Published on

Palantir Post Mortem

Authors
  • avatar
    Name
    michael
    Twitter

Palantir

Data scraping in a systems language like C++ is an ugly art, from limited Unicode support to clunky string manipulation it's not surprising most sensible people reach for Python when opting to scrape websites.

Unfortunately I'm more familiar with C++ than Python so I began this grim undertaking with a CMakeLists

Let's explore some of the core building blocks that made this a reality.

Scoped Locks and Mutexes

One of the best ways to write modern C++ is using the RAII (resource acquisition is initialisation) idiom. An egregious name for a programming paradigm that removes a lot of grief from the language by cleaning up resources as they move out of scope.

Another common source of grief is multithreading, but despite its challenging reputation it can be easy to reason about if broken down to basic synchronisation primitives like mutexes.

Without going deep into atomics, cache lines or memory order semantics, a mutex is a well-understood abstraction for locking access to memory and can be coupled with RAII for a straightforward thread-safe implementation of many common STL containers.

palantir.hpp
namespace Palantir {

    template<typename T>
    class mutexQueue {
        std::queue<T> queue;
        mutable std::mutex mutex;

    public:
        size_t size() const {
            std::scoped_lock lock(mutex);
            return queue.size();
        }
    };
}

We use a templated STL queue for that enviable constant O(1) insertion/deletion complexity with a scoped lock guard to ensure that typically unsafe operations can now be performed across multiple threads.

The extension of the queue above to support writing is left as an exercise for the reader.

Zero Copy

Copying data is regrettable, but sometimes unavoidable (e.g. if using coroutines), when optimising for performance.

There are a multitude of strategies, compiler optimisations and patterns to leverage when approaching "zero copy" by design. I'll try to cover some of the essentials that were involved in Palantir.

Move Semantics

Move semantics, as the name implies, can improve performance characteristics by simply moving data rather than copying it.

Although one must be careful to avoid using std::move() on function return values as this prevents the compiler from exercising copy elision (NRVO).

When designing objects an easy way to ensure your data's moved rather than copied is to delete the copy constructor and the copy assignment operator. This approach is Rust-like but works well.

myclass(const myclass&) = delete;
myclass& myclass::operator=(const myclass&) = delete;

The compiler will give you some grief if you don't move said class, which can be desirable if the alternative is an expensive copy operation.

Static Initialisation

Straightforward, fundamental, and sometimes forgotten: we can avoid copy by ensuring that data we know ahead of time is allocated appropriately.

scraper.h
class Scraper {

    static constexpr std::array nodes
            {
                    "str1",
                    "str2",
                    "etc"
            };
}

Here static lets all instances of the Scraper class share the same variable to avoid duplication. Such variables are initialised before main() is called, during _start provided by your C standard library of choice.

Additionally, its immutability by way of constexpr lets the compiler store this in the .data section of the executable during compilation.

Taking advantage of compile time constants is often the gateway into metaprogramming...

Compile Time Metaprogramming

The most obvious path to avoiding excess copy is to do any heavy lifting at compile time instead of runtime. Some functions, or variables, may be more suitable as constant expressions with the constexpr keyword if they can be evaluated at compile time.

Although consteval in C++20 is more appropriate, as the compiler is not necessarily obliged to honour a constexpr for some reason.

A good challenge would be rewriting the mythical Fizz Buzz using template metaprogramming and fold parameter packs for a runtime complexity O(0) implementation.

Or you could even use your MMU for some meta meta programming.