Implementing order books in C++ and Rust - part 2/3

In the previous part, we have sketched out the structure of implementing a simple order book in C++, supporting order insertion and efficient cancellation. (Implementing uncrossing, that is, trades when a buy order and a sell order agree on a price, is an exercise left to the reader, as is implementing order modification.)

In this article, let’s explore how we can implement the same thing in Rust, starting with the basic types we need before we can even start implementing an orderbook.

Defining `Order` and `OrderKey`

Let’s start by defining the same Order and OrderKey types:

#[derive(Debug, PartialEq, Eq, Clone, Copy)]
enum Side {
    Buy,
    Sell,
}

#[derive(Debug, PartialEq, Eq, Clone, Copy)]
struct OrderKey {
    price: f64,
    seq_num: u64,
}

#[derive(Debug, Clone)]
struct Order {
    seq_num: u64,
    id: u32,
    side: Side,
    volume: i32,
    price: f64,
}

impl Order {
    fn key(&mut self) -> OrderKey {
        OrderKey {
            price: self.price,
            seq_num: self.seq_num,
        }
    }
}

So far here, there shouldn’t be anything too surprising, except perhaps the #[derive(...)] syntax, which can be used to auto-generate implementations of traits. Traits are a sort of a mix between C++ concepts and abstract classes (interfaces), and I’m told most similar to Haskell’s meta-classes. For example, implementing the Eq trait requires that you implement equality comparison as a function, and it, well, makes values of this type equality-comparable: the equivalent of defining operator== in C++. derive()ing Eq then is the Rust equivalent to writing bool operator==(...) = default; in C++.

The code above actually would fail to compile due to attempting to derive Eq for OrderKey. The hint as to why lies in the existence of PartialEq: floats do not have a total equivalence relation, namely because NaNs violate the requirement of reflexivity: x != x if x = NaN. Therefore, floats (f32, f64) cannot be Eq in Rust, only PartialEq. Our OrderKey structure contains a price: f64 field, which makes it impossible to derive Eq for. If we have a guarantee though that price cannot possible be NaN, then we can tell the compiler that it’s all good by implementing Eq manually:

#[derive(Debug, PartialEq, Clone, Copy)]
struct OrderKey {
    price: f64,
    seq_num: u64,
}

// we check the price for NaN before we can get here
impl Eq for OrderKey {}

Rust is big on explicitness in general. For instance, you can implement the Clone trait by implementing the clone(&self) -> Self method, used for cloning the object, in a similar manner as a C++ copy-constructor would, except that in Rust:

types are not cloneable unless you explicitly make them so,
you always have to call clone() manually, except if you implement the Copy trait, which makes it happen implicitly whenever an object of this type is passed by value.

Rust has pretty good documentation, especially for basics like this, and they also explain all this: Trait std::marker::Copy.

Putting the `Ord` in the `OrderKey`

The closest Rust equivalent of the C++ std::map<> is std::collections::BTreeMap<>, but there’s one big difference: while std::map is typically implemented as a red-black tree, BTreeMap is implemented as a (you guessed it) B-Tree (which is not the same as a binary search tree!) that stores multiple elements in one node to improve data locality. I expect, though I haven’t benchmarked it, that Rust’s BTreeMap would generally significantly outperform C++’s std::map, while providing essentially the same functionality. Nice!

In std::map we could pass in a comparator type to be used as the third template argument, but Rust’s BTreeMap is more opinionated than that: it requires that our key type implement Ord, the trait for total ordering. This is certainly possible to do, but requires that we put the order’s side into OrderKey as well, otherwise we cannot tell whether a lower of higher price is better:

use core::cmp::Ordering;
// ...

#[derive(Debug, PartialEq, Clone, Copy)]
struct OrderKey {
    side: Side,
    price: f64, // assume non-NaN
    seq_num: u64,
}

impl Eq for OrderKey {}

impl Ord for OrderKey {
    fn cmp(&self, other: &OrderKey) -> Ordering {
        if self.price < other.price {
            match self.side {
                Side::Buy => Ordering::Greater,
                Side::Sell => Ordering::Less,
            }
        } else if self.price > other.price {
            match self.side {
                Side::Buy => Ordering::Less,
                Side::Sell => Ordering::Greater,
            }
        } else {
            self.seq_num.cmp(&other.seq_num)
        }
    }
}

impl PartialOrd for OrderKey {
    fn partial_cmp(&self, other: &OrderKey) -> Option<Ordering> {
        Some(self.cmp(other))
    }
}

This shows off Rust’s pattern-matching, which makes this type of code a pleasure to write compared to C++: even in a simple scenario like this one, the equivalent C++ switch (self.side) { ... } would be more to annoying to write, because:

switch and case are not expressions and therefore do not yield values,
even if you switch on an enum and you have a case for each possible value, the compiler is not satisfied: this is because in C++, it is completely legal for an enum to have a value that is not one of its listed options.

For example:

enum class Side {
    Buy = 1,
    Sell = 2,
};
...
auto weird_side = static_cast<Side>(3); // this is ok!

Admittedly, this is terrible code, but it is perfectly legal, due to the legacy of C enums being just named integer literals. This is not possible in safe Rust, and triggers undefined behaviour in unsafe Rust: one of the many cases where Rust is stricter than C++, but as a result can optimize better.

Okay, so now that OrderKey is Ord, we can use it as a key in BTreeMap. Unfortunately, we ended up with much less efficient code than the C++ version, due to having to store the order side in OrderKey, and branching on side in cmp(). One way to deal with this situation is to define wrapper types around OrderKey and implement Ord differently for them. For example:

use core::cmp::Ordering;
// ...

#[derive(Debug, PartialEq, Clone, Copy)]
struct OrderKey {
    price: f64,
    seq_num: u64,
}

impl Eq for OrderKey {}

// order key for buy orders
#[derive(Debug, PartialEq, Eq, Clone, Copy)]
struct BidKey(OrderKey);

impl Ord for BidKey {
    fn cmp(&self, other: &BidKey) -> Ordering {
        if self.0.price < other.0.price {
            Ordering::Greater
        } else if self.0.price > other.0.price {
            Ordering::Less
        } else {
            self.0.seq_num.cmp(&other.0.seq_num)
        }
    }
}

impl PartialOrd for BidKey {
    fn partial_cmp(&self, other: &BidKey) -> Option<Ordering> {
        Some(self.cmp(other))
    }
}

// order key for sell orders
#[derive(Debug, PartialEq, Eq, Clone, Copy)]
struct AskKey(OrderKey);

impl Ord for AskKey {
    fn cmp(&self, other: &AskKey) -> Ordering {
        if self.0.price < other.0.price {
            Ordering::Less
        } else if self.0.price > other.0.price {
            Ordering::Greater
        } else {
            self.0.seq_num.cmp(&other.0.seq_nym)
        }
    }
}

impl PartialOrd for AskKey {
    fn partial_cmp(&self, other: &AskKey) -> Option<Ordering> {
        Some(self.cmp(other))
    }
}

Here BidKey and AskKey are both tuple types, hence the need for self.0 to access the underlying OrderKey. We had to duplicate a lot of the code unfortunately to achieve this.

Another approach: using traits

Another way to solve the issue and deduplicate the code is by using Rust’s trait system. The idea is as follows: we define the trait (concept) of an orderbook side which knows how to decide if a particular price is better than another. Then we define two types (Bid and Ask) that implement that concept. Finally, the we make OrderKey generic over these.

Let’s see:

use core::cmp::Ordering;
use std::marker::PhantomData;

trait OrderbookSide: Eq {
    type Opposite: OrderbookSide;

    fn is_price_better_than(left: f64, right: f64) -> bool;
}

// the buyer side
#[derive(Debug, PartialEq, Eq, Clone, Copy)]
struct Bid;

// the seller side
#[derive(Debug, PartialEq, Eq, Clone, Copy)]
struct Ask;

impl OrderbookSide for Bid {
    type Opposite = Ask;

    fn is_price_better_than(left: f64, right: f64) -> bool {
        left > right
    }
}

impl OrderbookSide for Ask {
    type Opposite = Bid;

    fn is_price_better_than(left: f64, right: f64) -> bool {
        left < right
    }
}

#[derive(Debug, PartialEq, Clone, Copy)]
struct OrderKey<Side: OrderbookSide> {
    price: f64, // assume not-NaN
    seq_num: u64,
    _side: PhantomData<Side>,
}

impl<Side: OrderbookSide> Eq for OrderKey<Side> {}

impl<Side: OrderbookSide> Ord for OrderKey<Side> {
    fn cmp(&self, other: &Self) -> Ordering {
        if self.price == other.price {
            self.seq_num.cmp(&other.seq_num)
        } else if Side::is_price_better_than(self.price, other.price) {
            Ordering::Less
        } else {
            Ordering::Greater
        }
    }
}

impl<Side: OrderbookSide> PartialOrd for OrderKey<Side> {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        Some(self.cmp(other))
    }
}

One peculiar you thing you might notice is the field _side: PhantomData<Side> in OrderKey. PhantomData here is necessary, otherwise rustc will not accept our program, due to the unused Side type parameter in the struct definition. Fortunately, it does not pose a run-time cost: Rust supports zero-size types (such as our Bid and our Ask marker types) and so PhantomData<T> does not actually increase the size of OrderKey.

The <Side: OrderbookSide> syntax represents a trait bound: requirements on what traits the type parameter must implement. Rust would not otherwise allow the use of associated functions and types (such as is_price_better_than and Opposite) on Side, unlike in C++. Rust traits act as C++ template specializations, concepts, and also interfaces (abstract base classes).

Even this way, our Rust code is much longer and more verbose than the equivalent C++ one, mainly due to BTreeMap requiring the key to be Ord, which in turn means that it also needs to be PartialOrd, PartialEq, and Eq. We did not have to bother with any of this in C++, as std::map just lets us pass in a comparator as the third template argument. I don’t see any real reason why Rust is less ergonomic here, so I’m going to chalk it up to a difference in culture and/or philosophy. I can see a sort-of theoretical purity to forcing the same ordering and equality implementations to apply in all cases, but I’m not a fan of the imposed cost.

In the final part of the series, we will implement the Orderbook and Exchange structures and build support for inserting and cancelling orders.

Defining Order and OrderKey

Putting the Ord in the OrderKey

Another approach: using traits

Defining `Order` and `OrderKey`

Putting the `Ord` in the `OrderKey`