A fairly standard programming problem in the world of trading is building an order book. In this article, I will be showing one particular interesting aspect of this problem, and how to solve it in C++. In the next two parts, we will be trying to implement the same solution in Rust.
This is meant to be comprehensive tutorial of Rust, but I will do my best to explain some of the most interesting or unusual parts. The series will try to approach the work of programming in Rust in much the same way as I expect another C++ developer would experience it. Therefore, some C++ knowledge is taken for granted.
Brief introduction: orders and order books
An order represents willingness to buy or sell something: for example, Buy AAPL 25 @ 137.85
would represent a buy order on the Apple stock (identified by the symbol AAPL
) for a volume of 25 (i.e. 25 shares) at the price of (up to) 137.85. An exchange (also called stock exchange or market) maintains books with all active orders in them, allowing potential buyers and sellers to trade with each other.
Order books tend to be split by side, and the orders are sorted so that an order with higher priority comes before an order with lower priority. Priority is usually determined by two things: the price first and foremost, and time second (to act as an arbitrator in the case of multiple orders with the same price). For buy orders, a higher price is better (since it is more likely to lead to a trade), and therefore has priority; for sell orders, it’s the opposite.
Defining Order
and Orderbook
Let’s start with some simple C++ code to define the types we need to represent an order book:
// Used for sorting orders
struct OrderKey {
double price = NAN;
uint64_t seq_num = 0; // for time-priority
};
enum class Side {
None = 0,
Bid, // buy
Ask, // sell
};
using OrderId = uint32_t;
struct Order {
uint64_t seq_num = 0;
OrderId id = 0;
Side side = Side::None;
uint32_t volume = 0;
double price = NAN;
OrderKey key() const {
return OrderKey{
.price = price,
.seq_num = seq_num,
};
}
};
// Comparators for OrderKey-s such that the orders most likely to trade come first.
// For buy orders (bids) this means price-descending order; for sell orders (asks) this means price-ascending order.
struct CompareBids {
bool operator()(const OrderKey& lhs, const OrderKey& rhs) const noexcept {
if (lhs.price > rhs.price) return true;
else if (lhs.price == rhs.price) return lhs.seq_num < rhs.seq_num;
else return false;
}
};
struct CompareAsks {
bool operator()(const OrderKey& lhs, const OrderKey& rhs) const noexcept {
if (lhs.price < rhs.price) return true;
else if (lhs.price == rhs.price) return lhs.seq_num < rhs.seq_num;
else return false;
}
};
struct Orderbook {
std::map<OrderKey, Order, CompareBids> bids;
std::map<OrderKey, Order, CompareAsks> asks;
};
There are many different ways to structure the data, but here I opted to introduce an OrderKey
structure which is used to keep all Order
s properly sorted in their maps. OrderKey
is strictly a subset of the fields of Order
, which means that each map entry has some data duplication (both price
and seq_num
is stored in both the key and the value), but it makes the code much easier to work with.
In this approach, we must take care to never accidentally modify an Order
’s price
or seq_num
without also updating the corresponding OrderKey
, which requires the order to be removed and re-inserted to the map. (As of C++ 17 this can be done efficiently done using std::map<>::extract()
.)
Note that while it seems that some code duplication could be avoided by making Order
inherit from OrderKey
, I consider this to be a bad idea for a number of reasons:
- An
Order
is not anOrderKey
in a sense where you’d ever want anOrder
to be valid in a place where anOrderKey
is accepted. Requiring an explicit call toOrder::key()
is much less confusing. - It prevents
Order
from being constructed using the designated initializers syntax that I’m using inOrder::key()
. - It is not necessarily the case that as the code evolves,
Order
remains a superset ofOrderKey
. For example,seq_num
arguably should not be a field inOrder
at all, as its only use is encoding time-priority.
Of course, there are many things we could here. We could optimize Order
to take up less memory, e.g. by using bit fields: keep 1 or 2 bits for side (depending on if you want to keep Side::None
or not), and 31 or 30 bits for volume to reduce sizeof(Order)
by 7 bytes. As I mentioned before, arguably we do not want seq_num
to be a field in Order
at all, which would save an additional 8 bytes. Saving memory at this scale is not really a relevant factor for overall memory consumption, but it can make a lot of difference for performance: the smaller our objects, the more of them can fit into caches.
I am using std::map
here to store the orders in the ordering we want them: now the top of the book (the orders that have the highest chance of trading) are the first elements in both collections. We could also use std::priority_queue
to achieve the same thing, but std::map
has a nice properties that will come in handy later. (We cannot use std::set
because orders can change during lifetime, e.g. their volume might decrease if they are traded against, and std::set
cannot allow that it would allow elements to be mutated such that their position should change.)
Defining Exchange
and implementing insertion
Next we should define an Exchange
class that will contain all of our order books. We have to maintain the order book separately for each instrument, as orders on different instruments are completely unrelated to each other. However, we are going to add one extra caveat: we are going to assume that all orders are identified by an ID that is unique across books, and we want to be able to efficiently cancel orders buy specifying only their ID, without having to repeat the side.
We could go about it like so:
struct OrderHandle { /*???*/ };
using Symbol = std::string;
struct Exchange {
std::unordered_map<Symbol, Orderbook> order_book_by_symbol;
std::unordered_map<OrderId, OrderHandle> order_handle_by_id;
void insert(const Symbol&, const Order&);
void cancel(OrderId);
};
Inserting an order means inserting it into the appropriate order book’s appropriate side map (bids
or asks
), as well as registering it in order_handle_by_id
. We do not yet know what OrderHandle
will have to look like for Exchange::cancel()
to be as efficient as possible, so we left that empty for now.
Let’s implement Exchange::insert()
next:
void Exchange::insert(const Symbol& symbol, const Order& order) {
// look up the order book if it already exists, otherwise insert it
const auto [orderBookIt, orderBookInserted] = order_book_by_symbol.try_emplace(symbol);
Orderbook& orderBook = orderBookIt->second;
// ... logic for matching this order against orders on the opposite side to see if we have any trades ...
if (order.volume <= 0) {
// the new order is fully traded, so nothing is inserted into the orderbook
return;
}
// store the new order and a handle to it
OrderHandle newHandle;
if (order.side == Side::Bid) {
const auto [orderIt, orderInserted] = orderBook.bids.try_emplace(order.key(), order);
assert(orderInserted && "Duplicate order key");
// TODO: fill out `newHandle`
} else {
assert(order.side == Side::Ask);
// similarly...
}
order_handle_by_id.try_emplace(order.id, std::move(newHandle));
}
Implementing efficient order cancellation
How could we implement Exchange::cancel()
as efficiently as possible?
One option would be to have OrderHandle
store all the information needed to locate the order: the containing order book, order side, and order key. However, this would make cancellation quite slow, as it would require 3 map lookups:
order_handle_by_id
to find theOrderHandle
,order_book_by_symbol
as we cannot store anOrderbook*
directly unless we can guarantee thatorder_book_by_symbol
will never rehash,Orderbook::bids
orOrderbook::asks
.
Looking at the order insertion code again, the insight for how to optimize this comes by realizing that at the end of order insertion, we have everything we need to effectively remove the order. It is just two things: a reference to the map that contains the order, and iterator referencing the order in the map. Since std::map
guarantees address stability, we can rely on std::map<>::iterator
to never be invalidated unless the corresponding element is removed, meaning that it is save to store.
Unfortunately, we cannot have OrderHandle
store these directly, because both the map type and the iterator type depend on the comparator used, which will be different between bids and asks. We would need to template OrderHandle
on our comparator, which would prevent our bid and ask order handles to be stored together. That doesn’t work.
Type erasure to the rescue! We can encapsulate the order removal into a lambda expression at the end of order insertion, and save that in an std::function
which can then go on to live inside our OrderHandle
:
struct OrderHandle {
std::function<void()> remove_fn = nullptr;
};
void Exchange::insert(const Symbol& symbol, const Order& order) {
// ...
OrderHandle newHandle;
if (order.side == Side::Bid) {
const auto [orderIt, orderInserted] = orderbook.bids.try_emplace(order.key(), order);
assert(orderInserted && "Duplicate order key");
newHandle.remove_fn = [&orderbookSide=orderbook.bids, orderIt]() {
orderbookSide.erase(orderIt);
};
} else {
assert(order.side == Side::Ask);
const auto [orderIt, orderInserted] = orderbook.asks.try_emplace(order.key(), order);
assert(orderInserted && "Duplicate order key");
newHandle.remove_fn = [&orderbookSide=orderbook.asks, orderIt]() {
orderbookSide.erase(orderIt);
};
}
...
}
void Exchange::cancel(OrderId orderId) {
const auto orderHandleIt = order_handle_by_id.find(orderId);
if (orderHandleIt == order_handle_by_id.end()) {
return; // no such order
}
const OrderHandle& orderHandle = orderHandleIt->second;
orderHandle.remove_fn();
order_handle_by_id.erase(orderHandleIt);
}
Neat! We managed to cut down our 3 map lookups needed to cancel an order to just 1!
We do have to pay attention that if we ever remove an order from an order book, we also must remove it from Exchange::order_handle_by_id
, otherwise the OrderHandle
will dangle due to the invalidated iterator.
Unfortunately, one bug has managed to sneak in, of the kind that C++ is notorious for: the reference to orderbook.bids
(or orderbook.asks
) is only valid so long as the Orderbook
object stays in place; but as I’ve indicated before, the std::unordered_map
containing the Orderbook
objects may rehash, which involves relocating all of it elements.
This is sometimes possible to prevent by reserving enough capacity for the std::unordered_map
upfront, but this is not always practical: often, it is difficult or impossible to give a reasonable upper bound on the maximum size the map will ever have, or if it is, the number is so huge that it is prohibitively expensive to actually do so.
The easiest, and guaranteed way of solving this is to store the Orderbook
s in an std::unique_ptr
.
This issue is very nasty, as during testing with just one or two symbols everything will appear to be working perfectly, but in production, when exposed to a large number of different symbols, the application will very quickly hit undefined behaviour, and crash in the best-case scenario. Ah, C++!
De-duplicating the code
Finally, Exchange::insert()
has some unfortunate code-duplication: two almost identical code paths for bids and asks. We can deduplicate this by adding some templates. Let’s also take this opportunity and refactor the code a little bit, by moving the bulk of the order insertion logic into Orderbook
:
struct Orderbook {
// ...
template <Side S>
auto& get_side() {
if constexpr (S == Side::Bid) return bids;
else if constexpr (S == Side::Ask) return asks;
else __builtin_unreachable();
}
template <Side S>
OrderHandle insert(Order order) {
// ... logic for matching this order against orders on the opposite side to see if we have any trades ...
if (order.volume <= 0) {
// the new order is fully traded, so nothing is inserted into the orderbook
return OrderHandle::Invalid;
}
auto& orderbookSide = get_side<S>();
const auto [orderIt, orderInserted] = orderbookSide.try_emplace(order.key(), order);
assert(orderInserted && "Duplicate order key");
return OrderHandle{
.remove_fn = [&orderbookSide, orderIt]() {
orderbookSide.erase(orderIt);
},
};
}
};
...
void Exchange::insert(const Symbol& symbol, const Order& order) {
if (order.side == Side::Bid) {
insert_impl<Side::Bid>(symbol, order);
} else {
assert(order.side == Side::Ask);
insert_impl<Side::Ask>(symbol, order);
}
}
template <Side S>
void Exchange::insert_impl(const Symbol& symbol, const Order& order) {
const auto [orderBookIt, orderBookInserted] = order_book_by_symbol.try_emplace(symbol);
Orderbook& orderBook = orderBookIt->second;
OrderHandle newHandle = orderbook.insert<S>(order);
if (newHandle) {
order_handle_by_id.try_emplace(order.id, std::move(newHandle));
}
}
Note that we also need some API changes to OrderHandle
:
struct OrderHandle {
static const OrderHandle Invalid;
std::function<void()> remove_fn = nullptr;
/*implicit*/ operator bool() const {
return remove_fn != nullptr;
}
};
const OrderHandle OrderHandle::Invalid{};
Pretty nice. Yay for templates!
In the next part we will begin implementing the same solution in Rust, and we will begin seeing where the two languages differ. The final part will see the bulk of the work being done, implementing order insertion and cancellation and the highlight of the show: fighting with the borrow checker.