In the previous part we have started working on our Rust orderbook re-implementation of the C++ original, and defined the types Order
and OrderKey<Side>
, where Side
is a type implementing the OrderbookSide
trait: either Bid
(for buy orders) or Ask
(for sell orders).
Defining OrderHandle
Let’s try to implement inserting an order into the orderbook in the same way we have done in C++. First, let’s define the OrderHandle
and Orderbook
types:
struct OrderHandle {
remove_fn: Box<dyn FnOnce()>,
}
struct Orderbook {
bids: BTreeMap<OrderKey<Bid>, Order>,
asks: BTreeMap<OrderKey<Ask>, Order>,
}
impl Orderbook {
fn new() -> Orderbook {
Orderbook {
bids: BTreeMap::new(),
asks: BTreeMap::new(),
}
}
// ...
}
In the previous article I have already introduced BTreeMap
, which we will be using instead of std::map
in C++, to similar semantics, with one important caveat that we will get to later. We have implemented Ord
for OrderKey<Side>
to be able to use as keys here.
Note the Orderbook::new()
implementation: in Rust, there is no such thing as a default constructor (the closest is Default::default()
for types that implement the Default
trait), so you have to initialize every field explicitly and manually. The convention is the define an associated new()
function (the equivalent of a static
method in C++), but this is not required by the language.
OrderHandle
deserves some explanation as well. For reference, this is how we defined it in C++:
struct OrderHandle {
std::function<void()> remove = nullptr;
};
We know that std::function
in C++ provides type erasure for callable types (such as lambda expressions), and therefore stores the callable object on the heap (barring small-function optimization). In the Rust implementation, we want to achieve the same result, which we can with the combination of two features: Box<>
and dyn Trait
. Let’s start with Box<>
because that is easier: this is the Rust version of std::unique_ptr<>
.
dyn Trait
is the Rust way of describing any type that implements the Trait
trait; in this case, FnOnce()
: the trait for functions that may not be called multiple times because the call function takes self
by value (moving by default). The equivalent to C++ would be an abstract base class (interface as it is called in other languages) called FnOnce
that defines a pure virtual method call_once()
, which every function, lambda expression, and otherwise callable type must implement. This would allow us to store a FnOnce
pointer or reference (e.g. std::unique_ptr<FnOnce>
), relying on dynamic dispatch (via a vtable lookup) to call the correct call_once()
implementation. In C++, this is not how you treat callable objects, because it would force every single callable (including every function!) object to contain an extra, hidden vptr field that points to the concrete type’s vtable, increasing memory use and costing runtime performance. Instead, this is only done when you actually want type erasure and dynamic dispatch of a function and you opt-in to this by using std::function<>
. However, inheriting from a base class with any virtual methods always incurs the overhead of a vptr in C++, whether you ever need it for dynamic dispatch or not.
Rust has no such thing as inheritance, the closest thing it has to base classes are traits. Because traits are so ubiquitous (think of e.g. the Eq
, Ord
, and Add
traits), the C++ (and C#, Java, etc.) approach is not workable. Instead, in Rust, you only pay the overhead of dynamic dispatch when you explicitly request that feature by using the dyn
keyword. This makes the dyn Trait
type have an compile-time unknown size: it depends on the size of the implementing object which is only known at runtime (the entire point of using dynamic dispatch!), forcing us to use pointer indirection. In this case, we need to store the function and so need ownership of it: therefore, Box<>
, though any other smart pointer type, such as Rc<>
or Arc<>
(the equivalents of the C++ std::shared_ptr<>
) would also work.
Implementing Orderbook::get_side()
For our Orderbook
class, we will attempt to define a similar API as in C++, so something like this:
impl Orderbook {
fn get_side<Side>(&mut self) -> &mut ??;
fn insert<Side>(&mut self, order: Order) -> OrderHandle;
}
insert()
will use get_side()
so let’s start there. What should this function return? For a refresher, here is our C++ implementation:
template <Side S>
auto& get_side() {
if constexpr (S == Side::Bid) return bids;
else if constexpr (S == Side::Ask) return asks;
else __builtin_unreachable();
}
In Rust, we cannot just hand-wave the return type away like so, we have to name a type. We also cannot do compile-time branching like C++ if constexpr
, so we have to delegate the implementation to OrderbookSide::get_orders()
. Let’s revisit the relevant definitions from the previous post:
trait OrderbookSide: Copy + Eq + Sized + 'static {
// ...
fn get_orders(orderbook: &mut Orderbook) -> &mut BTreeMap<OrderKey<Self>, Order>;
}
// ...
impl OrderbookSide for Bid {
// ...
fn get_orders(orderbook: &mut Orderbook) -> &mut BTreeMap<OrderKey<Self>, Order> {
&mut orderbook.bids
}
}
impl OrderbookSide for Ask {
// ...
fn get_orders(orderbook: &mut Orderbook) -> &mut BTreeMap<OrderKey<Self>, Order> {
&mut orderbook.asks
}
}
In Rust, this is the only way (that I know of) of doing compile-time branching: using traits. That is a lot more code than the C++ version! But with that in place, we can finally define Orderbook::get_side()
:
fn get_side<Side: OrderbookSide>(&mut self) -> &mut BTreeMap<OrderKey<Side>, Order> {
Side::get_orders(self)
}
It is worth highlighting an important way in which Rust generics differ from C++ templates, which is that generics are much more restrictive. C++ assumes that any code you wrote using templates is a-okay unless it fails to compile a particular instantiation, meaning that get_side()
can be used without any particular difficulties… that is, unless you count the fact that the IDE cannot help you with the return type of get_side()
and what methods are callable on it. A human can prove that it’ll always be a std::unordered_map<OrderKey, Order, Something>
, but the tooling is lacking, also because there’s nothing stopping you from writing code where get_side()
will return completely different types for different Side
values.
Rust, much like Java or C#, takes the opposite approach and doesn’t let you do anything to generic types unless you have constrained (bound) it. So Orderbook::get_side()
needs the Side: OrderbookSide
bound: otherwise, Rust will refuse compilation, saying that Side
might not have an associated function get_orders()
.
Furthermore, this would not work:
trait OrderbookSide {
type OrderMap;
fn get_orders(orderbook: &mut Orderbook) -> &mut Self::OrderMap;
}
impl OrderbookSide for Bid {
type OrderMap = BTreeMap<OrderKey<Self>, Order>;
fn get_orders(orderbook: &mut Orderbook) -> &mut Self::OrderMap {
&mut orderbook.bids
}
}
impl OrderbookSide for Ask {
type OrderMap = BTreeMap<OrderKey<Self>, Order>;
fn get_orders(orderbook: &mut Orderbook) -> &mut Self::OrderMap {
&mut orderbook.asks
}
}
Here is the diagnostics:
error[E0308]: mismatched types
--> blogpost.rs:180:9
|
179 | fn get_side<Side: OrderbookSide>(&mut self) -> &mut BTreeMap<OrderKey<Side>, Order> {
| ------------------------------------ expected `&mut std::collections::BTreeMap<OrderKey<Side>, Order>` because of return type
180 | Side::get_orders(self)
| ^^^^^^^^^^^^^^^^^^^^^^ expected `&mut BTreeMap<OrderKey<Side>, ...>`, found `&mut <... as OrderbookSide>::OrderMap`
|
= note: expected mutable reference `&mut std::collections::BTreeMap<OrderKey<Side>, Order>`
found mutable reference `&mut <Side as OrderbookSide>::OrderMap`
help: consider constraining the associated type `<Side as OrderbookSide>::OrderMap` to `std::collections::BTreeMap<OrderKey<Side>, Order>`
|
179 | fn get_side<Side: OrderbookSide<OrderMap = std::collections::BTreeMap<OrderKey<Side>, Order>>>(&mut self) -> &mut BTreeMap<OrderKey<Side>, Order> {
| ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Side::OrderMap
is an opaque, unconstrained type that we are trying to coerce into a BTreeMap<>
, and Rust is having none of it. A fix would be to add an additional constraint for Side
, namely that Side::OrderMap
is BTreeMap<OrderKey<Side>, Order>
… but then we have gone full circle and we might as well name the type properly as we have done originally.
Naive Orderbook::insert()
Moving on to Orderbook::insert()
, we have to make a small change to the return type: in C++ we have returned OrderHandle::Invalid
(with remote_fn = nullptr
) if the order was not inserted, but Rust does not allow null (outside of unsafe
code using raw pointers), forcing to return an Option<OrderHandle>
instead. This is probably a cleaner API indeed, but I’d guess that the codegen ends up being worse (though I have not checked).
impl Orderbook {
// ...
fn insert<Side: OrderbookSide>(&mut self, order: Order) -> Option<OrderHandle> {
let order_key = OrderKey::<Side> {
price: order.price,
seq_num: order.seq_num,
_side: PhantomData,
};
// ...
if order.volume <= 0 {
return None;
}
let same_side = self.get_side::<Side>();
same_side.insert(order_key, order);
let order_handle = OrderHandle{
remove_fn: Box::new(move || {
same_side.remove_entry(&order_key);
}),
};
Some(order_handle)
}
}
This is notably missing an optimization present in the C++ code: capturing an iterator to the inserted order map entry so that removal can occur without first having to perform a lookup. This is not possible to do with Rust’s BTreeMap
unfortunately, because a B-tree cannot provider the pointer stability that a node-based implementation can. So in Rust, we are stuck with capturing same_side
by reference and order_key
by value (moving it).
It might look like that we are pretty much done, but Rust rejects our code. Here is the compiler diagnostics:
error: lifetime may not live long enough
--> blogpost.rs:198:24
|
180 | fn insert<Side: OrderbookSide>(&mut self, order: Order) -> Option<OrderHandle> {
| - let's call the lifetime of this reference `'1`
...
198 | remove_fn: Box::new(move || {
| ________________________^
199 | | same_side.remove_entry(&order_key);
200 | | }),
| |______________^ cast requires that `'1` must outlive `'static`
What the compiler is saying is that for this code to be safe, &mut self
must outlive 'static
, that is, the Orderbook
object must provably continue to exist until the program terminates. This however we cannot guarantee in general, let alone in our current case, as our Orderbook
lives inside a HashMap
:
struct Exchange {
order_books: HashMap<String, Orderbook>,
order_handles: HashMap<u32, OrderHandle>,
}
The issue that Rust is highlighting is that there is nothing stopping us from removing a value from Exchange::order_books
, leaving OrderHandle
s potentially storing a dangling reference (same_side
). The HashMap
might also rehash, an issue we already talked about with the C++ implementation.
Even if we were somehow able to convince the compiler that neither of this is a concern, the code would still be rejected, as Orderbook
may end up being used in a different context later.
Idiomatic Orderbook::insert()
using ref-counting
As best as I can figure, the idiomatic Rust solution to this is guaranteeing that the Orderbook
remains accessible while any OrderHandle
s referencing it remain by using reference-counting. Rust’s Rc<>
however comes with a severe limitation: it does not provide mutable access to its contents. This is because otherwise Rc<>
would allow us to circumvent’s Rust safety rule that a value may either be referenced by a single mutable reference or any number of shared references at the same time, making it possible to write code such as:
let numbers1 = Rc::new(vec![1, 2, 3]);
let numbers2 = numbers1.clone();
for item in numbers1.iter() {
numbers2.push(item + 1);
println!("Got {item}");
}
Here the vector is borrowed immutably for the scope of the for
due to numbers1.iter()
(which yields items by reference), but then also borrowed mutably by the call to push()
. This is unsafe because pushing to a vector might force a reallocation, leaving the item
reference to dangle.
The Rust solution to this is to defer this kind of safety checking to runtime by using a mutable container such as RefCell<>
. RefCell<>
enforces the borrow checker’s rules during runtime, at the cost of incurring a performance overhead. However, this allows our code to compile and run correctly:
use std::cell::RefCell;
use std::rc::Rc;
// ...
impl Orderbook {
// ...
fn insert<Side: OrderbookSide>(orderbook_ptr: Rc<RefCell<Orderbook>>, order: Order) -> Option<OrderHandle> {
let mut orderbook = orderbook_ptr.borrow_mut();
// ... as before ...
let order_handle = {
let captured_orderbook_ptr = orderbook_ptr.clone();
OrderHandle{
remove_fn: Box::new(move || {
let mut orderbook = captured_orderbook_ptr.borrow_mut();
let same_side = orderbook.get_side::<Side>();
same_side.remove_entry(&order_key);
}),
}
};
Some(order_handle)
}
}
Notice how Orderbook::insert()
now takes orderbook_ptr
instead of &mut self
, meaning that it can be no longer called as a method. This is necessary so that we can then clone the reference-counted pointer and capture that clone in the lambda expression.
Unsafe Orderbook::insert()
: C++ing it up
Warning: the code and approach presented here actually leads to undefined behaviour in Rust. See Order book in Rust: fixing the undefined behaviour for an explanation and a fix. I’m keeping the original version here for authenticity.
Using Rc<RefCell<>>
incurs a performance hit which we may not be willing to pay. Nor should we have to, as we can prove that our code would work, if only the pesky borrow checker could have some chill.
What we can do then is to acknowledge that the compiler is not going to be able to prove that our code is safe, and \ take matters into our own hands by using unsafe
. Specifically, we are going to store same_side
not as a reference, but as a raw pointer, which has no lifetime restrictions as far as the borrow checker is concerned:
let order_handle = {
let same_side_ptr = same_side as *mut BTreeMap<OrderKey<Side>, Order>;
OrderHandle{
remove_fn: Box::new(move || {
let same_side = unsafe { &mut *same_side_ptr };
same_side.remove_entry(&order_key);
}),
}
};
This will work if we remember to keep our Orderbook
objects boxed in the HashMap
so that a rehashing does not move them:
struct Exchange {
order_books: HashMap<String, Box<Orderbook>>,
order_handles: HashMap<u32, OrderHandle>,
}
It is worth keeping in mind though that writing unsafe Rust is a bit more difficult and less convenient than writing C++, as Rust has more restrictions and forces code to be more explicit, to the extent that unsafe code ends up usually being a lot more verbose than the C or C++ equivalent would be. A good example this can be seen in Learn Rust the Dangerous Way.
Our unsafe
version of Orderbook::insert()
will work only so long as the code outside Orderbook
ensures that it uses the class correctly by ensuring that the Orderbook
reference stays valid at least as long as there may be any OrderHandle
s using it. By using unsafe
, the safety of our code can no longer be reasoned about locally, i.e. just by looking at individual pieces of code, but we must do so globally, by considering our entire codebase as a whole. Safety is only maintained if every use of Orderbook
is correct. For C++ programmers, this is just business as usual, but for sane people, this is very undesirable. Sometimes though, such as when performance is critical, this trade-off is worth it, and that is why Rust has unsafe
.
Wrapping up: implementing Exchange
All that is left is implementing Exchange::insert()
and cancel()
(using the safe implementation with Rc<RefCell<>>
):
impl Exchange {
fn new() -> Exchange {
Exchange {
order_books: HashMap::new(),
order_handles: HashMap::new(),
}
}
fn insert(&mut self, symbol: String, order: Order) {
let order_id = order.id;
let orderbook_ptr = self
.order_books
.entry(symbol)
.or_insert_with(|| Rc::new(RefCell::new(Orderbook::new())))
.clone();
let new_handle = match order.side {
Side::Bid => Orderbook::insert::<Bid>(orderbook_ptr, order),
Side::Ask => Orderbook::insert::<Ask>(orderbook_ptr, order),
};
if let Some(new_handle) = new_handle {
self.order_handles.insert(order_id, new_handle);
}
}
fn cancel(&mut self, order_id: u32) {
if let Some(order_handle) = self.order_handles.remove(&order_id) {
(order_handle.remove_fn)();
}
}
}
We can improve the API a bit using the newtype pattern, creating a “strong type alias” for symbols and order IDs:
#[derive(Hash, PartialEq, Eq, Debug, Clone)]
struct Symbol(String);
#[derive(Hash, PartialEq, Eq, Debug, Clone, Copy)]
struct OrderId(u32);
#[derive(Debug, Clone)]
struct Order {
seq_num: u64,
id: OrderId,
// ...
}
// ...
struct Exchange {
order_books: HashMap<Symbol, Rc<RefCell<Orderbook>>>,
order_handles: HashMap<OrderId, OrderHandle>,
}
impl Exchange {
// ...
fn insert(&mut self, symbol: Symbol, order: Order) { /* ... */ }
fn cancel(&mut self, order_id: OrderId) { /* ... */ }
}
Further reading
Thank you for following along with me in this mini-series! I hope you have enjoyed it, and have gotten a bit more familiar with Rust along the way. See also Order book in Rust: fixing the undefined behaviour for further exploration of how to implement this in unsafe
Rust while avoiding undefined behaviour.
If you would like to learn more, here are a few resources and articles I have found particularly useful or interesting:
- Wrapper Types in Rust: Choosing Your Guarantees: explaining the vocabulary types of Rust, like references, pointers,
Box
,Rc
,Cell
, etc. - The Problem With Single-threaded Shared Mutability: why is Rust so restrictive about references and mutability?
- Rust by Example: learn Rust one code example at a time
- Rustlings: small exercises to get you used to reading and writing Rust code
- Learn Rust the Dangerous Way: converting a small C program (nbody) into Rust,
unsafe
first