82 lines
3.3 KiB
Markdown
82 lines
3.3 KiB
Markdown
# Redesigned Internal Memory Pool
|
|
Date: 2025-05-28
|
|
|
|
## Goals and expectations
|
|
Lay the foundations in the hub for the new memory pool design.
|
|
|
|
## Thought train
|
|
We're putting aside the support for AI clusters and focusing on the
|
|
HFT side of things for the time being. The have very different
|
|
network workloads and one system to combine them both is not really a
|
|
good option.
|
|
|
|
This also means we can remove the plans for congestion control, that's
|
|
mostly done at the application layer, not the network layer in HFT
|
|
infrastructure. This will speedup the development and let me focus on
|
|
getting the most essential parts to function as intended.
|
|
|
|
A single block of BRAM typically only allows two simultaneous
|
|
operations to non-conflicting addresses. This meant that servicing
|
|
multiple interfaces at the same time is impractical.
|
|
|
|
So, I decided to implement a round-robin approach for both reads and
|
|
writes (feels like being back to the same spot a few weeks ago, but
|
|
there are some differences).
|
|
|
|
The approach is quite straight forward, every cycle, the hub selects
|
|
one of the interfaces to service, and when servicing it, it checks
|
|
for both RX and TX side transmissions.
|
|
|
|
And note that it's up to the interfaces to keep track of the
|
|
completion of receiving a packet, but left for the hub to collect
|
|
the free slots.
|
|
|
|
And the interfaces will have their own packet address queues to keep
|
|
track of their outgoing packets. Furthermore, this queue can be
|
|
limited to a fraction of the packet queue's size to allow control over
|
|
the maximum amount of packets per packet buffer.
|
|
|
|
This centralized, dynamic memory allocation strategy should handle
|
|
bursts well and ensure lightweight flows to be handled during a burst
|
|
event. Which is good for handling HFT-like workloads.
|
|
|
|
## Results
|
|
A very good evening of coding, I finished the following:
|
|
|
|
1. Reworked most of the hub's logic, implemented the RX side of things
|
|
and left some TODO notes.
|
|
2. Implemented the `free_queue` for allocating free queue slots for
|
|
incoming packets and enqueue freed slots by the TX side logic.
|
|
3. Implemented the `memory_pool` for the packet memory.
|
|
4. Write the first draft of the FLORA/ROSE coding style guide.
|
|
|
|
## Reflections
|
|
1. Focus. Focus is the key to getting what you really want.
|
|
2. Modularize. Modularization will keep the work limited to more
|
|
manageable chunks, which is much more important when developing
|
|
alone.
|
|
3. Write everything down. Keep track of every thought, by handwritten
|
|
notes, documentation, or even ChatGPT conversation history. This
|
|
will help when there's a few dozen things to keep in mind every
|
|
day.
|
|
4. Start doing things. Start writing down thoughts, start discussions
|
|
about future plans, start coding. Start a momentum, and start
|
|
keeping it alive.
|
|
|
|
## Final thoughts
|
|
FPGAs are great tools. And I've only began to scratch the surface of
|
|
them. Think implementing BRAM-based queues, I'd have to think about
|
|
how to sync all the components so that everything I need would be
|
|
ready exactly when I want them.
|
|
|
|
I feel like I'm beginning the transformation from a sequential thinker
|
|
that thinks in steps into a clock-aligned combinational thinker - I
|
|
think when each step would happen, not in what order, but at what
|
|
time.
|
|
|
|
Also, explicitly knowing the hidden logic of `logic` implying
|
|
ownership helped me structure my code better.
|
|
|
|
## Next steps
|
|
Complete the hub, then move on to the interfaces.
|