Redesigned Internal Memory Pool

Date: 2025-05-28

Goals and expectations

Lay the foundations in the hub for the new memory pool design.

Thought train

We're putting aside the support for AI clusters and focusing on the HFT side of things for the time being. The have very different network workloads and one system to combine them both is not really a good option.

This also means we can remove the plans for congestion control, that's mostly done at the application layer, not the network layer in HFT infrastructure. This will speedup the development and let me focus on getting the most essential parts to function as intended.

A single block of BRAM typically only allows two simultaneous operations to non-conflicting addresses. This meant that servicing multiple interfaces at the same time is impractical.

So, I decided to implement a round-robin approach for both reads and writes (feels like being back to the same spot a few weeks ago, but there are some differences).

The approach is quite straight forward, every cycle, the hub selects one of the interfaces to service, and when servicing it, it checks for both RX and TX side transmissions.

And note that it's up to the interfaces to keep track of the completion of receiving a packet, but left for the hub to collect the free slots.

And the interfaces will have their own packet address queues to keep track of their outgoing packets. Furthermore, this queue can be limited to a fraction of the packet queue's size to allow control over the maximum amount of packets per packet buffer.

This centralized, dynamic memory allocation strategy should handle bursts well and ensure lightweight flows to be handled during a burst event. Which is good for handling HFT-like workloads.

Results

A very good evening of coding, I finished the following:

Reworked most of the hub's logic, implemented the RX side of things and left some TODO notes.
Implemented the free_queue for allocating free queue slots for incoming packets and enqueue freed slots by the TX side logic.
Implemented the memory_pool for the packet memory.
Write the first draft of the FLORA/ROSE coding style guide.

Reflections

Focus. Focus is the key to getting what you really want.
Modularize. Modularization will keep the work limited to more manageable chunks, which is much more important when developing alone.
Write everything down. Keep track of every thought, by handwritten notes, documentation, or even ChatGPT conversation history. This will help when there's a few dozen things to keep in mind every day.
Start doing things. Start writing down thoughts, start discussions about future plans, start coding. Start a momentum, and start keeping it alive.

Final thoughts

FPGAs are great tools. And I've only began to scratch the surface of them. Think implementing BRAM-based queues, I'd have to think about how to sync all the components so that everything I need would be ready exactly when I want them.

I feel like I'm beginning the transformation from a sequential thinker that thinks in steps into a clock-aligned combinational thinker - I think when each step would happen, not in what order, but at what time.

Also, explicitly knowing the hidden logic of logic implying ownership helped me structure my code better.

Next steps

Complete the hub, then move on to the interfaces.

3.3 KiB Raw Blame History