rose/devlog/2025-05-28-Queuing.md

# Redesigned Internal Memory Pool
Date: 2025-05-28

## Goals and expectations
Lay the foundations in the hub for the new memory pool design.

## Thought train
We're putting aside the support for AI clusters and focusing on the
HFT side of things for the time being.  The have very different
network workloads and one system to combine them both is not really a
good option.

This also means we can remove the plans for congestion control, that's
mostly done at the application layer, not the network layer in HFT
infrastructure.  This will speedup the development and let me focus on
getting the most essential parts to function as intended.

A single block of BRAM typically only allows two simultaneous
operations to non-conflicting addresses.  This meant that servicing
multiple interfaces at the same time is impractical.

So, I decided to implement a round-robin approach for both reads and
writes (feels like being back to the same spot a few weeks ago, but
there are some differences).

The approach is quite straight forward, every cycle, the hub selects
one of the interfaces to service, and when servicing it, it checks
for both RX and TX side transmissions.

And note that it's up to the interfaces to keep track of the
completion of receiving a packet, but left for the hub to collect
the free slots.

And the interfaces will have their own packet address queues to keep
track of their outgoing packets.  Furthermore, this queue can be
limited to a fraction of the packet queue's size to allow control over
the maximum amount of packets per packet buffer.

This centralized, dynamic memory allocation strategy should handle
bursts well and ensure lightweight flows to be handled during a burst
event.  Which is good for handling HFT-like workloads.

## Results
A very good evening of coding, I finished the following:

1. Reworked most of the hub's logic, implemented the RX side of things
   and left some TODO notes.
2. Implemented the `free_queue` for allocating free queue slots for
   incoming packets and enqueue freed slots by the TX side logic.
3. Implemented the `memory_pool` for the packet memory.
4. Write the first draft of the FLORA/ROSE coding style guide.

## Reflections
1. Focus.  Focus is the key to getting what you really want.
2. Modularize.  Modularization will keep the work limited to more
   manageable chunks, which is much more important when developing
   alone.
3. Write everything down.  Keep track of every thought, by handwritten
   notes, documentation, or even ChatGPT conversation history.  This
   will help when there's a few dozen things to keep in mind every
   day.
4. Start doing things.  Start writing down thoughts, start discussions
   about future plans, start coding.  Start a momentum, and start
   keeping it alive.

## Final thoughts
FPGAs are great tools.  And I've only began to scratch the surface of
them.  Think implementing BRAM-based queues, I'd have to think about
how to sync all the components so that everything I need would be
ready exactly when I want them.

I feel like I'm beginning the transformation from a sequential thinker
that thinks in steps into a clock-aligned combinational thinker - I
think when each step would happen, not in what order, but at what
time.

Also, explicitly knowing the hidden logic of `logic` implying
ownership helped me structure my code better.

## Next steps
Complete the hub, then move on to the interfaces.