WORKING PROGRESS. revamped the files and naming, so git is a bit confused
mem_hub.sv -> hub.sv spi_slave.sv -> interface.sv
This commit is contained in:
120
devlog/2025-05-17-Routing-logic.md
Normal file
120
devlog/2025-05-17-Routing-logic.md
Normal file
@ -0,0 +1,120 @@
|
||||
# Routing Logic
|
||||
Date: 2025-05-17
|
||||
|
||||
## Goals and expectations
|
||||
The flu is mostly gone from me, so I expected to get some work done.
|
||||
The first is to complete the core routing logic (still, no congestion
|
||||
considered, if a TX buffer is full, then it'll just drop the data
|
||||
silently).
|
||||
|
||||
## Thought train
|
||||
The RX queue should also be able to send high-priority messages to the
|
||||
interface if it's congested because the routing logic is congested.
|
||||
There can even be congestion management methods developed based on how
|
||||
full the RX queue is relative to the packet sizes and the number of
|
||||
connected devices.
|
||||
|
||||
## Results
|
||||
### Trivial (not really)
|
||||
Due to the younger me being blind and used `verilog-mode`'s default
|
||||
3-space indentation, the files have been revamped to use 4 spaces for
|
||||
indentations. And I also renamed some files and modules.
|
||||
|
||||
Indentation is pretty relevant in good code, 3 spaces is probably more
|
||||
evil than using tabs.
|
||||
|
||||
### Completed routing logic
|
||||
I had the idea in mind to stream packets directly without any buffer
|
||||
involved, but to simplify the round-robin logic (and allowing
|
||||
potentially multiple streams of data), I went with a small buffer to
|
||||
absorb 1 byte from the interfaces.
|
||||
|
||||
I did re-rethink the routing logic: all interfaces can send incoming
|
||||
data at the routing logic, so that I don't have to deal with
|
||||
sync-related issues on the RX side of things, and put the service
|
||||
buffer inside of the routing logic to work:
|
||||
|
||||
On the RX side, if the buffer for that interface (1 byte) is full or
|
||||
going to be filled, turn off `rx_ready`. And if the buffer is empty,
|
||||
it moves the data received from the interface into the service buffer.
|
||||
|
||||
On the TX side, I implemented a round-robin approach and only service
|
||||
one buffer at any given time. If the destination is ready, send the
|
||||
byte and set `rx_ready` to true. Also note that due to only servicing
|
||||
1 TX queue at any given time, I have to update the `tx_valid` bit of
|
||||
the last destination to avoid sending duplicates.
|
||||
|
||||
#### Potential problem
|
||||
The `rx_ready` design is currently under evaluation, I have two
|
||||
approaches in mind, one is the safe one by assigning `rx_ready =
|
||||
~in_buffer` which would definitely solve any kind of problems related
|
||||
to an interface sending when the buffer isn't ready. But this would
|
||||
mean skipping cycles when 1 interface can directly stream to another.
|
||||
|
||||
Then there's the option of only turning off `rx_ready` when the
|
||||
interface is trying to write to a full buffer, but the incoming byte
|
||||
still stays in a register and hence enabling continuous streaming.
|
||||
|
||||
However, since we're polling from any specific buffer only once every
|
||||
4 cycles, that means skipping 1 cycle for the RX side of 1 interface
|
||||
is trivial. So, I went with the first approach.
|
||||
|
||||
However, this gave me inspiration for another thing: I can allow a
|
||||
direct stream mode so that one device can just stream to another,
|
||||
better yet, I can also use a shared pool of memory to avoid any kind
|
||||
of streaming, although that will significantly impact the logic
|
||||
involved and reduce queue size flexibility.
|
||||
|
||||
### BRAM access
|
||||
Exciting stuff, finally getting into BRAM land, a 1 cycle delay is
|
||||
acceptable when the logic itself is running faster than the interface.
|
||||
I learned about how to safely access it (within the same clock domain
|
||||
of course, that's why there's an RX buffer), and wrote some logic for
|
||||
the RX buffer (incomplete).
|
||||
|
||||
## Reflections
|
||||
1. Trade-offs are being made. If I construct more complex logic, then
|
||||
I can eliminate the need for a central routing logic for data, but
|
||||
there's a few catches to that: 1. the memory management would be
|
||||
more complex, although it would allow more flexible memory
|
||||
allocation and handle bursts better, but it would also mean having
|
||||
4 smaller queues inside of a bigger memory pool and using a memory
|
||||
collection queue to keep track of which buffers are
|
||||
empty; 2. If one interface is being congested, that means the
|
||||
entire fabric is probably going to be congested.
|
||||
- As always, there's the design of using reserved queues for each
|
||||
interface and a shared central buffer for handling bursts. BUT
|
||||
WITH EVEN MORE COMPLEXITY!
|
||||
2. Reworking the design is acceptable, but I should still keep track
|
||||
of all of my ideas just in case I want to go back to them one day.
|
||||
A lot of things came up as I gathered my thoughts for this devlog,
|
||||
combining unimplemented ideas and my current implementation. Best
|
||||
to save this devlog for future references.
|
||||
3. FPGAs are restricting, but as I dug deeper into constructing logic
|
||||
for it, I felt as inspired as I first found out how programming is
|
||||
like teaching a child to do everything as explicitly and as
|
||||
accurately as you can.
|
||||
4. Ideas are cheaper than implementation, that doesn't mean they'll
|
||||
stay. Keep track of the ideas.
|
||||
|
||||
## Lessons learned
|
||||
1. Respect the hardware. Get to know it more, like how BRAM access
|
||||
has a 1 cycle delay and how non-BRAM variables will use up your
|
||||
LUTs.
|
||||
2. There's multiple ways to do things, weigh them carefully and
|
||||
decide what to do with them. They can be ditched, implemented, or
|
||||
saved for the future.
|
||||
3. Rethink and connect. One design choice can lead to another, one
|
||||
idea can be combined with another. Go back to previous thoughts
|
||||
and think about how you can refine the current implementation by
|
||||
taking a page out of those past books.
|
||||
|
||||
## Final thoughts
|
||||
As I continue working on ROSE, I see more of its potential, and I can
|
||||
see that I'm making steps to realizing many of them.
|
||||
|
||||
I might write down all of my ideas for someone (perhaps me in a more
|
||||
distant future) to implement all of them.
|
||||
|
||||
## Next steps
|
||||
Complete the RX and TX queues, and test them out on a testbench.
|
Reference in New Issue
Block a user