initial commit: figuring out SPI on the Tang Primer 20K, already 3 devlogs, will commit on a per-devlog/document change basis

This commit is contained in:
2025-05-11 00:35:24 -04:00
commit 24bf28db9d
8 changed files with 549 additions and 0 deletions

View File

@ -0,0 +1,120 @@
# SPI slave implementation on the Tang Primer 20K
Date: 2025-05-03
## Goals and expectations
Today's goal will be focused on getting some simple SPI modules on the
Tang Primer 20K running that would receive bits from a master
(raspberry pi) and sends them back, optionally to increment them by
one.
### Before diving in
- I have not yet formally learned SystemVerilog, perhaps I won't until
the early stages of THORN, since simulations are something of its
concerns.
- Today's development will be in a learn as I go model, this is to
quickly get my hands dirty playing with the FPGA and related things
and to give myself some positive feedback after planning out such a
big plan and getting honestly a bit scared by the details (even
though I set pretty loose deadlines).
- Hopefully, I can figure out how to run their programmer in Linux as
well, and hopefully they provided command-line access to their tool
chain or else I would have to ship the synthesized binary in the
repository so that I can automate flashing it.
## Results
We didn't get to pushing the logic to the FPGA, but we did get pass
the sim using `verilator` with a `tb` generating the master signal.
This is a great step forward as I implemented it with a buffer in
mind, which would corresponding to buffering an entire ROSE packet to
memory.
## Reflections
1. Use elegant solutions, if something is ugly, it should not have
been working in the first place.
2. Programming in SystemVerilog is very different from your normal,
sequential programming languages, especially with non-blocking
assignments.
- Non-blocking assignments will run at once after all the logic,
and all of them at once.
- For example: if I want to update a buffer for the result of
incrementing a received buffer upon the reception of the 8th bit,
I would have to write it as `tx_buff <= {rx_shift[6:0], mosi} +
1`, instead of `tx_buff <= rx_shift + 1`, `rx_shift` hasn't been
updated in that cycle. This actually took me a while to figure
out, that I'm updating the buffer with older results if I use the
latter.
3. **ALWAYS** test with longer tests, and every bit matters.
- For starters, my initial (incorrect) logic ran fine against a
single byte of data, but I didn't notice that I'm actually
updating the buffer at the reception of the 9th bit, which meant
that it quickly fell apart when I tested it against something
like "HELLO" and got gibberish back. It was at that moment I
knew that I had incorrect timing for updating the buffer.
- Then there came the problem with "HELLO", I found that I got
results that were incremented by **2** when I ran the "fixed"
version. While it was tempting to just decrement it by 1 when I
update `tx_shift` and when sending out the first bit of the byte
(since `tx_shift` isn't updated at that time, it would have to
draw that bit from `tx_buff`) or simply not add the 1 to
`tx_buff`. But fixing the logic mattered more, and that's when I
noticed that they all had 0's at their second to last bit for the
entire string. Yup, another sync issue. So, I moved on to
testing with something with more coverage like "ABCD", which
covers the parity.
4. Plan out what to do at each clock edge. Clocks are confusing with
combinational logic, especially paired with SPI's specifications
for using both the rising and falling edges. It took me a while to
realize that the rising edge happens **before** the falling one,
which meant that the data from the rising edge has already been
updated.
5. Use `$display` to poke around the bits and bytes. Although an
oscilloscope might be even better, I should probably set that up.
Displaying debug messages has helped me catch many errors in my
logic (completely avoidable, just a newbie programmer's fault).
6. ChatGPT might not be the best tool to help. I think it did well
helping me plan this project, but not in helping me with the actual
code (according to my philosophy, it never should even try). It
tried to write a 2FF syncing module for the slave module, which is
completely unnecessary since we're directly syncing from the
master's clock. Although that would be helpful later on with
inter-clock domain transactions of data.
- Do your own research, use ChatGPT for suggestions and analyzing
the error messages (thanks to `verilator` for giving me good error
messages and kidnapping me like a rust compiler).
## Final thoughts
SystemVerilog is confusing. Combinational logic is confusing.
Designing logic and tests within that framework is confusing.
Fuddling with them befuddled me. But it's sweet to see some positive
feedback after all that planning. It's a great start (even if it's
just some simple receive-and-send-back logic), I feel like I'm
actually starting to learn the ropes here, something that I could
never have learned by reading online tutorials or some reference book.
Even though most of today was me shooting myself in the foot with a
poor understanding of the principles behind SV, it helped me grasp how
the language and what it produces work. It felt like opening myself
to a new domain, where everything can be ran all at once, something
achievable in python or C only with running different threads and
using mutexes to prevent data corruption. It felt like a leap of
faith into abstracting the logic gates but keeping the logic alive. I
still have faith that ROSE will succeed, along with THORN and PETAL.
I still setup a testbench in ROSE, but after the completion of a
working prototype, it will be migrated/integrated into THORN.
For now, let the rose sprout on its own and let it gain the momentum
to grow its stems and leaves...
## The next step
Dump the logic onto the FPGA, and see how it responds to the raspberry
pi.
Then, try to setup a buffer inside the FPGA's registers to hold
packets. Perhaps try to receive a number of ROSE packets, and then
send them out in reverse order, with their contents modified (like
switching the source and destination fields).
Might as well as enable UART dumps for debug messages, they would come
in handy once I hand the logic to the FPGA. This is highly optional
for a "next step", but a must in the near-term.