initial commit: figuring out SPI on the Tang Primer 20K, already 3 devlogs, will commit on a per-devlog/document change basis

2025-05-11 00:35:24 -04:00
commit 24bf28db9d
8 changed files with 549 additions and 0 deletions
--- a/devlog/2025-05-03-SPI-slave-implementation.md
+++ b/devlog/2025-05-03-SPI-slave-implementation.md
@ -0,0 +1,120 @@
+# SPI slave implementation on the Tang Primer 20K
+Date: 2025-05-03
+
+## Goals and expectations
+Today's goal will be focused on getting some simple SPI modules on the
+Tang Primer 20K running that would receive bits from a master
+(raspberry pi) and sends them back, optionally to increment them by
+one.
+
+### Before diving in
+- I have not yet formally learned SystemVerilog, perhaps I won't until
+  the early stages of THORN, since simulations are something of its
+  concerns.
+- Today's development will be in a learn as I go model, this is to
+  quickly get my hands dirty playing with the FPGA and related things
+  and to give myself some positive feedback after planning out such a
+  big plan and getting honestly a bit scared by the details (even
+  though I set pretty loose deadlines).
+- Hopefully, I can figure out how to run their programmer in Linux as
+  well, and hopefully they provided command-line access to their tool
+  chain or else I would have to ship the synthesized binary in the
+  repository so that I can automate flashing it.
+  
+## Results
+We didn't get to pushing the logic to the FPGA, but we did get pass
+the sim using `verilator` with a `tb` generating the master signal.
+This is a great step forward as I implemented it with a buffer in
+mind, which would corresponding to buffering an entire ROSE packet to
+memory.
+
+## Reflections
+1. Use elegant solutions, if something is ugly, it should not have
+   been working in the first place.
+2. Programming in SystemVerilog is very different from your normal,
+   sequential programming languages, especially with non-blocking
+   assignments.
+   - Non-blocking assignments will run at once after all the logic,
+     and all of them at once.
+   - For example: if I want to update a buffer for the result of
+     incrementing a received buffer upon the reception of the 8th bit,
+     I would have to write it as `tx_buff <= {rx_shift[6:0], mosi} +
+     1`, instead of `tx_buff <= rx_shift + 1`, `rx_shift` hasn't been
+     updated in that cycle.  This actually took me a while to figure
+     out, that I'm updating the buffer with older results if I use the
+     latter.
+3. **ALWAYS** test with longer tests, and every bit matters.
+   - For starters, my initial (incorrect) logic ran fine against a
+     single byte of data, but I didn't notice that I'm actually
+     updating the buffer at the reception of the 9th bit, which meant
+     that it quickly fell apart when I tested it against something
+     like "HELLO" and got gibberish back.  It was at that moment I
+     knew that I had incorrect timing for updating the buffer.
+   - Then there came the problem with "HELLO", I found that I got
+     results that were incremented by **2** when I ran the "fixed"
+     version.  While it was tempting to just decrement it by 1 when I
+     update `tx_shift` and when sending out the first bit of the byte
+     (since `tx_shift` isn't updated at that time, it would have to
+     draw that bit from `tx_buff`) or simply not add the 1 to
+     `tx_buff`.  But fixing the logic mattered more, and that's when I
+     noticed that they all had 0's at their second to last bit for the
+     entire string.  Yup, another sync issue.  So, I moved on to
+     testing with something with more coverage like "ABCD", which
+     covers the parity.
+4. Plan out what to do at each clock edge.  Clocks are confusing with
+   combinational logic, especially paired with SPI's specifications
+   for using both the rising and falling edges.  It took me a while to
+   realize that the rising edge happens **before** the falling one,
+   which meant that the data from the rising edge has already been
+   updated.
+5. Use `$display` to poke around the bits and bytes.  Although an
+   oscilloscope might be even better, I should probably set that up.
+   Displaying debug messages has helped me catch many errors in my
+   logic (completely avoidable, just a newbie programmer's fault).
+6. ChatGPT might not be the best tool to help.  I think it did well
+   helping me plan this project, but not in helping me with the actual
+   code (according to my philosophy, it never should even try).  It
+   tried to write a 2FF syncing module for the slave module, which is
+   completely unnecessary since we're directly syncing from the
+   master's clock.  Although that would be helpful later on with
+   inter-clock domain transactions of data.
+   - Do your own research, use ChatGPT for suggestions and analyzing
+   the error messages (thanks to `verilator` for giving me good error
+   messages and kidnapping me like a rust compiler).
+
+## Final thoughts
+SystemVerilog is confusing.  Combinational logic is confusing.
+Designing logic and tests within that framework is confusing.
+Fuddling with them befuddled me.  But it's sweet to see some positive
+feedback after all that planning.  It's a great start (even if it's
+just some simple receive-and-send-back logic), I feel like I'm
+actually starting to learn the ropes here, something that I could
+never have learned by reading online tutorials or some reference book.
+
+Even though most of today was me shooting myself in the foot with a
+poor understanding of the principles behind SV, it helped me grasp how
+the language and what it produces work.  It felt like opening myself
+to a new domain, where everything can be ran all at once, something
+achievable in python or C only with running different threads and
+using mutexes to prevent data corruption.  It felt like a leap of
+faith into abstracting the logic gates but keeping the logic alive.  I
+still have faith that ROSE will succeed, along with THORN and PETAL.
+
+I still setup a testbench in ROSE, but after the completion of a
+working prototype, it will be migrated/integrated into THORN.
+
+For now, let the rose sprout on its own and let it gain the momentum
+to grow its stems and leaves...
+
+## The next step
+Dump the logic onto the FPGA, and see how it responds to the raspberry
+pi.
+
+Then, try to setup a buffer inside the FPGA's registers to hold
+packets.  Perhaps try to receive a number of ROSE packets, and then
+send them out in reverse order, with their contents modified (like
+switching the source and destination fields).
+
+Might as well as enable UART dumps for debug messages, they would come
+in handy once I hand the logic to the FPGA.  This is highly optional
+for a "next step", but a must in the near-term.