I did some further investigating - it's apparently due to not having enough setup
time on the RX pio SM. Even though the PIO clocking is fixed at 100 MHz, there are CRC errors at the lower system clocks. I tried changing the delay in the PIO instruction that starts the RX sampling, but that only made things worse (as expected). Also tried disabling the synchronizers with no improvement.
Hmm, interesting. Am I understanding it correctly that you're doing some kind of reset on the RX PIO from regular C code, and the time for "RX finish -> interrupt CPU -> reset RX PIO" is longer than the gap between packets?
If so, might it be possible to use two RX PIOs, automatically starting the next one via inter-PIO IRQ when a packet is finished? That'd give you an entire packet receive time to reset the original PIO, which should be plenty.
Nothing nearly so complex. Here's the code in question:
.wrap_target
irq set 0 ; Signal end of active packet
start:
wait 1 pin 2 ; Wait for CR_DV assertion
wait 1 pin 0 ; Wait for RX<0> to assert, signalling preamble start
wait 1 pin 1 [2] ; Wait for Start of Frame Delimiter, align to sample clk
sample:
in pins, 2 ; accumulate di-bits
jmp PIN, sample ; as long as CRS_DV is asserted
.wrap
It's run at a fixed 100 MHz, regardless of system clock speed, via controlling
the PIO execution rate a fraction of the system clock speed. So, for a 300 MHz
system clock, the PIO is clocked once every three system clocks. I'm speculating that the extra two clocks (at 300 MHz) allows more setup time to
the PIO inputs. The [2] above enables an extra two PIO clock delays before
executing the next instruction. I tried changing this from zero to three at 100 MHz system clock (i.e. a PIO system clock divisor of one), and wasn't able
to fix the problem. Though it should be noted that the LAN8742 isn't a very
forgiving chip - I've seen RX Data Valid (DV) go metastable when the TX clock
is interrupted/changed, so another pass through might be worthwhile.
BTW, Sandeep's original code clocked the RX PIO SM at 50 MHz, pushing all the samples to the output FIFO, and relied on the processor getting interrupted
at the falling edge of DV to figure out what samples constituted a packet.
I expect the RP2350 to perform much better in this scenario! At the minimum, one of the DMA channels should be eliminated, and I'm hoping the CRC calculation will
get faster.
Fortunately, the receive ISR isn't cracking packets, just calculating a checksum and passing the packet on to LWIP. I wish there were two DMA sniffers, so that the checksum could be calculated by the DMA engine(s), as that's where a lot of processor time is spent (event with a table driven CRC routine).
You can do it using PIO. I did that for emulating memory stick slave on rp2040. One PIO SM plus two dma channels with chained descriptors. XOR is achieved using any io reg you don’t need, with 0x3000 offset (manual mentions this as the XOR alias)
Almost correct - the third implementation does generate the clock, but it isn't necessary to drive the clock directly from the system clock, as there are m/n clock dividers available. I use a 300 MHz system clock, and divide down to 50 MHz which works well. (I've also addressed a few other shortcomings of this library, but am not done yet...) Haven't looked at the 10 MHz half duplex mode, though.
Do you have your code publicly available? I was just discussing with a friend it would be nice to add an optional Ethernet mode to my motor controller, but the limitations of this library or other approaches limit the appeal.
Also, are there any other approaches that might be better and would offer 100meg or even gigabit links with the RP2350? Thanks!
I'm still working on it - hopefully will be on the DECstation2040 github soon. The interface uses the LAN8720/8740 MAC chips, which provide 10/100 Mbit. (I haven't tried 10 Mbit, so no idea if it works or not). FYI, here's a test result:
ping -q -i 0.005 192.168.1.6
PING 192.168.1.6 (192.168.1.6) 56(84) bytes of data.
229501/229501 packets, 0% loss, min/avg/ewma/max = 0.063/76.725/88.698/160.145 ms
295393/295987 packets, 0% loss, min/avg/ewma/max = 0.063/76.792/37.003/160.171 ms
295393/296278 packets, 0% loss, min/avg/ewma/max = 0.063/76.792/37.003/160.171 ms
(Above is with killall -s QUIT ping)
As you can see, it eventually hangs, and CMSIS reports:
target halted due to breakpoint, current mode: Handler HardFault
xPSR: 0x61000003 pc: 0x200001c4 msp: 0x20040f40
I'm assuming you've looked at the pico-rmii-ethernet library? If so, I feel your pain - I've been fixing issues, and am about halfway done. (This is for the DECstation2040 project, available on github). Look for a release in late aug/early sep. (Maybe with actual lance code? Dmitry??) The RP2350 will make RMII slightly easier - the endless DMA allows elimination of the DMA reload channel(s).
I looked at it and dismissed it as too hacky for production. I don't remember the real reason why. I would have to look through my notes. The main question is whether the RP2350 will change that. As in it actually possible to do bug free without weird hacks.
The DMA Sniffer is part of the RP2040 DMA engine. It can perform a variety of functions on the DMA'd data, including CRCs, accumulation, bit-reversal, etc. I use the accumulation function to increment the address contained in the HyperRAM command packet after each scan line is DMA'd from HyperRAM to the scan line buffer. This allows re-use of the HyperRAM command packet, eliminating the need for a per scan line packet (3 uint_32s per scanline). Important for the emulator, as it runs out of RP2040 RAM (vs. flash).
Thanks. This got me curious enough that I'm now reading the datasheet. It seems exciting enough that I may likely choose it for my next project that requires a microcontroller. (When I don't want to use FPGAs, which I tend to overuse because I'm a big fan of them, and because cost margins usually don't matter private projects.)
Maybe you'd like to help me get this over the finish line? I'm looking at a steep learning curve, trying to write a Lance chip emulator. I have been able to get LWIP to run on the DECstation HW. (Though Rev 2.1 does exhibit some packet drops which I'm looking at. Hopefully, there's some slight difference between the LAN8720a on the Waveshare boards, and the LAN8742 on rev 2.1 that needs some software...)