Hi @BrianM
I have not gotten any of this to work in any version of u-boot. The āpciā command replies, āNo such device.ā I have to, eventually, get it working because I wish to use a 2MB QSPI Flash, not an 16MB QSPI Flash to store u-boot, the FPGA and the kernel. I want the FPGA image and the kernel on the SSD.
Ideally we would have liked to access the NVME SSD from u-boot as well but was told early on by Intel (through a distribution) that for the Arria 10 this was simply not possible because u-boot did not support the PCI-E hard core in that device. There is support for the Startix 10 PCI-E core, however when I queried whether we could port this driver to the Arria 10 I was told that this would be very complex and involved. This may be the case for the Cyclone V as well which could explain why the pci command fails.
A possible solution is to have a very cut down Linux image in flash, enough to get the PCI-E/NVME up, and then switch kernels with kexec, however we have failed to get kexec working on the Arria 10 also.
- Here is the entries from socfpga.dtsi that I use for the FPGA. This includes the ones already there. Note: the latest kernel has the other two included. Finding this was a significant part of getting things working for me. These entries, along with the PCIe modules are added to socfpga.dtsi, then in your top level dts, they are enabled with status=āokayā.
I have similar bridge entries in my u-boot dts handoff file generated from the firmware project (Cyclone V may be different), without these any accesses to those memory regions just generate processor exceptions. They donāt appear in the generated Linux device tree at all, not needed I suspect, u-boot only uses them to correctly enable the bridge access through the bridge enable command. I scrapped the PCI-E core device tree entry in the u-boot dts once we realised that it was not supported.
- Yes, Iām only dealing with linux at the moment. However, I have added it to u-boot and u-boot seems to respect it as well. Itās a bit harder to verify since I donāt have pci working in u-boot.
I tried adding the reserved-memory entry to the u-boot dts as well as Linux, but I still get the System RAM being reported as 00000000-7fffffff. Iām not sure how u-boot handles these entries in its dts, maybe it prevents it relocating itself into the reserved region when it copies to DRAM. I did trace through u-boot parsing the Linux dts and setting up the reserved sections for the kernel, rootfs and device tree (there were some others not sure what they were) and could see my reserve from 0x00000000-0x00200000 being applied there.
- If you check out that other thread with folks trying to get PCIe SATA and PCIe NVMe drivers working, you can see the full extent of our understanding. Itās not much. My theory is that the Root Port is writing to offset zero in the DRAM page. I attempted to get it to move that address via many and different means, from adding address_span_extenders with offsets to the TXs bus from the HPS, by adding an offset to the ASE between the root port and the DRAM, and setting dma-ranges in the dts.
Iām not a software expert. I barely know anything about PCIe but itās clear to me that the driver is not fully developed.
I am more on the software than firmware side, however any Linux kernel driver is still complex for me, as is PCI-E. I did take a quick look at the pcie-altera driver code, its a lot sparser than I expected. I did not see any references to it looking up dma-ranges from the dts either explicitly or through of_pci_dma_tange_parser_init() calls which several other drivers used, so you may be right with your assessment of the driver state.
- For Cyclone V I Right click the avalon-mm pci express module in Platform Designer and choose Edit. On the bottom of the page under āAvalon to PCIe Address Translation Settingsā there are two entries: Set āNumber of address pagesā to 2 and set āSize of address pagesā to ā1 MByte - 20 bitsā It must be different for the 10 series of FPGAs. You could try changing the āAvalon-MM address widthā to 64 bit and see if it gives you more options. I donāt know why they would limit it down to 64K when as I understand it has to be at least one megabyte.
Looks like Iām in the right place, although how I got there and the tab page names seem different, probably Cyclone V vs Arria 10 differences. As I said the only choices for me with 2 pages is 12-16 and 22-32 bits, 4KB-64KB and 4MB-4GB. I did try setting the address width to 64-bit and then the drop down lists disappear and only give be an address width text box, no page count. This is free form so I could set this to 20 or 21 bits, but then I got a load of instantiation warnings. I really donāt understand what the significance of changing these would be, I would need to get the firmware guys to look into this.
- If you canāt make it any smaller than 4 MB * 2, then you are going to have to find a way to move the kernel. I tried and was not successful. It seems to me that the ARM arch of the kernel doesnāt want to set a start address.
- I had tried the textoffset method or something like it before, but I was actually trying to move the whole kernel⦠I did so much that I probably broke something else or it might have worked for me.
I had no real issues moving the kernels offset and entry point. I did not modify u-boot, although you may need to specify different boot parameters I guess. I found that setting the start and entry addresses in the FIT image did not affect its final location, even though thatās what u-boot would have read, I assume. All I did in the end was patch the arch/arm/Makefile to add an explicit textofs-$(CONFIG_ARCH_SOCFPGA) := 0x00208000, there are some other examples there for SA11xx StrongARM CPUs you can follow. Just for completeness I did set the entry and start in my FIT to the same number (I think originally I messed up and set it to 0x00200000 but it still worked so maybe u-boot does not honor the values in the FIT).
Perhaps trying to shift it by 64MB is too much. The AER errors I saw always noted memory write addresses less than 0x00100000, which would sort of tie in with the 1MB behind the bridge reports from lspci -vvv.
Are you having problems with interrupts? Iām getting this message when I try to do heavy writing to the SSD:
[ 90.312675] nvme nvme0: I/O 832 QID 1 timeout, completion polled
No not seen any of those. What do you mean by heavy, many parallel writes, etc? I ran a test last night where I copied a 1GB random test file from the NVME and back to it as a differently named file 100 times, but this was sequential. Then I check summed all files repeatedly and everything was fine, no apparent errors. I might try setting up a load of parallel copies back and forward and check dmesg.
I was thinking you were on to something because I think vectors are at 0x00000000 and if the RP is writing there it would be bad. That might be my problem, which means Iām no closer to solving the issue than I was two months ago.
Usually ARM chips put their exception vector jump addresses at 0x00000000, so you need to be careful. Some newer ARM cores have the ability to put the vectors higher up in memory also. If the PCI-E core is using this lower 1MB of DRAM for something, then it must be keeping clear of the bottom 8 words which I think is the extent of these vectors. If it was to write to one of these then the whole kernel would most likely crash.
BTW thanks for your original post, itās really helped me progress. Itās a definite start but I would really like to try and understand why moving the kernel or reserving that region seems to fix the issue.
If you make any other progress let me know, Iāll try and do the same.