Enable FPGA-to-HPS SDRAM Interface

jackfrye11 · October 24, 2020, 4:40pm

How can you enable the FPGASDRAM interface from the newer version of u-boot that includes the preloader?

There is what seems to be an older guide here (GSRD v13.1 - Programming FPGA from HPS | Documentation | RocketBoards.org) on using a u-boot command

run bridge_enable_handoff

This command is not in the environment in the newer releases of u-boot-socfgpa. I am using the branch origin/socfpga_v2019.04

I am fairly confident this bridge is not enabled based on what I am seeing in sysfs in Linux

root@cyclone5:~# ls /sys/class/fpga_bridge/
br0 br1
root@cyclone5:~# cat /sys/class/fpga_bridge/br0/name
lwhps2fpga
root@cyclone5:~# cat /sys/class/fpga_bridge/br1/name
hps2fpga

The question follows the issues I was seeing from SignalTap when debugging the master ports to a Altera DMA core mentioned in my recent post (Cyclone 5 F2SDRAM issue - #5 by jackfrye11)

schnudi · October 26, 2020, 4:33pm

Hi,
It seams that the command was replaced with the following one:

bridge enable

Not sure how it works on the CV device but at least on the A10 devices the bridges should be declared in the u-boot device tree blob.

jackfrye11 · October 29, 2020, 10:22pm

I do run that command in my script with a bit mask of 0xffffffff. The interface does not seem to be enabled.

bridge enable 0xffffffff

Where is the uboot device tree blob files located? In u-boot-socfpga source or linux-socfpga? Is it the same blob that is used for the kernel? Where are the files for Arria10? Did you have to change anything in those files?

jackfrye11 · November 25, 2020, 4:19am

Where can I find an example of the FPGA2HPS SDRAM node for the device tree? The other interfaces are listed in the decompiled dts I am looking at, but not the node for FPGA2HPS. I think I might need to add it manually.

davidb · December 17, 2020, 12:34pm

I had similar issue but “bridge enable” fixed it for me. IIRC from looking at the u-boot source, the bridge command only enables bridges for which the u-boot device tree has the “hps_fpgabridge<x>” entry present with the “init-val = <1>;” set. As this is in the handoff dtsi file generated from the Quartus project I suspect you may be missing something in there which causes that entry to be created.

jackfrye11 · December 17, 2020, 9:58pm

With the new bootloader flow, how can you change the bootloader device tree? Is there some file in u-boot-socfpga that will allow me to do that? I think I already did this for the kernel one. How can I make one for u-boot or tell u-boot where my kernel one is, which includes the setting you have listed above

davidb · December 18, 2020, 8:24am

Hi @jackfrye11
The process will somewhat depend on what board you are using and what version of u-boot you have. We are using u-boot 2020.04, in that release all the supported board device trees are in uboot/arch/arm/dts. In there you should find your boards dts file if its supported, so for example if you have an Arria 10 SoCDK board there are a number of file which starts socfpga_arria10_socdk_*.dts. There are several for the SoCDK board as it has different boot methods, NAND, QSPI, SD, etc.

So depending on which boot method is applicable, and the board as they don’t all follow the same structure, there should be a *_handoff.dtsi file. This is the file you need to replace with the device tree file generated from the firmware projects hps_isw_handoff folder. So again using the Arria 10 SoCDK as an example, when you build u-boot it should compile all the Arria 10 device trees (assuming you have selected the Arria 10 SoCDK as the Altera SOCFPGA board in the u-boot config), generating equivalent .dtb files.

The device tree files are hierarchical so it starts with say the socfpga_arria10_socdk_nand.dts file, this includes various common dtsi files such as the socfpga_arria10_socdk.dtsi which describe features of all SoCDK board, which in turn include other files like socfpga_arria10.dtsi, etc. as well as the handoff file, socfpga_arria10_handoff.dtsi, which contains the project specific SoC FPGA differences layered on top of the common definitions.

Now you may have to do things slightly different depending on your board, from earlier info it looks like you are using a Cyclone V, I assume the process is broadly the same. Like I said some Altera board device tree files have been quite nicely broken out ad structures, others not so.

Hope that helps.

jackfrye11 · December 18, 2020, 11:06pm

I tried adding this to arch/arm/dts/socfpga.dtsi

fpga_bridge2: fpga-bridge@ffc25080 {
compatible = “altr,socfpga-fpga2sdram-bridge”;
reg = <0xffc25080 0x10000>;
bridge-enable = <1>;
init_val = <1>;
};

I am seeing same behavior. Do I need it in kernel device tree too? I have it in there but it seems like it is not registering. I see nothing in dmesg about it.

jackfrye11 · December 27, 2020, 3:04pm

I have it registering in the kernel. Still same failing behavior (cmd ports never ready). What does your u-boot device tree node look like? Here is the node I have in the kernel and u-boot device tree

	fpga_bridge3: fpga-bridge@ffc25080 {
		compatible = "altr,socfpga-fpga2sdram-bridge";
		reg = <0xffc25080 0x4>;
		bridge-enable = <1>;
		read-ports-mask = <0x0000000f>;	/* appended from boardinfo */
		write-ports-mask = <0x0000000f>;	/* appended from boardinfo */
		cmd-ports-mask = <0x00000001>;	/* appended from boardinfo */
	};

davidb · January 6, 2021, 9:31am

Hi @jackfrye11

Sorry for the delay, been away for a few weeks.

So did you manually add that to the socfpga.dtsi as part of the uboot code?

My uboot dtsi segment is quite a bit different. The handoff file generated from the hps_isw_handoff firmware folder contains entries as below for the three SDRAM bridges (numbers differ):

hps_fpgabridge3: fpgabridge@3 {
    compatible = "altr,socfpga-fpga2sdram0-bridge";
    init-val = <1>;
}

However as this is just an incremental dts file some of the info you have, such as the bridge address, will be defined in a common file, again caveat this with the above is for an Arria 10. One thing that is different is the init-val has a hyphen not an underscore.

The main concern I have is that you manually added that entry rather than having it generated from the firmware project, assuming a Cyclone V follows the same flow as an Arria 10.

jackfrye11 · January 6, 2021, 10:37pm

@davidb thanks for the response.
I am looking at Building Bootloader for Stratix 10 and Agilex | Documentation | RocketBoards.org and it seems the flow is different between Arria10 and CycloneV.

It seems for Arria10 that bsp-editor generates either the device tree itself or a handoff file for u-boot to generate the device tree depending on what version of the Intel SoC tools you are using. CycloneV does not have as extensive support.

According to the wiki, “All custom user settings must be done directly in U-Boot (device tree, configuration and source code).”

I will play around with the “init_val” vs “init-val” change and let you know if I see anything. I too worry about exactly how the sdram interface node is being fit into the larger DT structure and if that is being done properly. Does u-boot use the same device tree compile as Linux? Perhaps I can find a way to compile and decompile the u-boot device tree to flatten it out, then add the node, then recompile with dtc, telling u-boot-socfpga to use my newer recompiled version of the device tree.

UPDATE
I tried adding

		fpga_bridge3: fpga_bridge@ffc25080 {
			compatible = "altr,socfpga-fpga2sdram0-bridge";
			init-val = <1>;
		};

into socfpga.dtsi based on the nodes right before it

  fpga_bridge0: fpga_bridge@ff400000 {
  	compatible = "altr,socfpga-lwhps2fpga-bridge";
  	reg = <0xff400000 0x100000>;
  	resets = <&rst LWHPS2FPGA_RESET>;
  	clocks = <&l4_main_clk>;
  };

  fpga_bridge1: fpga_bridge@ff500000 {
  	compatible = "altr,socfpga-hps2fpga-bridge";
  	reg = <0xff500000 0x10000>;
  	resets = <&rst HPS2FPGA_RESET>;
  	clocks = <&l4_main_clk>;
  };

but that did not seem to work. Same behavior where cmd_ready is never asserted and Avalon DMA times out.

davidb · January 7, 2021, 8:13am

Hi @jackfrye11

I’ve had a quick look at the Cyclone V dts files in uboot and the technical reference manual. It looks like the Cyclone V is quite a bit different to the Arria 10 and as such I’m not sure I can suggest anything else of any help.

During the quick look I tried to find the F2H_SDRAM registers in the reference manual and failed, so I have no idea whether they need to be set up in the same way as the Arria 10, i.e. where any bridge enable bits might be. I’m not even sure that the socfpga-fpga2sdram-bridge device type in the dts file is even appropriate to the Cyclone V. If I’m reading the manual right the Cyclone V has up to six paths from the FPGA to SDRAM, the Arria 10 only has three.

You may need to dig into the uboot source code and trace through the bridge enable command and see how this applies to a Cyclone V, this may then give you a clue as to what is needed in the device tree.

Sorry I could not have been any more help.

jackfrye11 · January 14, 2021, 10:55pm

Ok. I dug through the u-boot code. It seem the command “bridge” has no concept of the fpga2sdram bridge. If you look at the code and macro, it appears only concerned with the other three bridges.

u-boot-socfpga/blob/socfpga_v2020.07/arch/arm/mach-socfpga/misc.c

#ifndef CONFIG_SPL_BUILD
static int do_bridge(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[])
{
	unsigned int mask = ~0;

	if (argc < 2 || argc > 3)
		return CMD_RET_USAGE;

	argv++;

	if (argc == 3)
		mask = simple_strtoul(argv[1], NULL, 16);

	switch (*argv[0]) {
	case 'e':	/* Enable */
		do_bridge_reset(1, mask);
		break;
	case 'd':	/* Disable */
		do_bridge_reset(0, mask);
		break;
	default:
		return CMD_RET_USAGE;
	}

	return 0;
}

U_BOOT_CMD(bridge, 3, 1, do_bridge,
	   "SoCFPGA HPS FPGA bridge control",
	   "enable [mask] - Enable HPS-to-FPGA, FPGA-to-HPS, LWHPS-to-FPGA bridges\n"
	   "bridge disable [mask] - Enable HPS-to-FPGA, FPGA-to-HPS, LWHPS-to-FPGA bridges\n"
	   ""
);

Where is the support for fpga2sdram? Is that handled by another command or just parsing of the device tree???

jackfrye11 · January 16, 2021, 5:03pm

Next test I performed. I looked at

2. For FPGA to SDRAM bridge:
mw $fpga2sdram <my_fpga2sdram_new_value> 
go $fpga2sdram_apply

So, I added the following commands to my boot script
# fpga-sdram bridge
mw FFC25080 00000311
mw FFC2505C 00000004

Now it seems the ready’s are asserted, but when I start the DMA transaction, it seems the kernel hangs.

The idea of my program is to read values from some SDRAM location and write them to a separate location. These values are initialized from 0-19.

printf("Setting RX buf values to 0...\n");
for(i = 0; i <  20; i++)
{
    *(rx_buf+i) = 0;
}

printf("Reading back RX buf values...\n");
for(i = 0; i <  20; i++)
{
    printf("rx_buf[%d]: %d\n", i, *(rx_buf+i));
}

printf("Calling DMA ioctl...\n");
ioctl(fd_t, 0, &dummy); // trigger my linux driver for Altera DMA


printf("Reading back RX buf values...\n");
for(i = 0; i <  20; i++)
{
    printf("rx_buf[%d]: %d\n", i, *(rx_buf+i));
}

These are the print messages

Setting TX buf values…
Reading back TX [buf values…
88.997482] Read addr: 2f2b5000
tx_buf[0]: 0
tx_buf[1]: 1
tx_buf[2]: 2
tx_buf[3]: 3
tx_buf[4]: 4
tx_buf[5]: 5
tx_buf[6]: 6
tx_buf[7]: 7
tx_buf[8]: 8
tx_buf[9]: 9
tx_buf[10]: 10
tx_buf[11]: 11
tx_b[uf[12]: 12
tx_buf[13]: 13
tx_buf[14]: 14
tx_buf[15]: 15
tx_buf[16]: 16
tx_b uf[17]: 17
tx_buf[18]: 18
tx_b uf[19]: 19
Setting RX buf values to 0…
Readi ng back RX buf values…
rx_buf[0]: 0
rx_buf[1]: 0
rx_buf[2]: 0
rx_buf[3]: 08
rx_buf[4]: 0
rx_buf[5]: 0
rx_buf[6]: 0
rx_buf[7]: 0
rx_buf[8]: 0
rx_buf[9]: 0
rx_buf[10]: 0
rx_buf[11]: 0
rx_buf[12]: 0
rx_buf[13]: 0
rx_buf[14]: 0
rx_buf[15]: 0
9rx_buf[16]: 0
r.x_buf[17]: 0
rx_buf[18]: 0
rx_buf[19]: 0
Calling DMA ioctl…
005098] Write addr: 2f2b4000
[ 89.066627] Length: 1000
[ 89.069157] Control: 84

It seems that such data come in starting at bit 5 on the cmd_data_0 bus.

I am not sure if there are other things I need to do in u-boot to make sure the processor and the FPGA communicate with the SDRAM in a “responsible” way such that one hitting the memory doesn’t mess up the state of the other (in this case the FPGA manipulating the L3 interconnect such that processor cannot use it)

davidb · January 18, 2021, 12:02pm

Hi @jackfrye11

It looks like this is where the Arria 10 and Cyclone V diverge, with bridge implementation significantly different. For Arria 10 the do_bridge_enable() ultimately calls the socfpga_reset_deassert_bridges_handoff() method which in turn loops over all the bridges defined by the bridge_cfg_tbl[] and checks for the init-val in the u-boot handoff dts file. That config table includes H2F, LWH2F, F2H, as well as the three F2SDR.

The code in misc_gen5.c, which I presume is used by Cyclone V, is significantly different, ultimately calling socfpga_bridges_set_handoff_regs() from reset_manager_gen5.c. It does appear that in that method only the H2F, LWH2F and F2H bridges are handled.

Looks like you may have to write to the appropriate reset register to enable the SDRAM bridges yourself.

davidb · January 18, 2021, 12:14pm

Hi @jackfrye11

OK so you have an Altera mSGDMA instantiated in the fabric, reading and writing to HPS DRAM. That is broadly what we have, albeit separate DMA engines are used to copy data from the HPS DRAM to the FPGA fabric and data from the fabric to the HPS DRAM and it works fine.

This may be an obvious question but are you taking into account that once in Linux the DRAM is virtualised and you need to perform a virtual to physical lookup in order to provide the fabric DMA engine with a valid address. This applied to both the read from and write to DRAM. Any mistake there could result in your data being written back to a piece or memory in use with the kernel, hence the crashes.

Any DMA transfer will also need to account for the memory paging as a result of the physical to virtual mapping. Data in user space buffers may span multiple physical pages, which are not guaranteed to be contiguous unless you allocate such memory and stage your data there before passing to the DMA engine. Data does not need to be copied to a contiguous buffer if you lock down all pages associated with the user space buffer and then set the DMA engine up with multiple scatter-gather descriptors.

jackfrye11 · January 19, 2021, 1:41am

Just to be clear, this is the DMA I am using

I don’t think it has SG capability, so I’m not writing that metadata to some FPGA RAM or anything that complicated.

For allocation of memory, this is the important code in my driver.

#define BUFFER_SIZE 0x1000

typedef struct tx_dma_buf {
    struct cdev cdev;
    volatile phys_addr_t phys_addr_tx;
    volatile unsigned int *virt_addr_tx;
    volatile unsigned int *dma_regs;
    dev_t dev_node;
    struct class *class;
} tx_dma_buf_t;

typedef struct rx_dma_buf {
    struct cdev cdev;
    volatile phys_addr_t phys_addr_rx;
    volatile unsigned int *virt_addr_rx; 
    dev_t dev_node;
    struct class *class;
} rx_dma_buf_t;

tx_dma_buf_t *tx_dma_buf;
rx_dma_buf_t *rx_dma_buf;


static int __init my_init(void)
{
    int err;
    tx_dma_buf = (tx_dma_buf_t *) kmalloc(sizeof(tx_dma_buf_t));
    rx_dma_buf = (rx_dma_buf_t *) kmalloc(sizeof(rx_dma_buf_t));

    // Create TX and RX buffers by letting the kernel choose some physcial address and give us virtual pointer
    //   Use kmalloc with correct flags to ensure memory is contiguous
    tx_dma_buf->virt_addr_tx = (unsigned int *)kmalloc(BUFFER_SIZE, GFP_KERNEL);
    rx_dma_buf->virt_addr_rx = (unsigned int *)kmalloc(BUFFER_SIZE, GFP_KERNEL);

    // Get the physcial addresses for programming the DMA using virt_to_phys()
    tx_dma_buf->phys_addr_tx = virt_to_phys( (volatile void *)tx_dma_buf->virt_addr_tx);
    rx_dma_buf->phys_addr_rx = virt_to_phys( (volatile void *)rx_dma_buf->virt_addr_rx);

    printk( KERN_INFO "tx_dma_buf->virt_addr_tx: %x\n", tx_dma_buf->virt_addr_tx);
    printk( KERN_INFO "rx_dma_buf->virt_addr_rx: %x\n", rx_dma_buf->virt_addr_rx);
    printk( KERN_INFO "tx_dma_buf->phys_addr_tx: %x\n", tx_dma_buf->phys_addr_tx);
    printk( KERN_INFO "rx_dma_buf->phys_addr_rx: %x\n", rx_dma_buf->phys_addr_rx);
   ...
 }

Now the question would be whether the kmalloc is kosher or if I should be using dma_alloc_coherent(). I’ve done kmalloc on Xilinx platform before.

Also worth noting, the TX physical address printed when I “insmod” the driver and the init runs matches what I am seeing in the SignalTap (which makes sense since that is what I write to the register in the application code.)

Based on the RTL generated from Platform Designer, which shows how the AXI signals are mapped onto the raw interfaces, it seems like SDRAM interface is happy with the address commands, but never wants to RVALID-up and start sending read data.

davidb · January 20, 2021, 8:24am

Hi @jackfrye11

I’m not familiar with that particular DMA core you are using, that said the FPGA side of things is not my speciality, I work more on the software side.

The code snippet seems OK. kmalloc() allocates contiguous physical memory so can be used as a source or destination for a DMA transaction. I believe the main difference between kmalloc() and dma_alloc_coherent(), which is what my driver uses, is that the latter automatically handles cache coherency for you. With kmalloc() you would need to manually flush the buffer before the DMA engine transmits the data, or invalidate the buffer after a DMA receive before the processor accesses it. However that would not stop it working, which I think is the issue you are seeing, it would however potentially show invalid data.

jackfrye11 · January 30, 2021, 2:52pm

It has nothing to do with software.

I created a design with an address space extender IP with a 64K window that would allow me to write to different 64K areas of the SDRAM via a register that would set base address for a given 64K area. I tried doing a simple write via HPS_AXI_LW (which I proved worked in Signal Tap), through this IP and to the SDRAM interface. I did this in u-boot console and when I initiated the transaction, u-boot froze and then rebooted. This is definitely a HW/FW problem. For reference, I had GPIO at address 0x0 in the FPGA LW BASE address space.

=> mw FF200000 000000ff
=> mw FF220000 00000000
=> mw FF220004 40000000
=> md FF200000 1
ff200000: 000000ff …
=> md FF210000 1
ff210000:

I am reading about SDRAM clocking. I think some of the blocks are supposed to be set up by u-boot via settings pll_config.h, which is created by running qts_filter.sh on you Quartus project. I think your device has the same flow that includes qts_filter. Do you mind sending me your pll_config.h file when you get a chance?

jackfrye11 · January 30, 2021, 4:14pm

I switched to the 20.1 versions of the embedded_command_shell.sh/u-boot-socfpga source and now FPGA2SDRAM interface works.

I used:

intelFPGA_lite v20.1 of embedded_command_shell.sh
u-boot-socfpga branch origin/socfpga_v2020.07

Couple of notes
u-boot-socfpga 2020.07 will automatically scan for u-boot.scr (compiled u-boot script)
The image I was using was still generated from intelFPGA_lite v19.1 (wasn’t forced to upgrade F/W to get it to work)