Enabling ACP over CycloneVsoc FPGA to HPS bridge

I’m using an arm cortex A9 processor with ArriaV SOC on a bare-metal project.

My Problem:

I’m trying to use the ACP to send data over the FPGA to HPS bridge from one device to another (both memory regions are defined in my MMU as devices and are sharable!) That Device is a RapidIO Slave/Master used for my transaction protocol between boards and for bigger packets above 1024 I use the DMA to transfer over the ACP. the problem starts when the packets are smaller than 1024 and therefore the DMA is not needed in that case adn a normal memcpy would be the choice …
This caused that when reading from the RapidIO Receive window and the size is smaller than the DMA transaction minimum size I have to Invalidate my Cache for every received RapidIO packet which deeply hurts my BM App! (which receives / sends 1 packet at a time).

I came across a problem of not fully understanding the meaning of the AXI signals in the data sheet: section B3.8 of ARMv7-AR architecture reference manual which indicates that we need to initialize the AWUSER/ARUSER and AWCACHE/ARCACHE.

My tryout solution:

I’ve successfully initialized the AWUSER/ARUSER register as told and initialized SCU, Shared memory registers, and added 0x80000000 for my receive buffer pointer so it’ll go through the ACP.

But when trying to read about how to enable the AWCACHE/ARCACHE I got a bit confused. when searching in the technical reference manual the only usage I saw for enabling those signals is in the EMAC category for the EMAC device which is connected to the L3 Master but is not part of my transaction as I’m passing it from the FPGA to HPS bridge straight into the ACP ID Mapper and then the actual ACP.

The only relevant register I saw that mentioned it was the l3master register which is within the EMAC. I’ve been stuck on this problem for weeks unable to find any good examples or further documentation about another register that does this.

I tried to initialize the l3master and turn on all its bits for allocations on read/write for the shared memory but it failed to help so I’m seeking help on what should I do next …

What register is used for enabling the Cache Coherency signals within the L3?
Is there a known problem between signals on AXI and Avalon buses? (since rapidio works on the Avalon Bus)

I’m looking for clarification on the process of enabling my ACP!

Thank you in advance,
Roy Carter