SMP ARM cores hang when using DMA and two cores enabled


I am experiencing A complete arm core hang when both of the cores are employed in SMP mode and using DMA.
I was tested with Linux kernels 3.10, 4.1 and 4.6 in SMP mode.
SOC used is Altera Cyclone V SOC-FPGA with dual Cortex A9.
The DMA transfer goes from the DDR to the FPGA logic.

SignalTap running shows the FPGA is still running when the cores hang.
Running the kernel with maxcpus=1 in the command line makes the problem go away.
The two cores are connected via the L2 cache controller and the SCU to the switch fabric (NIC-301).
If I make a cyclic history tracer which logs the operations and the pointer fed to the DMA
and enable the watchdog, after the reset, I stop the booting process at u-boot and view the log in memory
then the addresses fed to the DMA are valid ones, so the software looks OK.

So the above items move the suspicion to the ARM cores themselves.

There are erratas specifying hang or memory corruption due to race conditions between the cache management
of the two Cortex A9 cores.
I have tried to apply ARM Cortex A9 erratas 761320, 845369, 764369, 794072 but I am still experiencing the hang.

I can try to turn on, one by one, the bits in the diagnostic debug register mentioned in the above erratas but before
I do that I would be glad for any help in case someone experienced something similiar with other Cortex A9 SOCs
and is aware of additional erratas relating to SMP cache coherency / cache management race conditions
which might help solve the issue.