Errors Related to Modified GHRD Project

Jambie · December 21, 2017, 11:33pm

I periodically get ‘udevd’ errors (shown below) reporting from Angstrom Linux (I am using A10 SoC Devkit board and Quartus v17.0.2). I only got these when I made a project by modifying a version of the GHRD to make my own project. It looks to me like there are Linux system services tied to the componenets used in the GHRD so all I should need to do is find them and turn them off. Has nyone else seen sporadic errors like these after using a modified GHRD project? (I am making the assumption that the GHRD is where these came from, but if I run an unmodified GHRD they do not seem to occur.) Did the authors of the GHRD insert Linux services that link to their design? (Seems unlikely but???)

If you recognize the error listed below let me know what causes it. - Thanks

root@arria10:~# [ 846.542928] Unable to handle kernel paging request at virtual address 0000600c
[ 846.550123] pgd = edc04000
[ 846.552817] [0000600c] *pgd=00000000
[ 846.556389] Internal error: Oops: 5 [#1] SMP ARM
[ 846.560982] Modules linked in: altera_sysid
[ 846.565171] CPU: 0 PID: 1100 Comm: systemd-udevd Not tainted 4.1.22-ltsi-altera #1
[ 846.572703] Hardware name: Altera SOCFPGA Arria10
[ 846.577384] task: edc3eac0 ti: ee65a000 task.ti: ee65a000
[ 846.582764] PC is at _raw_spin_lock_irqsave+0x24/0x60
[ 846.587795] LR is at __pm_relax+0x2c/0x7c
[ 846.591785] pc : [] lr : [] psr: 20070093
[ 846.591785] sp : ee65beb8 ip : ee65bec8 fp : ee65bec4
[ 846.603207] r10: ee5bfa00 r9 : 60070013 r8 : ee65bf58
[ 846.608406] r7 : 00000000 r6 : ee65bef0 r5 : 0000600c r4 : 00006000
[ 846.614901] r3 : ee65bef0 r2 : 0000600c r1 : 0000ab80 r0 : 20070093
[ 846.621398] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user
[ 846.628584] Control: 10c5387d Table: 2dc0404a DAC: 00000015
[ 846.634302] Process systemd-udevd (pid: 1100, stack limit = 0xee65a218)
[ 846.640883] Stack: (0xee65beb8 to 0xee65c000)
[ 846.645219] bea0: ee65bee4 ee65bec8
[ 846.653358] bec0: c0378cc4 c059f040 00000000 ee65bf58 00000000 ee5bfa34 ee65bf24 ee65bee8
[ 846.661496] bee0: c0163878 c0378ca4 ee65bf24 00000001 ee65bef0 ee65bef0 00000000 00000000
[ 846.669635] bf00: ee737000 00000009 be9a06e0 ee737000 ee5bfa00 00000000 ee65bfa4 ee65bf28
[ 846.677773] bf20: c0164a64 c0163748 00000000 00000000 00000009 be9a06e0 ee737000 ee737000
[ 846.685912] bf40: 00000000 00000000 00000001 ee5bfa34 ee65bf6c ee65bf60 00000009 be9a06e0
[ 846.694050] bf60: 00000001 edc3eac0 c004f940 00000100 00000200 c000fbe4 ee65bfb0 7f667df0
[ 846.702188] bf80: 00000001 be9a06e0 000000fc c000fbe4 ee65a000 00000000 00000000 ee65bfa8
[ 846.710327] bfa0: c000fa40 c0164804 7f667df0 00000001 0000000a be9a06e0 00000009 ffffffff
[ 846.718465] bfc0: 7f667df0 00000001 be9a06e0 000000fc ffffffff ffffffff 00000009 0000000a
[ 846.726603] bfe0: 00000000 be9a06d4 7f62514d b6e0bfa0 60000010 0000000a ca2ba5a9 ca2ba5a9
[ 846.734749] [] (_raw_spin_lock_irqsave) from [] (__pm_relax+0x2c/0x7c)
[ 846.742980] [] (__pm_relax) from [] (ep_scan_ready_list+0x13c/0x1d8)
[ 846.751036] [] (ep_scan_ready_list) from [] (SyS_epoll_wait+0x26c/0x460)
[ 846.759438] [] (SyS_epoll_wait) from [] (ret_fast_syscall+0x0/0x3c)
[ 846.767406] Code: e1a02000 e10f0000 f10c0080 f592f000 (e1923f9f)
[ 846.773476] —[ end trace 32a4162ace9bcd9f ]—

jhaberly · December 27, 2017, 8:45pm

It would be interesting to see the differences (if any) between the device trees (DTS files) generated by the GHRD and your modified version of it.

hoople · December 7, 2018, 2:48pm

I have the exact same trouble. If you find a solution be sure to post it

Jambie · January 8, 2019, 8:01pm

" I did find a way to make it stop scrolling errors at me by placing the Altera Sysid Peripheral back into my project and setting it up with the connections that allow it to respond to the Linux service.
I had removed the altera_sysid from my QSYS since I have no functional need of it but Altera has ensured it must stay by inserting a system service that will crash your Linux if you change or remove it. To put the sysid back into your HPS project add the sysid peripheral in QSYS. It has three connections, clock, reset and control_slave. Connect clock and reset to any available and active clock and reset in your QSYS design. Connect the control_slave to the h2f_lw_axi_master in your HPS device. (I had tried connecting it to the AXI master, which connects OK - but it did not work to stop the crashes.) You also should rename the sysid peripheral to “sysid” (I am not sure if this is required but it is part of how I got it to work).

Altera added a golden system reference design driver to the Angstrom Linux that distributes to users of the A10_SoC_DevKit and it can be found in /sys/module/altera_sysid directory in your Angstrom distribution. It is not documented as far as I can see so there is little available to try to control it through settings (secret code bugs me). I have no interest in using the GSRD beyond some testing so I doubt there is anything in there that helps me, and it probably causes other unwanted effects besides sysid crashes. I did not get around to shutting it down (using the appropriate systemctl commands) but I am fairly sure you could do that successfully then be able to remove the Altera-sysid completely from your QSYS.

My projects are flat FPGA designs and there is no risk of installing code for a different processor. I don’t use partial configuration and at present I am not using and NIOS code since the HPS works well to do all I need for now. If I add some NIOS again later I may need the sysid back but I doubt I’ll have need of a system service that helps crash my system.

Jambie · January 8, 2019, 8:25pm

It looks like I spoke too soon with the post above since it just crashed again (though it ran well for two days - a new Altera record!).
I’ll have a go at it with the systemctl commands and post again.

Jambie · January 8, 2019, 11:10pm

I’ve been running OK so far without the altera-gsrd system service, but time will tell if this is the real fix.
Using systemd commands you can see and control the system services.
To list the services that are running enter
ls /lib/systemd/system/.service /etc/systemd/system/.service
To see the status of a particular service (for example the altera-gsrd.service ) enter
systemctl status altera-gsrd.service
To stop a service (for example the altera-gsrd.service) enter
systemctl stop altera-gsrd.service
That will stop the service now, but it will restart on the next reboot.
To start a service (etc…)
systemctl start altera-gsrd.service
To disable a service so it does not start again on the next boot (so it stays turned off) enter
systemctl disable altera-gsrd.service
To enable a service to it will restart when the system is booted up enter
systemctl enable altera-gsrd.service
So far the sysid errors are not happening to my system but I have only been testing it for one afternoon.
I still have the Altera_sysid component in my QSYS but if testing looks good I will try removing it to see what happens.

LATE REPORT: It seems that still isn’t enough since I still see the dreaded Altera_sysid oops traces from my Linux terminal. I’ll keep working on it.

Jambie · January 24, 2019, 11:52pm

The next chapter in my ongoing battle with altera_sysid. To recap, I am not using the altera_sysid module for anything because I have no NIOS processors (I use HPS only) and I do not dynamically load or reload code on the fly. I am also the only user so I dont need to contain the possibility that someone might come along and load a new program.
I started out with the SD Card loaded with the golden system reference design but moved on from spinoffs of that program long ago. Along the way I began getting Linux oops tracebacks reporting failures related to altera-sysid. This topic covers my attempt to make those (and a few other system generated errors) go away so my Linux is more stable to work with.
I found that there is a system service called altera-gsrd.service that does something (unspecified) to support the golden system reference design. This is not documented as far as I can see and is a likely syuspect in inciting the altera-sysid messages. This system service resides in /lib/systemd/system/altera-gsrd.service
I also found that there is a module named altera_sysid residing in /sys/module/altera_sysid
I had occasionally seen errors from periodic checks which were watchdog checks or timed system events (something that happens at 180 sec intervals) that spew warnings that seem to have no effect.
To alleviate these I have added the following lines to my initial login seqeunce and I have had very good luck at eliminating the non-useful errors and warnings so far. I also have removed the altera_sysid component from the QSYS in my project. The lines are as follows;

rmmod altera-sysid
echo ‘V’ > /dev/watchdog
systemctl stop systemd-udev-trigger
mesg n
loglevel=3
systemctl stop altera-gsrd
echo 0 > /proc/sys/kernel/hung_task_timeout_secs

It may be that I do not need all of these any more as they were accumulated over time and some may now be preventing the errors that others are covering but they are working for me better than my earlier attempts.

Jambie · January 25, 2019, 3:10pm

I spoke too soon again… Now I am seeing a different error. I would hold on some of the actions above until I find out which is unstable…