I’m troubleshooting an issue on an Arria 10 SOC board running the Linux 4.20 kernel where we randomly see corruption reported on filesystems located on the eMMC. It doesn’t happen frequently and is quite indeterminate but after many reboots, one or more of the filesystems that are mounted report errors (even on filesystems mounted as read-only). As part of tracking down the issue (and having not narrowed it down to the eMMC at the time), we tried other filesystem types (ext3->ext4 and ext3->cramfs) but the problem persisted.
An interesting data point is that previously we were using the Linux 3.10 kernel and did NOT see this issue. We went back and confirmed that the issue does not occur when using the 3.10 kernel by testing multiple boards over the course of 10 days. Given this and other debugging performed, the issue appears related to MMC driver changes in the 4.20 kernel (which appear to be substantial based on a cursory comparison of the two versions).
I’m currently performing an mmc controller register comparison and added CMD logging to the MMC kernel driver for the purposes of comparing the CMDs being sent during configuration.
Has anyone encountered this issue? Any suggestions on things to try?
If you have any questions for me, please feel free to ask.