-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
USB3 disk writes trigger OOM reaper on Pi 4 #3210
Comments
|
The SSD is powered using the official 3A Pi 4 power supply. vl805_update_0137ab does not fix the issue. The WD USB disk drive is powered by it's own power supply. I'm testing on the SSD only.
After the firmware upgrade, I ran:
dmesg output
|
Expecting the 4B to power an external SSD, even with a known-good power supply, is ambitious. That particular USB<->SATA adaptor has been seen to cause problems. |
On the USB-SATA adapter part - I disagree. I use 2 Western Digital 8TB usb disks with ZFS, and it has same issue. The WD disk has its own power supply. And I am using the genuine 5v3a standard power. |
@wushilin |
Using the 64bit kernel did not solve the problem. At least for me. |
Tried multiple SATA USB controllers including the Asmedia ASM1153. Issue still persists. Dont think its a power supply issue as well. I'm using the official power supply with no peripherals (keyboard, mouse etc) attached. Should be more than enough power to run the SATA SSD drive. |
Please update to the latest release kernel and retry, please report results. |
What's changed @JamesH65 ?
|
I had this happen to my Pi last night too. I was doing a DD to a USB 3 hard drive and got the OOM errors in the syslog file. This is what my Pi shows when I do a free -m (after a reboot). Is it strange that the swap is being slightly used even with so much memory spare?
Mem: 3776 450 266 159 3059 3068 |
Having same issue. Firstly caused by Transmission, then Deluge when downloading more torrents. Then just by coping files over network to PI. Tried to make huge swap without any solution. Now I am trying to disable oom killer by https://serverfault.com/questions/141988/avoid-linux-out-of-memory-application-teardown. It really driving me as I cannot use pi as a backup network storage. I have latest kernel and Raspberry OS. |
I'm no expert, but it seems that it's just the data transfer that causes the issue, not the application then. I was just using dd to make a copy of the image to the USB 3 hard drive and I had the same issue, with OOM kicking in and killing all kinds of things in an attempt to get it to work. I have a Pi v4 with 4Gb of RAM and I've never seen the amount used go over around 20% used. I've not rebooted my pi since posting the memory above and it's now total used free shared buff/cache available |
This issue is a definite deal breaker for anyone trying to use the Pi 4 as a desktop replacement or a media hub. Please look into this. |
@rowanalex123 So, are you still having the same issue, with the same things in the logs, even now? I know that your level of detail in this post was what helped me figure out I was having the same, or a very similar issue. I don't think it does it all the time for me, but last week whilst doing dd like you did, it slowed my Pi to a crawl and OOM starting closing all sorts of things down even though the memory seemed to be OK. |
I've not tested it myself off late. Going by the above two reports though, think its safe to say the issue is not resolved with the latest kernel? |
I regularly run apt update and full-upgrade so I'm running the latest firmware, bootloader and kernel. |
I also tried huge 80GB swap but it's not used at all. I would like to have this resolved. When I had the storage drive as NTFS and transfer speeds were <20MBs that time over network I had not this issue but now since I have storage on ext4 I have speeds ~100MBs it's crashing by oom killer. |
Can people seeing the problem please post the output of the following commands:
|
And:
|
Actually this rings a bell with me. I have my external drive formatted as NTFS but I kept on getting all sorts of errors etc related to NTFS-3G, but I can't remember any issues with OOM. I reformatted the drive to ext4 and got the OOM error. |
Hi, thanks for the info. Here are the results of the commands on my Pi: Linux raspberrypiv4 5.4.51-v7l+ #1333 SMP Mon Aug 10 16:51:40 BST 2020 armv7l GNU/Linux procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- Bus 002 Device 002: ID 1058:2627 Western Digital Technologies, Inc. |
That's a rev 1.1 4B with 4GB RAM running a recent kernel. The rev 1.1 boards where fitted with BCM2711B0 - the first production silicon - which had a few restrictions that were improved in the later C0 revision:
The latter restriction may be having an effect on the disk throughput. You can eliminate it as a possibility by (as an experiment!) rebooting with |
How can I tell which silicon revision my BCM2711 is? Is it etched into the top of the package? All mine seem to be B0 - I'm guessing it's the B0 in the middle of this:
Do the chips with C0 silicon say C0 instead? |
It's the penultimate 2 characters that matter: 2711ZPKFSB06COT On a running system the easiest way to tell is to look at the declaration of the bus the SD card controller is on:
|
As far as I know, all production 8GB units are fitted with C0s. |
Thanks for the info. So, my board not only has the USB-C power issue, it also can't access all of the memory? Will the total_mem=3072 limit the board to 3Gb instead of 4Gb of memory? |
It's a test - I'm trying to ascertain whether or not SWIOTLB is a factor or not. Yes it will limit the system to 3GB during the test. |
Hi @pelwell thanks for the info. My Pi will be doing a dd tomorrow night as part of a cron job. I'll let it do it without me changing anything to see if it occurs again. If it does, I'll make the change and then let it do the dd again to see if it happens again and I will report back. This will probably be the end of next week though. Thanks again. |
I confirm that this issue is still around in the recent kernels. Using only mounted samba shares (no local usb drives) and running an io intensive program like resilio-sync or transmission, I get the familiar oom errors despite gigabytes of free ram. I can't force a crash using loopback script mentioned earlier. |
What USB adapter are you using? I recently bought a Toshiba cavil disk it
worked great. Can’t reproduce with dd on zfs for 100gb, speed tops at
70mbps on USB 3.0.
…On Wed, 19 May 2021 at 10:56 PM, tempestnano ***@***.***> wrote:
I confirm that this issue is still around in the recent kernels. Using
only mounted samba shares (no local usb drives) and running an io intensive
program like resilio-sync or transmission, I get the familiar oom errors
despite gigabytes of free ram. I can't force a crash using loopback script
mentioned earlier.
kernel: Linux LibreELEC 5.10.27 #1
<#1> SMP Sat Apr 17 00:41:27
CEST 2021 armv7l GNU/Linux
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3210 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABVR2NCY625OV2D3LASVPSTTOPGQJANCNFSM4IUF6ESA>
.
|
I was referring to a test without any local usb drives at all, all i/o was performed over a remote samba share. I can reproduce this error for both usb 3.0 mount and remote samba mount. I don't believe that a simple dd will trigger this error. I think it has to be lots of small r/w operations. Something about caching and queues perhaps? |
To me dd was able to consistently trigger oom reaper with a jmicron sata to
USB converter.
…On Wed, 19 May 2021 at 11:12 PM, tempestnano ***@***.***> wrote:
What USB adapter are you using? I recently bought a Toshiba cavil disk it
worked great. Can’t reproduce with dd on zfs for 100gb, speed tops at
70mbps on USB 3.0.
… <#m_6399342338603381688_>
I was referring to a test without any local usb drives at all, all i/o was
performed over a remote samba share. I can reproduce this error for both
usb 3.0 mount and remote samba mount.
I don't believe that a simple dd will trigger this error. I think it has
to be lots of small r/w operations. Something about caching and queues
perhaps?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3210 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABVR2NGSZHCBQGJDJHCWKZTTOPIO3ANCNFSM4IUF6ESA>
.
|
@tempestnano What is your gpu_mem set to? |
gpu_mem=76 |
@P33M, comprehensive details are in this post. In short, conditions that increase the likeliness of triggering the OOM reaper vary across systems. For example:
Beware that developers do not deem Effectiveness of |
A google search led me to this issue as I'm experiencing the same on my RPi4. Here is the data asked above:
|
Can you try adding |
(setting changed to lower case) |
Sorry for the delayed feedback. |
OK - thanks. That's one theory to cross off the list. |
Try other data connector. I am fine with Toshiba and WD external disks
directly connected.
Asmedia, jmicron usb to data connector fails big time.
…On Thu, Jul 8, 2021, 6:37 PM Phil Elwell ***@***.***> wrote:
OK - thanks. That's one theory to cross off the list.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3210 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABVR2NBA3XSRCZO3ZNTQ32TTWV5VPANCNFSM4IUF6ESA>
.
|
@ wushilin |
Memory leak caused oom isn't that an error?
As I said, it is very easy to reproduce with jmicron or as media data
converter. It is impossible to reproduce on my Toshiba and WD external disk.
…On Thu, Jul 8, 2021, 7:13 PM H34dl3ss ***@***.***> wrote:
@ wushilin
If there was an issue with the USB adaptor of the drive, there would be an
error in the kernel log.
I have tested numerous adaptors and know how that kind of errors look like.
There are no errors of that kind.
This is a different story.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3210 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABVR2NBF4RJZ6RFH2ZDCFPLTWWB7FANCNFSM4IUF6ESA>
.
|
The same converter didn't have any issue on Windows btw.
…On Thu, Jul 8, 2021, 7:19 PM Shilin Wu ***@***.***> wrote:
Memory leak caused oom isn't that an error?
As I said, it is very easy to reproduce with jmicron or as media data
converter. It is impossible to reproduce on my Toshiba and WD external disk.
On Thu, Jul 8, 2021, 7:13 PM H34dl3ss ***@***.***> wrote:
> @ wushilin
> If there was an issue with the USB adaptor of the drive, there would be
> an error in the kernel log.
> I have tested numerous adaptors and know how that kind of errors look
> like.
> There are no errors of that kind.
> This is a different story.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#3210 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABVR2NBF4RJZ6RFH2ZDCFPLTWWB7FANCNFSM4IUF6ESA>
> .
>
|
I have triggered this without any usb drive on the 32-bit rpi4 using a second rpi4 (with a 64-bit kernel) and a network share over gigabit ethernet. It is somehow related to filesystem I/O, but not any hardware directly. |
As transfer via smb is no option anymore, I tried rsync and there is no such problem. Performance is mediocre though. I guess @tempestnano has a point with the I/O issue. SMB is spawning a mass of processes while it runs the copy. @wushilin |
@popcornmix I have been experiencing some OOM errors as well and I'm thinking that decreasing |
You can't have 0 gpu_mem, the minimum is 16 (the gpu still handles the initial boot from sdcard and manages clocks and power). |
In the Raspberry Pi forum I recently discussed problems like Bus Errors, Kernel Panics etc., that I encountered using gpu_mem=16. I was adviced to use 32 as a minimum, which, indeed solved these problems. So I suggest to stop using 16 as a minimum. 16:03 nasberrypi: % grep Revision /proc/cpuinfo |
there is fixes in #3981 which can be related , anyone have checked if the problem still exist with the fixes applied ? |
The fixes in #3981 did not resolve this issue for my rpi4b 8gb.
It still encountered an overzealous OOM reaper while writing to an externally powered usb3 external HDD via nfs:
|
I wanted to drop some words and links that will hopefully help other users in with my issue find this via Google. I came to this thread in a roundabout way via this Raspberry Pi Forum post. I was trying to format two 4TB USB3.0 hard drives in a RAID1 array with my Pi4. I was able to create the RAID array, but at the point I tried to create the filesystem with this command:
I would get this error:
I followed the tip in this thread from @jozsefDevs:
After rebooting the Pi, I was able to run mkfs.ext4. |
This does absolutely fix the issue, but it persists for those stuck on the 32-bit kernel (in my case, Libreelec). |
LibreELEC has switched to 64-bit kernel. See: LibreELEC/LibreELEC.tv#5507 |
I faced this issue as well. During large downloads, the OOM reaper would reap basically all of my load bearing daemons. I installed the 64 bit kernel and started using that, now I have none of these issues. My downloads are also much faster and the kernel now seems to correctly fill its cache (before the use of cached memory was all over the place). |
Describe the bug
Writing to a USB3 SSD disk results in kernel out of memory reaper killing random processes
To reproduce
Boot Pi 4 from SATAII SSD HDD on USB3 with quirks mode enabled
Run
sudo dd if=/dev/zero of=~/test.tmp bs=500K count=8024
Run
dmesg
Expected behaviour
No processes killed due to OOM reaper
Actual behaviour
Bunch of core system processes are killed by the OOM reaper.
dmesg output
slabtop output
System
Forum troubleshooting thread - link
The text was updated successfully, but these errors were encountered: