Further Adventures With The Homeserver

Honestly, I just need to learn to leave good enough alone.

In an update to the previous Home Server Saga: It crashed again. Same issue, systemctl-journald core dumped and it was unresponsive. I had noticed during my previous log reviews that every time, before journald dumped core and the system became frozen, that Plex was running its Chapter Thumbnail generation. The strange thing was checking each boot, it was Plex running Chapter Thumbnail generation against the SAME FILES. Every night at 2am it would generate new thumbnails of the same 40 or so files. This didn't seem normal or productive, so I turned off Chapter Thumbnail generation in the Plex settings and let the system run.

Well a few days later I logged into a fully responsive Linux system to do my morning syslog perusal and I saw that systemctl-journald had dumped core three times overnight. Each time it killed the process successfully and restarted it. It seems like Plex tying up disk IO by trying to create the same thumbnails endlessly is what was rendering the system frozen the previous times.

Now is when a normal human would leave well enough alone!

NOT ME! I saw in the logs that Plex Transcoder was getting killed for "out of memory" issues around the same time:

<hostname> kernel: Out of memory: Killed process <PID> (Plex Transcoder) total-vm:15262104kB, anon-rss:15212436kB

Well, if the Plex Transcoder is running out of memory, maybe I should enable HugePages to help it out!

Dear Reader: Let me tell you that I had 0 clue what HugePages actually were. I enabled 100 MB huge pages with sudo sysctl -w vm.nr_hugepages=102400 and checked htop to see how that was. Imagine my surprise when I saw 11.9 GB of 15.6 GB of my RAM used, when the system normally sits between 730-900 MB used. Getting scared, I changed it to 51200 (50 MB) and added a line to my sysctl.conf file to do the same. Still seeing high memory usage, I panic rebooted the machine.

I could not have done a worse thing.

The actual play would have been to set sudo sysctl -w vm.nr_hugepages=0 and go on with my day. Instead, by adding that line to my sysctl.conf file and rebooting I doomed my system to NO RAM for the foreseeable future. This was at 9:30 AM this morning.

Once my ping told me the machine was back up, I SSH'd in. The time on my status bar said 9:48 AM. At about 10:16 AM the password prompt showed up and I entered my password.

AT 11:07 AM EST I FINALLY GOT A COMMAND PROMPT

A very zippy one hour and 19 minutes to complete a login. My computer was dying. I quickly found that pressing up for command history didn't work, it gave me garbage input, as did backspace. Having no way to delete that, I had to press enter and wait for the computer to process the garbage and give me a fresh command line. After probably an hour, I got a fresh command prompt and I painstakingly typed in the command to disable HugePages. I could enter about 4 or 5 characters into the buffer and count on them showing up, then enter the next few characters. I eventually got it all entered and pressed enter. It didn't look like it did anything, so I pressed enter again.

So about an hour later I got the 2nd prompt to enter my SUDO password. I did so and pressed enter, and a new command line showed up almost immediately. My OhMyZSH prompt told me it took ONE HOUR AND TWENTY SIX MINUTES AND SIXTEEN SECONDS to run that command (including the botched double enter causing me to wait for a 2nd SUDO prompt). Now the machine was zippy again, and I removed the line from sysctl.conf to make extra sure I don't run into that again.

So now I'll do what I should have done and just left the system alone to see how it runs down the road. If the syslog journal crashes every few days but recovers? I'm fine with that. As long as the system doesn't look up it can do it's thing!

Blog Comments powered by Disqus.