Home Servers And Me: Dear God, What Have I Gotten Myself Into

No seriously, WTH.

15th Sep 2023

Home Servers And Me: Dear God, What Have I Gotten Myself Into

IN THE BEGINNING

I have, for DECADES now, had a personal server of one sort or another to run the web pages and ill conceived projects that come into my head. The first dedicated server was an old Dell Pentium 3 workstation that I got from an office move that I was contracted to help on. They had heaps of these old machines after everyone got upgraded to newer ones, so I got to take one home and I installed Mandrake Linux on it. This original install eventually failed and I replaced it with Free BSD 5.2. It ran through 2 moves and was eventually replaced with a home made Athlon 64 X2 machine that I installed Gentoo Linux on (for my sins.) eventually this machine died and the files from it went into a virtual machine on a friends server, which then died. I only have a wayback machine link for the virtual machine version, but the "a brief history of scummbox.org" mentions the Athlon X2 and Gentoo version existing in 2009. There was a brief react.js version in 2013 where I was very angry at a lot of corporations, apparently. For the last while it has been a DigitalOcean Droplet that I have run this very blog from!

But a cloud machine is fine for webpages and whatnot, but there are some things you want to run in your home network.

Enter: Plex.

Plex is like a "roll your own Netflix' sort of service. You start by ripping all your DVDs of TV programs and movies to a storage medium somewhere (like a Network Attached Storage device) and then you add those directories to Plex as "Libraries". Plex will then process the files, matching them to TVDB and various Movie Database entries to pull down metadata (actors, synopsis, posters, etc) which it then presents in a very Netflix-y way. You can then point a client on a Smart TV or an app on your computer, or even browse to a webpage in a browser to play the media that you have added. It's really cool!

It's also pretty resource intensive on the host machine, especially when you get into transcoding media. (Transcoding is when you take a video intended for, say a 1080p screen and resize it to work on a 480p screen. Plex has to do this on the fly and send the transcoded bits to your client.)

I used to run Plex on the same Synology NAS that the ripped videos were stored on. This was fine when it was all 720p content going to 720p or 1080p devices. But once I started ripping 1080p and 4K content the NAS' CPU really struggled to keep up. I clearly needed a home server to run Plex for me.

As luck would have it I got a Apple Mini (a Late 2014 hardware refresh, but it was slightly newer than that) for nearly no money, so I used that as my Plex server with only a few minor hiccoughs. Every so often the CPU would pin itself for no well explained reason and it would sound like a jet engine and Plex would stutter like crazy. I found that a quick reboot would sort that out, and it handled the Plex load for a number of years... UNTIL the curse of the Fusion drive reared it's ugly head.

For those not in the know, a "fusion" drive is a regular spinning rust hard drive with a small amount of NVME storage nailed to it. In theory the frequently accessed OS bits would go on the NVME for speedy access, and the rest would go on the slower spinning rust. Sometimes these drives were in a single 2.5" laptop hard drive package, and sometimes (like in the Mac Mini) it's a logical volume containing a small discrete NVME drive and a laptop-sized hard drive. The great part about this arrangement is that sometimes, out of the blue, MacOS kills this logical volume. The only soloution, apparently, is to smile and completely re-install the OS and start from scratch. After a prolonged period of swearing, I got the WRONG internet recovery started and re-installed a version of OSX (not MacOS) so old that Safari couldn't open the new fangled TLS-backed HTTPS sites. It was a nightmare to get brought up to "current" MacOS versions (not too current, though. Apple has decided that even though the Late 2014 Mac Minis can run MacOS Ventura that they would drop support for it on them anyway.) Once that was done, I installed Plex and the other services I used on the Mac and went my "happy" way.

Until a few weeks later when the logical volume died again after a reboot. There was no way I was going through rebuilding the logical volume and re-installing the OS again, so I went to Amazon and ordered a "known good" 2.5" laptop SSD by Samsung. Once I got that I found out that you need a special tool to extract the mainboard from a Mac Mini. Having had way more than my fill of Apple at this point I just muscled the mainboard out and took out the offending spinning rust drive. I got the new drive installed in it's place, got the Mini back together and proceeded to install the more correct Internet Recovery that got me to within a major MacOS version. I re-re-installed Plex and the other services I used and went my "disgruntled" way.

Until the VERY NEXT TIME I REBOOTED THE MACHINE! It's at this point I should tell you that Apple doesn't want you knowing what a Mac does at boot, no way no how. If an Apple logo shows up, then it found the hard drive with the OS installed on it. If a progress bar shows up, it found the directory containing the OS. There is nothing that tells you what "the progress bar sits at exactly half-way through for 12+ hours with no further progress." The only way I was able to do anything was by REINSTALLING THE OS AGAIN. Weeping tears of rage I initiated the WRONG Internet Recovery AGAIN and ended up on the ancient version of OSX. I managed to get it upgraded again (which included rebooting to go from OSX to MacOS) and then once it was up I installed Plex and the other services I used and went on my "disgusted" way. This time I swore I would never reboot the machine again, no matter how loud it's fan screamed.

It locked up not a week later.

So after destroying the Mac Mini with a 5 pound mini-sledgehammer (this is an exaggeration, I unplugged it and put it in the storage room to be silently reviled for all time) I went looking for something to replace it as the home server. I have been looking at various NUCs (Next Unit of Computing) since Intel announced them in 2012. There was just something compelling about a tiny computer that packed a somewhat large punch! The only problem is that the Intel NUCs were REAL expensive, and not many 3rd parties made them in the early years. Well in 2023 that is no longer true. Lots of companies now make NUC-form factor mini computers, and I had been keeping tabs on them. One name that came up often was Beelink. They made a lot of different NUCs in different CPU flavors (Intel Celeron, Intel i5, Intel i7 and AMD Ryzen) and were generally well regarded in the reddit homeserver and plex subreddits. After some digging I fould that HDR to SDR Tonemapping (basically stripping out the High Dynamic Range additions and re-rendering them in Standard Dynamic Range for displays that don't handle HDR) would ONLY offload to the GPU on Intel systems with Intel GPUs, and only running under Linux. With this guiding my choice, I went to Amazon!

PXL_20230811_221310359.MP

I ended up selecting the Beelink SEi 11 Pro 11320H. (I would normally insert an Amazon link there, but I have just noticed that Amazon automatically redirects the URL for the SEi 11 to the SEi 12! Sneaky!) It has an Intel i5-11320H processor that can run at up to 4.5 GHz, an Intel Iris Xe GPU, 16 GB of Dual Channel RAM and a 512 GB NVME drive with Windows 11 Pro pre-installed. This last bit was very unexpected, but nice! Once I had downloaded the recovery environment and saved it to a USB key, I went about making a Debian 12.1 "Bookworm" USB installer and wiped the tiny computer. I have been running Debian for ages on my machines, ever since I stopped using Ubuntu ages ago. It's a robust OS with a package system I really like.

With me being me, I couldn't do a straight forward install. Normally you would either do a "headless" (IE: no monitor, keyboard or mouse connected to the server. Just command line goodness) or a Desktop Environment (a XWindows server, a window manager and all the GUI applications that come along with it) install. But me? I want a headless server that runs X but only via a VNC server. I selected the Desktop Environment option and chose KDE Plasma 5 as the Window Manger (hey, my Steam Deck uses KDE Plasma 5 in desktop mode, it should be familiar at least!) as well as the SSH server option and then let Debian go to town! Once the partitioning and formatting of the drive were done, it copied all the initial packages to the new filesystem and then restarted!

Now the familiar joy of updating the APT database and updating the pre-installed packages and installing the other packages I would need. Lets see... Nginx for serving web pages... Wine for running that one windows program I need under Linux... of course Plex as well as some other odds and ends. I changed the default shell from Bash to ZSH and installed Oh My Zsh and my preferred theme PowerLevel10K. Oh yes, and a VNC server. I've used TightVNC Server in the past, so I installed that one. At this point I did a bit of googling about how to get TightVNC server working with KDE Plasma and... Huh. It seems like any VNC server and KDE Plasma don't work no how! Well, theoretically there is a KDE Plasma VNC Server package that you can install, but then you have to be running KDE plasma as the default interface, and I want the machine to boot to the command line. Clearly I should have done my research BEFORE installing the Window Manager!

So I fell back to my "old faithful", XFCE4. It's a pretty light weight (IE low resource usage in most situations) window manager and I've used it a LOT back when I ran Gentoo Linux, so that should work! There are plenty of tutorials on getting XFCE4 working with TightVNC Server in a systemd environment so getting it up and running is easy!

So now I have a little (but powerful!) server running Debian 12 hosting Plex as well as running a VNC server for when I need to access a GUI. Perfect!

Well, it would be perfect if the machine didn't lock up after a few weeks.

This was probably my fault. I couldn't leave good enough alone! I had decided to migrate from having Plex installed as a package in the main OS to having it run as a docker container. Docker is sort of like Virtual Machines, except instead of running an entire OS and the applications installed in that OS, it basically just runs the applications. This allows you to run a lot more applications in the same amount of processing space that a single VM would take up. Not long after moving Plex to Docker the machine was unresponsive one morning. I had my partner (who works from home) restart it, and when it came back up it let me SSH in, but when I tried to do anything it said that it was out of disk space. I knew this wasn't true, as df -h was telling me less than 10% of the / partition was being used. After a lot of googling I finally figured out that the inodes for my root (/) partition were 100% used up! (df -ih is invalulable for diagnosing inode issues!) After poking around it looked like a Docker container update for Plex went wrong and ended up using all the available indoes for my filesystem!

I attempted to clean this up manually, but purging the unused images in docker only got it down to about 75% use, and that was WAY too high! I purged all the docker images (after removing the containers) and this got it back to a much more sane 2% inode usage. At this point I decided to back away from Docker and go back to just running Plex as a package. Problem solved!

Well, no. The machine became unresponsive again.

It turns out that it had been core dumping almost endlessly without me knowing it. The Wifi driver that Debian uses for Intel Wifi cards wasn't 100% compatible with the wifi chipset used in the machine, and it was just core dumping over and over again, filling up the /tmp directory. Thus began my holy war against the wifi driver, which ended shortly thereafter once I found I could blacklist it in systemd and that made Debian ignore the card entirely. I also found out that when I installed XFCE4 that it cheekily made itself the default boot option, and that it might have been applying sleep or hibernation rules that was making the computer unresponsive. So I set Debian back to boot to the command line, and gutted all possible sleep or hibernation options that I could find.

I went on a little detour (now that I was looking at the journalctl logs frequently I was noticing a lot of errors and warnings in there!) getting bluetooth disabled and getting PipeWire audio working on the system. Listen, I know I will never have any speakers hooked up to it, but it's the principal of the thing, ok? It also cleaned up the logs a lot! Pro Tip: to get PipeWire working with XFCE4 running via VNC you have to add export XDG_RUNTIME_DIR=/run/user/<GID> to your .vnc/xstartup file. With all that done, I could finally sit back and

It was unresponsive AGAIN the next day.

Dear reader, I was perplexed! I had spent DAYS neck deep in system logs and had htop running constantly, monitoring the CPU and memory loads. But perversely this is what I love about running my own server. Chasing things down, following (hopefully new, but often times old) forum threads to see what other people have tried, and just poking at hundreds of different things to see how to get things running the way I wanted them to! It looked to be disk space issues related to inodes running out again, so I started to plan to move the whole filesystem from the venerable, but old, EXT4 to the new and inode free, but also potentially less stable, BTRFS. (I choose to call it B-Tree FS, like back in the olden days when I first started following it's development.) While researching the best way to go about doing this migration I was also combing the logs and I found a LOT of the ancillary bits of XFCE4 were core dumping over and over again. I decided it was time to change the VNC window manager again!

So to cut this part short, I tried a lot of old favorite window managers, and none worked how I wanted them. I tried Enlightenment (It ran but it refused to put window controls on any program that wasn't built-in to Enlightenment. This really stung because in the late 90s Enlightenment was my JAM!), Blackbox (I am a bare-bones window manager enjoyer, but it was too bare bones even for me!), Fluxbox (an old favorite that I used to run from the early 2000s, the method for configuring it is now far to cryptic for me to try to use it day-to-day), and then AwesomeWM (I hadn't heard of this one before, and it would have worked great, if the main window manager process didn't crash every hour or so. Seriously, the background would be loaded and the cursor would move, but you couldn't access the application menu or click on anything, the window bars would be gone and you'd have to restart the whole VNC server! I spent the most time trying to make this one work.)

While I was on the Window Manager carousel, I also read that by changing the /tmp directory from just a directory in / to it's own tmpfs (a type of file system that is basically a RAM drive, after every reboot it's gone and is recreated from scratch). This seems to have alleviated the inodes on the root filesystem from getting used.

This brings us to this week, and once again the machine was locked when I woke up. I had a monitor connected to it, and the last thing printed on the screen was the systemd syslog journaling system, journald, core dumping. So it seems like the journal log got too large and caused the process to dump it's memory to disk? After EVEN MORE googling, I created a journaltrim.system and journaltrim.timer for systemd to run. So every day at 1am the timer trips, running the .system file which tells journalctl to "vacuum" it's log down to 3 GB in size. This trims the file down to the most recent 3 GB of logs, and will hopefully keep it from core dumping again.

Once I had that done, I admitted defeat in the Window Manager War and switched back to XFCE4 as my VNC Window Manager of choice. In order to fix up some errors that were showing up around XKEYBOARD, I switched to TigerVNC server (apparently it was a fork of the aborted TightVNC 4 server) which is supposed to not generate those errors. For some reason re-installing XFCE4 lead to a LOT less core dumps by it's subsystems (except for light-locker which is a login interface for X that I was never going to use, so I uninstalled it). Naturally by changing VNC servers my TightVNC server xstartup file no longer actually started up XFCE4, so more googling was required. This is what I ended up needing to get XFCE4 to start properly with TigerVNC:

unset SESSION_MANAGER
unset DBUS_SESSION_DBUS_ADDRESS
export XDG_RUNTIME_DIR=/run/user/<GID>
/usr/bin/startxfce4
[ -x /etc/vnc/xstartup ] && exec /etc/vnc/xstartup
[ 0r $home/.Xresources ] && xrdb $HOME/.Xresources
x-window-manager &

One other nagging error I would get on boot was unable to create directory '/run/user/<GID>/dconf': Permission denied. dconf will not work properly. It turns out that with no actual user loaded at boot time, when VNC would start up and ask for the dconf directory, it wouldn't be there. The solution I found was running sudo loginctl enable-linger <USERNAME> which tells systemd that my user account is to be loaded at boot and persist until the server shuts down. And lo and behold, no more errors on startup!

So now it's just a waiting game. I hope the system is stable now, but there are no guarantees, and new, startling failure awaits seemingly every morning! Since this post is now novel sized, If there are further issues, I will write them up in a new blog post. I apologize for not having EVERY solution documented in this blog, a lot of this happened in mid-to-late August and writing this stuff up from vague memories and terse discord posts is difficult. I've given the broad strokes, and googling errors from your syslog is your friend!

Archives

Syndicate

Home Servers And Me: Dear God, What Have I Gotten Myself Into

IN THE BEGINNING

ABOUT