Category Archives: Linux

Raising Software: A Tale of Unruly Tools

One of my retirement hobbies has been shooting and editing videos, mostly of our bicycle adventures.  When I was young, back in the middle of the 20th century, one of my uncles had a movie camera and projector, and I learned to splice broken film and edit clips together with razor blades and transparent tape.  He moved away and I grew up, and the skill was filed away as just another data point in How The World Works.

Fast forward to the second decade of the 21st century, when I decided to strap a cheap digital camera to the handlebars of the bicycle.  Digital editing requires software, a much less messy and more forgiving (i.e., non-destructive to the original “film”) process.  Since we use Linux exclusively as our computing platform of choice, there were a number of choices for Open Source video editing projects, mostly attempts to clone commercial video editing software for Windows and Apple.

I looked at Cinelerra and Kdenlive, which are fairly complex tools with a steep learning curve, but settle on OpenShot, a simpler tool with a lot of attractive features and a cleaner, no frills user interface.  Openshot 1 was essentially a one-man project, a user interface written in Python with the Tk graphical toolkit, built on the Unix principle of lots of little programs working together, using the multimedia command-line editor ffmpeg and associated libraries, the Inkscape vector graphics editor, and the Blender animation tools and libraries.

Openshot 1.4.3

OpenShot made it possible to load up a project with multiple clips, arrange them on a timeline, trim the ends or snip out shorter clips, and add titles and audio tracks, like voice-over and musical scores, in multiple overlaid tracks, and turn video and audio on and off by track and clip.   For several years, this worked fine.  However, success is not always a good thing, and Openshot suffered from it.  Not in the usual riches-to-rags story of an entrepreneur or rock star who descends into excess and loses his way, but in the attempt to seek wider appeal.

OpenShot was originally a purely Linux product, as mentioned.  To port the project to Windows, it was necessary, for a project with limited manpower resources, to keep a common code base.  Openshot attempts to keep a smoothly flowing user interface through parallel processing, using OpenMP.  The Windows philosophy is based on a single-user, single-task model rather than the multi-user, multi-tasking model of Unix.  When Windows evolved into a multi-tasking system, it used the pre-emptive model, which is relatively inefficient for the pipelined processing flow in the Unix cooperative model.  So, Windows applications tend to be monolithic, with all resources loaded in one huge process.  Parallel processing in Windows monolithic applications is accomplished largely through threads, rather the inter-process communication protocols.  I’ve programmed with threads in Linux, which tends to be tricky at best, and takes a thorough knowledge of parallel processing and memory management to do successfully.

The move to Windows-compatible architecture necessitated rewriting a lot of the Unix-specific standard library code in C++, which introduces the danger of memory-management issues. Openshot began to get buggy, with newer versions crashing often.  The developers claim it is the fault of unstable system libraries, but I’m not buying that explanation.  Since the user interface was also getting a major overhaul, work on version 2 meant that no more bug fixes were forthcoming for the now-crippled version 1.4.3.  Alas, initial releases of Version 2, with the back end still largely 1.4 code base, was also prone to crashing as well as presenting an unfamiliar user experience.

Openshot 2.4.1

So, we stayed with version 1.4.3 for a while longer, with short auto-save intervals.  Finally, with crashes and deadlocks rampant, we just had to try version 2 again.  Yes, the crashes had been largely fixed, but the new version was a monolithic package wrapped in a “launcher,”  (AppImage), apparently a type of container tool including all of the library dependencies, which rapidly ate up all available memory and much of the swap space, becoming so slow as to be indistinguishable from deadlock. Memory leaks come to mind when seeing this type of behavior.  On top of that, some of our favorite tools for controlling video blanking and audio muting by track were missing, to be restored by popular demand in a future revision.   Back to 1.4.3 .

Kdenlive

The other alternative, kdenlive, based on the Konquerer Desktop Environment (kde, not native to Ubuntu, thus necessitating loading the complete KDE library support suite), is yet another learning curve, with many editing feature differences and rendering options.  We did use this for one video, as the internal de-shaking algorithm is a bit more efficient than reprocessing the clips with command-line utilities, and we had a bad experience with a new camera mount that was sensitive to shaking.  Kdenlive also crashes from time to time, lending some credence to the Openshot claim that the system libraries are at fault.

But, I continue on, putting up with slow response, freezes, and crashes, because I’m familiar with the features I like, and it produces acceptable videos.   I may spend some time to learn kdenlive, but, hopefully, Openshot 2 will improve over time.  The other alternative is to try to build a native Ubuntu version from source, which is a daunting task, since most open source software has a very specific support software version dependencies.   Despite the woes with Openshot’s growing pains, it is still faster than writing command-line scripts to use the Linux multi-media-processing utilities ffmpeg or avconv to trim and assemble video clips and sound files.  I use those command-line tools to assemble time-lapse videos from my home-built security camera system, but that is much simpler.

Affordable Linux: A Quest

As most readers know, we here at Chaos Central are a Linux establishment for our primary computing.  We use Apple iPhones and iPads for mobile convenience and casual surfing, but depend on Linux for everything else–with the exception of what has become known at CC as “The Screaming Season,” where we are dependent on Windows to run Turbotax to render unto Caesar.

As much as we’d like to be on a 3 to 5-year cycle for computer upgrade/replacement, retirement, and before that, negotiated fee schedules, have reduced that to “when it breaks, MacGyver it.”  The result has been dismal.  When Judy’s desktop Linux machine died last year, having already received a CPU cooling fan transplant several years ago, salvaged from an old machine before it went to the recyclers, we pressed our old 2007-vintage laptop back into service, restoring her files from backup.  That machine had had a new hard drive and more memory installed during its last incarnation, but the extra memory had faded away, and the CPU not up to the demands of the 2016 version of Ubuntu Linux, so it was running slowly with the reduced-capability desktop.

Since the current crop of machines with pre-installed Linux are aimed at developers rather than standard office use, we couldn’t justify a new laptop from one of the few vendors who supply them.  So, as in the “bad old days,” we went shopping for a cheap Windows laptop on which we could install Linux.  The “cheap” machines are now essentially Microsoft’s answer to the iPad and Android tablets they see as main competition–a light-weight, small-screen device with as small (32GB) flash memory (EMMC) instead of a hard drive, and no optical drive.  Not to worry, I loaded a light-weight version of Linux (Lubuntu) on a 64GB flash drive and it became her new machine, with the added advantage of being portable.  The big disadvantage was the need to plug in the Linux memory stick to run it.  The biggest problem was the short life span of flash drives when used as the primary drive on a Unix-like system.   And, the system was incredibly slow.

Well, in only a few months, the flash drive expired.  No problem, I thought, just build another and restore.  But, as many others have found, the ability to install in the first place was a fluke: these little Windows tablets masquerading as a laptop by having an attached keyboard are finicky: try as I might, I could not get it to boot from the install memory stick again.  So, the little blue Dell machine has become the new Turbotax home, effectively retiring the $80 refurbished Vista->Windows7->Windows10 desktop we acquired a couple years ago and built up (another $40 video card) to run Win10 (badly).

So, Judy was back to the old laptop.  This time, we pressed her old 2010 HP Netbook back into service, even slower and more clunky than the older Compaq (which she also used during this evolution), running a light-weight version of Mint Linux.  The office was beginning to look like the computer labs of the early 2000s, where we pressed piles of recycled obsolete machines into experimental compute clusters to get 21st-century performance out of 20th-century machines, a distributed computing architecture conceived in the mid-20th century but not economically feasible until there was a huge supply of retired machines awaiting landfill space.

A comprehensive check of Linux-capable systems was not promising.  The custom-built laptops (our new criteria is portability to fit our mobile retirement life-style) are expensive and beyond our budget with no current paying clients.  My own laptop, which was high-end (and expensive) in its day (2010) is showing its age, with a too-small disk, weak battery and some screen burn-in.  But, the equivalent new models are in excess of $2000, difficult to justify on a pension without contract income to support it.  Nevertheless, it is only a matter of time before it wheezes its last, so we need to first find an affordable and long-term solution for Judy’s immediate need while keeping in mind the future budget hit for a developer machine.  The other option for affordable Linux-ready or pre-installed laptops are refurbished machines proven to be compatible.  The disadvantage there is the same facing my situation: the batteries are in mid-life, so long-term reliability is an issue,as well as mobile portability.  Older machines are also heavier than desired.

So, we finally settled on a fourth option: installing Linux on a Chromebook.  Chrome is a laptop operating system from Google, a variant of the Android system used on smart phones and tablets.  Since Chrome and Android are basically embedded Linux, it is relatively easy to add Linux to them as a container, using Crouton, a management system similar to the Docker system used to install and manage containers on native Linux machines.  I say relatively, but that assumes some heavy-duty skills in system administration, since Chrome is a locked system.  First, the system must be unlocked into Developer mode, which basically voids the warranty and security protections.  At least, Google does not provide support for a machine in this state.  The hacker community, then, is at the mercy of well-intentioned, but not always well-informed peers for advice.

I followed the available advice in the hacker forums into the corners that most others had painted themselves, finally figuring out an important missing step.  To use the Crouton system, the wily hacker must first use the official Google developer shell to set the root and administrator passwords.  Then, Crouton can access the administrative account to install the Linux container.  The less-helpful solutions offered essentially required reinstalling Chrome and starting over, which was not necessary at all, but the default obvious solution to developers and admins raised in the Age of Windows, where frequent rebooting and periodic reinstalling is considered a normal operational necessity.  Unix and Linux admins only reboot after installing a new kernel, which may be months or years, depending on the stability of the system and whether the management decides to incorporate the latest patches when they are released (many don’t, for fear it will break fragile applications, despite the security risk of not upgrading).

So, now Judy has a brand-new system that, hopefully, will be stable for a reasonable time.  The advantage of this system is that it is still Chrome, so she can use the user-friendly desktop apps available to Chrome, similar to the ones on Apple iOS, and switch to the Linux desktop with a simple key-combination to run the Linux applications we depend on.  A similar key-press combination switches back to Chrome.  And, the Chromebook, though large enough to have a full HD display, is light enough to be highly portable, with a very long battery life.  The minor concession to our ill-advised attempt to co-exist with Windows is that we now have a portable tax preparation machine.  During the rest of the year, we may turn it on once in a while to let the updates install and top off the battery, so that it won’t take three days to update next tax season.

Chromebook, running Chrome with Crouton in a browser window.

 

Chromebook, running Ubuntu Linux with LXDE, with a simple browser, and essential apps, including Fiberworks Bronze weaving software running under WINE. (a Windows emulator running in a Linux container under Chrome–boggles the mind, no?)

Home-grown Webcam Evolution

Good, fast, or cheap: pick any two. A few years ago, I decided to build a webcam, rather than buy one, which were about $100, plus whatever monthly service charge for hosting the link on the cloud. I’m not sure I beat the cost, quality, or speed, but it’s kept me actively managing the system. Instead of a plug-n-play wifi-enabled little module, I have a rats-nest of wires, USB hubs, USB external disk drives, Raspberry Pi with external camera on a ribbon cable, and, now, extension cords and 50-foot CAT5e cable. About once a year, I wear out the flash drive that the system runs from, so there is some on-going cost. Plus, much coding in Python and Bash, a distributed network system to process the video, cron jobs, an API key and code to get weather information and sunrise/sunset times to turn the camera on and off.
 
Meanwhile, the landscaping has grown up around the office window, so the camera sees mostly flowers and bees (left view). So, I moved it to the office closet, which was not so simple. 1) Being “cheap and fast,” the software wasn’t very “good,” so I had to modify the Python code to provide a way to restart the system during the day without losing all the footage: the system keeps a week’s worth of data, and erases last week’s when starting a new day. This also entailed generating images with a timestamp, rather than a simple index, as the camera software libraries start indexing at 1 each instance.
OK, that’s done, and the system retested, bugs fixed, etc., which ended up losing most of a couple day’s surveillance: “cheap” means not having a second system for development and test, and “fast” means not doing a proper code review before testing, which leaves “good” out of the equation.
Of course, nothing ever goes smoothly: after moving the computer/camera, the USB hub and disks into the closet, we weren’t getting communication with the processor.  So, drag everything out next to the desk so we could hook up a console (keyboard, mouse, and monitor) to the computer, retest with the original ethernet cable, then with the long one.  Everything worked, inexplicably, since nothing really changed except having the console hooked up.  Unhooked the console, and moved everything, still running, back into the closet, then adjust the camera  view, and we’re done–except for resetting the key agent so the computer could talk to the video processing computer.
Our program takes a photo every 10 seconds, updated to the web server, then assembles a timelapse video once an hour, showing one hour in 30 seconds.  After letting the revised program run for a couple hours, we checked the logs and directories: still showing last week’s video.  Aha.  The video compositor program needs a numerical sequence for the images in order to assemble a video: the timestamp doesn’t meet specification.  So, back to the drawing board, rewrite the Bash script on the video processing computer to renumber the files in a format the video assembly utility understands.  Success at last.  The system is now fully functional, but made a bit more complex by the simply addition of a restart ability.
The results can be viewed at http://www.parkins.org/webcam

So, not fast, not good, and not cheap, when you consider the effort put into a custom, one-of-a-kind system. But, it keeps me in practice coding and designing.  And, because it runs on Linux, I can keep the security patches current: many purchased plug-and-play “appliances” have their code burned in at time of manufacture, and may be designed around already obsolete and buggy software.  My little system has undergone several major upgrades of the Debian Linux distribution core system  (Linux kernel 4.9.35, patched 30 June 2017: latest release is 4.12) and gets regular security patches and bug fixes.  That’s even newer than my primary laptop (Kernel 3. 13.0, patched 26 June 2017).  Considering all the little Rasperry Pi machines scattered around the house, it may be prudent to work on configuring them for diskless boot, in order to preserve the flash memory chips on-board.

Not your plug-n-play webcam…

Beware the Wiley Hacker: a Cautionary Tale

Not an approved method of securing your network...
Not an approved method of securing your network…

A while back, we wrote about rebuilding a crashed Raspberry Pi system.  In the course of reinstalling the system (on a new chip–the old SD card that contains the operating system had “worn out”), we had made a fatal slip.  This system happens to be our gateway system, i.e., connected directly to the Internet to provide us access to our files and some web services while out of the office.  Unfortunately, this also provides the opportunity for the world-wide hacker community to try to break in.

Normally, we have safeguards in place, like restricting which network ports are open to the outside and which machines and accounts are allowed login access.  However, in our haste to get the new system up and running as quickly as possible, we connected the device to the Internet to download upgrades before the configuration was complete, meaning the system was exposed without full protection for several hours to several days.

Screenshot-larye@raspberrypi2: -var-log
One hour’s worth of break-in attempts by hackers. Note that attempts to access system accounts (root, pi) are denied because we don’t allow external logins for these accounts.  The accounts that are allowed are restricted to public-key authentication (basically, a 1700-character random password).  Attempts on this one-hour snapshot come from four different sources: Porto, Portugal; Shanghai and Baoding in China; and Tokyo, Japan (possibly hacked machine, as the name is spoofed).

Now, our security logs record, on a normal day, hundreds of break-in attempts (see screenshot above).  We aren’t the Democratic National Committee or Sony, just a small one-man semi-retired consulting business.  But, the hackers use automation: they don’t just seek out high-value targets, they scan the entire Internet, looking for any machine that isn’t fully protected.  If they can’t steal data or personal information, they will use your machine to hack other machines.  If you use Microsoft Windows, you are undoubtedly familiar with all sorts of malware, as there are many tens of thousands of viruses, trojan horses, adware, ransomware, and other malevolent software that invades, corrupts, and otherwise takes over or cripples your machine.  Unix systems are less susceptible to these common attacks, but, if an account can be compromised, or a bug in the login process exploited, eventually a persistent attacker can gain system privileges and install a ‘rootkit,’ a software package that replaces the common monitoring and logging software, redirecting calls through the rootkit, which hides its existence and activities from the reporting tools, even the directory listing utilities.

Once an attacker takes over a Unix or Linux machine, there is no limit to the damage they can do on the Internet, as Unix/Linux is the basis of most of the servers on the Internet, and can become as a SPAM server, web-spoofer, or hacker-bot itself.  (Microsoft Windows Server has the rest, nearly half, and they are even easier to break into.)  I began to suspect this might have happened when normal system functions failed to terminate or run correctly.  We have a lot of custom software built on this machine, which runs on a scheduler.  The machine got slower and slower, and it was apparent that the jobs run by the scheduler were never exiting, filling up the process table with jobs that weren’t doing anything, except taking up space, which is always at a premium.  Clearing out all the jobs, restarting the machine, and starting the processes manually worked–until about 2:00pm.  Very suspicious: a rootkit, once installed, can repair or re-install itself even if the administrator restores many of the co-opted command files by normal upgrades or by a conscious attempt to recover from the intrusion.

The main problem seemed to be with /bin/sh, the system shell, which is actually /bin/dash, a shared object.  Cron, the scheduler, uses dash to run the jobs, where the normal user login shell uses /bin/bash, a non-linkable executable shell with similar functionality.  A rootkit is generally constructed as a filter, wrapped around the co-opted commands, so it would be easier to link to the *real* /bin/dash in an undetectable manner from the filter program than it would to wrap /bin/bash.  In this case,  assuming an intrusion was the cause, something went wrong, rendering dash non-functional.  Perhaps the intrusion was not compiled for the ARM  processor used by Pi, though most of a rootkit would be scripted to be portable among different CPU architectures and Unix/Linux versions.

An analogy to the problem would be like finding out who let the horses out: it is easy to identify wolf or horse-thief tracks outside the barn when the door is barred, but, if you left it open and the horses have bolted, it is more difficult to find out what happened–the traces are covered. I did install some intrusion-detection software, but running it after the tracks are covered over is usual a futile effort.  However, there were enough questionable traces to warrant taking corrective action.  Besides, even if the problem had been caused by some inadvertent misconfiguration on my part (unlikely, considering the fact that the machine could be made to run for several hours before the problem reasserted itself), the solution was clear:  reinstall everything.

The first step is to backup the data, including configurations.  Now, this is not just an ordinary computer:  Raspbian, the Debian-based operating system distribution designed for the Raspberry Pi computer, comes with a simple desktop intended to introduce new users to Linux.  But, this machine doesn’t use the desktop and is not even connect to a monitor most of the time: it is an internet gateway, web server, and custom webcam driver, so has a lot of “extras,” both loaded from software repositories and written especially for this installation.  Backups are important, since much of the software only exists on this machine.  And, since we only have one camera, fail-over isn’t possible without physically moving the camera from one machine to another, not a trivial exercise, as the connection is on the motherboard rather than an external connector.

Now comes the glitch: Since the introduction of the Raspbian operating system, it has been based on Debian 7; but, since Debian 8 was recently released, a new version of Raspbian is also available.  So, the machine was rebuilt with Raspbian “Jessie”, replacing Raspbian “Wheezy” (the releases named after Toy Story characters rather than just the numbers–as with Apple OS/X, Debian releases tend to have names in addition to release numbers).  Installation on Raspberry Pi is not like other computers.  Since there is no external boot device, the operating system “live” image is loaded onto the SD card that serves as the boot device and operating system storage.  Initial configuration is best done without a network connection, since the startup password is preset and well-known.

So, avoiding that mistake (booting on a network with the default passwords, the single most preventable source of hacker intrusions), we booted with the network cable disconnected and a monitor and keyboard attached, changed the password and expanded the system to fill the SD card, set up the other user accounts, then shut down the system, removed the SD card, and mounted it on the backup server to finish transferring vital data, like the security keys and system security configurations.  In larger systems with a permanent internal boot drive, such “hacker-proof” installation is done on an isolated network, but, since the boot drive on a Pi is removable, it is easy enough to edit the configuration files on another system.

So, with the system fairly well hardened by securing the system accounts and user accounts, it was rebooted attached to the network and the system upgrades and extra software packages (like the web server) installed.  So far, so good.  But, since we upgraded the operating system, the server packages were also upgraded, most notably moving from the webserver, Apache, version 2.2 to version 2.4.  Apache has been the predominant web server software on the Internet for 20 years, so it is in a constant state of upgrade, for security and feature enhancements.  Between version 2.2 and 2.4, many changes to the structure of the configuration files  were made, so that not only did the site configuration need to be restored manually, but there was a fairly steep learning curve to identify the proper sequence and methodology by which to apply the changes.

Then, of course, were the additional Python modules needed to be installed to support the custom software, which involved downloading and compiling the latest versions of those, since Python 2 also upgraded from version 2.7.3 to 2.7.9 (we haven’t yet ported the applications to Python 3, which moved from version 3.2.3 to 3.4.2 between Debian 7 and Debian 8).  Finally, there were other tweaks, like comparing system configuration files to update group memberships for access to the camera hardware, loading the camera drivers, and setting file ownership and permissions for data and program files.

We could have saved most of this by sticking with Raspbian Wheezy, but eventually, support for older systems goes away, and the newer systems are usually more robust and faster: open source software evolves rapidly, with new minor releases every six months and new major releases every two years for most distributions, and an average life span of five years for maintenance of major releases and a year for minor releases.  As we said before, Linux is free, if your time is worth nothing.  The price of keeping current is constant maintenance.  Patch releases occur as they are available, with maintenance upgrades almost daily.

Finally, after a week of tweaking and fiddling, the webcam service is back up and running.  And, the security logs show break-in attempts every few minutes, from multiple sites all over the world (one from Portugal recently, others from unassigned addresses–ones with assigned addresses undoubtedly come from computers that have been compromised and used as hacking robots, as hackers don’t want to be traced back to their own computers, ever).

So, the moral of this post is:  don’t ever expose a stock, unmodified computer system directly on the Internet (which is difficult to do, when all upgrades, new software, etc is available only through download from the Internet–which should be accomplished only from behind a proven firewall).  But, you can set passwords and change system accounts before joining a network.  And, if your computer is hacked, take it to a professional, and don’t grumble about the cost or time it takes to restore it.  Pay for malware scanning software and keep your subscription up to date, as well as scheduling upgrades on a regular bases.  And, if you are a professional, don’t take shortcuts (i.e., install and configure off-net or behind a firewall), keep good backups, install intrusion-detection software early, and check for security upgrades daily.  Change the default passwords immediately, and create a new, weirdly named administrative user, and deny external logins for all administrative users.  Use two-factor authentication and public-key encryption on all authorized user accounts.  They are out there, and they are coming for your computer, even if you don’t have data worth stealing: they can use your computer to spread SPAM or steal data from someone else.

System Administrator Appreciation Day

The last Friday in July (today!) is the 16th annual system administrator appreciation day, an obscure celebration started in 2000 (by a system administrator, of course) as a response to an H-P ad showing users expressing gratitude to their sysadmin for installing the advertiser’s latest printer.  To my knowledge, none of us have ever gotten flowers or even donuts on “our” day, but it does remind us in the profession that our job is to keep the users happy, mostly by keeping the machines happy, but also by attending to their needs in a prompt and professional manner.

I was reminded of the event not only by notices in the discussion forums and IT email lists, but by the fact that today, the replacement memory module for our network server came, and I installed it. A simple procedure, but one that takes a fair portion of the sysadmin’s bag of tricks and tools to accomplish.  Bigger shops might have a service contract with the hardware vendor, but in many cases, the sysadmin is also the hardware mechanic.

Dell110_DSCF1134

For a few months, the server, a Dell T110, has been crashing every few weeks, fortunately not while we were on our two-month grand tour, but of concern, naturally. especially because it is a virtual machine host, and often has a half-dozen virtual machines running on it, which means, when the server goes down, half of our network goes with it.  Virtualization is a great way to run different versions or distributions of operating systems when developing and testing software, so not too many have production roles in the network, but it is still an inconvenience to have to restart all of them in event of a crash.

A red light appeared on the front panel of the server, indicating an internal hardware condition, so it was time to check it out.  First, hardware designed for use as servers (the T110 is aimed at small offices like mine) is a lot more robust than the average tower workstation you might have on your desk.  Note above the heavy-duty CPU heat sink (air baffles have been removed for access to the memory modules–the four horizontal strips above the CPU fins).  In addition, big computers have little computers inside that keep track of the status of the various components, like memory, CPU, fans, and disk drives, and turn on the light on the panel  that indicates the machine needs service.   Server memory has error-correction circuitry, as do most server-quality disk arrays, but this is limited to one error–the next one will bring the system down.

System administrators depend on these self-correcting circuits and error indications to schedule orderly shutdowns for maintenance, so that the machine doesn’t crash in the middle of the workday.  For most offices, this means late evening or weekend work.  For 24-hour operations, like web sites, it means shifting the load to one or more redundant systems while the ailing one is repaired, so no data is lost.  Companies like Dell supply monitoring software to notify sysadmins of impending problems, which is vital to operations where there is a room full of noisy servers and the admins are in a nice quiet office in the back room.  In our case, with just one server, we don’t use the monitoring software regularly, but it is useful for telling us which component the red light is for; then we can look up the location in the service manual and order the right part, and hope the system doesn’t crash before it arrives.

Normally, businesses that are thriving and need to keep competitive in the market replace their machines at least every three years.  Others, like ours, that operate on a shoestring and buy whatever resources we need for a project when we need it, tend to run machines five years or more, sometimes until repair parts are no longer available: since we run Linux, we have machines eight years or older that are still useful for running some network services.

Our server is almost five years old, so when I order replacement parts, they don’t always look like the ones we took out, or have the same specifications.  For that reason, I usually take replacement as an opportunity to upgrade, replacing all of a group of components with the a new set, which I did when a disk drive failed a couple years ago.  However, this time, since I’m semi-retired and don’t have a steady cash flow, I only ordered one memory module, to replace the failing one.  Memory comes in pairs, so having slightly different configurations in a pair causes the machine to complain on startup, but it still runs.  The “upgrade” alternative would have replaced all four modules, or at least the two paired ones, with the larger size, at a cost of $150 to $300 instead of just replacing a $50 module and putting up with having to manually restart the machine on reboot.

So, the sysadmin not only needs to keep the machines running, but running within budget, and making sure the operating systems and hardware capabiities can support the software users need to do their jobs.  If he or she is doing their job right, there won’t be any red lights in the server room, and the sysadmin will look like they aren’t doing anything…