25 October, 2014

Goodbye, Computer, and Thank You to Richard

oh, fudge.  My most heavily used, electricity-efficient computer died today.  It was an Athlon 64 X2 5400+ system with 4GB RAM and 160GB of used disk space (it's a 320G disk, but I eventually wanted to try things like Win7/8/10, so I left some space unpartitioned).  Of that, about 70G were in use.  I got it for $10 from a guy, Richard, that was once into computer building and tweaking, but just didn't want to deal with its instability anymore.  He said basically, if you want to see if you can get something useful out of it, here you go, have at it.  It was a slightly unstable upgrade from the 10.5 year old Dell Precision 360, whose class is 2.4GHz Pentium IV, 2.5GB RAM.  As you can imagine, the '360 is excruciatingly slow by today's standards, and for various reasons is still running what's basically FC 5 with a custom-built 2.6.17 kernel. (Yeah, I know, it's bad, I should have kept up with upgrades, so at least the software worked with modern stuff...water under the bridge which cannot be fixed without the aid of something like a TARDIS.)

I knew something was probably radically wrong with the Athlon system because mid-this past week, I had it shut down due to the ACPI thermal threshold being exceeded.  This is the first time this has happened since I got the system, so that was odd thing number 1.  When it initially happened, I thought this is the hardest I have probably driven this system since I got it.  It was doing real-time video decoding of HD video (I think w/o GPU assistance) to full screen, so scaling it too.  It was stuttering some, which I would expect for having to do both decoding and scaling

Odd thing number 2 is that after working stably for I'd guess 6 weeks or more, I started getting "processor context corrupt" panics, which honestly was nothing new.  As long as they don't bootloop or occur more than about 3 times a day, I could deal with it.  Most times if I just wiggled the (PCIe) video board a little, those sorts of errors would not occur for a while.

Odd thing number 3 was yesterday when I had what I think were thermal trips while just idling (!!!), and the system would shut down, I thought I would see what was going on in the BIOS setup "PC Health and Status" page where it allows you to set the ACPI thermal threshold from 60 to 90 degrees (in increments of 5), and every half second or so see the fan tachs, system temperature, and CPU reported temperature.  I am pretty sure bulk silicon even as small as an Athlon 64 X2 cannot rise in temperature 12 or more degrees in a second.  If anything it will rise maybe a degree or two per second.  I would see the reported CPU temp idle along at a normal-looking 38 degrees, but then jump to 42, then back to 38-ish, then 50, then 47, then 40, then 52, then 54, then 52, then 48, ...well I think you get the picture, temperatures which are going wildly all over the place.

I tried unscrewing the fan from atop the CPU heatsink, and vacuumed out as much dust as I could (and reinstalled it of course).  But that really didn't help much, especially since the first power on did not POST.  Again, for this system, that was unusual but not super unusual; often I would just drop a GB of RAM or something, and wiggling the RAM modules would restore sanity somewhat.  So I just shut off all power, pressed on the modules, wiggled the video board, and such-like things, and the next power on did POST.  That seemed to work OK, except for one or two more ACPI thermal trips.

This morning I had more troubles with the system.  I don't remember now whether it was more kernel panics, thermal trips, or whatnot, but the telling thing that something was just about irretrievably wrong was that the boot firmware detected something wrong in the CMOS or bootin, and had to use fail safe, default, etc. clocking.  I had never, ever seen that message out of this system, but once again, being a system which was given to me due to its instability, I figured it was yet another episode of wanting to be challenging to me.  But at some point, the entire system froze, unresponsive, on the Xorg screen, a with a handful of browser windows open.  I think that had not happened for a few months at least.  So i hit the reset, and went into setup to double check that this "defaults programming" didn't mess with stuff like Cool 'n' Quiet which I knew would lock up the system; it apparently did not, CnQ was still off.  I assumed since that had not been turned back on that everything else which I may have tweaked in order to have it not freeze was likewise left alone.

Well, it worked for a while longer, but eventually, it would turn on the fans, but nothing else would happen.  No amount of pressing on the CPU heat sink, wiggling the video board, or tapping the RAM would make this board even light the third of its diagnostic LEDs (it ordinarily only takes about a second for it to light that 3rd LED, and if it's having trouble initializing the video, it'll blink that one at about 1/2 Hz).

OK, so what else can I try?  I went and got a spare PSU; no change.  I know some systems don't even function unless they have a viable CR2032 installed, so I fetched a fresh one of those.  The old one was at about 1.8V (for those unfamiliar, it's supposed to be 3V); no change.  I got out the manual PDF, looked at how to reset the CMOS, did that; no help.  It was about as dead as a doornail, but not quite because the fans would still spin.

So, although there is still that rock-solid workhorse of a '360, it only has really limited usefulness.  But there is at least a little hope without having to spend any money (on hardware anyway).  Again, through the kindness of someone who was just going to toss a lot of computing hardware on the recycling heap because he has stuff that's better in certain ways, (see my post on Google+ about how I was quite blessed back in March) and it's a real hummdinger of a system (a Dell Precision 670, which is dual Xeons (which /proc/cpuinfo says are 3.4G yet cpuMHz says 2800...dunno, so either 3.4G or 2.8G), 4GB of ECC RAM), could have computed rings around that Athlon system, but the problem is, I measured the wattage draw of both at idle.  The Athlon was 70W or a touch under, this one is more like 150.  Ouch.  So it's good hardware, just a touch on the expensive side to run.

But I pretty much have very little choice.  I have lots of systems around here on the shelf (which work, and a few more which don't and I don't know what's wrong with them), but they're all from Pentium Pro through <= 2GHz Pentium IV/Celeron.  I could try installing Xubuntu onto one of these "tiny" machines, but ultimately, I don't have much hope that it'd be any better than hobbling along on my '360.  The only thing which would improve is interoperability with the world (such as Google+).  Just about every Web page these days demands at the extreme minium a dual core something or other, and the only (working) specimen in my collection is the power chugging dual Xeon computer on which I'm typing this.

My thought though is, I have been wanting to get another computer since Phenom was a thing (then Phenom II, then Bulldozer/Zambezi...well, you get the idea I think, continuously thinking about it but not doing it).  Although I am currently out of work, computing is (or at least was) my chosen profession, so it's probably a good thing to spend the money to upgrade to something stable, powerful AND energy-efficient.  I still have considerable savings, it's just a question of how long I have to make it last until I'm working again.

But there is a glimmer of hope on that front.  I just got a call from one of my friends who thinks I may be able to work with him relatively soon.  So we shall see.  That would at least take some sting out of spending $1200 - $1500 on a new system.

Direct all comments to this Google+ post.

English is a difficult enough language to interpret correctly when its rules are followed, let alone when the speaker or writer chooses not to follow those rules.

"Jeopardy!" replies and randomcaps really suck!

Please join one of the fastest growing social networks, Google+!