Tag Archives: maintenance

Maintaining Thor

Thor was dirty. No, Thor was filthy. Thor is Pam’s desktop computer, an Intel quad core box I built for her back in 2008. Next to her desk, it’s raised off the floor a few inches and we regularly clear off the surface dust and air filters but it had been a while – a couple of years, probably, since it’s been properly torn open and cleaned. Lately, signs of instability were growing more frequent. So the other day I opened the case.

Well, I guess it was to be expected. The innards were choked with dust. The squirrel-cage fan on the graphics card, one of those big honkin’ GeForce cards, hardly had room to spin! I looked inside the box, looked at the can of Dust Off in my hand, looked back inside, thought about how many cans I might have in the basement store… Nah, this would never do.

So I set up a work table outside the garage door and hauled out my shop compressor. 100 PSI? I thought about the possibility of blowing components right off the motherboard, the moisture that would accumulate in that air after a few cycles… I changed the blowgun tip to something a little more diffuse and got to work.

It took a while. But when I was finished Thor’s innards once again looked like new. I closed the box, cleaned up my tools, wrestled the box back upstairs. And it wouldn’t boot.

Nothing really seemed out-of-place, I was careful with the air streams, I hadn’t forgotten any cables. Still, no boot. Or, more precisely, the pulsating orbs of Windows 7 starting up would halt and the blips of drive activity would take on a regularity that indicates a hang. To add an interesting twist, it booted nicely to Safe Mode.

Because of the way Windows works, this was pointing toward an issue with video. The card was obviously initializing so I replaced the driver and exercised the various modes. All looked fine but the situation was unchanged.

Maybe the boot drive was going south from running in all that heat before the cleanup, and the shock of moving stuff around pushed it over the edge. Before I went to work I imaged the drive. I could virtualize the image, recover Pam’s settings and apply them to a new Windows 7 install. As part of Thor’s long-overdue maintenance I planned to change out the boot drive for one of those hybrid drives I like and the drive was in there anyway, empty and waiting. The install media booted fine and the installation began. Wouldn’t you know, though, when the installer got the point that it boots the newly installed kernel, before personalization, it hung again!

Puzzling. The hardware POSTs, Safe Mode boots, a normal boot hangs, as does a new Windows install. Log checks in Safe Mode, as well as other diagnostics run from bootable media all seem okay. Everything pointed to a video issue.

So I pulled the GeForce card out, grabbed a loupe and looked it over. Aha! There was corrosion on some of the contacts! Cleaned ’em up, that’s what I did, and coated ’em with Stabilant. What’s that? From the tech notes…

Stabilant 22 is an initially non-conductive amorphous-semiconductive block polymer that when used in thin films within contacts acts under the effect of the electrical field and switches to a conductive state. The electric field gradient at which this occurs is established is during its manufacture so that the material will remain non-conductive.
Thus, when applied to electromechanical contacts, Stabilant 22 provides the connection reliability of a soldered joint without bonding the contacting surfaces together!

It’s amazing stuff. It’s also seriously expensive. It’s by far the most expensive fluid in the house. Old whisky? Nah. Even printer ink is way cheaper. But it works. On the good side, a little goes a long way. I’ve still got more than half of the 15 mL I bought back in 2006.

The graphics card slipped into its connector with friction-free ease. And not only did Thor POST faster than I’d ever seen it POST, but it booted like nothing had ever been amiss.

 

Storage: the plex is missing

Last year, in the midst of migrating the VM farm from VMware to VirtualBox, I had a Seagate drive go tits up. Luckily it was part of a RAID so I just substituted another drive and that was that. It was still under warranty so I figured that one day I would clear out the confidential data and RMA the thing. No rush.

Every so often, as time permitted, I would haul the thing out and play with it a little. This morning was one of those times.

Since I’ve been rather unsuccessful with the thing so far I figured to try swapping logic boards on the drive. I’ve got a spare, of sorts; it’s on a drive that’s part of the RAID mirror in my primary desktop. Software RAID, that is, on a Windows 7 system.

It’d be a simple matter to pull the drive, failing the RAID. Then the plan was to install the known-good logic board onto the failed drive, cable it up to the ESATA port and (possibly) do the wipe. Recovery would be just as easy. Replace the logic board and re-install the RAID drive. Then recover/resync the mirror and that would be that.

Before I got started I figured a backup would be prudent. The RAID mirror is where I do all my work. The better part of a terabyte was soon copied to a spare drive.

The drive pull took but a moment. Gotta love those big, roomy cases! I booted to find that the array had NOT failed; instead it went missing altogether! Oops. No concern, though, right?Microsoft documentation says that breaking a mirror results in two drives containing the data, just no more mirror. My exercise should have merely simulated a drive failure. When I re-installed the drive it should be fine.

Okay, so I did the logic board swap and futzed with that a bit, still feeling a bit uneasy about the mirror. Didn’t get anywhere for my trouble. It looks like the failed drive is just that – a failed drive. (More about that later.)

I put the known-good logic board back on the mirror drive, shoved it into the case, cabled it up and booted. Uh oh. Still no mirror. One of the two formerly mirrored drives appeared uninitialized while the other was foreign. I imported the foreign disk, which then got its old drive letter back.The data appeared to be intact but (I guess) since the companion volume remained uninitialized it still reported itself as having “failed redundancy.” I couldn’t break the mirror, nor could I remove the mirror. It looked like it was in some kind of limbo. I tried to reactivate the volume and had a nice little “WTF” moment: “the plex is missing” mocked the resulting error message.

I’m running out of time, there’s stuff I need to be doing and it’s certainly not this.

I initialized the uninitialized drive, made it dynamic and formatted it. Then I copied the data from the drive whose plex – whatever the hell that is – was missing onto the newly formatted volume. Continuing, I wiped the plex-less drive. Would it now offer itself up as a candidate to accept a mirror? Yes, it would. So I did just that and it took a while – longer than all the file copying – to resync.

Now, I’ve had good luck with Windows’ software RAID mirrors before but this exercise worried me a little. Should I have broken the mirror instead of simply yanking the drive? What if it had failed electrically? Or if I knocked a cable loose doing some unrelated maintenance? Or someone stole the drive? What happens when a drive fails under certain circumstances? Have I just been lucky all along, where the failures I’ve experienced have just been the right kind of failures that were recoverable? Ponder, ponder.

I guess I need to set up a testbed VM and experiment. Meanwhile, I have my panic copy and the same mirror arrangement I had this morning, no lossage.

Oh, and the old drive that I was trying to wipe? Glad you asked. It’s still on the shelf. There’s confidential data on there, if one were to recover it. I haven’t been able to get to it in order to properly cleanse it. I don’t trust Seagate; not that Seagate’s evil or anything. It’s just that, well, the responsibility’s mine and I don’t take that lightly. Terabyte drives are only worth about $75 retail these days and I got a couple of good years out of the thing.

What would YOU do with a drive full of confidential but unreachable data? Can you suggest any tools that I might use to get at the drive to wipe it without needing to access it with Windows or Linux, the two predominant OSs we run here?