Sunday, March 17, 2013

backing up my systems: it ain't my day (or month)

OK.  I've been in three weeks of hardware hell, mainly due to the fact that I wanted to get my backups for all my machines (a MacBook Pro, my main Linux video editing workstation and an older Windows Vista digital audio workstation (DAW)) properly backed up.  I detailed my strategy for this in my last post.  This post is more of a rant than anything else, so please excuse the lack of any real mentorship on problem solving, except maybe "Google is Your Friend."

Issue #1: Drobo runs out of space
The Drobo has been a fine unit for me.  But as time goes on, you acquire more media and your available space runs out.  You'd think it would be a simple matter of buying a new disk, putting it in the Drobo and letting the BeyondRAID rebuild it's array.  Well, the first drive I bought, a Western Digital Green 1TB, died after the first rebuild.  That never happened to me before, where a drive failed out-of-the-box for me.  Never having that problem before, I didn't truly believe it was dead.

With my non-belief firmly in place, I tried to use the drive in different capacities.  So as a test, I formatted the disk using my Thermaltake BlacX connected to my Mac.  I was able to copy files over to it (though I didn't copy gigs and gigs worth as a true test).  But when I put the unit back in the Drobo, the Drobo gave an immediate "red" light for that drive bay, indicating the drive was bad.  I switched drives in the Drobo unit around, because I thought it could have been a faulty drive bay.  

Then, I had the bright idea to move the data off my 2TB system drive of my main Linux machine to the new Western Digital, put the 2TB in the Drobo and use the new 1TB (which I really thought was a good, error-free drive) as my Linux system drive. So still thinking that the 1TB drive was good, I would have to do some fancy footwork in order to make this possible as the system drive was a logical volume.  This entailed a week of work to figure out how to shrink a logical volume in order to fit the used space of the 2TB drive (which was less than a terabyte) onto the 1TB.

I learned a lot from that experience, to be detailed in a later post.  Suffice it to say that in the end, the 1TB was truly dead and I ended up getting a new 1TB (a Western Digital Black) from BestBuy and that solved my Drobo storage issue.  Kudos to BestBuy, as they were able to give me the Black at the same price as the Green for my trouble.

Issue #2: Mac Time Machine "the identity of this backup disk has changed" (Sparsebundle Problem)
This was an odd one.  After installing the new disk in the Drobo, Time Machine started showing the error "the identity of this backup disk has changed".  From the below post:

I executed the "chflags" command listed.  This ran for about four hours.  After, I tried to execute the "hdutil" command listed, but the Mac said it had already ran the command.  So testing the result of the chflags command, I shutdown and restarted the Drobo.  When Time Machine started backing up, it no longer gave me the error.  Hooray.  Another one down.

Issue #3: Windows Vista DAW crashes
So after a week spent on #1 and #2, I was ready to start work on a new musical project with some friends.  Firing up my old Dell 400SC running Windows Vista (OK, OK..I know I need to upgrade Win7, but I've got a recording session coming up soon and didn't want to change OS's yet), I was presented with this error:
c\windows\system32\config\system corrupt

Oh, wonderful.  So I popped in the Vista Ultimate DVD and selected "Repair".  After it ran, the system rebooted and I was pleasantly surprised to find that this fixed the problem and that I was able to get back into the system.

Getting back into the system, I reasoned that if the drive was going bad, I'd better make a backup.  So I ponied up $40 for Drobo's PC Backup product, the ugly step-brother of the seemless Drobo integration with Mac Time Machine.  Assuming the PC product worked the same way the Mac product did, I selected the defaults.  Well, the defaults do NOT backup the entire drive.  Only your user data.  My bad for not reading the fine print, but I believe that a Drobo product should be consistent between systems and the default should be to backup your entire drive with all system data included, as long as you have the space on your Drobo.  But that's just me.

The missing data would be crucial for what happened next.

Issue #4: Windows Vista DAW crashes again
After taking a two day hiatus from my backup shenanigans, I fired up the DAW again.  And guess what..a new error appears:
\Windows\system32\winload.exe is missing or corrupt (status 0xc000000f)

Oh great.  Going back to my ritual, I loaded in the Vista Ultimate DVD and selected "Repair".  However, after the reboot, no go..still the same "missing or corrupt" error.  I tried a number of times doing the repair, as the Vista repair process would show slightly different screens every time it booted and recognized the system.  This gave me false hope that the DVD was actually repairing something correctly.  Also, the frustrating part of this process that for whatever reason, the DVD would take 10 minutes to load on my Dell.  I'm not sure what the problem was there.  So I chewed up a few hours doing this multiple times.  

Finally, after reading some Google posts by people with the same issue, I decided to run "chkdsk /r" from the command line, rather than relying on the non-informative Windows Vista screen to run some unknown fix command.  I had to specifically boot into the System Recovery Options screen as shown in the below post:

Once I was there, I selected "Command Prompt" and typed in good ol' "chkdsk /r", the "repair" option to chkdsk.  This time, I was rewarded with an actual status screen that told me "bad clusters found", Windows was marking the clusters as bad and was moving the files located on those clusters to good sectors on the disk.  (Sectors and cluster primer here: http://t.co/DLFjrXAp5C).  This process took about three hours, unlike the half-hearted effort that Windows Vista attempted.  I wonder why Vista did not default to doing a real "chkdsk /r".  That doesn't help anyone who has a failing disk.  Bad default!

After the bad cluster identification and repair, I was really glad to see Vista boot up properly!  But since there were so many bad clusters, I had to make a full backup or clone of that drive but quick!  For this, I popped in an unused 500GB SATA I had lying around.  I repartitioned and formatted this drive.  It had been a second Vista system disk and one point, so I knew the drive's main partition was marked as bootable.  So I was good to go there.  I then dragged all the files from my C: onto the new E: (my DVD being the D:).  However, on bootup, Vista showed an error:
"System volume on disk is corrupt"

I suspected this was a problem with the NTFS boot files on the 500GB drive as they had links from the partition map from the old 256GB drive that was failing.  Luckily, when I ran Vista repair, Vista was able to fix this issue and the system started properly.

Issue #5: Windows Vista continually keeps "preparing your desktop"
After the system came up, I made sure all my applications (Reaper, Drobo PC Backup, etc) were working properly.  Unfortunately, they were not, as Vista continually kept giving me the message "Preparing Your Desktop" when I logged into my profile.  I tried a number of things from Google, but those suggestions did not work.  I didn't have any critical data in the old profile, so I figured I'd bit the bullet and create a new profile.  After doing this, the message disappeared and I was able to save my desktop settings and application preferences properly.

In Sum
Wow.  So this has been three weeks of hell.  I "think" I am back to steady state with my systems.  I was able to reset Drobo PC Backup to a full system backup of my Vista DAW to the Drobo.  The Drobo is backing up the Mac just fine and CrashPlan is encrypting my main Linux box backup to the Cloud.

Maybe now I can go outside and get some sun?
TAG

1 comment:

Unknown said...

You most likely had a block of the drive which was iffy, not giving out good data. In the future I suggest your purchase a copy of Spinrite, for well 15 years I've used this product to get machines which will not boot (errors like the missing file you had) to boot. Generally after this I will buy a new drive but sometimes it just a case of the data rott from being on the drive for a long time. If spinwrite reports the driving being in good condition and I'm sure its data rott I'll just leave it.

Feel free to drop me a line or ask me a question.