Thursday, April 02, 2009

Fix Corrupted Volume and Directory /var/run on Linux

I work on a host running CentOS 5 with SATA raid 1 mirror few days ago, After watching the boot up screen I discovered that one of the disk is not being detected, so I turned off and disconnect the main disk and configured the mirrored disk as main disk since it should have the same data it shouldn't matter, I don't know why the disk didn't detect the mirrored disk immediately when it failed as my prime concern right now is to get the data off as I am informed they don't have an updated backup, during boot up I also discovered that various services are not able to write to "/var/run" specifically "Can't create/write to file '/var/run"

Doing ls -la on var will show that the "/var/run" directory is corrupted (see also screenshot below)

[/var]# ls -la
total 2147483844

drwxr-xr-x 24 root root 4096 Mar 11 2008 .

drwxr-xr-x 24 root root 4096 Mar 31 14:30 ..
?rwsrwsrwt 65535 4294967295 4294967295 4294967295 Jan 1 1970 run

Attempt to chown, chmod failed, attempt to unmount "/var" since it is assigned to its own volume

Before I attempt to unmount the volume I copied all the content of "/var" to a free external disk, except for "/var/run", as at this point I planned just to format the "/var" volume

After copying the file the only way I was able to unmount my "/var" was to edit "/etc/fstab" and comment it out

LABEL=/usr/local1 /usr/local ext3 defaults 1 2
#LABEL=/var1 /var ext3 defaults 1 2

LABEL=SWAP-pdc_dagahb swap swap defaults 0 0

NOTE: When copying file make sure to use -p optio as in cp -pr to keep permission and owenrs intack

Then do a reboot

Once the corrupted volume is unmounted, I boot into single user mode and did

# mkfs.ext3 /dev/mapper/pdc_dagahbchp8

After formating I edit my "/etc/fstab" and added the line below to have the formatted volume mount to "/var"

/dev/mapper/pdc_dagahbchp8 /var ext3 defaults 1 2

Then I did mount -a, and copied all the files back to my /var, had to manually create some\ folders in /var/run like mysqld and utmp, after setting the permission chmod 644 on utmp and chmod 770 on mysqld which didn't work for mysqld as shown on the screenshot, so I end up doing chmod 777 /var/run which I have to fix later on, anyway I was able to bring up MySQL on the host and the data out, I know I can just copy the "/var/lib/mysql/data" folder and just recreate the table on the mysql db, but I also want to bring the host up first and make sure that the files are not corrupted and maybe hoping I can salvage this host, instead doing a rebuild, I hope this help.

Tech Blog Quick List of howto's:

No comments:

For suggestion and concerns E-mail