How To Fix / Repair Bad Blocks In Linux

October 10, 2012 | By
| 5 Replies More

Linux Bad Blocks

The bad blocks in a storage device are the portions of the device that are not readable for some reason. These bad blocks can be recovered if we are able to reach the exact location of that block. The SMART technology built into many storage devices (such as many ATA-3, later ATA, IDE and SCSI-3 hard drives) monitors the reliability of hard drive and can predict drive failures. The SMART stands for Self-Monitoring, Analysis and Reporting Technology. The smartmontools provide command line utilities for carrying out different types of drive self-tests. This article describes the actions that can be taken when smartmontools detect and report some bad blocks on the disks.

The Smartmontools

The smartmontools package provides two utilities: smartrd and smartctl.

smartd is the deamon that polls the ATA and SCSI devices every 30 minutes (this value can be changed) and logs the SMART errors and changes in SMART attributes using SYSLOG interface.

The smartctl performs SMART tasks and can be used to print SMART self-tests and error logs among other tasks such as support for polling TapeAlert messages from SCSI tape drives. The usage of this command will be clear as we proceed through this article. This article proceeds through some examples of disk failure for different types of filesystem.

The dd command

The ‘dd’ command is very useful command if you are working on disk level. This command is used to write the raw bits on the faulty drive. An example of dd command for writing a file is.

# dd if=/dev/zero of=/myfile.txt bs=1024 count=10

This command will create a file /myfile.txt of size 10 KB. The ‘if’ option provides the input file for dd command. Here the file is /dev/zero. This file is used to write zeroes on the hard drive. In this case, the file created has all zeroes on the bit level, i.e. all bits have value 0. The ‘of’ option specifies the output file. The ‘bs’ is the block size and ‘count’ is the number of blocks to be written on the drive.

Now that we know the basic usage of dd command, we can proceed to the examples.

Ext2/ext3 first example

The smartctl command reports a bad block at Logical Block Address LBA = 0x016561e9 which in decimal number system is 23421417.

root]# smartctl -l selftest /dev/hda

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 217 0x016561e9

The LBA counts sectors in units of 512 bytes starting at zero. The value of "Current_Pending_Sector" attribute in "smartctl -A" command confirms the bad sector.

root]# smartctl -A /dev/hda
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 1

The following steps are taken to remove this bad block:

Step1

Locate the partition on which the bad block resides
The fdisk command can be used to view the sectors of the hard disk partitions.

root]# fdisk -lu /dev/hda

Disk /dev/hda: 123.5 GB, 123522416640 bytes
255 heads, 63 sectors/track, 15017 cylinders, total 241254720 sectors
Units = sectors of 1 * 512 = 512 bytes

Device Boot Start End Blocks Id System
/dev/hda1 * 63 4209029 2104483+ 83 Linux
/dev/hda2 4209030 5269319 530145 82 Linux swap
/dev/hda3 5269320 238227884 116479282+ 83 Linux
/dev/hda4 238227885 241248104 1510110 83 Linux

Here we can see that the LBA 23421417 lies in the third partition, i.e. /dev/hda3. The offset value of the sector is 23421417 - 5269320 = 18152097 sectors in the partition /dev/hda3.

Now we need to check the type of filesystem of the partition. This can be checked from /etc/fstab file.

root]# grep hda3 /etc/fstab
/dev/hda3 /data ext2 defaults 1 2

Step2

Now we need to find the block size of the filesystem using tune2fs command

root]# tune2fs -l /dev/hda3 | grep Block
Block count: 29119820
Block size: 4096

This reports the block size to be 4096 bytes.

Step3

Find the filesystem block that contains this problematic LBA. We use the following formula:

b = (int)((L-S)*512/B)

where:

b = File System block number
B = File system block size in bytes
L = LBA of bad sector
S = Starting sector of partition as shown by fdisk -lu
and (int) denotes the integer part.

For our example, L=23421417, S=5269320, and B=4096.

b = (int)18152097*512/4096 = (int)2269012.125

so b=2269012.

Step4

Use debugfs to locate the inode stored in this block, and hence the file that is stored at that location.

root]# debugfs
debugfs 1.32 (09-Nov-2002)
debugfs: open /dev/hda3
debugfs: testb 2269012
Block 2269012 not in use

Here, in this case, the block is not in use. So the rest of this step can be skipped and we can jump directly to next step. Otherwise if the block is in use, as reported by the following output:

debugfs: testb 2269012
Block 2269012 marked in use
debugfs: icheck 2269012
Block Inode number
2269012 41032
debugfs: ncheck 41032
Inode Pathname
41032 /S1/R/H/714197568-714203359/H-R-714202192-16.gwf

In this case, the problematic file is: /data/S1/R/H/714197568-714203359/H-R-714202192-16.gwf

In case of ext3 filesystem, this block can be the part of journal itself. The inode will be very small and debugfs will not be able to report any filename.

debugfs: testb 2269012
Block 2269012 marked in use
debugfs: icheck 2269012
Block Inode number
2269012 8
debugfs: ncheck 8
Inode Pathname
debugfs:

In this case, we can remove the journal with tune2fs command:

tune2fs -O ^has_journal /dev/hda3

Now, we repeat the step 4, and if the problem is not reported anymore, we can rebuild the journal:

tune2fs -j /dev/hda3

Step5

This step will destroy the data on that block by writing zeroes on it. The bad block will be recovered but the data of the file will be lost. If you are sure, you can proceed with the following step:

root]# dd if=/dev/zero of=/dev/hda3 bs=4096 count=1 seek=2269012
root]# sync

Now we can again check the "smartctl -A" output to verify that everything is back to normal.

root]# smartctl -A /dev/hda
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 1
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 1
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 1

Here you can see that the value of "Current_Pending_Sector" is zero.

Ext2/ext3 second example

To: ballen
Subject: SMART error (selftest) detected on host: medusa-slave166.medusa.phys.uwm.edu

This email was generated by the smartd daemon running on host:
medusa-slave166.medusa.phys.uwm.edu in the domain: master001-nis

The following warning/error was logged by the smartd daemon:
Device: /dev/hda, Self-Test Log error count increased from 0 to 1

This email from smartd shows the first sign of trouble. As talked about in the previous example, we run "smartctl -a /dev/hda" to confirm the problem:

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 80% 682 0x021d9f44

The LBA reported is 0x021d9f44 (base 16) = 35495748 (base 10)

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 3
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 3

Here, 3 sectors are unreadable. Using the following bash script, we can check the sectors around that area.

[root]# export i=35495730
[root]# while [ $i -lt 35495800 ]
> do echo $i
> dd if=/dev/hda of=/dev/null bs=512 count=1 skip=$i
> let i+=1
> done

 

35495734
1+0 records in
1+0 records out
35495735
dd: reading `/dev/hda': Input/output error
0+0 records in
0+0 records out

 

35495751
dd: reading `/dev/hda': Input/output error
0+0 records in
0+0 records out
35495752
1+0 records in
1+0 records out

 

This shows that 17 sectors 35495735-35495751 are unreadable.

The filesystem blocks that contain this area are:

L=35495735 to 35495751
S=5269320
B=4096
so b=3778301 to 3778303

To identify files at these locations, we run debugfs:

[root]# debugfs
debugfs 1.32 (09-Nov-2002)
debugfs: open /dev/hda3
debugfs: icheck 3778301
Block Inode number
3778301 45192
debugfs: icheck 3778302
Block Inode number
3778302 45192
debugfs: icheck 3778303
Block Inode number
3778303 45192
debugfs: ncheck 45192
Inode Pathname
45192 /S1/R/H/714979488-714985279/H-R-714979984-16.gwf
debugfs: quit

We can use md5sum to confirm our file:

[root]# md5sum /data/S1/R/H/714979488-714985279/H-R-714979984-16.gwf
md5sum: /data/S1/R/H/714979488-714985279/H-R-714979984-16.gwf: Input/output error

So, we force the disk to reallocate the bad blocks:

[root]# dd if=/dev/zero of=/dev/hda3 bs=4096 count=3 seek=3778301
[root]# sync

or

[root]# dd if=/dev/zero of=/dev/hda bs=512 count=17 seek=35495735
[root]# sync

Now we can check if the bad block are creating no trouble with the value of "Current_Pending_Sector" attribute (smartctl -A command):

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0

Unassigned sectors

In the above examples, we have not considered the case when the bad blocks are not assigned to any file. This will be clear when we run debugfs command to find the file corresponding to a particular block. If this is the case, then we can first create a file that is large enough to fill the remaining filesystem (again, dd command is the rescue).

dd if=/dev/zero of=/some/mount/point bs=4k

This command will run until there is no space left on the filesystem. And now we can proceed through rest of the steps.

ReiserFS example

In this example, the filesystem used is ReiserFS. So, some of the commands used will be different from the above case.

The SMART error log indicates the bad block address to be 58656333. The partition table indicates that the block is in a partition with ReiserFS filesystem, starting at block address 54781650.

Step1: Get the block size of filesystem

# debugreiserfs /dev/hda3 | grep '^Blocksize'
Blocksize: 4096

Step2: Get the block number

# echo "(58656333-54781650)*512/4096" | bc -l
484335.37500000000000000000

Step3: Try to get more information about the block

# debugreiserfs -1 484335 /dev/hda3
debugreiserfs 3.6.19 (2003 http://www.namesys.com)
484335 is free in ondisk bitmap

The problem has occurred looks like a hardware problem.

Here, we see that reading the block fails. But we now know from this output that it is unused block.
At this point, we can try to write the bad block and see if the drive remaps the bad block. If it cannot remap the block, use badblock option (-B) with reiserfs utils to handle this block correctly.

bread: Cannot read the block (484335): (Input/output error).

Aborted

At least we have the correct bad block.

Step4: Try to find the affected file

tar -cO /mydir | cat >/dev/null

Step 5: We can try running badblock -n to provoke reallocation

# badblocks -b 4096 -p 3 -s -v -n /dev/hda3 `expr 484335 + 100` `expr 484335 - 100`

If everything happens as expected, debugreiserfs -1 484335 /dev/hda3 reports no errors. Otherwise:

Step6: Use dd command to write zeroes on the particular area

# dd if=/dev/zero of=/dev/hda3 count=1 bs=4096 seek=484335
1+0 records in
1+0 records out
4096 bytes transferred in 0.007770 seconds (527153 bytes/sec)

LVM Repair

This example considers the bad block to be on an LVM volume:

An error is reported and the bad block is found to be at LBA 37383668 with following command:

# smartctl -a /dev/hdb
...
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 90% 66 37383668

sfdisk can help to find the physical partition of the bad block:

# sfdisk -luS /dev/hdb

Or

# fdisk -ul /dev/hdb

Disk /dev/hdb: 9729 cylinders, 255 heads, 63 sectors/track
Units = sectors of 512 bytes, counting from 0

Device Boot Start End #sectors Id System
/dev/hdb1 63 996029 995967 82 Linux swap / Solaris
/dev/hdb2 * 996030 1188809 192780 83 Linux
/dev/hdb3 1188810 156296384 155107575 8e Linux LVM
/dev/hdb4 0 - 0 0 Empty

The bad block is in /dev/hdb3 partition, which is an LVM based partition. The offset of this block is: (37383668 - 1188810) = 36194858

The physical partition used by LVM is divided into PE (Physical Extent). The 'pvdisplay' command gives the size of PE of the LVM partition:

# part=/dev/hdb3 ; pvdisplay -c $part | awk -F: '{print $8}'
4096

To get its size in LBA block size (512 bytes or 0.5 KB), we multiply this number by 2 : 4096 * 2 = 8192 blocks for each PE.

Now we search the PE in which the bad block is residing: physical partition's bad block number / sizeof(PE)

36194858 / 8192 = 4418.3176

Now we need to find the logical partition corresponding to PE number 4418.

# lvdisplay --maps |egrep 'Physical|LV Name|Type'
LV Name /dev/WDC80Go/racine
Type linear
Physical volume /dev/hdb3
Physical extents 0 to 127
LV Name /dev/WDC80Go/usr
Type linear
Physical volume /dev/hdb3
Physical extents 128 to 1407
LV Name /dev/WDC80Go/var
Type linear
Physical volume /dev/hdb3
Physical extents 1408 to 1663
LV Name /dev/WDC80Go/tmp
Type linear
Physical volume /dev/hdb3
Physical extents 1664 to 1791
LV Name /dev/WDC80Go/home
Type linear
Physical volume /dev/hdb3
Physical extents 1792 to 3071
LV Name /dev/WDC80Go/ext1
Type linear
Physical volume /dev/hdb3
Physical extents 3072 to 10751
LV Name /dev/WDC80Go/ext2
Type linear
Physical volume /dev/hdb3
Physical extents 10752 to 18932

Hence, the PE 4418 is in /dev/WDC80Go/ext1 logical partition.
Size of logical block of filesystem on /dev/WDC80Go/ext1 is

# dumpe2fs /dev/WDC80Go/ext1 | grep 'Block size'
dumpe2fs 1.37 (21-Mar-2005)
Block size: 4096

The logical partition starts on PE 3072:

(# PE's start of partition * sizeof(PE)) + parttion offset[pe_start] = (3072 * 8192) + 384 = 25166208

There are 512 blocks of physical partition, so the bad block number is:

(36194858 - 25166208) / (sizeof(fs block) / 512) = 11028650 / (4096 / 512) = 1378581.25

You can verify if this is the actual bad block with dd command:

dd if=/dev/WDC80Go/ext1 of=block1378581 bs=4096 count=1 skip=1378581

If the command issues some error, then the calculation for the bad block is correct. Otherwise, recheck the calculations to find the correct block. Once you have found the correct block, resolve the issue with dd command as explained in all above examples:

dd if=/dev/zero of=/dev/WDC80Go/ext1 count=1 bs=4096 seek=1378581

Conclusion

All the examples given in this article concentrate on finding the correct bad block and the partition. Once you have found the bad block, all you need to do is to run dd command to write zeroes on the block. The different examples provide the methods of finding the bad block location in different filesystems and in different scenarios.

Filed Under : HARDWARE, LINUX HOWTO

Tagged With :

Free Linux Ebook to Download

Comments (5)

Trackback URL | Comments RSS Feed

  1. kevin says:

    Thanks for the page
    I'm having problems trying to find the partition offset with fdisk on my gpt partitioned disks.

    For disks that have been partitioned with gpt, I have to use gdisk right?
    trying to use fdisk as above, gave me confusion

    gdisk reports like this on my gpt partitioned disk
    $ gdisk -l /dev/sda
    GPT fdisk (gdisk) version 0.8.1

    Partition table scan:
    MBR: protective
    BSD: not present
    APM: not present
    GPT: present

    Found valid GPT with protective MBR; using GPT.
    Disk /dev/sda: 7814037168 sectors, 3.6 TiB
    Logical sector size: 512 bytes
    Disk identifier (GUID): 58CF1ED7-6886-401C-B98A-08F60893C58A
    Partition table holds up to 128 entries
    First usable sector is 34, last usable sector is 7814037134
    Partitions will be aligned on 8-sector boundaries
    Total free space is 1739 sectors (869.5 KiB)

    Number Start (sector) End (sector) Size Code Name
    1 2048 102402047 48.8 GiB 8300
    2 102402048 204802047 48.8 GiB 8300
    3 204802048 307202047 48.8 GiB 8300
    4 307202048 409602047 48.8 GiB 8300
    5 409602048 512002047 48.8 GiB 8300
    6 512002048 4608002047 1.9 TiB 8300
    7 4608002048 7199637503 1.2 TiB 8300
    8 7199637504 7814035455 293.0 GiB 8200
    9 34 1987 977.0 KiB EF02

    -kevin

  2. SG says:

    This article was immensely helpful. Thank you.

  3. deco says:

    The script below finds bad sectors, puts them into a text file, then test if text file size is different than zero, so e2fsck will mark bad sectors (these sectors will not be used by operating system)

    #!/bin/sh
    minsize=0
    target="/tmp/bad-blocks.txt"
    for disc in `fdisk -l | grep '^/' | awk '{ print $1 }'`; do
    badblocks -v $disc > "$target"
    tam=$(du -k "$target" | cut -f 1)
    if [ $tam -eq $minsize ]; then
    echo "no badblocks on $disc"
    else
    echo "badblock(s) found(s) on $disc"
    e2fsck -l "$target" "$disc"
    fi
    done

  4. KDV says:

    It looks like dd command must have bs parameter equal to physical block size, not logical one. Otherwise writing fails with I/O error and HDD does not reallocate anything.

Leave a Reply

Commenting Policy:
Promotion of your products ? Comment gets deleted.
All comments are subject to moderation.