Friday 9 November 2012

Recovering Lost Files With Linux

Unfortunately data loss and corruption are common ailments in memory devices such as hard disk drives, solid state memory cards and USB sticks.


It doesn't matter if your faulty device is a memory card from your camera, or a hard drive from an Apple Mac or a Windows PC, all may not be lost if you can connect the device to a computer and use a couple of magical Linux tools.


The other day I was handed an SD card, which should have contained the record of a great holiday in Egypt. There were a few jpeg files listed in file manager, but they would not open, and many seemed to have a file size of 0 bytes.

The outlook was gloomy (as was my well-tanned friend) but in just a few minutes I was able to restore over 200 happy holiday snaps.



Hard drives and solid state memory devices can of course fail in a number of different modes, some which make it very difficult (and probably impractical) to retrieve data. For example, if the disk in a hard drive stops spinning, you won't be able to extract data using software tools. But where files have been accidentally deleted, or where the file system has become corrupt, you are still in with a good chance of recovering something.

Identification

So the first step is to start Linux, either from a permanent installation, or from a LiveUSB Linux distro, and determine the identity of the faulty memory device.

In Linux, disk drives and USB memory devices are first "discovered" and then attached to the file structure. If the drive is not found or recognised as a drive, it won't be attached to the file structure, and you won't be able to use this procedure to retrieve files.

Drives are attached to the \dev directory as "block devices" and are normally named like this:-
  hda, hdb, hdc   {e.g. hard drive "a"}
...or:-
  sda, sdb, sdc     {e.g. serial drive "a"}

Drives with partitions are shown numbered. For example, if your hard drive is sda and it has 2 partitions, in /dev you may see:-
  sda
  sda1
  sda2

The reason for mentioning this, is that your faulty drive may or may not be "mounted" to the file structure when you connect it. So if you use terminal command "df" before and after you connect the faulty drive, you may see something like this:-


In this example, the faulty drive is called "sdb" which has one partition "sdb1". But if our faulty drive does not have a partition, it will not appear using "df".

Maybe the most conclusive way to find the drive identity, is to use the file manager to examine the block devices in /dev


Here we can see that our main drive is sda and has 3 partitions (1, 2 & 5) while our faulty drive (which appeared in the file manager window as I connected it) is sdb and it does not have a partition.

So having identified our faulty drive, then next step is to un-mount the partition (if it has one) from the Linux file structure, which we do from the terminal:-
 sudo umount sdb1

Make A Copy

Now we can make a copy of the faulty drive and save it to an image file.

First, use file manager to create 2 new directories (e.g. /home/steve/DD and /home/steve/XX).

Return to the terminal and type this command:-
 sudo dd if=/dev/sdb of=/home/steve/DD/sdb.dd bs=2k

This may take some time, and the screen doesn't give you any feedback!

But you can look at this new directory with your file manager and [hopefully] see that the image file "sdb.dd" has been created and is slowing getting bigger.

Scan With Foremost

In the meantime you can download a program called "foremost", which should be available in your Linux distro's repository.

The next step is to run foremost on your newly created image file. If you are just trying to retrieve photos from a camera card, type this in terminal:-
 foremost -t jpeg -o /home/steve/XX -i /home/steve/DD/sdb.dd

This will scan your image file (sdb.dd) and retrieve jpeg files to the output directory (/home/steve/XX).

This may also take a while, especially on big drives, but you will at least be entertained with a truly fascinating growing line of stars:-
 |********

If you want to recover other types of file you could specify:-
 foremost -t all -o /home/steve/XX -i /home/steve/DD/sdb.dd

...or take a look at the documentation from the terminal:-
 man foremost


Look! No Partition!

Having deliberately deleted the partition from one of my memory sticks, the audit file generated by foremost looks like this:-

Invocation: foremost -t all -o /home/steve/XX -i /home/steve/DD/sdb.dd
Output directory: /home/steve/XX
Configuration file: /etc/foremost.conf
------------------------------------------------------------------
File: /home/steve/DD/sdb.dd
Start: Fri Nov  9 12:11:52 2012
Length: 7 GB (8004304896 bytes)

Finish: Fri Nov  9 12:44:22 2012
.

.
.{lots of files listed}
.
.
1113 FILES EXTRACTED
jpg:= 2
htm:= 16
png:= 1095


As you can see, this took just over 30 minutes to scan 7GB.

Data Protection?

I was interested to find that some of the recovered files from this memory stick were from a time when it was used to run Puppy Linux as a live distro. Since then it has been wiped a couple of times, and it most recently hosted Tiny Core Linux.

I guess this just shows how data can hang around on old discarded drives!

1 comment: