Disk Scan for deleted entries

Disk Scan is a process of low-level enumeration of all entries in the Root Folders on FAT12, FAT16, FAT32 or in Master File Table (MFT) on NTFS, NTFS5. The goal is to find and display deleted entries.

In spite of different file/folder entry structure for the different file systems, all of them contain basic file attributes like name, size, creation and modification date/time, file attributes, existing/deleted status, etc...

Given that a drive contains root file table and any file table (MFT, root folder of the drive, regular folder, or even deleted folder) has location, size and predefined structure, we can scan it from the beginning to the end checking each entry, if it's deleted or not and then display information for all found deleted entries.

Deleted entries are marked differently depending on the file system. For example, on FAT any deleted entry, file or folder are marked with ASCII symbol 229 (0xE5) that becomes the first symbol of the entry. On NTFS deleted entry has a special attribute in file header that points whether the file has been deleted or not.

Example of scanning folder on FAT16:

  1. Existing folder MyFolder entry (long entry and short entry)
0003EE20   41 4D 00 79 00 46 00 6F  00 6C 00 0F 00 09 64 00   AM.y.F.o.l....d.
0003EE30   65 00 72 00 00 00 FF FF  FF FF 00 00 FF FF FF FF   e.r...yyyy..yyyy
0003EE40   4D 59 46 4F 4C 44 45 52  20 20 20 10 00 4A C4 93   MYFOLDER   ..JA"
0003EE50   56 2B 56 2B 00 00 C5 93  56 2B 02 00 00 00 00 00   V+V+..A"V+......
  1. Deleted file MyFile.txt entry (long entry and short entry)
0003EE60   E5 4D 00 79 00 46 00 69  00 6C 00 0F 00 BA 65 00   aM.y.F.i.l...?e.
0003EE70   2E 00 74 00 78 00 74 00  00 00 00 00 FF FF FF FF   ..t.x.t.....yyyy
0003EE80   E5 59 46 49 4C 45 20 20  54 58 54 20 00 C3 D6 93   aYFILE  TXT .AO"
0003EE90   56 2B 56 2B 00 00 EE 93  56 2B 03 00 33 B7 01 00   V+V+..i"V+..3·..
  1. Existing file Setuplog.txt entry (the only short entry)
0003EEA0   53 45 54 55 50 4C 4F 47  54 58 54 20 18 8C F7 93   SETUPLOGTXT .??"	
0003EEB0   56 2B 56 2B 00 00 03 14  47 2B 07 00 8D 33 03 00   V+V+....G+..?3..
0003EEC0   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
0003EED0   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
Offset      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

For this folder root table contains 3 entries, one of them has been deleted.

First entry is an existing folder MyFolder. Second one is a deleted file MyFile.txt. Third one is an existing file Setuplog.txt.

First symbol of the deleted file entry is marked with E5 symbol, so Disk Scanner can assume that this entry has been deleted.

Example of scanning folder on NTFS5 (Windows 2000):

For our drive we have input parameters:

  • Total Sectors 610406
  • Cluster size 512 bytes
  • One Sector per Cluster
  • MFT starts from offset 0x4000, non-fragmented
  • MFT record size 1024 bytes
  • MFT Size 1968 records

Thus we can iterate through all 1968 MFT records, starting from the absolute offset 0x4000 on the volume looking for the deleted entries. We are interested in MFT entry 57 having offset 0x4000 + 57 * 1024 = 74752 = 0x12400 because it contains our recently deleted file "My Presentation.ppt"

Below MFT record number 57 is displayed:

Offset      0  1  2  3  4  5  6  7   8  9  A  B  C  D  E  F

00012400   46 49 4C 45 2A 00 03 00  9C 74 21 03 00 00 00 00   FILE*...?t!.....
00012410   47 00 02 00 30 00 00 00  D8 01 00 00 00 04 00 00   G...0...O.......
00012420   00 00 00 00 00 00 00 00  05 00 03 00 00 00 00 00   ................
00012430   10 00 00 00 60 00 00 00  00 00 00 00 00 00 00 00   ....`...........
00012440   48 00 00 00 18 00 00 00  20 53 DD A3 18 F1 C1 01   H....... SY?.nA.
00012450   00 30 2B D8 48 E9 C0 01  C0 BF 20 A0 18 F1 C1 01   .0+OHeA.A?  .nA.
00012460   20 53 DD A3 18 F1 C1 01  20 00 00 00 00 00 00 00    SY?.nA. .......
00012470   00 00 00 00 00 00 00 00  00 00 00 00 02 01 00 00   ................
00012480   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
00012490   30 00 00 00 78 00 00 00  00 00 00 00 00 00 03 00   0...x...........
000124A0   5A 00 00 00 18 00 01 00  05 00 00 00 00 00 05 00   Z...............
000124B0   20 53 DD A3 18 F1 C1 01  20 53 DD A3 18 F1 C1 01    SY?.nA. SY?.nA.
000124C0   20 53 DD A3 18 F1 C1 01  20 53 DD A3 18 F1 C1 01    SY?.nA. SY?.nA.
000124D0   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000124E0   20 00 00 00 00 00 00 00  0C 02 4D 00 59 00 50 00    .........M.Y.P.
000124F0   52 00 45 00 53 00 7E 00  31 00 2E 00 50 00 50 00   R.E.S.~.1...P.P.
00012500   54 00 69 00 6F 00 6E 00  30 00 00 00 80 00 00 00   T.i.o.n.0...€...
00012510   00 00 00 00 00 00 02 00  68 00 00 00 18 00 01 00   ........h.......
00012520   05 00 00 00 00 00 05 00  20 53 DD A3 18 F1 C1 01   ........ SY?.nA.
00012530   20 53 DD A3 18 F1 C1 01  20 53 DD A3 18 F1 C1 01    SY?.nA. SY?.nA.
00012540   20 53 DD A3 18 F1 C1 01  00 00 00 00 00 00 00 00    SY?.nA.........
00012550   00 00 00 00 00 00 00 00  20 00 00 00 00 00 00 00   ........ .......
00012560   13 01 4D 00 79 00 20 00  50 00 72 00 65 00 73 00   ..M.y. .P.r.e.s.
00012570   65 00 6E 00 74 00 61 00  74 00 69 00 6F 00 6E 00   e.n.t.a.t.i.o.n.
00012580   2E 00 70 00 70 00 74 00  80 00 00 00 48 00 00 00   ..p.p.t.€...H...
00012590   01 00 00 00 00 00 04 00  00 00 00 00 00 00 00 00   ................
000125A0   6D 00 00 00 00 00 00 00  40 00 00 00 00 00 00 00   m.......@.......
000125B0   00 DC 00 00 00 00 00 00  00 DC 00 00 00 00 00 00   .U.......U......
000125C0   00 DC 00 00 00 00 00 00  31 6E EB C4 04 00 00 00   .U......1neA....
000125D0   FF FF FF FF 82 79 47 11  00 00 00 00 00 00 00 00   yyyy‚yG.........
000125E0   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
000125F0   00 00 00 00 00 00 00 00  00 00 00 00 00 00 03 00   ................
...............
00012600   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................

MFT Record has pre-defined structure. It has a set of attributes defining any file of folder parameters.

MFT Record begins with standard File Record Header (first bold section, offset 0x00):

  • "FILE" identifier (4 bytes)
  • Offset to update sequence (2 bytes)
  • Size of update sequence (2 bytes)
  • $LogFile Sequence Number (LSN) (8 bytes)
  • Sequence Number (2 bytes)
  • Reference Count (2 bytes)
  • Offset to Update Sequence Array (2 bytes)
  • Flags (2 bytes)
  • Real size of the FILE record (4 bytes)
  • Allocated size of the FILE record(4 bytes)
  • File reference to the base FILE record (8 bytes)
  • Next Attribute Id (2 bytes)

The most important information for us in this block is a file state: deleted or in-use. If Flags(in red color) field has bit 1 set, it means that file is in-use. In our example it is zero, i.e. file is deleted.

Starting from 0x48, we have Standard Information Attribute (second bold section):

  • File Creation Time (8 bytes)
  • File Last Modification Time (8 bytes)
  • File Last Modification Time for File Record (8 bytes)
  • File Access Time for File Record (8 bytes)
  • DOS File Permissions (4 bytes) 0x20 in our case Archive Attribute

Following standard attribute header, we have File Name Attribute belonging to DOS name space, short file names, (third bold section, offset 0xA8) and again following standard attribute header, we have File Name Attribute belonging to Win32 name space, long file names, (third bold section, offset 0x120):

  • File Reference to the Parent Directory (8 bytes)
  • File Modification Times (32 bytes)
  • Allocated Size of the File (8 bytes)
  • Real Size of the File (8 bytes)
  • Flags (8 bytes)
  • Length of File Name (1 byte)
  • File Name Space (1 byte)
  • File Name (Length of File Name * 2 bytes)

In our case from this section we can extract file name, "My Presentation.ppt", File Creation and Modification times, and Parent Directory Record number.

Starting from offset 0x188, there is a non-resident Data attribute (green section).

  • Attribute Type (4 bytes) (e.g. 0x80)
  • Length including header(4 bytes)
  • Non-resident flag (1 byte)
  • Name length (1 byte)
  • Offset to the Name (2 bytes)
  • Flags (2 bytes)
  • Attribute Id (2 bytes)
  • Starting VCN (8 bytes)
  • Last VCN (8 bytes)
  • Offset to the Data Runs (2 bytes)
  • Compression Unit Size (2 bytes)
  • Padding (4 bytes)
  • Allocated size of the attribute (8 bytes)
  • Real size of the attribute (8 bytes)
  • Initialized data size of the stream (8 bytes)
  • Data Runs ...

In this section we are interested in Compression Unit size (zero in our case means non-compressed), Allocated and Real size of attribute that is equal to our file size (0xDC00 = 56320 bytes), and Data Runs (see the next chapter).