Skip to content

mhw/topfield-hdsave

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Topfield HDSave

by Mark Wilkinson [email protected]

http://github.com/mhw/topfield-hdsave/

Current State

HDSave is still in development and should be considered alpha quality. The Unix command line program (tfhd) is able to list the contents of directories and copy files from the Topfield hard disk to the host filesystem. This might be enough functionality for you to warrant the risk of letting this code loose on your hard disk. To make you feel more comfortable I'll point out that the raw disk device is opened read-only, so it should not be possible for tfhd to alter the disk's contents. Your mileage may vary though.

All development work and testing has been done on a 64-bit Intel processor; the software has not been tested on a 32-bit system yet.

Work on the TAP mentioned below has not progressed beyond a Makefile that compiles some of the source code with the Topfield MIPS tool chain.

Downloading and Building

For the time being you will need to compile HDSave yourself. You can download a source tarball or zip file, or check the code out using git, from the Github URL above.

Once you have a copy of the source code you should be able to build the tfhd command line program by running make in the unix subdirectory:

$ cd unix
$ make
[...]
$ ./tfhd
usage: tfhd [options] <command> [args...]
options:
        -f DEVICE       Topfield disk to manipulate
        -m FILE         Use a previously saved map file
        -s SIZE         Set disk size instead of probing device
commands:
        info            Print basic information about the disk
        ls [dir]        List contents of a directory
        cp <src> <dst>  Copy contents of a file to host filesystem
$

If it fails to compile, please let me know what went wrong. Better still, if you get it to build on a new platform either fork the code on github, commit your changes and then send me a pull request, or just send me a patch and I'll incorporate it.

Using tfhd

To use thfd you need to connect your Topfield hard disk to your PC. I use USB caddies to do this, but directly attaching the disk to the IDE or SATA bus should also work. As this necessitates removing the hard disk from the Toppy I need to say this: Be careful of high voltages on the power supply components when working inside the Toppy. Removing the cover of your Toppy will void any warranty on the hardware. You follow these instructions at your own risk.

tfhd needs read access to the raw disk device corresponding to the Topfield hard disk. I recommend you grant read access to all users on your system in preference to running tfhd as the root user. On an Ubuntu system you can do that like this:

$ sudo chmod o+r /dev/sdb

replacing /dev/sdb with the appropriate raw device. On Ubuntu systems the device files are recreated each time the device is plugged in or the system rebooted, so these permission changes are not permanent.

You should now be able to use tfhd to list the contents of the directories:

$ ./tfhd -f /dev/sdb ls
__RECYCLE__/
DataFiles/
ProgramFiles/
MP3/
$ ./tfhd -f /dev/sdb ls /DataFiles
[...]
$

You can copy a file from the Topfield hard disk to the host filesystem like this:

$ ./tfhd -f /dev/sdb cp /ProgramFiles/Auto\ Start/MyStuff.tap MyStuff.tap
$

Note that you need to quote spaces in filenames to prevent the shell from splitting them into separate arguments.

You can create a 'disk map' file, which records the filenames of all files and directories on the disk, along with the position and size of all the clusters that each file is stored in. The command to do this is:

$ ./tfhd -f /dev/sdb map disk.map
$

disk.map will contain the disk map. The disk map is plain text so you can look at it with a text editor.

Sparse Clones

It is also possible to create a 'sparse clone' of the disk: this is a file in the host filesystem which contains only the blocks of data that tfhd has actually read from the Topfield disk during a run. The sparse clone can then be used in place of the original disk, as long as the commands being executed are a subset of those originally cloned. For example:

$ ./tfhd -f /dev/sdb -c disk.img info
/dev/sdb: 160.042G device - 312581808 * 512 byte blocks
[...]
$ ls -lh disk.img
-rw-r--r-- 1 mhw mhw 1.0K 2010-03-09 17:13 disk.img
$

So at this point disk.img contains 1k of data, those being the two superblocks from the start of the disk that the info command reads.

$ ./tfhd -f /dev/sdb -c disk.img map disk.map
$ ls -lh disk.img
-rw-r--r-- 1 mhw mhw 131G 2010-03-09 17:13 disk.img
$ du -h disk.img
3.2M    disk.img
$

disk.img is now a 131Gb file that only occupies 3.2Mb of actual disk space because most of the blocks have never been written to.

$ ./tfhd -f disk.img ls
warning: superblock 2444 blocks per cluster does not match calculated 2256 blocks per cluster
__RECYCLE__/
DataFiles/
ProgramFiles/
MP3/
$

We can now run commands against the sparse clone. The warning message appears because the physical size of the sparse clone does not match the size of the raw disk, and hence the calculated number of blocks per cluster is lower than it should be. You could silence the message by using the -s option to override the disk size:

$ ./tfhd -f disk.img -s 160G ls
forcing device size to be 160G
__RECYCLE__/
DataFiles/
ProgramFiles/
MP3/
$

Future Plans

The intention is for HDSave to have a couple of parts. The first part is a TAP that runs on the Topfield device making periodic backups of the directory structure and extent locations of all files on the Topfield hard disc (this backup we will refer to as the 'disk map'). The second part is a program that runs on a PC (under Windows, Linux or OS X) and allows files to be recovered from a disk by using the extent locations captured by the TAP. The idea is that the disk map represents a complete enough backup of the Topfield disk structure that the majority of the underlying recordings could be recovered in the event that the disk structure gets trashed by the crash.

Some Implementation Thoughts

  • The TAP needs to be written in C. It will be easier for me to build the code on a Linux machine with a Topfield disk plugged in to it, so the core of the directory interpretation code needs to be abstracted from the bits that implement reading raw disk blocks.
  • This abstraction layer can also be used to make the command line recovery tool portable between Unix and other operating systems.
  • The disk structure dumper should be runnable as a command line program under Linux for testing. It should be simple enough to wrap the core disk structure dumping logic into either a CLI program or a TAP.

Disk Map Structure and Location

  • The disk map should be written into a file in the Topfield filesystem when running as a TAP. This would allow the disk map to be copied off the device periodically by an attached WL500g or similar device.
  • The disk map can be found by looking at the first block of each cluster to see if it looks like a disk map file. The time to read the first block of each cluster should be reasonably small.
  • The disk map file should be written to a new file each time to maximise the number of clusters that will contain a copy of the disk map from some point in time.
  • The disk map could also be written to the last sectors of the disk if these sectors are unused, on the basis that they are likely to remain unused and will be easy to find after a disk structure failure.
  • If the last sectors are used it may then be more useful to recreate the disk map file each time it is written to maximise the number of copies of the disk map that might be left on the disk in the event of a crash.

The disk map should have header and trailer lines that allow the map to be found and validated. If there is no off-disk backup of the disk map and the last sectors contain file data a disk scan may be able to locate the sectors that contained the file copy of the disk map by looking for the disk map header. By matching something like a timestamp between a header line and a footer line it should be possible to verify that the whole disk map has been recorded. A checksum might also be a good idea.

If the disk map spans more than one cluster it is possible that the disk map could be split into smaller fragments. This might mean that it is better to write header and trailer lines around each block-sized chunk of the disk map.

The disk map should be plain text. In the simplest case it should be possible to take the raw block offsets and lengths from a copy of the disk map and use the Unix 'dd' command to extract the file data from the disk.

Thanks

Thanks to:

  • Firebird for documenting the disk structure, and for firebirdlib.
  • R2-D2 for explaining directory entries when I hadn't read the documentation properly.

License

Copyright 2010 Mark H. Wilkinson

HDSave is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

HDSave is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with HDSave. If not, see http://www.gnu.org/licenses/.

About

Topfield PVR hard disk tools

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages