dd_rescue (1) - Linux Manuals
dd_rescue: Data recovery and protection tool
NAME
dd_rescue - Data recovery and protection toolSYNOPSIS
dd_rescue [options] infile outfiledd_rescue [options] [-2/-3/-4/-z/-Z seed/seedfile] outfile
dd_rescue [options] [--shred2/--shred3/--shred4/--random/--frandom seed/seedfile] outfile
DESCRIPTION
dd_rescue is a tool that copies data from a source (file, block device, pipe, ...) to one (or several) output file(s).If input and output files are seekable (block devices or regular files), dd_rescue does copy with large blocks (softbs) to increase performance. When a read error is encountered, dd_rescue falls back to reading smaller blocks (hardbs), to allow to recover the maximum amount of data. If blocks can still not be read, dd_rescue by default skips over them also in the output file, avoiding to overwrite data that might have been copied there successfully in a previous run. (Option -A / --alwayswrite changes this.).
dd_rescue can copy in reverse direction as well, allowing to approach a bad spot from both directions. As trying to read over a bad spot of significant size can take very long (and potentially cause further damage), this is an important optimization when recovering data. The dd_rhelp tool takes advantage of this and automates data recovery. dd_rescue does not (by default) truncate the output file.
dd_rescue by default reports on progress, and optionally also writes into a logfile. It has a progress bar and gives an estimate for the remaining time. dd_rescue has a wealth of options that influence its behavior, such as the possibility to use direct IO for input/output, to use fallocate() to preallocate space for the output file, using splice copy (in kernel zerocopy) for efficiency, looking for empty blocks to create sparse files, or using a pseudo random number generator (PRNG) to quickly overwrite data with random numbers.
The modes to overwrite partitions or files with pseudo random numbers make dd_rescue a tool that can be used for secure data deletion and thus not just a data recovery and backup tool but also a data protection tool.
You can use "-" as infile or outfile, meaning stdin or stdout. Note that this means that either file is not seekable, limiting the usefulness of some of dd_rescues features.
OPTIONS
When parsing numbers, dd_rescue assumes bytes. It accepts the following suffixes:b -- 512 size units (blocks)
k -- 1024 size units (binary kilobytes, kiB)
M -- 1024^2 size units (binary megabytes, MiB)
G -- 1024^3 size units (binary gigabytes, GiB)
The following options may be used to modify the behavior of dd_rescue .
General options
- -h, --help
- This option tells dd_rescue to output a list of options and exit.
- -V, --version
- Display version number and exit.
- -q, --quiet
- tells dd_rescue to be less verbose.
- -v, --verbose
- makes dd_rescue more verbose.
- -c
0/1, --color=0/1 - controls whether dd_rescue uses colors. By default it does, unless the terminal type from TERM is unknown or dumb or ends in -m or -mono.
- -f, --force
- makes dd_rescue skip some sanity checks (e.g. automatically setting reverse direction when input and output file are the same and ipos < opos).
- -i, --interactive
- tells dd_rescue to ask before overwriting existing files.
Block sizes
- -b
softbs, --softbs=softbs, --bs=softbs - sets the (larger) block size to softbs bytes. dd_rescue will transfer chunks of that size unless a read error is encountered (or the end of the input file or the maximum transfer size has been reached). The default value for this is 64k for buffered I/O and 1M for direct I/O.
- -B
hardbs, --hardbs=hardbs, --block-size=hardbs - sets the (smaller) fallback block size to hardbs bytes. When dd_rescue encounters read errors, it will fall back to copying data in chunks of this size. This value defaults to 4k for buffered I/O and 512 bytes for direct I/O.
hardbs should be equal to or smaller than softbs. If both block sizes are identical, no fallback mechanism (and thus no retry) will take place on read errors.- -y
syncsize, --syncfreq=syncsize - tells dd_rescue to call fsync() on the output file every syncsize bytes (will be rounded to multiples of softbs sized blocks). It will also update the progress indicator at least as often. By default, syncsize is set to 0, meaning that fsync() is only issued at the end of the copy operation.
Positions and length
- -s
ipos, --ipos=ipos, --input-position=ipos - sets the starting position of the infile to ipos. Note that ipos is specified in bytes (but suffixes can be used, see above), not in terms of softbs or hardbs sized blocks. The default value for this is 0. When reverse direction copy is used, an ipos of 0 is treated specially, meaning the end of file.
Negative positions result in an error message.- -S
opos, --opos=opos, --output-position=opos - sets the starting position of the outfile to opos. If not specified, opos is set to ipos, so the file offsets in input and output file are the same. For reverse direction copy, an explicit opos of 0 will position at the end of the output file.
- -x, --extend, --append
- changes the interpretation of the output position to start at the end of the existing output file, making appending to a file convenient. If the output file does not exist, an error will be reported and dd_rescue aborted.
- -m
maxxfer, --maxxfer=maxxfer, --max-size=maxxfer - specifies the maximum number of bytes (suffixes apply, but it's NOT counted in blocks) that dd_rescue copies. If EOF is encountered before maxxfer bytes have been transferred, this option will be silently ignored.
- -M, --noextend
- tells dd_rescue to not extend the output file. This option is particularly helpful when overwriting a file with random data or zeroes for safe data destruction. If the output file does not exist, an error message will be generated and the program be aborted.
- sets the starting position of the infile to ipos. Note that ipos is specified in bytes (but suffixes can be used, see above), not in terms of softbs or hardbs sized blocks. The default value for this is 0. When reverse direction copy is used, an ipos of 0 is treated specially, meaning the end of file.
Error handling
- -e
maxerr, --maxerr=maxerr - tells dd_rescue to exit, after maxerr read errors have been encountered. By default, this is set to 0, resulting in dd_rescue trying to move on until it hits EOF (or maxxfer bytes have been transferred).
- -w, --abort_we
- makes dd_rescue abort on any write errors. By default, on reported write errors, dd_rescue tries to rewrite the blocks with small block size writes, so a small failure in a larger block will not cause the whole block not to be written. Note that this may be handled similarly by your Operating System kernel with buffered writes without the user or dd_rescue noticing; the write retry logic in dd_rescue is mostly useful for direct I/O writes where write errors can be reliably detected.
Write error detection with buffered writes is unreliable; the kernel reports success and traces of the failing writeback operations later may only appear in your syslog. dd_rescue does try to notice the user by calling fsync() and carefully checking the return values of fsync() and close() calls.
Note that dd_rescue does exit if writes to the output file result in the Operating System reporting that no space is left.
Sparse files and write avoidance
- -A, --alwayswrite
- changes the behavior of dd_rescue to write zeroes to the output file when the input file could not be read. By default, it just skips over, leaving whatever content was in the output file at the file position before. The default behavior may be desired, if e.g. previous copy operations may have resulted in good data being in place; it may be undesired if the output file may contain garbage (or sensitive information) that should rather be overwritten with zeroes.
- -a, --sparse
- will make dd_rescue look for empty blocks (of at least half of softbs size), i.e. blocks filled with zeroes. Rather than writing those zeroes to the output file, it will then skip forward in the output file, resulting in a sparse file, saving space in the output file system (if it supports sparse files). Note that if the output file does already exist and already has data stored at the location where zeroes are skipped over, this will result in an incomplete copy in that the output file is different from the input file at the location where blocks of zeroes were skipped over. dd_rescue tries to detect this and issue a warning, but it does not prevent this from happening
- -W, --avoidwrite
- results in dd_rescue reading a block ( softbs sized) from the output file prior to writing it. If it is already identical with the data that would be written to it, the writes are actually avoided. This option may be useful for devices, where e.g. writes should be avoided (e.g. because they may impact the remaining lifetime or because they are very slow compared to reads).
Other optimization
- -R, --repeat
-
tells
dd_rescue
to only read one block (
softbs
sized) and then repeatedly write it to the output file.
Note that this results in never hitting EOF on the input file and should be
used with a limit for the transfer size (options -m or -M) or when filling
up an output device completely.
This option is automatically set, if the input file name equals "/dev/zero". - -u, --rmvtrim
-
instructs
dd_rescue
to remove the output file after writing to it has completed and issue
a FITRIM on the file system that contains the output file. This makes
only sense if writing zeros (or random numbers) as opposed to useful
content from another file. (dd_rescue will ask for confirmation if
this is specified with a normal input file and no -f (--force) is
used.) This option may be used to ensure that all empty
blocks of a file system are filled with zeros (rather than containing
fragments of deleted files with possibly sensitive information).
The FITRIM ioctl (on Linux) tells the flash storage to consider the freed space as unused (like the fstrim tool or the discard option) by issuing ATA TRIM commands. This will only succeed with superuser privileges (but the error can otherwise be safely ignored). This is useful to ensure full performance of flash memory / SSDs. Note that FITRIM can take a while on large file systems, especially if the file systems are not mounted with the discard option and have not been trimmed (with e.g. fstrim) for a while. Not all file systems and not all flash-based storage support this. - -k, --splice
- tells dd_rescue to use the Linux in-kernel zerocopy splice() copy operation rather than reading blocks into a user space buffer. Note that this operation mode does prevent the support of a number of dd_rescue features that can normally be used, such as falling back to smaller block sizes, avoiding writes, sparse mode, repeat optimization, reverse direction copy. A warning is issued to make the user aware.
- -P, --fallocate
-
results in
dd_rescue
calling fallocate() on the output file, telling the file system how much
space to preallocate for the output file. (The size is determined by the
expected last position, as inferred from the input file length and
maxxfer
). On file systems that support it, this results in them making better
allocation decisions, avoiding fragmentation. (Note that it does not
make sense to use sparse together with fallocate().)
This option is only available if dd_rescue is compiled with fallocate() support. For optimal support, it should be compiled with the libfallocate library.
Misc options
- -r, --reverse
-
tells
dd_rescue
to copy in reverse direction, starting at
ipos
(with special case 0 meaning EOF) and working towards the beginning of
the file. This is especially helpful if the input file has a bad spot
which can be extremely slow to skip over, so approaching it from both
directions saves a lot of time (and may prevent further damage).
Note that dd_rescue does automatically switch to reverse direction copy, if input and output file are identical and the input position is smaller than the output position, similar to the intelligence that memmove() uses to prevent loss of data when overlapping areas are copied. The option -f / --force does prevent this intelligence from happening. - -p, --preserve
-
When copying files, this option does result in file metadata (timestamps,
ownership, access rights, xattrs) to be copied, similar to the option with the
same name in the cp program.
Note that ACLs and xattrs will only be copied if dd_rescue has been compiled with libxattr support and the library can be dynamically loaded on the system. Also note that failing to copy the attributes with -p is not considered a failure and thus won't negatively affect the exit code of dd_rescue. - -t, --truncate
- tells dd_rescue to open the output file with O_TRUNC, resulting in the output file (if it is a regular file) to be truncated to 0 bytes before writing to it, removing all previous content that the file may have contained. By default, dd_rescue does not remove previous content.
- -T, --trunclast
- tells dd_rescue to truncate the output file to the highest copied position after the copy operation completed, thus ensuring there's no data beyond the end of the data that has been copied in this run.
- -d, --odir_in
-
instructs
dd_rescue
to open
infile
with O_DIRECT, bypassing the kernel buffers. While this option has a negative
effect on performance (the kernel does read-ahead for buffered I/O), it will
result in errors to be detected more quickly (kernel won't retry) and allows
for smaller I/O units (hardware sector size, 512bytes for most hard disks).
O_DIRECT may not be available on all platforms. - -D, --odir_out
- tells dd_rescue to open outfile with O_DIRECT, bypassing kernel buffers. This has a significant negative effect on performance, as the program needs to wait for writes to hit the disks as opposed to the asynchronous nature of buffered writeback. On the flip side, the return status from writing is reliable this way and smaller I/O chunks (hardware sector size, 512bytes) are possible.
Logging
- -l
logfile, --logfile=logfile - Unless in quiet mode, dd_rescue does produce constant updates on the status of the copy operation to stderr. With this option, these updates are also written to the specified logfile. The control characters (to move the cursor up to overwrite the existing status lines) are not written to the logfile.
- -o
bbfile, --bbfile=bbfile - instructs dd_rescue to write a list of bad blocks to bbfile. The file will contain a list of numbers (ASCII), one per line, where the numbers indicate the offset in terms of hardbs sized blocks. The file format is compatible with that of badblocks. Using dd_rescue on a block device (partition) and setting hardbs to the block size of a file system that you want to create, you should be able to feed the bbfile to mke2fs with the option -l.
Multiple output files
- -Y
ofileX, --outfile=ofileX, --of=ofileX - If you want to copy data to multiple files simultaneously, you can specify this option. It can be specified multiple times, so many copies can be made. Note that these files are secondary output files; they share file position with the primary output file outfile. Errors when writing to a secondary output file are ignored.
Data protection by overwriting with random numbers
- -z
RANDSEED, --random=RANDSEED - -Z
RANDSEED, --frandom=RANDSEED - -2
RANDSEED, --shred2=RANDSEED - -3
RANDSEED, --shred3=RANDSEED - -4
RANDSEED, --shred4=RANDSEED - When you want to overwrite a file, partition or disk with random data, using /dev/urandom (on Linux) as input is not a very good idea; the interface has not been designed to yield a high bandwidth. It's better to use a user space Pseudo Random Number Generator (PRNG). With option -z / --random, the C library's PRNG is used. With -Z / --frandom and the -2/-3/-4 / --shred2/3/4 options, an RC4 based PRNG is used.
Note that in this mode, there is no infile so the first non-option argument is the output file.
The PRNG needs seeding; the C libraries PRNG takes a 32bit integer (4 bytes); the RC4 based PRNG takes 256 bytes. If RANDSEED is an integer, the integer number will be used to seed the C library's PRNG. For the RC4 method, the C library's PRNG then generates the 256 bytes to seed it. This creates repeatable PRNG data. The RANDSEED value of 0 is special; it will create a seedval that's based on the current time and the process' PID and should be different for multiple runs of dd_rescue .
If RANDSEED is not an integer, it's assumed to be a file name from which the seed values can be read. dd_rescue will read 4 or 256 bytes from the file to seed the C library's or the RC4 PRNG. For good pseudo random numbers, using /dev/urandom to seed is a good idea.
The modes -2/-3/-4 resp. --shred2/--shred3/--shred4 will overwrite the output file multiple times; after each pass, fsync() will ensure that the data does indeed hit the file. The last pass for these modes will overwrite the file with zeroes. The rationale behind doing this is to make it easier to hide that important data may have been overwritten, to make it easier for intelligent storage systems (such as SSDs) to recycle the empty blocks and to allow for better compression of a file system image containing such data.
With -2 / --shred2, one pass with RC4 generated PRNG is happening and then zeroes are written. With -3 / --shred3, there are two passes with RC4 PRNG generated random numbers and a zero pass; the second PRNG pass writes the inverse (bit-wise reversed) numbers from the first pass. -4 / --shred4 works like -3 / --shred3, with an additional pass with independent random numbers as third pass.
Plugins
Since version 1.42, dd_rescue has an interface for plugins. Plugins have the ability to analyze the copied data or to transform it prior to it being written.- -L
plugin1[=param1[:param2[:..]]][,plugin2[=..][,..]] - --plugins=plugin1[=param1[:param2[:..]]][,plugin2[=..][,..]]
-
loads plugins plugin1 ... and passes parameters to it. All plugins should support
at least the help parameter and provide information on their usage.
Plugins may impose limits on dd_rescue. Plugins that look at the data can't work with splice, as this avoids copying data to user space. Also the interface currently does not facilitate reverse direction copy. Some plugins may impose further restrictions w.r.t. alignment of data in the file or not using sparse detection.
See section PLUGINS for an overview of available plugins.
PLUGINS
null
The null plugin (ddr_null) does nothing, except if you specify the [no]lnchange or the [no]change options in which case the plugin indicates to others that it transforms the length of the output or the data of the stream. (With the no prefix, it's reset to the default no-change indication again.) This may be helpful for testing or to influence which file the hash plugin considers for reading/writing extended attributes from/to and for plugins to change their behavior with respect to hole detection.ddr_null_ddr also allows you to specify debug in which case it just reports the blocks that it passes on.
hash
When the hash plugin (subsequently referred to as ddr_hash) is loaded, it will calculate a cryptographic hash and optionally also a HMAC over the copied data and print the result at the end of the copy operations. The hash algorithm can be chosen by specifying alg[o[rithm]]=ALG where ALG is one of md5, sha1, sha256, sha224, sha512, sha384. (Specify alg=help to get a list.) To abbreviate the syntax, the alg= piece can be omitted.For backwards compatibility, the hash plugin can also be referred to with the old MD5 name; it then defaults to the md5 algorithm.
The computed value should be identical to calling md5sum/sha256sum/... on the target file (unless you only write part of the file), but saves time by not accessing the (possibly large) file a second time. The hash plugin handles sparse writes and arbitrary offsets fine.
ddr_hash also supports the parameter
append=STRING
which appends the given STRING to the output before computing the cryptographic
hash. Treating the STRING as a shared secret, this can actually be used to protect
against someone not knowing the secret altering the contents (and recomputing the
hash) without anyone noticing. It's thus a cheap way of a cryptographic signature
(but with preshared secrets as opposed to public key cryptography). Use HMAC for a
somewhat better way to sign data with a shared secret.
ddr_hash also supports
prepend=STRING
which is likely harder to attack with brute force than an appended string.
Note that ddr_hash always prepends multiples of the hash algorithm's block
size and pads the STRING with 0 to match.
ddr_hash can be used to compute a HMAC (Hash-based Message Authentication
Code) instead of the plain hash. The HMAC uses a password that's
prepended and transformed twice to the data which is then hashed twice.
HMAC is believed to protect somewhat
better against extension or collision attacks than a plain hash (with a
plain prepended secret), so it's a better way to authenticate data with a
shared secret. (You can use append/prepend in addition to HMAC, if you
have a need for a scheme with more than one secret.)
When HMAC is enabled with one of the following parameters, both the plain hash
and the HMAC are computed by ddr_hash. Both are output to the console/log,
but the HMAC is used instead of the hash value to be written to a CHECKSUMS
file or to an extended attribute or checked against (see below).
hmacpwd=STRING
sets the shared secret (password) for computing the HMAC. Passing the secret on
the command line has the disadvantage that the shell may mistreat some bytes
as special characters and that the command line may be visible to all logged in
users on the system.
hmacpwdfd=INT
sets a file descriptor from with the secret (password) for HMAC computation will
be read. Specifying 0 means standard input, in which case ddr_hash even prints
a prompt for you ... Other numbers may be useful if dd_rescue is called from
another program that opens a pipe to pass the secret.
hmacpwdnm=INNAME
sets a file from which the shared secret (password) is read. Note that all bytes
(up to 2048 of them) are read and used, including trailing white space, 0-bytes
or newlines.
Please note that the ddr_hash plugin at this point does NOT take a lot of care
to prevent the password/pre/appended secret from remaining in memory or leaking
into a swap/page file. (This will be improved once I look into encryption plugins.)
ddr_hash accepts the parameter
output
, which will cause ddr_hash to output
the cryptographic hash to stdout in the same format that md5sum/sha256sum/... use.
You can also specify
outfd=INT
to have the plugin write the hash to a different
file descriptor specified by the integer number INT. Note that ddr_hash
always processes data in binary mode and correctly indicates this with
a star (*) in the output generated with output/outfd=.
The checksum can also be written to a file by giving the
outnm=OUTNAME
parameter. Then a file with OUTNAME will be created and a md5sum/sha256sum/...
compatible line will be printed to the file. If the file exists and contains
an entry for the file, it will be updated. If the file exists and does not
contain an entry for the file, one will be appended. If OUTNAME is omitted, the
file name CHECKSUMS.alg (or HMACS.alg if HMAC is enabled) will be used (alg
is replaced by the chosen algorithm).
If the checksum can't be written, a warning will be printed and the exit code
of dd_rescue will become non-zero.
The checksum can be validated using
checknm=CHKNAME .
The file will be read and ddr_hash will look for an md5sum/sha256sum/...
compatible line with a matching file name to take the checksum from and
compare it to the one computed. If NAME is omitted, the same default
as described above (in outnm=...) will be used. You can also read the
checksum from stdin if you prefer by specifying the
check
option.
Note that in any case, the check is only performed after the copy operation
is completed -- a faulty checksum will thus NOT result in the copy not
taking place. However, the exit code of dd_rescue will indicate the
error. (If you want to avoid copying data with a broken checksum into
the final target, use a temporary target that you delete upon error and
only move to the final location if dd_rescue's exit value is 0; you can
of course also copy to /dev/null for testing beforehand, but it might
be too costly reading the input file twice.)
You can store the cryptographic hash into the files by using the
set_xattr
option. The hash will be stored into the extended attribute user.checksum.ALG
by default (user.hmac.ALG if HMAC is enabled), but you can override the name
of the attribute by specifying
set_xattr=XATTR.NAME
instead. If the xattr can't be written, an error will be reported, unless
you also specify the
fallb[ack][=CHKNAME]
option. In that case, ddr_hash tries to write the checksum to the CHKNAME
checksums file. (For the default for CHKNAME, see outnm= option above.)
chk_xattr
will validate that the computed hash matches the one read from the extended
attribute. The same default attribute name applies and you can likewise override
it with
chk_xattr=XATTR.NAME .
A missing attribute is considered an error (although the same fallback is
tried if you specify the fallback option). A broken checksum is of course
considered an error as well, but just like with checknm=CHKNAME won't
prevent the copy. See the discussion there.
Note that for output,outfd,outnm=,set_xattr ddr_hash will use the
output file name to attach the checksum to (be it by setting xattr or the
file name used in the checksum file), unless a plugin
in the chain after ddr_hash indicates that it changes the data.
In that case, it will warn and associate the checksum with the input file
name, unless there's another plugin before ddr_hash in the chain which
indicates data transformation as well. In that case, there is no file that
the checksum could be associated with and ddr_hash will report an error.
Likewise for chknm=,check,chk_xattr ddr_hash will use the input file
name to get the checksum (be it by reading the xattr or by looking for
the input file name in a checksums file) unless there's a plugin in the
chain before ddr_hash that indicates that it changes the data. The output
file name will then be used, unless there's another plugin after ddr_hash
indicating data change as well, in which case there's no file we could
get the checksum for and thus an error is reported.
If your system supports extended attributes, those have the advantage of traveling with the files; thus a rename or copy (with dd_rescue -p) will maintain the checksum. Checksum files on the other hand can be handled everywhere (including the transfer via ftp or http) and can be cryptographically signed with PGP/GnuPG.
Please note that the md5 algorithm is NOT recommended any more for good protection against malicious attempts to hide data modification; it's not considered strong enough any more to prevent hash collisions. sha1 is better, but the recommendation is to use the SHA-2 family of hashes. On 32bit machines, I'd recommend sha256, while on 64bit machines, sha512 is faster and thus the best choice.
ddr_hash also supports using the HMAC code and hashes for deriving keys from passwords using the PKCS5 PBKDF2 (password-based key derivation function) that allows you to improve the protection from mediocre passwords by using a salt and a relatively expensive key stretching operation. This is only meant for testing and may be removed in the future. It's thus not documented in this man page. See the built-in help function for a brief summary on the usage.
lzo
The lzo plugin allows to compress and decompress data using liblzo2. lzo is an algorithm that is faster than most other algorithms but does not compress as well. See the ddr_lzo(1) man page for more details.crypt
The crypt plugin allows to encrypt and decrypt data on the fly. It currently supports a variety of AES ciphers. See the ddr_crypt(1) man page for more details.EXIT STATUS
On successful completion, dd_rescue returns an exit code of 0. Any other exit code indicates that the program has aborted because of an error condition or that copying of the data has not been entirely successful.EXAMPLES
- dd_rescue -k -P -p -t infile outfile
- copies infile to outfile and does truncate the output file on opening (so deleting any previous data in it), copies mode, times, ownership at the end, uses fallocate to reserve the space for the output file and uses efficient in kernel splice copy method.
- dd_rescue -A -d -D -b 512
/dev/sda /dev/sda - reads the contents of every sector of disk sda and writes it back to the same location. Typical hard disks reallocate flaky and faulty sectors on writes, so this operation may result in the complete disk being usable again when there were errors before. Unreadable blocks however will contain zeroes after this.
- dd_rescue -2
/dev/urandom -M outfile - overwrites the file outfile twice; once with good pseudo random numbers and then with zeroes.
- dd_rescue -t -a image1.raw image2.raw
-
copies a file system image and looks for empty blocks to create a
sparse output file to save disk space. (If the source file system
has been used a bit, on that file system creating a large file with
zeroes and removing it again prior to this operation will result
in more sectors with zeroes.
dd_rescue -u
/dev/zero DUMMY will achieve this ...) - dd_rescue -ATL hash=md5:output,lzo=compress:bench,MD5:output in out.lzo
- copies the file in to out.lzo with using lzo (lzo1x_1) compression and calculating an md5 hash (checksum) on both files. The md5 hashes for both are also written to stdout in the md5sum output format. Note that the compress parameter to lzo is not strictly required here; the plugin could have deduced it from the file names. This example shows that you can specify multiple plugins with multiple parameters; the plugins are forming a filter chain. You can specify the same plugin multiple times.
- dd_rescue -L hash=sha512:set_xattr:fallb,null=change infile
/dev/null - reads the file infile and computes its sha512 hash. It stores it in the input file's user.checksum.sha512 attribute (and falls back to writing it to CHECKSUMS.sha512 if xattrs can't be written). Note the use of the null plugin with faking data change with the change parameter; this causes the hash plugin to write to the input file which it would not normally have done. Of course this will fail if you don't have the appropriate privileges to write xattrs to infile nor to write the checksum to CHECKSUMS.sha512.
See also README.dd_rescue and ddr_lzo(1) to learn about the possibilities.
TESTING
Untested code is buggy, almost always. I happen to have a damaged hard disk that I use for testing dd_rescue from time to time. But to allow for automated testing of error recovery, it's better to have predictable failures for the program to deal with. So there is a fault injection framework.Specifying -F 5w/1,17r/3,42r/-1,80-84r/0 on the command-line will result in in the 5th block (counted in hardblocksize) will fail to be written once (from which dd_rescue should recover, as it tries a second time for failed writes), block no 17 will fail to be read 3 times, block no 42 will read fine once, but then fail afterwards, whereas blocks 80 through 83 are completely unreadable (will fail infinite times). Note that the range excludes the last block (80-84 means 4 blocks starting @ 80).
Block offsets are always counted in absolute positions, so starting in the middle of a file with -s or reverse copying won't affect the absolute position that is hit with the fault injection. (This has changed since 1.98.)
BUGS/LIMITATIONS
The source code does use the 64bit functions provided by glibc for file positioning. However, your kernel might not support it, so you might be unable to copy partitions larger then 2GB into a file.This program has been written using Linux and only tested on a couple of Linux systems. People have reported to have successfully used it on other Un*xish systems (such as xBSD or M*cOS), but these systems get little regular test coverage; so please be advised to test properly (possibly using the make check test suite included with the source distribution) before relying on dd_rescue on non Linux based systems.
Currently, the escape sequence for moving the cursor up is hard coded in the sources. It's fine for most terminal emulations (including vt100 and linux), but it should use the terminal description database instead.
Since dd_rescue-1.10, non-seekable input or output files are supported, but there's of course limitations to recover errors in such cases.
dd_rescue does not automate the recovery of faulty files or partitions
by automatically keeping a list of copied sectors and approaching bad spots
from both sides. There is a helper script dd_rhelp from LAB Valentin that
does this. Integration of such a mode into
dd_rescue
itself is non-trivial and due to the complexity of the source code might
not happen.
There also is a tool, GNU ddrescue, that is a reimplementation of this
tool and which contains the capabilities to automate recovery of bad
files in the way dd_rhelp does. It does not have the feature richness
of dd_rescue, but is reported to be easier to operate for error recovery
than dd_rescue with dd_rhelp.
If your data is very valuable and you are considering sending your disk to a data recovery company, you might be better off NOT trying to use imaging tools like dd_rescue, dd_rhelp or GNU ddrescue. If you're unlucky, the disk has suffered some mechanical damage (e.g. by having been dropped), and continuing to use it may make the head damage the surface further. You may be able to detect this condition by quickly raising error counts in the SMART attributes or by a clicking noise.
Please report bugs to me via email.
Data destruction considerations
The modes for overwriting data with pseudo random numbers to securely delete sensitive data on purpose only implement a limited number of overwrites. While Peter Gutmann's classic analysis concludes that the then current hard disk technology requires more overwrites to be really secure, the author believes that modern hard disk technology does not allow data restoration of sectors that have been overwritten with the --shred4 mode. This is in compliance with the recommendations from BSI GSDS M7.15.Overwriting whole partitions or disks with random numbers is a fairly safe way to destroy data, unless the underlying storage device does too much magic. SSDs are doing fancy stuff in their Flash Translation Layer (FTL), so this tool might be insufficient to get rid of data. Use SECURITY_ERASE (use hdparm) there or -- if available -- encrypt data with AES256 and safely destroy the key. Normal hard disks have a small risk of leaking a few sectors due to reallocation of flaky sectors.
For securely destroying single files, your mileage may vary. The more advanced your file system, the less likely dd_rescue's destruction will be effective. In particular, journaling file systems may carry old data in the journal. File systems that do copy-on-write (COW) such as btrfs, are very likely to have old copies of your supposedly erased file. It might help somewhat to fill the file systems with zeros (dd_rescue -u /dev/zero /path/to/fs/DUMMYNAME) to force the file system to release and overwrite non-current data after overwriting critical files with random numbers. If you can, better destroy a whole partition or disk.
AUTHOR
Kurt Garloff <kurt [at] garloff.de>CREDITS
Many little issues were reported by Valentin LAB, the author of dd_rhelp .The RC4 PRNG (frandom) is a port from Eli Billauer's kernel mode PRNG.
A number of recent ideas and suggestions came from Thomas.
COPYRIGHT
This program is protected by the GNU General Public License (GPL) v2 or v3 - at your option.HISTORY
Since version 1.10, non seekable input and output files are supported.Splice copy -k is supported since 1.15.
A progress bar exists since 1.17.
Support for preallocation (fallocate) -P exists since 1.19.
Since 1.23, we default to -y0, enhancing performance.
The Pseudo Random Number modes have been started with 1.29.
Write avoidance -W has been implemented in 1.30
Multiple output files -Y have been added in 1.32.
Long options and man page came with 1.33.
Optimized sparse detection (SSE2, armv6, armv8 asm, AVX2) has been present since 1.35 and been enhanced until 1.43.
We support copying extended attributes since 1.40 using libxattr.
Removing and (fs)trimming the output file's file system exists since 1.41. Support for compilation with bionic (Android's C library) with most features enabled also came with 1.41.
Plugins exist since 1.42, the MD5 plugin came with 1.42, the lzo plugin with 1.43. 1.44 renamed the MD5 plugin to hash and added support for the SHA-2 family of hashes. 1.45 added SHA-1 and the ability to store and validate checksums.
1.98 brought encryption and the fault injection framework, 1.99 support for ARMv8 crypto acceleration.
Some additional information can be found on
http://garloff.de/kurt/linux/ddrescue/
LAB Valentin's
dd_rhelp
can be found on
http://www.kalysto.org/utilities/dd_rhelp/index.en.html