A fast duplicate file finder

Overview

The Directory Differential hTool

DDH traverses input directories and their subdirectories. It also hashes files as needed and reports findings.

The H in "hTool" is silent. The H in its abbreviation, "DDH," is not.

This tool is called DDH for two very good reasons.

  • DDT is a dangerous pesticide
  • I mistyped when I created the project

Usage

DDH is usable both as a library and as a stand alone CLI tool and aims to be simple to use in both cases.

Library example

let (files, errors): (Vec<Fileinfo>, Vec<(_, _)>) = ddh::deduplicate_dirs(dirs);
let (shared, unique): (Vec<&Fileinfo>, Vec<&Fileinfo>) = files
                    .par_iter()
                    .partition(|&x| x.get_paths().len()>1);
process_full_output(&shared, &unique, &files, &errors, &arguments);

CLI Install

  • Install Rust
  • cargo install --git https://github.com/darakian/ddh ddh
  • The DDH binary will be installed into $CARGO_HOME/.bin/ddh, which usually is $HOME/.cargo/bin/ddh. This should be in your PATH already if you're using rustup.

CLI Features

DDH supports both a standard output for human comprehension and a parsable json output for custom tools such as ddh-move.

CLI Example

Directory Difference hTool
Jon Moroney [email protected]
Compare and contrast directories.
Example invocation: ddh /home/jon/downloads /home/jon/documents -f duplicates
Example pipe: ddh ~/Downloads/ -o no -v all -f json | someJsonParser.bin

USAGE:
    ddh [OPTIONS] <Directories>...

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
    -b, --blocksize <Blocksize>    Sets the display blocksize to Bytes, Kilobytes, Megabytes or Gigabytes. Default is
                                   Kilobytes. [possible values: B, K, M, G]
    -f, --format <Format>          Sets output format. [possible values: standard, json, off]
    -o, --output <Output>          Sets file to save all output. Use 'no' for no file output.
    -v, --verbosity <Verbosity>    Sets verbosity for printed output. [possible values: quiet, duplicates, all]

ARGS:
    <Directories>...    Directories to parse

How Does DDH Work?

DDH works by hashing files to determine their uniqueness and, as such, depends heavily on disk speeds for performance. The algorithmic choices in use are discussed here.

Comments
  • ddh not working with directories have Arabic characters

    ddh not working with directories have Arabic characters

    Hello

    First thank you for the idea and this implementation.

    I tested that in a folder which contains Arabic word but doesn't affected. I learning rust and I'm interested to participate to fix like these issues but may need some hint if possible. issue

    bug Windows 
    opened by moaz-mokhtar 32
  • UB from a crater run

    UB from a crater run

    I got a panic from a crater run which included this crate (in case you are not aware, crater is a tool that runs rustc against a set number of locked crates to check for regressions). It was with 0.10.9 version so i'm not sure if it still holds true in the 0.11.0 or not. Also, it could be caused by one of the dependencies. I tried tracking it down but I couldn't find anything.

    The main point of the crater run was to find UB according to this advisory: https://github.com/RustSec/advisory-db/issues/298.

    Edit: there is a small possibility that this could be due to the file permissions being denied in crater runs

    The crater output is:

    [INFO] [stdout] failures:
    [INFO] [stdout] 
    [INFO] [stdout] ---- src/lib.rs - deduplicate_dirs (line 216) stdout ----
    [INFO] [stdout] Test executable failed (terminated by signal).
    [INFO] [stdout] 
    [INFO] [stdout] stderr:
    [INFO] [stdout] thread '<unnamed>' panicked at 'attempted to leave type `internal::Local` uninitialized, which is invalid', /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/mem/mod.rs:661:5
    [INFO] [stdout] stack backtrace:
    [INFO] [stdout] thread '<unnamed>' panicked at 'attempted to leave type `internal::Local` uninitialized, which is invalid', /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/mem/mod.rs:661:5
    [INFO] [stdout] thread '<unnamed>' panicked at 'attempted to leave type `internal::Local` uninitialized, which is invalid', /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/mem/mod.rs:661:5
    [INFO] [stdout] thread '<unnamed>' panicked at 'attempted to leave type `internal::Local` uninitialized, which is invalid', /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/mem/mod.rs:661:5
    [INFO] [stdout] thread '<unnamed>' panicked at 'attempted to leave type `internal::Local` uninitialized, which is invalid', /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/mem/mod.rs:661:5
    [INFO] [stdout] thread '<unnamed>' panicked at 'attempted to leave type `internal::Local` uninitialized, which is invalid', /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/mem/mod.rs:661:5
    [INFO] [stdout] thread '<unnamed>' panicked at 'attempted to leave type `internal::Local` uninitialized, which is invalid', /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/mem/mod.rs:661:5
    [INFO] [stdout] thread '<unnamed>' panicked at 'attempted to leave type `internal::Local` uninitialized, which is invalid', /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/mem/mod.rs:661:5
    [INFO] [stdout]    0:     0x55caa32b01c4 - backtrace::backtrace::libunwind::trace::h61666fa4b9834a45
    [INFO] [stdout]                                at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
    [INFO] [stdout]    1:     0x55caa32b01c4 - backtrace::backtrace::trace_unsynchronized::h01d3725fb5da8653
    [INFO] [stdout]                                at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
    [INFO] [stdout]    2:     0x55caa32b01c4 - std::sys_common::backtrace::_print_fmt::h5c7c4751bc296f90
    [INFO] [stdout]                                at src/libstd/sys_common/backtrace.rs:78
    [INFO] [stdout]    3:     0x55caa32b01c4 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h7729313ad2d729c7
    [INFO] [stdout]                                at src/libstd/sys_common/backtrace.rs:59
    [INFO] [stdout]    4:     0x55caa32cf9bc - core::fmt::write::hb0049f6c12bdaffc
    [INFO] [stdout]                                at src/libcore/fmt/mod.rs:1069
    [INFO] [stdout]    5:     0x55caa32ae113 - std::io::Write::write_fmt::hfb486fa6ccb90f14
    [INFO] [stdout]                                at src/libstd/io/mod.rs:1504
    [INFO] [stdout]    6:     0x55caa32b29a5 - std::sys_common::backtrace::_print::ha087cac92e84a414
    [INFO] [stdout]                                at src/libstd/sys_common/backtrace.rs:62
    [INFO] [stdout]    7:     0x55caa32b29a5 - std::sys_common::backtrace::print::h6b013d3bfe6cbcb8
    [INFO] [stdout]                                at src/libstd/sys_common/backtrace.rs:49
    [INFO] [stdout]    8:     0x55caa32b29a5 - std::panicking::default_hook::{{closure}}::h016d7de02def2736
    [INFO] [stdout]                                at src/libstd/panicking.rs:198
    [INFO] [stdout]    9:     0x55caa32b26e2 - std::panicking::default_hook::h9386f7e484840463
    [INFO] [stdout]                                at src/libstd/panicking.rs:218
    [INFO] [stdout]   10:     0x55caa32b3002 - std::panicking::rust_panic_with_hook::h3846eb1b2e5a9312
    [INFO] [stdout]                                at src/libstd/panicking.rs:511
    [INFO] [stdout]   11:     0x55caa32b2beb - rust_begin_unwind
    [INFO] [stdout]                                at src/libstd/panicking.rs:419
    [INFO] [stdout]   12:     0x55caa32ceda1 - core::panicking::panic_fmt::hc5953a7331f9805c
    [INFO] [stdout]                                at src/libcore/panicking.rs:111
    [INFO] [stdout]   13:     0x55caa32ceced - core::panicking::panic::hbfb03ddd2cbb509e
    [INFO] [stdout]                                at src/libcore/panicking.rs:54
    [INFO] [stdout]   14:     0x55caa3297489 - core::mem::uninitialized::h869ad29ea76dd030
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/mem/mod.rs:661
    [INFO] [stdout]   15:     0x55caa3297489 - <crossbeam_epoch::internal::Local as crossbeam_epoch::sync::list::IsElement<crossbeam_epoch::internal::Local>>::entry_of::hdb4452d7a6e0a610
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/crossbeam-epoch-0.7.1/src/internal.rs:493
    [INFO] [stdout]   16:     0x55caa329e823 - crossbeam_epoch::sync::list::List<T,C>::insert::h3ac4d85186c907f0
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/crossbeam-epoch-0.7.1/src/sync/list.rs:176
    [INFO] [stdout]   17:     0x55caa3296454 - crossbeam_epoch::internal::Local::register::h91b5c90d84206ca9
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/crossbeam-epoch-0.7.1/src/internal.rs:291
    [INFO] [stdout]   18:     0x55caa32a0a5e - crossbeam_epoch::collector::Collector::register::h5691db836e05a7d5
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/crossbeam-epoch-0.7.1/src/collector.rs:39
    [INFO] [stdout]   19:     0x55caa329a7b4 - crossbeam_epoch::default::HANDLE::__init::h05b698916945b7d4
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/crossbeam-epoch-0.7.1/src/default.rs:17
    [INFO] [stdout]   20:     0x55caa3299ec9 - core::ops::function::FnOnce::call_once::hf43b3b136e2a1b46
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/ops/function.rs:232
    [INFO] [stdout]   21:     0x55caa32a053c - std::thread::local::lazy::LazyKeyInner<T>::initialize::h72653fcc8deb826a
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libstd/thread/local.rs:289
    [INFO] [stdout]   22:     0x55caa329ced2 - std::thread::local::fast::Key<T>::try_initialize::h30683c7ab15cafc2
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libstd/thread/local.rs:425
    [INFO] [stdout]   23:     0x55caa329d027 - std::thread::local::fast::Key<T>::get::hd10d89894258aff9
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libstd/thread/local.rs:410
    [INFO] [stdout]   24:     0x55caa329a7ed - crossbeam_epoch::default::HANDLE::__getit::h962c871568622af7
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libstd/thread/local.rs:175
    [INFO] [stdout]   25:     0x55caa32749d5 - std::thread::local::LocalKey<T>::try_with::h3f261a7c4f47ce5a
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libstd/thread/local.rs:262
    [INFO] [stdout]   26:     0x55caa3283a53 - crossbeam_epoch::default::with_handle::h269db34920d39f6c
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/crossbeam-epoch-0.7.1/src/default.rs:42
    [INFO] [stdout]   27:     0x55caa3283d37 - crossbeam_epoch::default::is_pinned::hb6e44e79801d0481
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/crossbeam-epoch-0.7.1/src/default.rs:29
    [INFO] [stdout]   28:     0x55caa328b6d2 - crossbeam_deque::Stealer<T>::steal::h9ae6be401072c7cd
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/crossbeam-deque-0.6.3/src/lib.rs:660
    [INFO] [stdout]   29:     0x55caa326c2fa - rayon_core::registry::WorkerThread::steal::{{closure}}::hca21cf343bc94b68
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/rayon-core-1.5.0/src/registry.rs:764
    [INFO] [stdout]   30:     0x55caa32694cb - core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut::h59bd862209ac9487
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/ops/function.rs:274
    [INFO] [stdout]   31:     0x55caa3287921 - core::iter::traits::iterator::Iterator::find_map::check::{{closure}}::h6647419eb8e863d9
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/iter/traits/iterator.rs:2239
    [INFO] [stdout]   32:     0x55caa327fd65 - core::iter::adapters::filter_try_fold::{{closure}}::h0e27e4a4e21c5718
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/iter/adapters/mod.rs:935
    [INFO] [stdout]   33:     0x55caa327effc - core::ops::function::impls::<impl core::ops::function::FnMut<A> for &mut F>::call_mut::h5b9d2db37f7cb14f
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/ops/function.rs:274
    [INFO] [stdout]   34:     0x55caa3287ed4 - core::iter::traits::iterator::Iterator::try_fold::h9c4820e0b9b6a2ec
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/iter/traits/iterator.rs:1878
    [INFO] [stdout]   35:     0x55caa327e3c3 - <core::iter::adapters::chain::Chain<A,B> as core::iter::traits::iterator::Iterator>::try_fold::h80243e00b6de1e21
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/iter/adapters/chain.rs:83
    [INFO] [stdout]   36:     0x55caa327e2a6 - <core::iter::adapters::Filter<I,P> as core::iter::traits::iterator::Iterator>::try_fold::ha3e1ae2a61dda215
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/iter/adapters/mod.rs:984
    [INFO] [stdout]   37:     0x55caa327fb01 - core::iter::traits::iterator::Iterator::find_map::he70084dfab691d9a
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/iter/traits/iterator.rs:2245
    [INFO] [stdout]   38:     0x55caa327e316 - <core::iter::adapters::FilterMap<I,F> as core::iter::traits::iterator::Iterator>::next::hdc570465366b354c
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/iter/adapters/mod.rs:1084
    [INFO] [stdout]   39:     0x55caa326c228 - rayon_core::registry::WorkerThread::steal::h9bd36c493aa21796
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/rayon-core-1.5.0/src/registry.rs:758
    [INFO] [stdout]   40:     0x55caa326bf52 - rayon_core::registry::WorkerThread::wait_until_cold::{{closure}}::h590a1fcd9de02876
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/rayon-core-1.5.0/src/registry.rs:714
    [INFO] [stdout]   41:     0x55caa327916f - core::option::Option<T>::or_else::hf431f9e43aa9978c
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/option.rs:766
    [INFO] [stdout]   42:     0x55caa326bc00 - rayon_core::registry::WorkerThread::wait_until_cold::hd72b2229314009a4
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/rayon-core-1.5.0/src/registry.rs:712
    [INFO] [stdout]   43:     0x55caa326b5f6 - rayon_core::registry::WorkerThread::wait_until::h608ca86b940e41fd
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/rayon-core-1.5.0/src/registry.rs:692
    [INFO] [stdout]   44:     0x55caa326c87d - rayon_core::registry::main_loop::hf39058cfb43bdbf9
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/rayon-core-1.5.0/src/registry.rs:812
    [INFO] [stdout]   45:     0x55caa3269720 - rayon_core::registry::ThreadBuilder::run::ha145b7a27b5851e2
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/rayon-core-1.5.0/src/registry.rs:58
    [INFO] [stdout]   46:     0x55caa3269c21 - <rayon_core::registry::DefaultSpawn as rayon_core::registry::ThreadSpawn>::spawn::{{closure}}::ha5a4cee564920880
    [INFO] [stdout]                                at /opt/rustwide/cargo-home/registry/src/github.com-1ecc6299db9ec823/rayon-core-1.5.0/src/registry.rs:103
    [INFO] [stdout]   47:     0x55caa3283e90 - std::sys_common::backtrace::__rust_begin_short_backtrace::hffc2c09a197264dd
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libstd/sys_common/backtrace.rs:130
    [INFO] [stdout]   48:     0x55caa3285171 - std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}}::hfdfb97e3734c02e3
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libstd/thread/mod.rs:475
    [INFO] [stdout]   49:     0x55caa32838b3 - <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::h27a0583c4b616fd0
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libstd/panic.rs:318
    [INFO] [stdout]   50:     0x55caa326e384 - std::panicking::try::do_call::h8929afe6053c86ae
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libstd/panicking.rs:331
    [INFO] [stdout]   51:     0x55caa326f6dd - __rust_try
    [INFO] [stdout]   52:     0x55caa326e174 - std::panicking::try::h3fadfd4c27ff95a2
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libstd/panicking.rs:274
    [INFO] [stdout]   53:     0x55caa3284323 - std::panic::catch_unwind::h422593581e096c2f
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libstd/panic.rs:394
    [INFO] [stdout]   54:     0x55caa3284f61 - std::thread::Builder::spawn_unchecked::{{closure}}::hf49384bceeda1e85
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libstd/thread/mod.rs:474
    [INFO] [stdout]   55:     0x55caa3274fbf - core::ops::function::FnOnce::call_once{{vtable.shim}}::h6ecc5d6bbd1cf8dc
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/libcore/ops/function.rs:232
    [INFO] [stdout]   56:     0x55caa32a9f9f - <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h7deaeeba480c6ce2
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/liballoc/boxed.rs:1008
    [INFO] [stdout]   57:     0x55caa32b53f3 - <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h968b1dfb87756523
    [INFO] [stdout]                                at /rustc/41997647ba6a77908f6ef64401414feb00bccf16/src/liballoc/boxed.rs:1008
    [INFO] [stdout]   58:     0x55caa32b53f3 - std::sys::unix::thread::Thread::new::thread_start::h2399addeb6370595
    [INFO] [stdout]                                at src/libstd/sys/unix/thread.rs:87
    [INFO] [stdout]   59:     0x7f326ac7a6db - start_thread
    [INFO] [stdout]   60:     0x7f326a78b88f - __clone
    [INFO] [stdout]   61:                0x0 - <unknown>
    [INFO] [stdout] Rayon: detected unexpected panic; aborting
    ```
    opened by Dylan-DPC-zz 11
  • Excessive CPU and memory use + thread panics

    Excessive CPU and memory use + thread panics

    Tried this on my home directory with ~1.5 million files.

    /usr/bin/time -v target/release/ddh ~
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    thread '<unnamed>' panicked at 'Error reading Symlink Metadata: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/lib.rs:285:25
    ...
    

    There seems to be no way to turn off symlink following, so I guess it might have escaped from my home dir and tried to scan the whole filesystem. But anyways, the CPU and memory usage was extremely high. System time usage is quite unusal assuming this was I/O bound workload:

    ^CCommand terminated by signal 2
    	Command being timed: "target/release/ddh /home/pkolaczk"
    	User time (seconds): 50.08
    	System time (seconds): 983.47
    	Percent of CPU this job got: 776%
    	Elapsed (wall clock) time (h:mm:ss or m:ss): 2:13.10
    	Average shared text size (kbytes): 0
    	Average unshared data size (kbytes): 0
    	Average stack size (kbytes): 0
    	Average total size (kbytes): 0
    	Maximum resident set size (kbytes): 18724728
    	Average resident set size (kbytes): 0
    	Major (requiring I/O) page faults: 0
    	Minor (reclaiming a frame) page faults: 4688422
    	Voluntary context switches: 27301
    	Involuntary context switches: 150856
    	Swaps: 0
    	File system inputs: 103592
    	File system outputs: 0
    	Socket messages sent: 0
    	Socket messages received: 0
    	Signals delivered: 0
    	Page size (bytes): 4096
    

    ddh consumed 18 GBs, while other programs like fdupes, jdupes, rmfind, fclones need about 150-500 MB.

    opened by pkolaczk 11
  • -o Option has no effect

    -o Option has no effect

    I built ddh from master branch on Ubuntu 16.04 with rustc 1.33.0-nightly (ec194646f 2019-01-02). Used 'cargo build --release' to compile.

    If I run ddh from the target/release directory with the following command in the target/release directory:

    ./ddh ~/Downloads/ -f json -o ~/Downloads/dupes.json
    

    The dupes.json file is created in the current working directory (target/release) but not at the location specified (~/Downloads/dupes.json).

    Despite this minor issue, I really appreciate this tool! Thank you for your hard work :+1:

    bug 
    opened by JayWalker512 8
  • Ignore well know directories

    Ignore well know directories

    can ddh ignore known directories such as node_modules, .mypy_cache, CMakefile Release/Debug directories,etc

    or just respect .gitignore by default(or not by default).

    opened by laoshaw 6
  • Incorrect help output

    Incorrect help output

    ddh --help reports:

    USAGE:
        ddh [OPTIONS] <Directories>...
    

    but it should be ddh <Directories>... [OPTIONS].

    Sadly I don't know clap and how to fix it.

    opened by Tobi823 6
  • Done a bit benchmarking 2021-12-04

    Done a bit benchmarking 2021-12-04

    local directory with 930 jpeg files with 2 duplicate jpeg files

    ubuntu 20.04 no GUI :

    rdfind   real: 0m00.657s   user: 0m00.007s  sys: 0m00.114s
    ddh      real: 0m00.007s   user: 0m00.002s  sys: 0m00.004s
    fdupes   real: 0m32.932s   user: 0m00.074s  sys: 0m05.685s
    

    Windows 11 pro:

    ddh        total: 0m00.011s (started from a .cmd file) 
    powershell total: 0m00.012s (incl. grouping & sorting)
    DupeGuru coulnd't the duplicate files
    

    Overall i would say nice job 👍👏.

    PS: Ssome of the rdfind options would be nice (shameless I know 😇)

    opened by neunmalelf 5
  • Add Dockerfile

    Add Dockerfile

    This Dockerfile uses a multi-stage Docker build to produce a relatively minimal image for the ddh tool.

    Once built, it can be used like so:

    docker run --rm \
        -v "$(pwd):/target/:ro" \
        -v "/my-host-results-dir/:/results/" \
        <ddh image> \
        ddh . --output /results/results_1.txt
    

    This way, you can safely check for duplicates in your current working directory (marked as read-only), and write results to a separate directory (e.g. /my-host-results-dir/ in the example above).

    For users without Rust installed on their machines, this may be an easier execution option that is relatively platform agnostic.

    TODO:

    • Hook build, and publish to CI
    • Update README.md to include instructions on running with Docker
    enhancement 
    opened by MrSaints 5
  • Latest release does not properly handle spaces in directory names for input arguments

    Latest release does not properly handle spaces in directory names for input arguments

    When using windows binary release from this commit: https://github.com/darakian/ddh/commit/8c64d23dce401330b4c78dcf9b93a3928f78bffd

    It does not handle directories with spaces in their names and crashes.

    i.e. ddh.exe .\ ----- Runs fine ddh.exe ".\space in name" ------ Causes the error below

    thread 'main' panicked at 'Error canonicalizing input: Os { code: 123, kind: Other, message: "The filename, directory name, or volume label syntax is incorrect." }', libcore\result.rs:945:5 stack backtrace: 0: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: BaseThreadInitThunk 11: RtlUserThreadStart

    help wanted Windows 
    opened by fiveseven808 5
  • Strange results on Windows

    Strange results on Windows

    If I try to ddh my entire user director, even from an administrator prompt, I get a bunch of "cannot open" lines as well as the errors in the attached image.

    issue

    However, if I try to ddh itself in the target directory ddh .\ It works as expected.

    opened by fiveseven808 5
  • LGPL-3.0 is deprecated

    LGPL-3.0 is deprecated

    LGPL-3.0 is deprecated, consider use LGPL-3.0-only or LGPL-3.0-or-later. (you can just add license header to the source file or in the README), thanks!

    https://github.com/darakian/ddh/blob/e4c574ffd9855384cb35ec7534d3eba6b03ba867/Cargo.toml#L7

    opened by chenrui333 3
  • Feature request: Allow usage of perceptual hashes for images

    Feature request: Allow usage of perceptual hashes for images

    This would allow users to dedup pictures even if they have different file formats, resolutions or quality. Usually one wants to keep the largest file.

    im_hash delivers the rust functionality.

    opened by grayfallstown 0
Releases(0.13.0)
Owner
Jon Moroney
Jon Moroney
Multiple USB File Flasher

Popsicle Popsicle is a Linux utility for flashing multiple USB devices in parallel, written in Rust. Build Dependencies If building the GTK front end,

Pop!_OS 476 Dec 30, 2022
Executes commands in response to file modifications

watchexec Software development often involves running the same commands over and over. Boring! watchexec is a simple, standalone tool that watches a p

null 3.6k Jan 1, 2023
A tool for quickly switching between different file configurations, using symbolic links.

config-loader A tool for quickly switching between different file configurations, using symbolic links. Usage To use it, download the latest release f

Zacchary Dempsey-Plante 3 Aug 22, 2022
Blazing 💥 fast terminal-ui for git written in rust 🦀

Blazing fast terminal client for git written in Rust Features Fast and intuitive keyboard only control Context based help (no need to memorize tons of

Stephan Dilly 11.8k Jan 5, 2023
A simple and fast download accelerator, written in Rust

zou A simple and fast download accelerator, written in Rust Zou is a Snatch fork by @k0pernicus. Snatch is a fast and interruptable download accelerat

Antonin Carette 173 Dec 4, 2022
A simple, fast and user-friendly alternative to 'find'

fd [中文] [한국어] fd is a program to find entries in your filesytem. It is a simple, fast and user-friendly alternative to find. While it does not aim to

David Peter 25.8k Dec 30, 2022
Super-fast command aliases with arguments.

Bonnie Bonnie is a command aliasing tool. If you have a super-long command that you have to run all the time, Bonnie is for you! Just define the comma

arctic_hen7 58 Dec 22, 2022
A fast and minimalistic image viewer forked from the now discontinued emulsion.

Alloy Image viewer based on (now-discontinued) Emulsion. Alloy targets Windows, Mac, and Linux (with more targets to come!). A note for Linux users: W

Ardaku Systems 9 Dec 1, 2022
A (aspiring to be fast) tool to find duplicate files.

find-duplicates A decenly fast tool to find duplicate files. Handles symbolic and hard links and treats them seperately to duplicates. Quickstart Inst

dylan 1 Jan 21, 2022
An efficient way to filter duplicate lines from input, à la uniq.

runiq This project offers an efficient way (in both time and space) to filter duplicate entries (lines) from texual input. This project was born from

Isaac Whitfield 170 Dec 24, 2022
a rust library to find near-duplicate video files

Video Duplicate Finder vid_dup_finder finds near-duplicate video files on disk. It detects videos whose frames look similar, and where the videos are

null 12 Oct 28, 2022
CLI tool to find duplicate files based on their hashes.

Dupper Dupper is a CLI tool that helps you identify duplicate files based on their hashes (using the Seahash hashing algorithm). Installation You can

Rubén J.R. 4 Dec 27, 2022
Filter, Sort & Delete Duplicate Files Recursively

Deduplicator Find, Sort, Filter & Delete duplicate files Usage Usage: deduplicator [OPTIONS] [scan_dir_path] Arguments: [scan_dir_path] Run Dedupl

Sreedev Kodichath 108 Jan 27, 2023
Fuzzy Finder in rust!

Life is short, skim! Half of our life is spent on navigation: files, lines, commands… You need skim! It is a general fuzzy finder that saves you time.

Jinzhou Zhang 3.7k Jan 4, 2023
Fuzzy a general fuzzy finder that saves you time in rust!

Life is short, skim! Half of our life is spent on navigation: files, lines, commands… You need skim! It is a general fuzzy finder that saves you time.

Jinzhou Zhang 3.7k Jan 8, 2023
Highly customizable finder with high performance. Written in Rust and uses GTK

Findex Highly customizable finder with high performance. Written in Rust and uses GTK Installation Automatic Binary Clone from https://aur.archlinux.o

MD Gaziur Rahman Noor 442 Jan 1, 2023
Skim - Fuzzy Finder in rust!

Life is short, skim! Half of our life is spent on navigation: files, lines, commands… You need skim! It is a general fuzzy finder that saves you time.

Takayuki Maeda 0 May 16, 2021
Fzf - A command-line fuzzy finder

fzf is a general-purpose command-line fuzzy finder. It's an interactive Unix filter for command-line that can be used with any list; files, command hi

Junegunn Choi 49.1k Jan 8, 2023
A simple, modern fuzzy finder tool to run examples in a Cargo project.

cargo-rx cargo-rx is a simple, modern Runner for Examples in a Cargo project. This crate provides a single executable: rx. Basically anywhere you woul

Ritvik Nag 14 Dec 2, 2022
Radix Babylon vanity address finder allowing easy import into Radix mobile Wallet.

Rad Vanity address finder for Radix Babylon which you can import directly into your Radix Wallet using QR scanner using Import from a Legacy Wallet fe

Alexander Cyon 6 Nov 13, 2023