[NBLUG/talk] Finding duplicate files

Ross Thomas spamb8r at netscape.net
Mon Jul 7 01:24:00 PDT 2003


Lincoln Peters wrote:
> Is there an easy way to do a recursive search for duplicate files, 
> preferably from the shell?

Put this into a file, make it executable and run it with the directory
you wish to check.  It defaults to the current directory.  This will
only really work on Linux.  Other *nix OS's don't support the -print0
and -0 args so you would need to insert a 'sed' command or make the
xargs more restrictive.

This one only finds real files.  Symlinks are ignored.

Also handles embedded blanks and tabs.  Misbehaves when new-lines are
embedded in a file name (sort & uniq aren't that sophisticated).

------------- Cut Here ----------------
#!/bin/sh

if [ $# -ge 2 ]
then
    echo "Usage: `basename \"$0\"` [<dir_name>]"
    exit 1
fi

find "${1:-.}" -type f -print0 | xargs -0 md5sum | sort | uniq -D -w 32
------------- Cut Here ----------------

HTH.

Ross.




More information about the talk mailing list