As was foretold, we've added advertisements to the forums! If you have questions, or if you encounter any bugs, please visit this thread: https://forums.penny-arcade.com/discussion/240191/forum-advertisement-faq-and-reports-thread/

Finding Duplicate Files

HarshLanguageHarshLanguage Registered User regular
edited September 2007 in Games and Technology
I need to organize the files on my PC. I've got hundreds of gigs of files on a couple different drives that have accumulated over the last few years. I've got backups from PCs that no longer exist, but the backed-up files overlap with each other. Now, I finally have a drive big enough to hold all this stuff in one place (a 500gb MyBook external) and I want to clean. it. out.

I know there are lots of duplicate files in there, so I want to take care of them first. I have a very simple program, called, obviously enough, Duplicate File Finder, that I've used for smaller duplicate checks before. But it's shareware, not free, and really basic, so before I pay for it, I want to see if there's any dupe-file checking programs that are better.

Ideally, I'd want something with a Windows Explorer-style interface (folder trees), clear display of the vital stats and location of each file, and a simple way of previewing the files and deleting the dupes. I'd prefer not to pay too much, let's say up to $25 ballpark, but free is even better.

Google and a search of shareware sites lead to too many results to be helpful, and I hate dl'ing random unknown apps from download.com or wherever. So, tell me G&T, what have you guys used and liked?

QSwearing_trans_smooth_small.gif
> turn on light

Good start to the day. Pity it's going to be the worst one of your life. The light is now on.
HarshLanguage on

Posts

  • HarshLanguageHarshLanguage Registered User regular
    edited September 2007
    This Duplicate File Finder program is... pretty OK. The interface is so barebones, though. It doesn't highlight which set of duplicates you're working on, so if you're working fast it would be possible to delete the wrong files.

    Any other suggestions?

    (Yes, I replied to myself, because I really do want to know if there's a better program out there for this. :| )

    HarshLanguage on
    QSwearing_trans_smooth_small.gif
    > turn on light

    Good start to the day. Pity it's going to be the worst one of your life. The light is now on.
  • jedijzjedijz Registered User regular
    edited September 2007
    Well, currently I'm using a feature in Advance Uninstaller Pro 2005 to find duplicate files. It's pretty effective with multiple settings on how thorough you want to be and it categorizes the results.

    jedijz on
    Goomba wrote: »
    It is no easy task winning a 1v3. You must jump many a hurdle, bettering three armies, the smallest.

    Aye, no mere man may win an uphill battle against thrice your men, it takes a courageous heart and will that makes steel look like copper. When you are that, then, and only then, may you win a 1v3.

    http://steamcommunity.com/id/BlindProphet
  • mntorankusumntorankusu I'm not sure how to use this thing.... Registered User regular
    edited December 2011
    Duplicate Cleaner is the best one I've used, and it's free.

    Edit: Oh. Spammer resurrected this thread. Whoops. Still useful information!!

    mntorankusu on
  • kitchkitch Registered User regular
    edited January 2012
    I know this isn't what you want, but hear me out.

    Download Cygwin http://www.cygwin.com/
    Install it with the setup.exe it has you download. You don't need to install every package from the package list, the defaults should be fine.

    Once installed, navigate to the folder where you installed it, and run cygwin.bat. You're basically running BASH within cmd.exe.

    On the commandline, type the following. /cygdrive/c/ would be the top of the C drive. /cygdrive/d/ would be the top of the D drive, etc
    cd /cygdrive/c/THE_PATH/TO/THE/FOLDER_YOU_WANT_TO_START_IN  
    

    If you get lost, type "pwd" to print your current directory. "ls" will print the name of the files in that directory.

    When you get to where you want to be, you can then type the following. This is going to run md5sum against every file in your current folder and below to see if there are any duplicate files. Depending on where you start, this could take a few hours to run. Every file on your computer (200,000+) vs 10,000 files on in your porn media folder is a big difference. Also you might not have enough RAM to check every file on your computer in one go.
    find . -type f -exec stat --printf='%32s ' {} \; -exec md5sum {} \; | sort -rn | uniq -d -w65 --all-repeated=separate > DUPLICATE_FILES_01.TXT
    

    Once it has finished running, all of the duplicates will be listed in DUPLICATE_FILES.TXT.

    Type this to view the text file in your terminal. Press the down/up or pagedown/pageup keys to scroll. "Q" will exit/quit it.
    cat -n DUPLICATE_FILES.TXT | less
    

    Or just open it in Wordpad or some rich text editor that is not Notepad.


    It'll look something like this:
                                   0 d41d8cd98f00b204e9800998ecf8427e *./folder_name/desktop/your_media_stash/gross.mp4
                                   0 d41d8cd98f00b204e9800998ecf8427e *./folder_name/desktop/your_media_stash/someotherfile.mkv
    

    That's an md5sum followed by the folder path of the file. A matching md5 means those files are duplicates.
    Since it's checking the md5, this allows you to check for duplicate files which may be named totally differently.


    Forgot to add, if you end up running this in multiple places and have a bunch of random DUPLICATE_FILES_01, 02, 03, etc laying around. Put them all into the same folder and type:
    cat DUPLICATE_FILES_01.TXT DUPLICATE_FILES_02.TXT DUPLICATE_FILES_03.TXT | sort > NEW_FILE_NAME.TXT
    
    That will print them all to a single file in sorted order.

    kitch on
  • Phoenix-DPhoenix-D Registered User regular
    Rare case where a necrothreading spammer was actually useful, I needed that info! :D

This discussion has been closed.