Coder Social home page Coder Social logo

dupfinder's Introduction

dupfinder

duplicate finder for files.

This python script will search through a specified directory for all the files and copy one instance of them into another directory. It leaves the original files and directories untouched.

A new directory called Copied is created at working directory where these files are copied into.

running dupfinder.py

Usage: dupfinder.py [options]

Options:
  -h, --help            show this help message and exit
  -x FILE_EXTENSION, --ext=FILE_EXTENSION
                        the extension used for file filter
  -p PATH, --path=PATH  file path to use
  -d DESTINATION, --dst=DESTINATION
                        where the unique files are to be copied into
  -l MAX_NUMBER_OF_FILES, --limits=MAX_NUMBER_OF_FILES
                        set a limit on the number of files to process

The problem it is trying to solve.

My wife and I take a lot of photos with our phones. When I transfer them onto our computer, there is no way of knowing whether I have already transferred the files to the computer or not. So to be safe, I copied all the files from the phone to the computer, and clean up the files on the computer.

The reason for doing this

There are quite a few duplicate finders available, but as an exercise for me to learn python, I have undertaken this project.

The algorithm

For the specified path, the python script will search for a file that matches the specified file extension. When it finds it, it will generate a md5 digest for that file. With the generated md5 digest, it searches through its already logged file if there is already a file with the same md5 digest. If it has, it will increase the count for that md5 digest. For that digest, it will also logged the full path. When all the files have been filtered, it will interate over each in the list and copied it to the destination folder.

dupfinder's People

Contributors

tyc avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.