Coder Social home page Coder Social logo

photocopier's Introduction

Photocopier: Organize and De-duplicate Your Media

Photocopier is a Python script that organizes your images and videos by date and ensures you don't copy the same file twice. This is especially useful for large photo libraries where organization can get chaotic over time.

๐ŸŒŸ Features

  • Organization by Date: Rearranges your media into folders by year and month.
  • De-duplication: Uses MD5 hashing to avoid copying the same media file twice.
  • Stock Take: A mode to get a quick overview of the extensions and the number of files in the source directory.
  • Logging: Detailed logging to provide insight into the process.

๐Ÿ”ง Prerequisites

  • Python 3.6 or higher

  • Required Python libraries: pyexifinfo, click, hashlib, os, shutil, datetime

    Install them using:

    pip install click pyexifinfo

๐Ÿš€ Usage

Basic Organization

Copy and organize your images:

python photocopier.py /path/to/source /path/to/destination

Stock Take Mode

For a stock take of extensions in the source directory:

python photocopier.py /path/to/source /path/to/destination -s

๐Ÿ›  How It Works

  1. Stock Take (Optional): Analyzes the source directory to display the number of files and the file extensions.
  2. Load Existing Hashes: Before organizing, the script checks a hash file (copied_hashes.txt) in the destination directory to see which files have been previously copied.
  3. Organization: Each media file is read, its date is extracted (from metadata or the file's modification date), and it's copied into the appropriate year-month directory.
  4. Avoiding Duplicates: Each media file's MD5 hash is calculated. If the hash exists in the hashes file, it's skipped; otherwise, it's copied to the destination and the hash is saved.

๐Ÿ’ก Note: The copied_hashes.txt file is essential and should not be deleted. It resides in the destination directory and serves as a record of what's already been copied. Deleting this file would lose that record and may result in duplicates.

๐Ÿ–ฅ๏ธ Output

Whenever a hash exists, a dot (.) is printed to show that this file has been found in your library already. When a file is not found and copied across, a plus (+) is printed. The output can look something like this:

Screenshot 2023-08-06 at 22 17 29

๐Ÿค Contributing

If you find any issues or have suggestions, feel free to open an issue or make a pull request.

photocopier's People

Contributors

dsmurrell avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.