Coder Social home page Coder Social logo

customclean's Introduction

CustomClean

For HCP outputs and other things you need to clean up!

Installation

  1. Requires Python.
  2. Clone custom-clean repo to the desired location.

Cleaning JSON file

The file of rules and patterns to be applied to the folder being cleaned.

The JSON file contains 2 elements:

  • The directory tree of the folder being cleaned in the "file_system_data" of the json. This is the "rules" portion that tells the script what to do (keep or delete) for each file and folder. The tree must match the directory tree of the folder being cleaned.
  • The "pattern_list" whose value is a list of pattern strings.

Patterns

If the data to be cleaned contains numbered files or folders, you can choose to apply a rule (keep or delete) for one numbered file to all matching, numbered files and folders.

To do this, either add a pattern to the "pattern_list" or use the pattern flag in the cleaning script, below.

Example 1

You have BIDS data and a lot of files and folder contain strings like "task-rest_run-01". You don't know how many runs there are going to be, but you want the rules for all task-rest runs to be the same.

  • Put rules in your tree for files and folders whose names contain "task-rest_run-01".
  • Add the pattern "task-rest_run-*" to the list value of the "pattern_list" element. (The * is shorthand for all numbers.)

All rules containing "task-rest_run-" followed by any number will be applied to all files and folders whose names contain "task-rest_run-" followed by any number.

Note that if you want the same rules to apply to all of your tasks (not just task-rest), your pattern can be just "run-*". Just be a little careful with patterns until you get used to them.

Example 2

There is a set of files in your data called temp01, temp02, .... Again, you don't know how many there will be, but you want them all gone.

  • Make a rule for a file called temp0 (or temp6 or temp382 - doesn't matter, just temp followed by a number). Set the value of its "state" to "delete".
  • Add a pattern to the list for "temp*".

Don't worry, all of your files with "template" in their names will not disappear. Only files with the string "temp" followed directly by a number.

CustomClean Cleaning Script (cleaning_script.py)

Delete things in a given directory based on the rules and patterns given in the JSON.

Required arguments:

  • -j --json [path to JSON]
  • -d --dir [path to target directory]

Optional arguments:

  • -p --pattern [string to use for numbered series]

Error information will display on the console. Success information (i.e. what files, directories, and links were removed) will be written to a file called custom_clean_success_record.txt at the top level of the target directory.

CustomClean GUI

NOTE The GUI is no longer being maintained. In additiona, as of 2.0.0, the cleaning script does not work with JSON files created by the CustomClean GUI. If you still wish to tru using the GUI and a cleaning script, use an older version of CustomClean (pre 2.0.0).

customclean's People

Contributors

dasturge avatar ericearl avatar kathy-snider avatar lucimoore avatar lundq163 avatar madisoth avatar tjhendrickson avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

gregconan

customclean's Issues

Need documentation about design decisions.

For example: what if a "folder" is listed in the json with "delete", but not all of its children are deleted (e.g. more files are written now than when the json was created, so removing the ones listed for "delete" in the json leaves some other stuff around. Should we delete the folder anyway? What if a folder is empty but it's status is "keep"? Should we delete it anyway?
We need to remember what was decided or to decide anew. Either way, we need to document so both the developer and the user know what the code is supposed to do.

Changed File Names in pipeline

Hi, I'm Noah Baden, with the Wash U team downloading the ABCD Year 2 Data, and going through the given data, it showed that the CustomClean script stopped working because the names of all the files were changed from task-**** to ses-2YearFollowUpYArm1_task-****. So you may need to re-run the script with an edited json to fix this issue.

Current cleaning script does not work with jsons from Custom Clean UI.

We have changed the format of the json files to, eventually, allow the user to edit an existing json from the GUI without the original directory being around anymore. Also to allow more features in the cleaning script (like patterns, etc.) But we have not re-written the GUI, so the current GUI makes one format of json and the cleaning script takes a different format. Bottom line, the GUI is kind of not used right now.

CustomClean needs a template option like file-mapper

In cases where a subject ID or a session ID maybe appears in the paths of files that should be deleted on a subject-specific or other case-specific basis, CustomClean should support templates the same as file-mapper does.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.