Coder Social home page Coder Social logo

osfclient / osfclient Goto Github PK

View Code? Open in Web Editor NEW
120.0 12.0 52.0 434 KB

A python library and command-line client for file storage on OSF

Home Page: http://osfclient.readthedocs.io/en/stable/

License: BSD 3-Clause "New" or "Revised" License

Python 99.18% Shell 0.82%
science open-science data-management osf python

osfclient's Introduction

osfclient

osfclient

travisbadge

The osfclient is a python library and a command-line client for up-and downloading files to and from your Open Science Framework projects. The Open Science Framework (OSF) is an open source project which facilitates the open collaboration of researchers on the web, by sharing data and other research outputs.

As such the OSF hosts large data sets, associated with papers or scientific projects, that can be freely downloaded. The osfclient allows people to store and retrieve large datasets associated to their scientific projects and papers on the OSF via the command line interface. If you are completely new to the OSF you can read their introductory materials

This is a very new project, it has some rough edges.

Installing

To use osfclient install it via pip:

$ pip install osfclient

For details on participating in the development of osfclient check out the Contributing section.

Usage

This project provides two things: a python library and a command-line program for interacting with files stored in the OSF.

The python library forms the basis for the command-line program. If you want programmatic access to your files use the library, otherwise try out the command-line program.

Read the full documentation: https://osfclient.readthedocs.io/en/latest/

Below are some examples on how to use it:

# get help and see available commands, get help on a specific command
$ osf -h
$ osf <command> -h

# setup a local folder for an existing project
$ osf init

# list all files for the project
$ osf ls

# fetch all files for the project
$ osf clone

# fetch an individual file from a project
$ osf fetch remote/path.txt local/file.txt

# get web view url for an individual file from a project
$ osf geturl remote/path.txt

# add a new file
$ osf upload local/file.txt remote/path.txt

# add a new directory
$ osf upload -r local/directory/ remote/directory

If the project is private you will need to provide authentication details. You can provide either username & password credentials or a Personal Access Token (PAT). You can provide these by setting either the OSF_USERNAME and OSF_PASSWORD environment variables or by setting the OSF_TOKEN environment variable. The password will be retrieved from the OSF_PASSWORD environment variable or you will be asked directly by the tool when you run it.

You can set default values for the username and project by using a configuration file in the current directory. This is what osf init does for you. To set the username and project ID create .osfcli.config:

[osf]
username = [email protected]
project = 9zpcy

To avoid having to provide credentials on each use, you can provide either your password or a PAT in your config with the following keys:

# basic auth (username/password)
password = this-password-is-fake

# token auth
token = kej2R9IU6Gr2uThsswSNdP1cd0cu9eaCerVXjVf7zNwfXHyT0QzMZtX0PGTYmp9Fzaixwq

After which you can simply run osf ls to list the contents of the project.

Contributing

Contributions from everyone and anyone are welcome. Fork this repository, make your changes, add a test to cover them and create a Pull Request. Then one of the maintainers will review your changes. When all comments have been addressed and all tests pass your changes will be merged.

To setup a development version:

$ git clone https://github.com/YOURNAMEHERE/osfclient
$ git remote add upstream https://github.com/osfclient/osfclient
$ cd osfclient
$ pip install -r devRequirements.txt -c constraints.txt
$ pip install -e . -c constraints.txt

There are a few secret keys relevant to this project, like passwords to pypi.org, test.pypi.org, and the osfclient email account. We store these in an encrypted git repo on Keybase. If you need access to this repo, contact any of the following maintainters on Keybase:

  • Tim Head (@betatim)
  • Ben Lindsay (@benlindsay)
  • Fitz Elliott (@felliott)
  • Longze Chen (@cslzchen)

For more details and instructions: CONTRIBUTING.md

osfclient's People

Contributors

adswa avatar aerubanov avatar benlindsay avatar betatim avatar chanaysavoyen avatar chkgk avatar cmungall avatar ctb avatar djsutherland avatar erinspace avatar ethanwhite avatar felliott avatar gedankenstuecke avatar glemaitre avatar la0 avatar luizirber avatar mfraezz avatar mih avatar mmore500 avatar sloria avatar stebo85 avatar untzag avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

osfclient's Issues

Command to obtain sharable URL

New subcommand to get the URL from which a file in a project can be downloaded with wget etc. For easy sharing with others or use in setups without osfclient.

Maybe like osf url <remote_path> which then prints out the URL to fetch the file.

Can we generate a link containing a "secret" so that files from private repos can be shared as well?

Additional information on contributing

Add more details on the particular style of how to work together/make contributions.

Topics:

  • [MRG] vs [WIP] tags on PRs
  • rebase your PRs instead of merges to keep history pretty and linear
  • how to write new tests, especially regarding mock

CONTRIBUTING.md is probably the place for this information.

syntax error with python 2.7.12

is this a py3 issue?

# after running sudo pip install -e .
% osf
Traceback (most recent call last):
  File "/usr/local/bin/osf", line 9, in <module>
    load_entry_point('osfclient', 'console_scripts', 'osf')()
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 542, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2569, in load_entry_point
    return ep.load()
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2229, in load
    return self.resolve()
  File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2235, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/home/titus/osf-cli/osfclient/__init__.py", line 2, in <module>
    from .api import OSF
  File "/home/titus/osf-cli/osfclient/api.py", line 1, in <module>
    from .models import OSFCore
  File "/home/titus/osf-cli/osfclient/models/__init__.py", line 6, in <module>
    from .core import OSFCore
  File "/home/titus/osf-cli/osfclient/models/core.py", line 31
    def _get_attribute(self, json, *keys, default=None):
                                                ^
SyntaxError: invalid syntax

Better PyPI project page

Switch setup.py to read in the README as long description so it appears on PyPI's page for the project.

Logo design

This project doesn't have a logo yet! No code writing required to contribute to this issue.

Collect ideas and examples here.

Maybe something that connects visually with https://osf.io/ so people realise that this is a tool for using with osf.io

The global sprint bright green and blue is pretty cool...maybe a bit hard on the eyes for the long term. Ideas welcome.

Local config file support

Add support for a config file in the local directory .osf-storage.config to store the username and project ID.

We should not offer any support that would encourage people to store secrets in a file/accidentally have it end up in some kind of history.

EOF occurred in violation of protocol error

I installed osf-cli have .osfcli.config with this content:
[osf]
username = [email protected]
project = d3jx7

why do I get EOF occurred in violation of protocol (_ssl.c:661) error when I try command:
osf -p d3jx7 list
and enter my osf password at the prompt as shown below?

LIB-2176:osf-cli nmunn$ osf -p d3jx7 list
Please input your password:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/bin/osf", line 11, in
load_entry_point('osfclient', 'console_scripts', 'osf')()
File "/Users/nmunn/osf-cli/osfclient/main.py", line 97, in main
exit_code = args.func(args)
File "/Users/nmunn/osf-cli/osfclient/cli.py", line 182, in list_
project = osf.project(args.project)
File "/Users/nmunn/osf-cli/osfclient/api.py", line 24, in project
return Project(self._json(self._get(url), 200), self.session)
File "/Users/nmunn/osf-cli/osfclient/models/core.py", line 23, in _get
return self.session.get(url, *args, **kwargs)
File "/Users/nmunn/osf-cli/osfclient/models/session.py", line 43, in get
response = super(OSFSession, self).get(url, *args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/sessions.py", line 515, in get
return self.request('GET', url, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/sessions.py", line 502, in request
resp = self.send(prep, **send_kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/sessions.py", line 612, in send
r = adapter.send(request, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:661)
LIB-2176:osf-cli nmunn$

`osf init` to help creating config files

Add a osf init command that asks the user some questions and creates a .osfcli.config in the current directory.

This should make it easier for users to setup things when they are new.

Companion command to `osf clone`

Related to #18: it would be useful to have a simple command that is the opposite of osf clone. I thought about integrating it into osf upload -r but I think the semantics are sufficiently different (for example you don't really want to specify a destination storage, it should use the same as a file currently uses).

Also need to think about how to handle the case when a local file is different from what is on OSF. There are versions on OSF so we might be able to figure out something smarter than just "remote file is different, overwrite?".

Maybe this is the time to switch to osf clone to make the initial copy, osf fetch to update local from OSF and osf push to update OSF from local. Following the git names/pattern a bit.

Add integration tests

All tests here fake the OSF in order to run fast and not rely on some project on the OSF that needs resetting. This means the build could be green despite the code not working in practice anymore because something has changed on the OSF side.

We should add some (light?) tests using one of the HTTP replay/recorder libraries to check we are still compatible with the OSF, and give feedback to the OSF when something that used to work breaks.

Tests still fail locally

After our discussion today @betatim:

Somehow running py.test still fails for me when running it locally, even when there's no .osfcli.config present in the osf-cli folder. I tried a fresh clone of this repo and it gives 6 failed, 60 passed in 1.76 seconds and in all 6 cases the error is the following:

AttributeError: <module 'osfclient.cli' from '…/osf-cli/osfclient/cli.py'> does not have the attribute 'open'

I've tested on Python 3.4.4 and Python 2.7.12.

Configurable mapping of remote storage to local path

Following on from this discussion we could use the config file to specify how to map remote storage to local directories (for use in osf clone).

Currently we create a subdirectory for each storage, which can be tedious especially if you only have one storage. Something along the lines of:

[osf]
osfstorage = local_path
githubstorage = other_local_path

with defaults equal to the current behaviour.

Explain how to install osfclient

The documentation on RTD doesn't actually explain how to install osfclient. Should be fixed by adding pip install osfclient to the index and maybe also the user guide?

Nicer error message when trying to access private project?

Sorry to raise another issue, but I really like the idea of having this command-line client and I hope the feedback is helpful. When I try to access a private project, I get a messy traceback. I see that this traceback is caused by a 401 message from osf.io. Is there a way to tell that the user is trying to access a private repo and tell that to the user rather than spitting out a traceback, or is osf.io pretty opaque about that?

(osf3) lindsb@rrlogin:osf-cli$ osf -p wy5pj ls
Traceback (most recent call last):
  File "/home/lindsb/usr/miniconda/envs/osf3/bin/osf", line 11, in <module>
    load_entry_point('osfclient', 'console_scripts', 'osf')()
  File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/__main__.py", line 97, in main
    exit_code = args.func(args)
  File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/cli.py", line 182, in list_
    project = osf.project(args.project)
  File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/api.py", line 24, in project
    return Project(self._json(self._get(url), 200), self.session)
  File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/models/core.py", line 23, in _get
    return self.session.get(url, *args, **kwargs)
  File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/models/session.py", line 45, in get
    raise UnauthorizedException()
osfclient.exceptions.UnauthorizedException

subdirectory problem

ref #2 (comment), it's trying to get a download link for a directory.

python -m osfclient fetch foo 7g6vu
looking at: testfoo
Downloading: Questionnaire.docx to /Users/t/dev/osf-cli/foo/dropbox/Questionnair
e.docx...
Downloading: README.md to /Users/t/dev/osf-cli/foo/github/README.md...
Downloading: Recycling Event.png to /Users/t/dev/osf-cli/foo/osfstorage/Recyclin
g Event.png...
Downloading: osf_subdirectory to /Users/t/dev/osf-cli/foo/osfstorage/osf_subdire
ctory...
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3
.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3
.6/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/t/dev/osf-cli/osfclient/__main__.py", line 31, in <module>
    main()
  File "/Users/t/dev/osf-cli/osfclient/__main__.py", line 28, in main
    args.func(args)
  File "/Users/t/dev/osf-cli/osfclient/__init__.py", line 42, in fetch
    response = oo.request_session.get(c.raw['links']['download'])
KeyError: 'download'

explicit storage directories, or no?

Currently,

python -m osfclient fetch foo 7g6vu puts files under subdirectories named after the storage.

% ls -R
dropbox         github          osfstorage

foo/dropbox:
Questionnaire.docx

foo/github:
README.md

foo/osfstorage:
Recycling Event.png

It looks to me like the Right Approach is to use the attr['materialized_path'] attribute for filenames, which is what ls currently does --

% python -m osfclient ls 7g6vu 
looking at id=7g6vu, title=testfoo
/Questionnaire.docx
/README.md
/Recycling Event.png
/osf_subdirectory/

but I don't know what happens if a file from one osfstorage steps on a file from another osfstorage. Any tips @felliott?

Link to github from RTD

The read the docs front page of osfclient should have a link to this GitHub repository to make it easy to find.

Some basic next steps

  • Make username/password configurable
  • fix osfstorage/subdirectory problem
  • put under test, + travis

More broadly:

  • figure out auth/credentials grabbing, maybe w/dot-file

Can not overwrite an existing file

osf upload can only create new files, if a file with the same name already exists it can't overwrite it.

The request is similar (omit name parameter), the tricky part is determining whether the user meant to overwrite the file or if it was a typo/mistake. And how to let users set this from the CLI.

AttributeError: 'file' object has no attribute 'peek'

Here's the traceback:

lindsb@rrlogin:~$ osf -p wy5pj -u [email protected] upload 2016-12-15_usage_1.txt 2016-12-15_usage_1.txt
Traceback (most recent call last):
  File "/home/lindsb/usr/miniconda/bin/osf", line 11, in <module>
    load_entry_point('osfclient', 'console_scripts', 'osf')()
  File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/__main__.py", line 97, in main
    exit_code = args.func(args)
  File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/cli.py", line 214, in upload
    store.create_file(remote_path, fp, update=args.force)
  File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/models/storage.py", line 106, in create_file
    not_empty = fp.peek(1)
AttributeError: 'file' object has no attribute 'peek'

Any idea what's going on here?

Thanks in advance!

Python 2 build for travis

Travis only tests python 3, not older versions. This means python3 only features (#46) are sneaking into the code. While there was no promise ever to also support python2 it seems like a good idea to do that where possible, or at least make some effort.

Step one of this is to add a python2 build to travis.

Fetch individual files

It isn't possible to download a single file with osf fetch, you always get the whole project.

Need to add a way to filter the files before downloading and some conventions on command line arguments.

Asynchronous?

hi,

i don't suppose you have any plans to make osf-cli asynchronous?

with thanks

Upload to non OSF Storage

Is it currently possible to upload a file to a storage option that isn't OSF Storage?

I'm using the figshare add-on and would like to be able to upload to its storage area, but can't figure out how to indicate this. osf ls shows the storage area and any files in it:

$ osf ls
figshare/data.csv
osfstorage/data2.csv
osfstorage/data.csv

But if I try to upload using:

osf upload data3.csv figshare/data3.csv

I end up creating a figshare directory in osfstorage:

ethan@gandalf:~/osftest$ osf ls
figshare/data.csv
osfstorage/figshare/data3.csv
osfstorage/data2.csv
osfstorage/data.csv

Reduce duplication in tests

This is a bit of a vague issue. A lot of the test code duplicates things, this will/has lead to bugs and is hard to maintain.

We should explore options to reduce this without making the tests harder to understand (no one likes debugging the tests while debugging real code)

Rsync style --delete option?

Maybe this would be hard to do, but it would be great if there was a way to include a --delete option when doing a recursive upload so that if a file has been deleted locally, the file will also be deleted in the OSF project, maybe asking for permission for each file by default.

Funky progress bar stuff happens on clone

When I tried cloning a project, the progress bar was a little confusing, and then it showed up kind of unattractively on the next command line prompt when the clone finished, like so:

lindsb@rrlogin:scratch$ osf -u [email protected] -p wy5pj clone
Please input your password:
82files [03:41,  1.57s/files]
lindsb@rrlogin:scratch$ █████████████████████████| 755/755 [00:00<00:00, 3.46Mbytes/s]

During cloning, the progress bar stayed at 100% the whole time, and the number of files incremented above the error bar without telling how many files total will be downloaded.

I feel like however the download progress is reported for cloning should be consistent with progress reporting for the download and upload commands. Maybe even add a --verbose flag for cloning too, like mentioned in issue #93?

Add support for caching and etag

Problem: osfcli does not look at cache headers or etag information.

Add support for taking this information into account and save on network requests made.

Releasing

Thoughts on releasing/packaging:

  • setup publish to pypi on tag
    • which pypi credentials to use?
  • set version info osfcli.__version__
  • put pypi specifiers in setup.py
  • polish docs -> #31

Handing out merge rights on first PR merge

There are various ways of working together and organising open-source projects. The one I believe in most is the "someone else has to merge your PR" rule. (assuming the basics like automatic tests etc are covered)

This creates a bit of a challenge for new projects, because often there is only one or two people ... a popular way around it is to offer to bestow the privilege of merging PRs on anyone who has a PR merged. I can imagine all sorts of ways that this could go wrong or lead to stress etc but in my (limited) experience it works well in practice.

If there is no objection voiced here I will start following this for all future PRs. I wouldn't broadly advertise it, just ask people when their first PR is merged.

File sizes are zero

osf clone produces the right file tree but all files have zero size.

python 3, current master.

Looking into it at the moment.

Upload a whole (tree of) folders

osf upload only works with single files.

Add support to allow the user to upload a folder and its contents or even a whole tree of folders.

Better usage information

Can we do better with the error messages and help output provided?

Right now going to an empty directory and typing:

osf clone

yields:

usage: osf clone [-h] [output]
osf clone: error: You have to specify a project ID via the command line, configuration file or environment variable.

which is correct but it would be more helpful if we could also print out the usage information you get when typing osf -h. I think because we use sub parsers and subcommands you only get the usage for the subcommand and the "global" options like -p don't get mentioned.

The way to get started with this is taking a look at the argparse documentation https://docs.python.org/3/library/argparse.html to see if there is something in there to change thsi behaviour before we build our own.

Add verbose flag for uploads and downloads

This is related to and mentioned in #47, but it would be nice to have a --verbose/-v flag to indicate which files have been uploaded or downloaded. This would be nice especially for recursive uploads. Maybe it would be worth having it available for single file uploads and downloads as well, so that if, for example, a bash script loops over files and uploads them individually, we could still see a printout of which file was just uploaded.

This could be as simple as just printing out file paths as they're uploaded, but it might be nice to show the % completion or even a small progress bar to the right of each file to show it's not hung when downloading a large file.

Allow -r flag to be used on single files

I tried looping over all files and directories in a directory and uploading each to OSF using

for i in *; do echo $i; osf upload -r $i /; done

But the upload fails on single files with the error message

RuntimeError: Expected source (boris_rod_lengths.dat) to be a directory when using recursive mode.

It would be nice if it could just ignore the -r flag on single files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.