osfclient / osfclient Goto Github PK
View Code? Open in Web Editor NEWA python library and command-line client for file storage on OSF
Home Page: http://osfclient.readthedocs.io/en/stable/
License: BSD 3-Clause "New" or "Revised" License
A python library and command-line client for file storage on OSF
Home Page: http://osfclient.readthedocs.io/en/stable/
License: BSD 3-Clause "New" or "Revised" License
This project doesn't have a logo yet! No code writing required to contribute to this issue.
Collect ideas and examples here.
Maybe something that connects visually with https://osf.io/ so people realise that this is a tool for using with osf.io
The global sprint bright green and blue is pretty cool...maybe a bit hard on the eyes for the long term. Ideas welcome.
The documentation on RTD doesn't actually explain how to install osfclient
. Should be fixed by adding pip install osfclient
to the index and maybe also the user guide?
Currently,
python -m osfclient fetch foo 7g6vu
puts files under subdirectories named after the storage.
% ls -R
dropbox github osfstorage
foo/dropbox:
Questionnaire.docx
foo/github:
README.md
foo/osfstorage:
Recycling Event.png
It looks to me like the Right Approach is to use the attr['materialized_path']
attribute for filenames, which is what ls
currently does --
% python -m osfclient ls 7g6vu
looking at id=7g6vu, title=testfoo
/Questionnaire.docx
/README.md
/Recycling Event.png
/osf_subdirectory/
but I don't know what happens if a file from one osfstorage steps on a file from another osfstorage. Any tips @felliott?
I installed osf-cli have .osfcli.config with this content:
[osf]
username = [email protected]
project = d3jx7
why do I get EOF occurred in violation of protocol (_ssl.c:661) error when I try command:
osf -p d3jx7 list
and enter my osf password at the prompt as shown below?
LIB-2176:osf-cli nmunn$ osf -p d3jx7 list
Please input your password:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/bin/osf", line 11, in
load_entry_point('osfclient', 'console_scripts', 'osf')()
File "/Users/nmunn/osf-cli/osfclient/main.py", line 97, in main
exit_code = args.func(args)
File "/Users/nmunn/osf-cli/osfclient/cli.py", line 182, in list_
project = osf.project(args.project)
File "/Users/nmunn/osf-cli/osfclient/api.py", line 24, in project
return Project(self._json(self._get(url), 200), self.session)
File "/Users/nmunn/osf-cli/osfclient/models/core.py", line 23, in _get
return self.session.get(url, *args, **kwargs)
File "/Users/nmunn/osf-cli/osfclient/models/session.py", line 43, in get
response = super(OSFSession, self).get(url, *args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/sessions.py", line 515, in get
return self.request('GET', url, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/sessions.py", line 502, in request
resp = self.send(prep, **send_kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/sessions.py", line 612, in send
r = adapter.send(request, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: EOF occurred in violation of protocol (_ssl.c:661)
LIB-2176:osf-cli nmunn$
Sorry to raise another issue, but I really like the idea of having this command-line client and I hope the feedback is helpful. When I try to access a private project, I get a messy traceback. I see that this traceback is caused by a 401 message from osf.io. Is there a way to tell that the user is trying to access a private repo and tell that to the user rather than spitting out a traceback, or is osf.io pretty opaque about that?
(osf3) lindsb@rrlogin:osf-cli$ osf -p wy5pj ls
Traceback (most recent call last):
File "/home/lindsb/usr/miniconda/envs/osf3/bin/osf", line 11, in <module>
load_entry_point('osfclient', 'console_scripts', 'osf')()
File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/__main__.py", line 97, in main
exit_code = args.func(args)
File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/cli.py", line 182, in list_
project = osf.project(args.project)
File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/api.py", line 24, in project
return Project(self._json(self._get(url), 200), self.session)
File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/models/core.py", line 23, in _get
return self.session.get(url, *args, **kwargs)
File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/models/session.py", line 45, in get
raise UnauthorizedException()
osfclient.exceptions.UnauthorizedException
After our discussion today @betatim:
Somehow running py.test
still fails for me when running it locally, even when there's no .osfcli.config
present in the osf-cli
folder. I tried a fresh clone of this repo and it gives 6 failed, 60 passed in 1.76 seconds
and in all 6 cases the error is the following:
AttributeError: <module 'osfclient.cli' from '…/osf-cli/osfclient/cli.py'> does not have the attribute 'open'
I've tested on Python 3.4.4
and Python 2.7.12
.
Add a osf init
command that asks the user some questions and creates a .osfcli.config
in the current directory.
This should make it easier for users to setup things when they are new.
Can we do better with the error messages and help output provided?
Right now going to an empty directory and typing:
osf clone
yields:
usage: osf clone [-h] [output]
osf clone: error: You have to specify a project ID via the command line, configuration file or environment variable.
which is correct but it would be more helpful if we could also print out the usage information you get when typing osf -h
. I think because we use sub parsers and subcommands you only get the usage for the subcommand and the "global" options like -p
don't get mentioned.
The way to get started with this is taking a look at the argparse documentation https://docs.python.org/3/library/argparse.html to see if there is something in there to change thsi behaviour before we build our own.
Add more details on the particular style of how to work together/make contributions.
Topics:
mock
CONTRIBUTING.md
is probably the place for this information.
https://github.com/elaine84/jupyter-synchronized-folders is an example of using a different storage to sync work. This isn't quite providing a new storage backend for jupyter. Using a sync'ed folder has the advantage that more than just the notebook is synced.
Worth investigating if/how we could have a sync'ed folder backed by OSF.
More broadly:
There are various ways of working together and organising open-source projects. The one I believe in most is the "someone else has to merge your PR" rule. (assuming the basics like automatic tests etc are covered)
This creates a bit of a challenge for new projects, because often there is only one or two people ... a popular way around it is to offer to bestow the privilege of merging PRs on anyone who has a PR merged. I can imagine all sorts of ways that this could go wrong or lead to stress etc but in my (limited) experience it works well in practice.
If there is no objection voiced here I will start following this for all future PRs. I wouldn't broadly advertise it, just ask people when their first PR is merged.
Related to #18: it would be useful to have a simple command that is the opposite of osf clone
. I thought about integrating it into osf upload -r
but I think the semantics are sufficiently different (for example you don't really want to specify a destination storage, it should use the same as a file currently uses).
Also need to think about how to handle the case when a local file is different from what is on OSF. There are versions on OSF so we might be able to figure out something smarter than just "remote file is different, overwrite?".
Maybe this is the time to switch to osf clone
to make the initial copy, osf fetch
to update local from OSF and osf push
to update OSF from local. Following the git names/pattern a bit.
Maybe this would be hard to do, but it would be great if there was a way to include a --delete
option when doing a recursive upload so that if a file has been deleted locally, the file will also be deleted in the OSF project, maybe asking for permission for each file by default.
All tests here fake the OSF in order to run fast and not rely on some project on the OSF that needs resetting. This means the build could be green despite the code not working in practice anymore because something has changed on the OSF side.
We should add some (light?) tests using one of the HTTP replay/recorder libraries to check we are still compatible with the OSF, and give feedback to the OSF when something that used to work breaks.
Travis only tests python 3, not older versions. This means python3 only features (#46) are sneaking into the code. While there was no promise ever to also support python2 it seems like a good idea to do that where possible, or at least make some effort.
Step one of this is to add a python2 build to travis.
https://raw.githubusercontent.com/dib-lab/osf-cli/master/docs/index.rst
We are missing that crucial _
at the end of the OSF introductory materials
and GitHub repository
mentions that are meant to be links. Right now restructured text doesn't rexognise them as links.
If a username is given but OSF_PASSWORD
isn't set, prompt the user for it.
hi,
i don't suppose you have any plans to make osf-cli asynchronous?
with thanks
I tried looping over all files and directories in a directory and uploading each to OSF using
for i in *; do echo $i; osf upload -r $i /; done
But the upload fails on single files with the error message
RuntimeError: Expected source (boris_rod_lengths.dat) to be a directory when using recursive mode.
It would be nice if it could just ignore the -r
flag on single files.
The current setup is pretty involved, especially for newcomers or people who only contribute infrequently.
We should try to streamline things a bit (see #42) as well as write a short guide for adding new tests and how to use mock
.
Running just osf
should result in some out put like osf -h
. Currently it is blank
this worked nicely:
% for i in anno.*.tar.gz; do osf -p jzpxn -u [email protected] upload $i osfstorage/prokka/$i; done
but was entirely silent, which is a frustrating default :)
New subcommand to get the URL from which a file in a project can be downloaded with wget
etc. For easy sharing with others or use in setups without osfclient
.
Maybe like osf url <remote_path>
which then prints out the URL to fetch the file.
Can we generate a link containing a "secret" so that files from private repos can be shared as well?
osf upload
only works with single files.
Add support to allow the user to upload a folder and its contents or even a whole tree of folders.
is this a py3 issue?
# after running sudo pip install -e .
% osf
Traceback (most recent call last):
File "/usr/local/bin/osf", line 9, in <module>
load_entry_point('osfclient', 'console_scripts', 'osf')()
File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 542, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2569, in load_entry_point
return ep.load()
File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2229, in load
return self.resolve()
File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 2235, in resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/titus/osf-cli/osfclient/__init__.py", line 2, in <module>
from .api import OSF
File "/home/titus/osf-cli/osfclient/api.py", line 1, in <module>
from .models import OSFCore
File "/home/titus/osf-cli/osfclient/models/__init__.py", line 6, in <module>
from .core import OSFCore
File "/home/titus/osf-cli/osfclient/models/core.py", line 31
def _get_attribute(self, json, *keys, default=None):
^
SyntaxError: invalid syntax
To reduce the number of HTTP requests we make, checkout https://developer.osf.io/#Introduction_embedding
can we use it to store SBTs and minhashes? (c.f. https://github.com/dib-lab/sourmash/)
can it be used as a Jupyter Notebook file system?
what about a purely JavaScript upload/download app?
Problem: osfcli does not look at cache headers or etag information.
Add support for taking this information into account and save on network requests made.
Add support for a config file in the local directory .osf-storage.config
to store the username and project ID.
We should not offer any support that would encourage people to store secrets in a file/accidentally have it end up in some kind of history.
This is a bit of a vague issue. A lot of the test code duplicates things, this will/has lead to bugs and is hard to maintain.
We should explore options to reduce this without making the tests harder to understand (no one likes debugging the tests while debugging real code)
It isn't possible to download a single file with osf fetch
, you always get the whole project.
Need to add a way to filter the files before downloading and some conventions on command line arguments.
The list of recognised storage backends in osfclient
is missing a lot of valid ones.
I think this is a good place to start finding out the name of all the valid storage providers (what osfclient calls storage backends): https://waterbutler.readthedocs.io/en/latest/providers.html
Thoughts on releasing/packaging:
osfcli.__version__
setup.py
osf clone
produces the right file tree but all files have zero size.
python 3, current master.
Looking into it at the moment.
ref #2 (comment), it's trying to get a download link for a directory.
python -m osfclient fetch foo 7g6vu
looking at: testfoo
Downloading: Questionnaire.docx to /Users/t/dev/osf-cli/foo/dropbox/Questionnair
e.docx...
Downloading: README.md to /Users/t/dev/osf-cli/foo/github/README.md...
Downloading: Recycling Event.png to /Users/t/dev/osf-cli/foo/osfstorage/Recyclin
g Event.png...
Downloading: osf_subdirectory to /Users/t/dev/osf-cli/foo/osfstorage/osf_subdire
ctory...
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3
.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3
.6/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/t/dev/osf-cli/osfclient/__main__.py", line 31, in <module>
main()
File "/Users/t/dev/osf-cli/osfclient/__main__.py", line 28, in main
args.func(args)
File "/Users/t/dev/osf-cli/osfclient/__init__.py", line 42, in fetch
response = oo.request_session.get(c.raw['links']['download'])
KeyError: 'download'
Here's the traceback:
lindsb@rrlogin:~$ osf -p wy5pj -u [email protected] upload 2016-12-15_usage_1.txt 2016-12-15_usage_1.txt
Traceback (most recent call last):
File "/home/lindsb/usr/miniconda/bin/osf", line 11, in <module>
load_entry_point('osfclient', 'console_scripts', 'osf')()
File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/__main__.py", line 97, in main
exit_code = args.func(args)
File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/cli.py", line 214, in upload
store.create_file(remote_path, fp, update=args.force)
File "/mnt/rrio1/home/lindsb/usr/osf-cli/osfclient/models/storage.py", line 106, in create_file
not_empty = fp.peek(1)
AttributeError: 'file' object has no attribute 'peek'
Any idea what's going on here?
Thanks in advance!
Link to https://cos.io/our-products/open-science-framework/ in addition to osf.io? Might give a better idea to total newcomers what this is all about?
...and run by various people to make sure it's clear!
I'll take this on.
Welcome #mozsprint'ers!
Let's try and close all the issues for https://github.com/dib-lab/osf-cli/milestone/1 and get v0.1 shipped.
If you have questions or comments post here.
When I tried cloning a project, the progress bar was a little confusing, and then it showed up kind of unattractively on the next command line prompt when the clone finished, like so:
lindsb@rrlogin:scratch$ osf -u [email protected] -p wy5pj clone
Please input your password:
82files [03:41, 1.57s/files]
lindsb@rrlogin:scratch$ █████████████████████████| 755/755 [00:00<00:00, 3.46Mbytes/s]
During cloning, the progress bar stayed at 100% the whole time, and the number of files incremented above the error bar without telling how many files total will be downloaded.
I feel like however the download progress is reported for cloning should be consistent with progress reporting for the download
and upload
commands. Maybe even add a --verbose
flag for cloning too, like mentioned in issue #93?
osf upload
can only create new files, if a file with the same name already exists it can't overwrite it.
The request is similar (omit name parameter), the tricky part is determining whether the user meant to overwrite the file or if it was a typo/mistake. And how to let users set this from the CLI.
The usage examples are out of date since #25 as the project ID is now a global parameter.
Is it currently possible to upload a file to a storage option that isn't OSF Storage
?
I'm using the figshare add-on and would like to be able to upload to its storage area, but can't figure out how to indicate this. osf ls
shows the storage area and any files in it:
$ osf ls
figshare/data.csv
osfstorage/data2.csv
osfstorage/data.csv
But if I try to upload using:
osf upload data3.csv figshare/data3.csv
I end up creating a figshare
directory in osfstorage
:
ethan@gandalf:~/osftest$ osf ls
figshare/data.csv
osfstorage/figshare/data3.csv
osfstorage/data2.csv
osfstorage/data.csv
Uploading a file of size zero silently fails.
This should at the very least create an error message or ideally an empty file on the storage.
The read the docs front page of osfclient should have a link to this GitHub repository to make it easy to find.
This is related to and mentioned in #47, but it would be nice to have a --verbose
/-v
flag to indicate which files have been uploaded or downloaded. This would be nice especially for recursive uploads. Maybe it would be worth having it available for single file uploads and downloads as well, so that if, for example, a bash script loops over files and uploads them individually, we could still see a printout of which file was just uploaded.
This could be as simple as just printing out file paths as they're uploaded, but it might be nice to show the % completion or even a small progress bar to the right of each file to show it's not hung when downloading a large file.
Following on from this discussion we could use the config file to specify how to map remote storage to local directories (for use in osf clone
).
Currently we create a subdirectory for each storage, which can be tedious especially if you only have one storage. Something along the lines of:
[osf]
osfstorage = local_path
githubstorage = other_local_path
with defaults equal to the current behaviour.
Switch setup.py
to read in the README as long description so it appears on PyPI's page for the project.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.