Coder Social home page Coder Social logo

papers's People

Contributors

aiotter avatar elonhub avatar flaminglasrswrd avatar gl-yziquel avatar hugovk avatar kirk86 avatar malfatti avatar maxbachmann avatar perrette avatar s3170 avatar seblemaguer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

papers's Issues

tests: $HOME not set?

After #53 -- which is a good idea! -- I see:

(python311) → master Work/papers tox -e py311                                                    8:07:09
GLOB sdist-make: /home/boyan/boyanshouse/Vazhno/Work/papers/setup.py
py311 create: /home/boyan/boyanshouse/Vazhno/Work/papers/.tox/py311
py311 installdeps: bibtexparser, scholarly, crossrefapi, rapidfuzz, unidecode, normality, pytest, pytest-cov
py311 inst: /home/boyan/boyanshouse/Vazhno/Work/papers/.tox/.tmp/package/1/papers-cli-2.3.dev130+gfd9291b.zip
py311 installed: alabaster==0.7.13,anyio==3.6.2,arrow==1.2.3,async-generator==1.10,attrs==23.1.0,Babel==2.12.1,banal==1.0.6,beautifulsoup4==4.12.2,bibtexparser==1.4.0,certifi==2022.12.7,chardet==5.1.0,charset-normalizer==3.1.0,coverage==7.2.4,crossrefapi==1.5.0,Deprecated==1.2.13,docutils==0.18.1,exceptiongroup==1.1.1,fake-useragent==1.1.3,free-proxy==1.1.1,h11==0.14.0,httpcore==0.17.0,httpx==0.24.0,idna==3.4,imagesize==1.4.1,iniconfig==2.0.0,Jinja2==3.1.2,lxml==4.9.2,MarkupSafe==2.1.2,normality==2.4.0,outcome==1.2.0,packaging==23.1,papers-cli @ file:///home/boyan/boyanshouse/Vazhno/Work/papers/.tox/.tmp/package/1/papers-cli-2.3.dev130%2Bgfd9291b.zip,pluggy==1.0.0,Pygments==2.15.1,pyparsing==3.0.9,PySocks==1.7.1,pytest==7.3.1,pytest-cov==4.0.0,python-dateutil==2.8.2,python-dotenv==1.0.0,rapidfuzz==3.0.0,requests==2.29.0,scholarly==1.7.11,selenium==4.9.0,six==1.16.0,sniffio==1.3.0,snowballstemmer==2.2.0,sortedcontainers==2.4.0,soupsieve==2.4.1,Sphinx==6.2.1,sphinx-rtd-theme==1.2.0,sphinxcontrib-applehelp==1.0.4,sphinxcontrib-devhelp==1.0.2,sphinxcontrib-htmlhelp==2.0.1,sphinxcontrib-jquery==4.1,sphinxcontrib-jsmath==1.0.1,sphinxcontrib-qthelp==1.0.3,sphinxcontrib-serializinghtml==1.1.5,text-unidecode==1.3,trio==0.22.0,trio-websocket==0.10.2,typing_extensions==4.5.0,Unidecode==1.3.6,urllib3==1.26.15,wrapt==1.15.0,wsproto==1.2.0
py311 run-test-pre: PYTHONHASHSEED='2774689394'
py311 run-test: commands[0] | pytest --cov=papers --cov-append --cov-report=term-missing -xv
========================================== test session starts ==========================================
platform linux -- Python 3.11.3, pytest-7.3.1, pluggy-1.0.0 -- /home/boyan/boyanshouse/Vazhno/Work/papers/.tox/py311/bin/python
cachedir: .tox/py311/.pytest_cache
rootdir: /home/boyan/boyanshouse/Vazhno/Work/papers
plugins: cov-4.0.0, anyio-3.6.2
collected 0 items / 1 error                                                                             

================================================ ERRORS =================================================
__________________________________ ERROR collecting tests/test_add.py ___________________________________
../../../miniconda3/envs/python311/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1206: in _gcd_import
    ???
<frozen importlib._bootstrap>:1178: in _find_and_load
    ???
<frozen importlib._bootstrap>:1128: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
    ???
<frozen importlib._bootstrap>:1206: in _gcd_import
    ???
<frozen importlib._bootstrap>:1178: in _find_and_load
    ???
<frozen importlib._bootstrap>:1149: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:690: in _load_unlocked
    ???
<frozen importlib._bootstrap_external>:940: in exec_module
    ???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
    ???
tests/__init__.py:3: in <module>
    sp.check_call('git config --list | grep user.name || git config --global user.name "Papers Tests"', shell=True)
../../../miniconda3/envs/python311/lib/python3.11/subprocess.py:413: in check_call
    raise CalledProcessError(retcode, cmd)
E   subprocess.CalledProcessError: Command 'git config --list | grep user.name || git config --global user.name "Papers Tests"' returned non-zero exit status 128.
-------------------------------------------- Captured stderr --------------------------------------------
fatal: $HOME not set
======================================== short test summary info ========================================
ERROR tests/test_add.py - subprocess.CalledProcessError: Command 'git config --list | grep user.name || git config --global us...
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
=========================================== 1 error in 0.22s ============================================
ERROR: InvocationError for command /home/boyan/boyanshouse/Vazhno/Work/papers/.tox/py311/bin/pytest --cov=papers --cov-append --cov-report=term-missing -xv (exited with code 1)
________________________________________________ summary ________________________________________________
ERROR:   py311: commands failed

Not sure what the thing is here...

DOI parsing fails in a few cases

The current method to retrieve DOI consists in search for regular expressions over the first two pages, and to keep the first one that appear.

Accepted prefixes are (lower or upper case):

'doi:', 'doi: ', 'doi ', 'dx\.doi\.org/', 'doi/'

DOI itself is searched as:

r"10\.\d\d\d\d/.*?"

And is expected to finish with:

r"[, \n]"

The method fails in a few cases:

  • when DOI spreads over two lines (e.g. here)
  • when other DOIs appear before the actual paper's DOI, for example here

These could be solved by more permissive parsing of DOI, but keep it conservative for now until a good solution is found.

Nevertheless, existing edits / fixes currently include:

  • underscore sometimes gets converted into an empty space by pdftotxt, so we also detect ending with any space followed by a digit. This solves at least one case.

Is it any opportunity to get new full path of moved&renamed pdf file.

I use zathura for pdf viewing and added some bindings for using papers inside zathura. When I add some paper it change the location and I should manually re-open it. Is it any way to get new location after adding papers for automation it re-opening?

P.S. my bind is:
map <C-b> feedkeys ":exec papers add -r $FILE<Return>"

attribute loads is disappearing in module bibtexparser

Hi.

I tend to like living on git HEAD. I therefore installed bibtexparser from source, i.e. version 2.0.0b3. Launching the papers cli then yields:

AttributeError: module 'bibtexparser' has no attribute 'loads'

Just warning you that it will come to bite you when you upgrade bibtexparser to the planned version.

Controlling key and filename fields

Hey! Thanks for sharing this very nice software!

I'm trying to generate entries with key format like perrette2013scaling instead of Perrette_2013 and to rename the file as Perrette2013_AScalingApproachToProjectRegionalSeaLevelRiseAndItsUncertainties.pdf. Is such controlling implemented yet? If not, I could play with this and try to implement --key-fields and --filename-fields arguments to some subcommands, where the default would be --key-fields author,_,year :) Would you accept pull requests?

Cheers,
Malfatti

Strange behavior with pdf-folder and bibtex folder

my config:

{
  "bibtex": "/home/user/Library/Bibtex/lib.bib",
  "filesdir": "/home/user/Library/PDFs",
  "git": true,
  "gitdir": "/home/user/Library"
}

If I trying to use paper add file.pdf I got error:

Traceback (most recent call last):
  File "/usr/bin/papers", line 5, in <module>
    papers.bib.main()
  File "/usr/lib/python3.9/site-packages/papers/bib.py", line 1350, in main
    check_install() and addcmd(o)
  File "/usr/lib/python3.9/site-packages/papers/bib.py", line 987, in addcmd
    savebib(my, o)
  File "/usr/lib/python3.9/site-packages/papers/bib.py", line 898, in savebib
    config.gitcommit()
  File "/usr/lib/python3.9/site-packages/papers/config.py", line 115, in gitcommit
    if not os.path.samefile(self.bibtex, target):
  File "/usr/lib/python3.9/genericpath.py", line 101, in samefile
    s2 = os.stat(f2)
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/Library/lib.bib'

But bib file is located /home/user/Library/Bibtex/lib.bib
If I add sim-link from /home/user/Library/Bibtex/lib.bib -> /home/user/Library/lib.bib
it works fine.

"list index out of range" issue stops multi-file adds

Absolutely loving this project. I have several folders of many pdfs I'm trying to add to papers, using

papers add -rc *.pdf

papers is hitting an error, with output

Traceback (most recent call last):
File "/usr/local/bin/papers", line 5, in
papers.bib.main()
File "/usr/local/lib/python3.4/dist-packages/papers/bib.py", line 1343, in main
check_install() and addcmd(o)
File "/usr/local/lib/python3.4/dist-packages/papers/bib.py", line 969, in addcmd
**kw)
File "/usr/local/lib/python3.4/dist-packages/papers/bib.py", line 390, in add_pdf
entry = bib.entries[0]
IndexError: list index out of range

And when it does this, none of the references have been added to the .bib file.

This happens if I include --ignore-errors as an argument, or don't use the -rc options. I've used setup to set up a config, and entering papers by hand works, as has several folders of just a few papers.

Suggestions?

Thanks!

FileNotFoundError: [WinError 2] The system cannot find the file specified

When I call python scripts\papers extract esd-4-11-2013.pdf in the anaconda prompt, also as an administrator, I get this error:

(PaperDownload) (base) C:\phd_scripts\paper-download\papers-master>python scripts\papers extract esd-4-11-2013.pdf
Traceback (most recent call last):
  File "scripts\papers", line 5, in <module>
    papers.bib.main()
  File "C:\ProgramData\Anaconda3\lib\site-packages\papers\bib.py", line 1359, in main
    extractcmd(o)
  File "C:\ProgramData\Anaconda3\lib\site-packages\papers\bib.py", line 1284, in extractcmd
    print(extract_pdf_metadata(o.pdf, search_doi=not o.fulltext, search_fulltext=True, scholar=o.scholar, minwords=o.word_count, max_query_words=o.word_count))
  File "C:\ProgramData\Anaconda3\lib\site-packages\papers\extract.py", line 201, in extract_pdf_metadata
    txt = pdfhead(pdf, maxpages, minwords, image=image)
  File "C:\ProgramData\Anaconda3\lib\site-packages\papers\extract.py", line 138, in pdfhead
    txt += readpdf(pdf, first=i, last=i)
  File "C:\ProgramData\Anaconda3\lib\site-packages\papers\extract.py", line 37, in readpdf
    sp.check_call(cmd)
  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 342, in check_call
    retcode = call(*popenargs, **kwargs)
  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 323, in call
    with Popen(*popenargs, **kwargs) as p:
  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 775, in __init__
    restore_signals, start_new_session)
  File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 1178, in _execute_child
    startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

I'm on Windows 10, 64bit

Improve test coverage

We now have about 60% test coverage (test coverage issues were fixed in #44).

Name                  Stmts   Miss  Cover
-----------------------------------------
papers/__init__.py        4      0   100%
papers/__main__.py      520    206    60%
papers/_version.py        2      0   100%
papers/bib.py           492    192    61%
papers/config.py        250     86    66%
papers/duplicate.py     384    136    65%
papers/encoding.py       84     22    74%
papers/extract.py       236    151    36%
papers/filename.py       51      4    92%
papers/latexenc.py       57     36    37%
-----------------------------------------
TOTAL                  2080    833    60%

As a first objective, we should aim for 100% test coverage for __main__.py, i.e. that every sub-command and if/then/else branching is executed at least once. That way we can be sure that namespace, import and other syntax errors ring the bell. From there, we can start thinking about semantic, i.e. thinking about simple and intricate cases where we expect a meaningful result.

The coverage reporting includes a missing lines section that is more informative:

Name                  Stmts   Miss  Cover   Missing
---------------------------------------------------
papers/__init__.py        4      0   100%
papers/__main__.py      520    206    60%   26, 30, 79-87, 103-119, 123-133, 147-149, 152, 171, 177-178, 185-186, 190-204, 215, 219-220, 239, 245-250, 253-260, 271, 306, 318-330, 333-339, 343-347, 356, 360-370, 374, 377, 381-386, 391-496, 499, 510-511, 815, 820-821, 827, 829, 831, 833, 838-842
papers/_version.py        2      0   100%
papers/bib.py           492    192    61%   45-47, 67, 140-143, 157-158, 175-190, 214, 233-234, 237, 262, 273, 332, 371, 381, 388, 401-402, 408, 440-446, 450, 455-459, 469, 505, 513-514, 523, 541, 562-565, 575-580, 590-594, 598-613, 616-629, 633-660, 664-668, 671, 674-682, 686, 692-703, 709-781
papers/config.py        250     86    66%   32-39, 66-67, 94-98, 105, 110, 124, 157-159, 164-165, 171, 174-186, 189-205, 213, 219, 229, 232, 234, 241, 246, 248, 254, 257-262, 299, 306-309, 317-319, 322-326, 333, 344-345, 349-359
papers/duplicate.py     384    136    65%   72-74, 88-89, 95-96, 100-101, 150-157, 163, 174, 176, 178, 195, 200-235, 242, 246, 272, 294, 304, 312, 314-318, 340, 349-371, 401-404, 407, 410, 417, 420, 427, 430-433, 436-451, 477-484, 487, 490, 493, 496-497, 500, 511-525, 538-542, 560-561, 619, 622
papers/encoding.py       84     22    74%   35, 63-64, 106-120, 125-128
papers/extract.py       236    151    36%   32, 51-78, 97, 105, 109, 117-118, 132, 144-159, 179-189, 193-199, 202, 215-221, 226-229, 234-239, 244-245, 250-268, 272, 277-286, 292-332, 337-367, 372-386
papers/filename.py       51      4    92%   18-19, 45-46
papers/latexenc.py       57     36    37%   23-31, 41-90, 100-101
---------------------------------------------------
TOTAL                  2080    833    60%

Interactive papers add ?

Hi.

I would like to know if it could be possible (or if it is kind of in the roadmap -- very well possibly could not be) to make papers addrun interactively and provide a mechanism to populate papers.bib interactively.

I'd like, for instance, to cite Théorie de l'addition des variables aléatoires, Paul Lévy, 1937, and it would be sensible, I guess, to enter that entry manually.

If it could be done manually, interactively, via the papers cli, that would be great. I do not see command line options for that, however.

Drop 2.7 support

Need to make a new version without six etc...
Would make papers simpler to maintain and extend.

IndexError: string index out of range (when reproducing your example)

Ubuntu 18.04, Python 2.7 (although also tried with 3.5), poppler and pip-dependencies installed.

  1. A regular papers install, no problems
    papers install --bibtex papers.bib --filesdir files --git --gitdir ./

  2. Download your papers and try to extract from it, get error message
    papers add --rename --copy --bibtex papers.bib --filesdir files esd-4-11-2013.pdf --info

NFO:papers:bibtex: papers.bib
INFO:papers:filesdir: files
Traceback (most recent call last):
File "/usr/local/bin/papers", line 5, in
papers.bib.main()
File "/usr/local/lib/python2.7/dist-packages/papers/bib.py", line 1343, in main
check_install() and addcmd(o)
File "/usr/local/lib/python2.7/dist-packages/papers/bib.py", line 942, in addcmd
my = Biblio.load(o.bibtex, o.filesdir)
File "/usr/local/lib/python2.7/dist-packages/papers/bib.py", line 258, in load
return cls(bibtexparser.loads(bibtexs), filesdir)
File "/usr/local/lib/python2.7/dist-packages/papers/encoding.py", line 12, in
bibtexparser.loads = lambda s: _bloads(s.decode('utf-8') if type(s) is str else s)
File "/usr/local/lib/python2.7/dist-packages/bibtexparser/init.py", line 48, in loads
return parser.parse(bibtex_str)
File "/usr/local/lib/python2.7/dist-packages/bibtexparser/bparser.py", line 145, in parse
bibtex_file_obj = self._bibtex_file_obj(bibtex_str)
File "/usr/local/lib/python2.7/dist-packages/bibtexparser/bparser.py", line 213, in _bibtex_file_obj
if bibtex_str[0] == byte:
IndexError: string index out of range

Any clues? And thanks for what looks like such a great way to manage a bibliography!

Have a more local --local install

Right now the papers install --local writes .papersconfig locally, but keep the same global defaults for filesdir and bibtex
While this may be OK and can be overwritten manually, the use case for --local is to have parallel, independent, medium-size projects. To me, this is one of the main uses of papers.

It would be good to adjust the defaults so that:

  • papers install --local defaults to local bibliography and filesdir
  • default to an existing bibliography if any is found

To make it more useful, it might also be good to have a recursive, upward search like git

Parsing braces when generating citation key

If the BibTeX returned by the DOI query includes braces in the author list, then it seems the code which generates the citation key based on the last name of the first author fails.

Example: The article http://dx.doi.org/10.4169/amer.math.monthly.118.05.450 returns via papers extract the following "entry"

@article{Alan D. Sokal_2011,
 author = {{Alan D. Sokal}, },
 doi = {10.4169/amer.math.monthly.118.05.450},
 journal = {The American Mathematical Monthly},
 number = {5},
 pages = {450},
 publisher = {Informa UK Limited},
 title = {A Really Simple Elementary Proof of the Uniform Boundedness Theorem},
 url = {http://dx.doi.org/10.4169/amer.math.monthly.118.05.450},
 volume = {118},
 year = {2011}
}

Note that the citation key contains spaces and is invalid BibTeX.

Tests fail when ``.papers`` does not exist.

This is kinda nit-picky and it's not even clear if it's worth a "fix", but when the .papers dir does not exist, a test fails:

__________________________________________ TestAdd2.test_add ___________________________________________
[gw2] linux -- Python 3.11.3 /home/boyan/boyanshouse/miniconda3/envs/python311/bin/python

self = <test_papers.TestAdd2 testMethod=test_add>

    def test_add(self):
        self.assertTrue(os.path.exists(self.mybib))
        print("bibtex", self.mybib, 'exists?', os.path.exists(self.mybib))
>       sp.check_call('papers add --bibtex {} {}'.format(
            self.mybib, self.pdf), shell=True)

tests/test_papers.py:177: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

popenargs = ('papers add --bibtex /tmp/papers.biba8iolx_d /home/boyan/boyanshouse/Vazhno/Work/papers/tests/downloadedpapers/esd-4-11-2013.pdf',)
kwargs = {'shell': True}, retcode = 1
cmd = 'papers add --bibtex /tmp/papers.biba8iolx_d /home/boyan/boyanshouse/Vazhno/Work/papers/tests/downloadedpapers/esd-4-11-2013.pdf'

    def check_call(*popenargs, **kwargs):
        """Run command with arguments.  Wait for command to complete.  If
        the exit code was zero then return, otherwise raise
        CalledProcessError.  The CalledProcessError object will have the
        return code in the returncode attribute.
    
        The arguments are the same as for the call function.  Example:
    
        check_call(["ls", "-l"])
        """
        retcode = call(*popenargs, **kwargs)
        if retcode:
            cmd = kwargs.get("args")
            if cmd is None:
                cmd = popenargs[0]
>           raise CalledProcessError(retcode, cmd)
E           subprocess.CalledProcessError: Command 'papers add --bibtex /tmp/papers.biba8iolx_d /home/boyan/boyanshouse/Vazhno/Work/papers/tests/downloadedpapers/esd-4-11-2013.pdf' returned non-zero exit status 1.

../../../miniconda3/envs/python311/lib/python3.11/subprocess.py:413: CalledProcessError
----------------------------------------- Captured stdout call ------------------------------------------
bibtex /tmp/papers.biba8iolx_d exists? True
----------------------------------------- Captured stderr call ------------------------------------------
Traceback (most recent call last):
  File "/home/boyan/boyanshouse/miniconda3/envs/python311/bin/papers", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/boyan/boyanshouse/Vazhno/Work/papers/papers/bib.py", line 1577, in main
    check_install() and addcmd(o)
    ^^^^^^^^^^^^^^^
  File "/home/boyan/boyanshouse/Vazhno/Work/papers/papers/bib.py", line 1568, in check_install
    logger.info('filesdir: '+config.filesdir)
                ~~~~~~~~~~~~^~~~~~~~~~~~~~~~
TypeError: can only concatenate str (not "NoneType") to str

Tags?

Is it possible to attach tags to each paper and use them to make searches?

Respect crossref's etiquette

Respect crossref's etiquette:

  • cache request results
    Currently results are cached using a local .crossref-bibtex.json file. This should be saved in a centralized configuration file (see related Planned Features in readme).

  • Specify a User-Agent header that properly identifies your script or tool and that provides a means of contacting you vai email using "mailto:". For example: GroovyBib/1.1 (https://example.org/GroovyBib/; mailto:[email protected]) BasedOnFunkyLib/1.4.

hardlinking kinda fails, sometimes..

OK, this is like some inside baseball here, but is worth noting I have occasionally started to see the following test failure on one machine:

self = <tests.test_add.TestAdd testMethod=test_add_rename_copy>

    def test_add_rename_copy(self):
    
>       paperscmd(f'add -rc --bibtex {self.mybib} --filesdir {self.filesdir} {self.pdf}')

tests/test_add.py:76: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/common.py:59: in paperscmd
    return speedy_paperscmd(cmd, *args, **kwargs)
tests/common.py:53: in speedy_paperscmd
    return call(main, args, check=check, check_output=check_output)
tests/common.py:34: in call
    return f(*args, **kwargs)
papers/__main__.py:1071: in main
    check_install(subp, o, config) and addcmd(subp, o, config)
papers/__main__.py:452: in addcmd
    biblio.add_pdf(file, attachments=o.attachment, rename=o.rename, copy=o.copy,
papers/bib.py:432: in add_pdf
    self.insert_entry(entry, update_key=True, **kw)
papers/bib.py:288: in insert_entry
    self.insert_entry_check(entry, update_key=update_key, rename=rename, copy=copy, **checkopt)
papers/bib.py:320: in insert_entry_check
    self.insert_entry(entry, update_key, rename=rename, copy=copy)
papers/bib.py:311: in insert_entry
    if rename: self.rename_entry_files(entry, copy=copy)
papers/bib.py:529: in rename_entry_files
    self.move(file, newfile, copy)
papers/bib.py:223: in move
    return _move(file, newfile, copy=copy, dryrun=papers.config.DRYRUN)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

f1 = '/home/boyan/boyanshouse/Vazhno/Work/papers/tests/downloadedpapers/bg-8-515-2011.pdf'
f2 = '/tmp/papers.files1tk1bilj/perrette_et_al_2011_near-ubiquity-of-ice-edge-blooms-in-the-arctic.pdf', copy = True
interactive = True, dryrun = False, hardlink = True

    def move(f1, f2, copy=False, interactive=True, dryrun=False, hardlink=True):
        maybe = 'dry-run:: ' if dryrun else ''
        dirname = os.path.dirname(f2)
        if dirname and not os.path.exists(dirname):
            logger.info(f'{maybe}create directory: {dirname}')
            if not dryrun: os.makedirs(dirname)
        if f1 == f2:
            logger.info('dest is identical to src: '+f1)
            return
    
        if os.path.exists(f2):
            # if identical file, pretend nothing happened, skip copying
            if os.path.samefile(f1, f2) or checksum(f2) == checksum(f1):
                if not copy and not dryrun:
                    logger.info(f'{maybe}rm {f1}')
                    os.remove(f1)
                return
    
            elif interactive:
                ans = input(f'dest file already exists: {f2}. Replace? (y/n) ')
                if ans.lower() != 'y':
                    return
            else:
                logger.info(f'{maybe}rm {f2}')
                if not dryrun:
                    os.remove(f2)
    
        if copy:
            # If we can do a hard-link instead of copy-ing, let's do:
            if hardlink:
                cmd = f'{maybe}ln {f1} {f2}'
                logger.info(cmd)
                if not dryrun:
>                   os.link(f1, f2)
E                   OSError: [Errno 18] Invalid cross-device link: '/home/boyan/boyanshouse/Vazhno/Work/papers/tests/downloadedpapers/bg-8-515-2011.pdf' -> '/tmp/papers.files1tk1bilj/perrette_et_al_2011_near-ubiquity-of-ice-edge-blooms-in-the-arctic.pdf'

papers/utils.py:119: OSError

This is not reproducible on two other machines -- and I think the filesystems are all flat! -- but one of the solutions seems to be here: higlass/higlass-manage#3 (tldr: replace os.link with shutil.copy2).

There's some back and forth of increasing dubiousness about hardlinks being bad and somewhat dangerous, and re-considering my behavior in the past, I'm tending to agree; I tend to just copy the files, and let the filesystem then dedupe them if if needs be -- the option here may be just not supporting the --hardlink option...

But again, this only rears it's head on one machine (and when I run tests manually, not though tox, cf #54 ).

scholarly.scholarly not found?

Hello folks -- @perrette first off, very very glad to see this fantastic project, and very much considering replacing my workflow (that relies on one of the particularly outdated proprietary packages you have listed on your front page) entirely with this! Thanks kindly! Thing is, I have a local library of 80 Gb of PDFs that's a good set of test cases here...

When I try papers extract yanofsky_qc.pdf --scholar, which should work, I get:

ModuleNotFoundError: No module named 'scholarly.scholarly'

This is with pip install papers-cli which may be out of date...

Anybody else seeing this?

papers --add fails if in a subdir?

I see

papers add 2013_AdvCIS_Modeling\ and\ simulation\ of\ electrostatically\ gated\ nanochannels.pdf --rename --copy --info         
INFO:papers:bibtex: '/home/boyan/Vazhno/Work/Literature/library.bib'
INFO:papers:filesdir: '/home/boyan/Vazhno/Work/Literature/papers_organized'
INFO:papers:8036 entry files were updated
INFO:papers:pdftotext -f 1 -l 1 2013_AdvCIS_Modeling and simulation of electrostatically gated nanochannels.pdf /tmp/tmppsa0k9ff.txt
INFO:papers:found doi:10.1016/j.cis.2013.06.006
INFO:papers:duplicate :: update key to match existing entry: 2013/2013_pardon_van-der-wijngaart_modeling-and-simulation-of-electrostatically-gated-nanochannels => Pardon2013
Traceback (most recent call last):
  File "/home/boyan/miniconda3/envs/python/bin/papers", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/__main__.py", line 1071, in main
    check_install(subp, o, config) and addcmd(subp, o, config)
                                       ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/__main__.py", line 452, in addcmd
    biblio.add_pdf(file, attachments=o.attachment, rename=o.rename, copy=o.copy,
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/bib.py", line 432, in add_pdf
    self.insert_entry(entry, update_key=True, **kw)
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/bib.py", line 288, in insert_entry
    self.insert_entry_check(entry, update_key=update_key, rename=rename, copy=copy, **checkopt)
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/bib.py", line 345, in insert_entry_check
    file = merge_files([candidate, entry], relative_to=self.relative_to)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/duplicate.py", line 290, in merge_files
    check = checksum(f) if os.path.exists(f) else None
            ^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/utils.py", line 81, in checksum
    return hash_bytestr_iter(file_as_blockiter(open(fname, 'rb')), hashlib.sha256())
                                               ^^^^^^^^^^^^^^^^^
IsADirectoryError: [Errno 21] Is a directory: '/home/boyan/Vazhno/Work/Literature'

The pdf itself is OK, I think -- this is the right PDF metadata after the add.

papers extract 2013_AdvCIS_Modeling\ and\ simulation\ of\ electrostatically\ gated\ nanochannels.pdf                    
@article{Pardon_2013, title={Modeling and simulation of electrostatically gated nanochannels}, volume={199–200}, ISSN={0001-8686}, url={http://dx.doi.org/10.1016/j.cis.2013.06.006}, DOI={10.1016/j.cis.2013.06.006}, journal={Advances in Colloid and Interface Science}, publisher={Elsevier BV}, author={Pardon, G. and van der Wijngaart, W.}, year={2013}, month=nov, pages={78–94} }

My papers is installed, and the config is:

(python) → working Literature/Stage cat ~/.local/share/config.json                                                                                                   
{
  "absolute_paths": true,
  "backup_files": false,
  "bibtex": "/home/boyan/Vazhno/Work/Literature/library.bib",
  "editor": null,
  "filesdir": "/home/boyan/Vazhno/Work/Literature/papers_organized",
  "git": true,
  "gitdir": "/home/boyan/.local/share",
  "gitlfs": true,
  "keyformat": {
    "author_num": 2,
    "author_sep": "_",
    "template": "{year}/{year}_{author}_{title}",
    "title_length": 100,
    "title_sep": "-",
    "title_word_num": 100,
    "title_word_size": 1
  },
  "local": false,
  "nameformat": {
    "author_num": 2,
    "author_sep": "_",
    "template": "{authorX}_{year}_{title}",
    "title_length": 100,
    "title_sep": "-",
    "title_word_num": 100,
    "title_word_size": 1
  }
}

Note that if I switch to the {journal} tag in the config ( by doing "template": "{journal}/{year}{author}{title}") which should be supported, as {journal} is a valid BibTex field, I get

INFO:papers:bibtex: '/home/boyan/Vazhno/Work/Literature/library.bib'
INFO:papers:filesdir: '/home/boyan/Vazhno/Work/Literature/papers_organized'
INFO:papers:8036 entry files were updated
INFO:papers:pdftotext -f 1 -l 1 2013_AdvCIS_Modeling and simulation of electrostatically gated nanochannels.pdf /tmp/tmp1if0wmzn.txt
INFO:papers:found doi:10.1016/j.cis.2013.06.006
Traceback (most recent call last):
  File "/home/boyan/miniconda3/envs/python/bin/papers", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/__main__.py", line 1071, in main
    check_install(subp, o, config) and addcmd(subp, o, config)
                                       ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/__main__.py", line 452, in addcmd
    biblio.add_pdf(file, attachments=o.attachment, rename=o.rename, copy=o.copy,
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/bib.py", line 427, in add_pdf
    entry['ID'] = self.generate_key(entry)
                  ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/bib.py", line 367, in generate_key
    key = self.keyformat(entry)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/filename.py", line 108, in __call__
    return self.render(**entry)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/filename.py", line 105, in render
    return stringify_entry(entry, **vars(self))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/boyan/miniconda3/envs/python/lib/python3.11/site-packages/papers/filename.py", line 68, in stringify_entry
    res = template.format(**fields)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'journal'

I can refile this as two issues, but am I calling "add" correctly? The behavior I expect is to have the PDF renamed and moved, and the entry added to the end of library.bib.

tests fail on master?

(python311) → master Work/papers pytest -vv .                                                                               18:37:55
======================================================== test session starts ========================================================
platform linux -- Python 3.11.2, pytest-7.3.1, pluggy-1.0.0 -- /home/boyan/boyanshouse/miniconda3/envs/python311/bin/python3.11
cachedir: .pytest_cache
rootdir: /home/boyan/boyanshouse/Vazhno/Work/papers
plugins: anyio-3.6.2
collected 95 items                                                                                                                  

tests/test_papers.py::TestBibtexFileEntry::test_format_file PASSED                                                            [  1%]
tests/test_papers.py::TestBibtexFileEntry::test_format_files PASSED                                                           [  2%]
tests/test_papers.py::TestBibtexFileEntry::test_parse_file PASSED                                                             [  3%]
tests/test_papers.py::TestBibtexFileEntry::test_parse_files PASSED                                                            [  4%]
tests/test_papers.py::TestSimple::test_doi PASSED                                                                             [  5%]
tests/test_papers.py::TestSimple::test_fetch PASSED                                                                           [  6%]
tests/test_papers.py::TestSimple::test_fetch_scholar PASSED                                                                   [  7%]
tests/test_papers.py::TestInstall::test_local_install PASSED                                                                  [  8%]
tests/test_papers.py::TestAdd::test_add PASSED                                                                                [  9%]
tests/test_papers.py::TestAdd::test_add_rename PASSED                                                                         [ 10%]
tests/test_papers.py::TestAdd::test_add_rename_copy PASSED                                                                    [ 11%]
tests/test_papers.py::TestAdd::test_fails_without_install PASSED                                                              [ 12%]
tests/test_papers.py::TestAdd2::test_add PASSED                                                                               [ 13%]
tests/test_papers.py::TestAdd2::test_add_attachment PASSED                                                                    [ 14%]
tests/test_papers.py::TestAdd2::test_add_rename PASSED                                                                        [ 15%]
tests/test_papers.py::TestAdd2::test_add_rename_copy PASSED                                                                   [ 16%]
tests/test_papers.py::TestAdd2::test_fails_without_install PASSED                                                             [ 17%]
tests/test_papers.py::TestAddBib::test_addbib PASSED                                                                          [ 18%]
tests/test_papers.py::TestAddDir::test_adddir_pdf FAILED                                                                      [ 20%]
tests/test_papers.py::TestAddDir::test_adddir_pdf_cmd FAILED                                                                  [ 21%]
tests/test_papers.py::TestDuplicates::test_anotherkey PASSED                                                                  [ 22%]
tests/test_papers.py::TestDuplicates::test_conflictauthor PASSED                                                              [ 23%]
tests/test_papers.py::TestDuplicates::test_conflictdoi PASSED                                                                 [ 24%]
tests/test_papers.py::TestDuplicates::test_conflictyear PASSED                                                                [ 25%]
tests/test_papers.py::TestDuplicates::test_exactsame PASSED                                                                   [ 26%]
tests/test_papers.py::TestDuplicates::test_missingdoi PASSED                                                                  [ 27%]
tests/test_papers.py::TestDuplicates::test_missingfield PASSED                                                                [ 28%]
tests/test_papers.py::TestDuplicates::test_missingtitauthor PASSED                                                            [ 29%]
tests/test_papers.py::TestDuplicatesExact::test_anotherkey PASSED                                                             [ 30%]
tests/test_papers.py::TestDuplicatesExact::test_conflictauthor PASSED                                                         [ 31%]
tests/test_papers.py::TestDuplicatesExact::test_conflictdoi PASSED                                                            [ 32%]
tests/test_papers.py::TestDuplicatesExact::test_conflictyear PASSED                                                           [ 33%]
tests/test_papers.py::TestDuplicatesExact::test_exactsame PASSED                                                              [ 34%]
tests/test_papers.py::TestDuplicatesExact::test_missingdoi PASSED                                                             [ 35%]
tests/test_papers.py::TestDuplicatesExact::test_missingfield PASSED                                                           [ 36%]
tests/test_papers.py::TestDuplicatesExact::test_missingtitauthor PASSED                                                       [ 37%]
tests/test_papers.py::TestDuplicatesGood::test_anotherkey PASSED                                                              [ 38%]
tests/test_papers.py::TestDuplicatesGood::test_conflictauthor PASSED                                                          [ 40%]
tests/test_papers.py::TestDuplicatesGood::test_conflictdoi PASSED                                                             [ 41%]
tests/test_papers.py::TestDuplicatesGood::test_conflictyear PASSED                                                            [ 42%]
tests/test_papers.py::TestDuplicatesGood::test_exactsame PASSED                                                               [ 43%]
tests/test_papers.py::TestDuplicatesGood::test_missingdoi PASSED                                                              [ 44%]
tests/test_papers.py::TestDuplicatesGood::test_missingfield PASSED                                                            [ 45%]
tests/test_papers.py::TestDuplicatesGood::test_missingtitauthor PASSED                                                        [ 46%]
tests/test_papers.py::TestDuplicatesPartial::test_anotherkey PASSED                                                           [ 47%]
tests/test_papers.py::TestDuplicatesPartial::test_conflictauthor PASSED                                                       [ 48%]
tests/test_papers.py::TestDuplicatesPartial::test_conflictdoi PASSED                                                          [ 49%]
tests/test_papers.py::TestDuplicatesPartial::test_conflictyear PASSED                                                         [ 50%]
tests/test_papers.py::TestDuplicatesPartial::test_exactsame PASSED                                                            [ 51%]
tests/test_papers.py::TestDuplicatesPartial::test_missingdoi PASSED                                                           [ 52%]
tests/test_papers.py::TestDuplicatesPartial::test_missingfield PASSED                                                         [ 53%]
tests/test_papers.py::TestDuplicatesPartial::test_missingtitauthor PASSED                                                     [ 54%]
tests/test_papers.py::TestDuplicatesAdd::test_anotherkey SKIPPED (skip cause does not make sense with add)                    [ 55%]
tests/test_papers.py::TestDuplicatesAdd::test_conflictauthor PASSED                                                           [ 56%]
tests/test_papers.py::TestDuplicatesAdd::test_conflictdoi PASSED                                                              [ 57%]
tests/test_papers.py::TestDuplicatesAdd::test_conflictyear PASSED                                                             [ 58%]
tests/test_papers.py::TestDuplicatesAdd::test_exactsame SKIPPED (skip cause does not make sense with add)                     [ 60%]
tests/test_papers.py::TestDuplicatesAdd::test_missingdoi PASSED                                                               [ 61%]
tests/test_papers.py::TestDuplicatesAdd::test_missingfield PASSED                                                             [ 62%]
tests/test_papers.py::TestDuplicatesAdd::test_missingtitauthor PASSED                                                         [ 63%]
tests/test_papers.py::TestAddResolveDuplicate::test_append PASSED                                                             [ 64%]
tests/test_papers.py::TestAddResolveDuplicate::test_conflict_updated_from_original PASSED                                     [ 65%]
tests/test_papers.py::TestAddResolveDuplicate::test_conflict_updated_from_original_but_originalkey PASSED                     [ 66%]
tests/test_papers.py::TestAddResolveDuplicate::test_original_updated_from_conflict PASSED                                     [ 67%]
tests/test_papers.py::TestAddResolveDuplicate::test_overwrite PASSED                                                          [ 68%]
tests/test_papers.py::TestAddResolveDuplicate::test_raises PASSED                                                             [ 69%]
tests/test_papers.py::TestAddResolveDuplicate::test_skip PASSED                                                               [ 70%]
tests/test_papers.py::TestAddResolveDuplicateCommand::test_append PASSED                                                      [ 71%]
tests/test_papers.py::TestAddResolveDuplicateCommand::test_conflict_updated_from_original PASSED                              [ 72%]
tests/test_papers.py::TestAddResolveDuplicateCommand::test_conflict_updated_from_original_but_originalkey PASSED              [ 73%]
tests/test_papers.py::TestAddResolveDuplicateCommand::test_original_updated_from_conflict PASSED                              [ 74%]
tests/test_papers.py::TestAddResolveDuplicateCommand::test_overwrite PASSED                                                   [ 75%]
tests/test_papers.py::TestAddResolveDuplicateCommand::test_raises PASSED                                                      [ 76%]
tests/test_papers.py::TestAddResolveDuplicateCommand::test_skip PASSED                                                        [ 77%]
tests/test_papers.py::TestCheckResolveDuplicate::test_merge PASSED                                                            [ 78%]
tests/test_papers.py::TestCheckResolveDuplicate::test_not_a_duplicate PASSED                                                  [ 80%]
tests/test_papers.py::TestCheckResolveDuplicate::test_pick_conflict_1 PASSED                                                  [ 81%]
tests/test_papers.py::TestCheckResolveDuplicate::test_pick_reference_2 PASSED                                                 [ 82%]
tests/test_papers.py::TestCheckResolveDuplicate::test_raises PASSED                                                           [ 83%]
tests/test_papers.py::TestCheckResolveDuplicate::test_skip_check PASSED                                                       [ 84%]
tests/test_papers.py::TestAddConflict::test_add_conflict_key_check_raises PASSED                                              [ 85%]
tests/test_papers.py::TestAddConflict::test_add_conflict_key_nocheck_raises PASSED                                            [ 86%]
tests/test_papers.py::TestAddConflict::test_add_conflict_key_update PASSED                                                    [ 87%]
tests/test_papers.py::TestAddConflict::test_add_miss_doi_merge PASSED                                                         [ 88%]
tests/test_papers.py::TestAddConflict::test_add_miss_field_fails PASSED                                                       [ 89%]
tests/test_papers.py::TestAddConflict::test_add_miss_merge PASSED                                                             [ 90%]
tests/test_papers.py::TestAddConflict::test_add_miss_titauthor_merge PASSED                                                   [ 91%]
tests/test_papers.py::TestAddConflict::test_add_same PASSED                                                                   [ 92%]
tests/test_papers.py::TestAddConflict::test_add_same_but_file PASSED                                                          [ 93%]
tests/test_papers.py::TestAddConflict::test_add_same_but_key_fails PASSED                                                     [ 94%]
tests/test_papers.py::TestAddConflict::test_add_same_but_key_interactive PASSED                                               [ 95%]
tests/test_papers.py::TestAddConflict::test_add_same_but_key_update PASSED                                                    [ 96%]
tests/test_papers.py::TestAddConflict::test_add_same_doi_fails PASSED                                                         [ 97%]
tests/test_papers.py::TestAddConflict::test_add_same_doi_unchecked PASSED                                                     [ 98%]
tests/test_papers.py::TestAddConflict::test_add_same_doi_update_key PASSED                                                    [100%]

============================================================= FAILURES ==============================================================
____________________________________________________ TestAddDir.test_adddir_pdf _____________________________________________________

self = <test_papers.TestAddDir testMethod=test_adddir_pdf>

    def setUp(self):
        self.pdf1, self.doi, self.key1, self.newkey1, self.year, self.bibtex1, self.file_rename1 = prepare_paper()
        self.pdf2, self.si, self.doi, self.key2, self.newkey2, self.year, self.bibtex2, self.file_rename2 = prepare_paper2()
        self.somedir = tempfile.mktemp(prefix='papers.somedir')
        self.subdir = os.path.join(self.somedir, 'subdir')
        os.makedirs(self.somedir)
        os.makedirs(self.subdir)
        shutil.copy(self.pdf1, self.somedir)
        shutil.copy(self.pdf2, self.subdir)
        self.mybib = tempfile.mktemp(prefix='papers.bib')
>       sp.check_call('papers install --local --no-prompt --bibtex {}'.format(self.mybib), shell=True)

tests/test_papers.py:302: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

popenargs = ('papers install --local --no-prompt --bibtex /tmp/papers.bib8eszhc3e',), kwargs = {'shell': True}, retcode = 2
cmd = 'papers install --local --no-prompt --bibtex /tmp/papers.bib8eszhc3e'

    def check_call(*popenargs, **kwargs):
        """Run command with arguments.  Wait for command to complete.  If
        the exit code was zero then return, otherwise raise
        CalledProcessError.  The CalledProcessError object will have the
        return code in the returncode attribute.
    
        The arguments are the same as for the call function.  Example:
    
        check_call(["ls", "-l"])
        """
        retcode = call(*popenargs, **kwargs)
        if retcode:
            cmd = kwargs.get("args")
            if cmd is None:
                cmd = popenargs[0]
>           raise CalledProcessError(retcode, cmd)
E           subprocess.CalledProcessError: Command 'papers install --local --no-prompt --bibtex /tmp/papers.bib8eszhc3e' returned non-zero exit status 2.

../../../miniconda3/envs/python311/lib/python3.11/subprocess.py:413: CalledProcessError
------------------------------------------------------- Captured stderr call --------------------------------------------------------
usage: papers [-h]
              {status,install,add,check,filecheck,list,doi,fetch,extract,undo,git}
              ...
papers: error: unrecognized arguments: --no-prompt
__________________________________________________ TestAddDir.test_adddir_pdf_cmd ___________________________________________________

self = <test_papers.TestAddDir testMethod=test_adddir_pdf_cmd>

    def setUp(self):
        self.pdf1, self.doi, self.key1, self.newkey1, self.year, self.bibtex1, self.file_rename1 = prepare_paper()
        self.pdf2, self.si, self.doi, self.key2, self.newkey2, self.year, self.bibtex2, self.file_rename2 = prepare_paper2()
        self.somedir = tempfile.mktemp(prefix='papers.somedir')
        self.subdir = os.path.join(self.somedir, 'subdir')
        os.makedirs(self.somedir)
        os.makedirs(self.subdir)
        shutil.copy(self.pdf1, self.somedir)
        shutil.copy(self.pdf2, self.subdir)
        self.mybib = tempfile.mktemp(prefix='papers.bib')
>       sp.check_call('papers install --local --no-prompt --bibtex {}'.format(self.mybib), shell=True)

tests/test_papers.py:302: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

popenargs = ('papers install --local --no-prompt --bibtex /tmp/papers.bib_o_zoa39',), kwargs = {'shell': True}, retcode = 2
cmd = 'papers install --local --no-prompt --bibtex /tmp/papers.bib_o_zoa39'

    def check_call(*popenargs, **kwargs):
        """Run command with arguments.  Wait for command to complete.  If
        the exit code was zero then return, otherwise raise
        CalledProcessError.  The CalledProcessError object will have the
        return code in the returncode attribute.
    
        The arguments are the same as for the call function.  Example:
    
        check_call(["ls", "-l"])
        """
        retcode = call(*popenargs, **kwargs)
        if retcode:
            cmd = kwargs.get("args")
            if cmd is None:
                cmd = popenargs[0]
>           raise CalledProcessError(retcode, cmd)
E           subprocess.CalledProcessError: Command 'papers install --local --no-prompt --bibtex /tmp/papers.bib_o_zoa39' returned non-zero exit status 2.

../../../miniconda3/envs/python311/lib/python3.11/subprocess.py:413: CalledProcessError
------------------------------------------------------- Captured stderr call --------------------------------------------------------
usage: papers [-h]
              {status,install,add,check,filecheck,list,doi,fetch,extract,undo,git}
              ...
papers: error: unrecognized arguments: --no-prompt
====================================================== short test summary info ======================================================
FAILED tests/test_papers.py::TestAddDir::test_adddir_pdf - subprocess.CalledProcessError: Command 'papers install --local --no-prompt --bibtex /tmp/papers.bib8eszhc3e' returned non-zero e...
FAILED tests/test_papers.py::TestAddDir::test_adddir_pdf_cmd - subprocess.CalledProcessError: Command 'papers install --local --no-prompt --bibtex /tmp/papers.bib_o_zoa39' returned non-zero e...
============================================= 2 failed, 91 passed, 2 skipped in 19.02s ==============================================
(python311) → master Work/papers                                                                                            18:38:

This is on python 3.11, on master. Is this a blocker for anything?

Backup, Sync and git tracking

Originally, git tracking feature was added in order to add safety to handling a global papers install.
Implementation details are now jeopardized with local install. Local installs are often git-tracked themselves, and nested git repos does not play good. Worse, papers git install might trigger commits to a directory where it is not expected to (fortunately it is off by default, so it still requires explicit user action to be enabled). In the original implementation, the git directory could also be separate from the bibtex file. If that was the case, the bibtex would be copied to the git directory upon saving, and a commit would be done. That works, but using git commands to revert or reset to a previous commit would then only affect the git repo, and not the original bibtex, making the overall behavior unintuitive. Clearly, some overhaul is needed.

While it is not entirely clear to me yet how that feature should evolve. The basic idea of using git to safeguard the bibtex, and undo unwanted changes, is still relevant IMO. Here a few options:

  • use git as an internal tool in papers, without explicitly asking about it. papers undo (and a new command papers redo) could be used to navigate git history. The git repo would be saved in a central papers dir, using different branches to handle different bibtex locations (using a slug of the full bibtex path as branch name, for instance). That could work even without a proper installation. Maybe. Issue: bibtex rename would break the flow by creating a new branch. We could live with that.

  • propose hooks upon bibtex save. Here a whole workflow could be fine-tuned by users. Could be used as internal to implement higher-level feature.

  • add options to track files, sync with a remote server etc.

For now I'll just leave that issue open to collect ideas. Current simplistic implementation works OK.

FILE field

Currently the assumed format for the file field is FILE:TYPE[;FILE:TYPE]... where TYPE is always pdf (so far) and FILE indicate the full file path.

This is understood by JabRef (at least for one file), but according to this discussion, each individual file should be instead BASENAME:FILE:TYPE or :FILE:TYPE or even :FILE:

Since BASENAME is redundant (it can be obtained from FILE), I kind of like :FILE:TYPE.
In any case, this should be accounted for when parsing the file field of a bibtex entry.

'papers extract' results in a call with nonsensical arguments to pdftotext

Hi,

I've been using 'papers' for quite a while now and this is the first time I've seen this issue. I am trying to extract the bilbiographic info of this article* from its pdf. The program throws this exception:

Command Line Error: Wrong page range given: the first page (2) can not be after the last page (1).
Traceback (most recent call last):
File "/usr/bin/papers", line 8, in
sys.exit(main())
^^^^^^
File "/usr/lib/python3.11/site-packages/papers/main.py", line 1091, in main
extractcmd(subp, o)
File "/usr/lib/python3.11/site-packages/papers/main.py", line 546, in extractcmd
print(extract_pdf_metadata(o.pdf, search_doi=not o.fulltext, search_fulltext=True, scholar=o.scholar, minwords=o.word_count, max_query_words=o.word_count, image=o.image))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/papers/extract.py", line 208, in extract_pdf_metadata
txt = pdfhead(pdf, maxpages, minwords, image=image)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/papers/extract.py", line 134, in pdfhead
txt += readpdf(pdf, first=i, last=i)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/site-packages/papers/extract.py", line 41, in readpdf
sp.check_call(cmd)
File "/usr/lib/python3.11/subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['pdftotext', '-f', '2', '-l', '2', 'paper.pdf', '/tmp/tmpaq14gv_5.txt']' returned non-zero exit status 99.

Apparently, 'papers' is calling 'pdftotext' with arguments that make no sense, so, what is making 'papers' get confused about those arguments?

(Have I mentioned how much I like this program? Cheers!)

*https://www.nature.com/articles/s41567-020-0990-x

install dir explicit path?

Might it make sense to change the install dir from ~/.local/config/ to ~/.local/config/papers, with the git keeping and all that? I also note that I'm not sure how well this plays with Windows, but can't test since I don't have a windows dev env up.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.