flamingtempura / bibtex-tidy Goto Github PK
View Code? Open in Web Editor NEWCleaner and Formatter for BibTeX files
Home Page: https://flamingtempura.github.io/bibtex-tidy/
License: MIT License
Cleaner and Formatter for BibTeX files
Home Page: https://flamingtempura.github.io/bibtex-tidy/
License: MIT License
I encountered an inputenc failure due to U+2212 in an article title.
What would be an appropriate escaping for the math minus sign?
Very, very small issue. It would be great if on your example website that the following error were fixed:
In the sort entries by area, there is the text:
(space delimited, e.g: id, type, publisher, author)
This was confusing at first as my input (exact same as the line) did not work. This should instead read:
(space delimited, e.g: id type publisher author)
i.e. just remove the commas. Thanks for the very useful tool btw!
Do you think it would be possible to change the behavior of bibtex-tidy
so that it accepts a list of files as CLI arguments, rather than only a single file?
bibtex-tidy [file1 [file2 [..]]
Thanks for your consideration!
Love your tool, it would be great if the configuration of https://flamingtempura.github.io/bibtex-tidy/ was in the URL as parameters. Then one could just share the link with the correct configuration for a project.
Add option to escape uppercase letters, e.g., in the title field. Example:
{{Model-Based Safety Assessment with SysML and Component Fault Trees: Application and Lessons Learned}}
-> {{{M}odel-{B}ased {S}afety {A}ssessment with {SysML} and {C}omponent {F}ault {T}rees: {A}pplication and {L}essons {L}earned}}
I.e.:
Love this tool, thank you very much. This is a feature request to git tag the releases so that pre-commit autoupdate
works correctly.
pre-commit autoupdate
uses tags to upgrade to latest versions of a tool. Currently pre-commit cannot know what is the commit for the latest release so the autoupdate
command uses the latest commit available which may be a work in progress or broken. It would be nice tag the "Release vX.Y.Z" commits.
I suggest to have as default 70 or maybe 80 characters as
maximum per each line, especially for those field that can
contain text as abstract or comments.
The matching criteria for spotting duplicated entries works pretty well, although on some occasions when the search is performed with 'Similar author and title' option I get too many false positives. For example, these two entries are tagged as duplicated, but actually the only really matching filed is the title
,
@book{hr09,
title = {Robust statistics},
author = {Huber, Peter J. and Ronchetti, Elvezio M.},
year = 2009,
publisher = {John Wiley \& Sons, Inc., Hoboken, NJ},
series = {Wiley Series in Probability and Statistics},
pages = {xvi+354 pp. + loose erratum},
doi = {10.1002/9780470434697},
isbn = {978-0-470-12990-6},
edition = {Second}
}
@book{huber81,
title = {Robust statistics},
author = {Huber, Peter J.},
year = 1981,
publisher = {John Wiley \& Sons, Inc., New York},
series = {Wiley Series in Probability and Mathematical Statistics},
pages = {ix+308},
isbn = {0-471-41805-6}
}
Do you think it is possible to enable (by default, or as option) addition of a trailing comma (for an entry's last key-val pair)?
A reason on why this is useful is given in the documentation of black
:
You might have noticed [...] that a trailing comma is always added. Such formatting produces smaller diffs; when you add or remove an element, it's always just one line.
It seems that also JabRef adds a trailing comma in files that it manipulates.
Thanks for your consideration!
Hi,
Some entries, especially from CoRR, have extra space between authors when they have full name and surname separated with 'and'. Examples:
https://dblp.uni-trier.de/rec/bibtex/journals/corr/abs-1811-02883
Anyway good work!
Hello, thank you for this very useful tool! It is really helping with the quite huge bibliography of my PhD thesis!
Anyway, I'm afraid I found a small problem: when I enable "escape special characters", it works great on most entries, except when I have math content. By escaping dollar symbol, it breaks math strings. I got quite a headache to pinpoint the problem!
For example, if I process this:
@Article{ashikari:1999,
title = {Rice gibberellin-insensitive dwarf mutant gene Dwarf 1 encodes the
author = {Ashikari, Motoyuki and Wu, Jianzhong and Yano, Masahiro and Sasaki, Takuji and Yoshimura, Atsushi},
year = 1999,
journal = {Proceedings of the National Academy of Sciences},
publisher = {National Acad Sciences},
volume = 96,
number = 18,
pages = {10284--10289},
}
the software return:
@Article{ashikari:1999,
title = {Rice gibberellin-insensitive dwarf mutant gene Dwarf 1 encodes the
author = {Ashikari, Motoyuki and Wu, Jianzhong and Yano, Masahiro and Sasaki, Takuji and Yoshimura, Atsushi},
year = 1999,
journal = {Proceedings of the National Academy of Sciences},
publisher = {National Acad Sciences},
volume = 96,
number = 18,
pages = {10284--10289}
}
..and then of course LaTeX gets angry! Now that I know I'll just keep an eye on suspicious "$", I only wanted to let you know.
It's the only flaw I found, besides that it's perfect!
Your very nice tool didn't produce the result I was looking for. I have duplicated entries with different IDs and duplicated entries with same IDs. What I want is merge all duplicated entries with same IDs, but keep the other duplicates. (This is because in a large LaTeX document, I have references to the different kinds of IDs, so I have to keep all of them). It would be nice to add such option.
Journal volumes are often in Roman numerals (e.g. VII). However, when the option to remove ALLCAPS is enabled it rewrites these as Vii
instead of leave them be as VII
.
The link for the options at the bottom of the README file is broken.
Thanks for the tool.
With this input
@misc{TheKey,
title = {{The \textsf{Secret}}},
}
and these options
bibtex-tidy --curly --space=2 --align=13 --no-escape --no-tidy-comments --no-remove-dupe-fields --no-lowercase --enclosing-braces=title
the online bibtex-tidy outputs
@misc{TheKey,
title = {{The \textsfSecret}}
}
I expected no change, but the "inner" curly braces were stripped.
Is this a bug or is my expectation wrong?
It would be great to have a option that deletes empty fields.
For instances
@inproceedings{fran2017,
author = {P. D. Francesco},
booktitle = {2017 IEEE International Conference on Software Architecture Workshops (ICSAW)},
title = {Architecting Microservices},
year = {2017},
volume = {},
number = {},
pages = {224--229},
issn = {},
month = {April}
}
will be changed to
@inproceedings{fran2017,
author = {P. D. Francesco},
booktitle = {2017 IEEE International Conference on Software Architecture Workshops (ICSAW)},
title = {Architecting Microservices},
year = {2017},
pages = {224--229},
month = {April}
}
First of all, nice project!
I have found an issue while tidying my BibTeX file. When checking the "Drop all caps" flag, publisher/institute/school tags composed only by acronyms (e.g. IEEE, UFRJ) became capitalized (e.g. Ieee, Ufrj). I'm not sure how you could deal with this, perhaps adding an option to exclude those tags?
Throws an error on comment lines; that is, lines starting with a semicolon.
BibTeX Tidy attempts to escape some characters (here: the underscore_
by \_
) in URLs, which does then break the URL. I encountered this problems with Biber. Biber allows _
and fails with \_
, though, BibTeX and BibLaTex may behave exactly in the opposite way.
As suggested here, underscores should rather be represented by their HTML encoding %5F
which works fine for me in Biber (though, maybe someone should also test this with BibTeX and BibLaTex).
In general, using percent-encodings in URLs should always be supported (see here). I guess this would be a good strategy for all special characters in URL fields, not only underscores.
Example:
@book{mybook,
title = {My Book},
author = {John Doe},
url = {https://en.wikipedia.org/wiki/Underscore_(disambiguation)}
}
Tidied-up example:
@book{mybook,
title = {My Book},
author = {John Doe},
url = {https://en.wikipedia.org/wiki/Underscore\_(disambiguation)}
}
Suggested correction:
@book{mybook,
title = {My Book},
author = {John Doe},
url = {https://en.wikipedia.org/wiki/Underscore%5F(disambiguation)}
}
Hi!
It is a wonderful tool to tide up bib file. But while checking for duplicates, I noticed that two unique entries in my file were erroneously merged into one. The entries are listed here for your reference.
@Article{Raku2,
title = {Focusing of a vortex carrying beam with Gaussian background by a lens in the presence of spherical aberration and defocusing},
author = {R.K.Singh and P.Senthilkumaran and K. Singh},
year = 2007,
journal = {Optics and Lasers in Engg.,},
volume = 45,
pages = {773--782},
date-added = {2020-05-09 00:08:30 +0530},
date-modified = {2020-05-09 00:10:16 +0530}
}
@Article{Raku10,
title = {Focusing of a vortex carrying beam with Gaussian background by an apertured system in presence of coma},
author = {R.K.Singh and P.Senthilkumaran and K. Singh},
year = 2008,
journal = {Opt.Commun.},
volume = 281,
pages = {923--934},
date-added = {2020-05-09 00:13:37 +0530},
date-modified = {2020-05-09 00:14:50 +0530}
}
Notice that titles of the articles are quite similar, but they are in unique journals. Is it possible to check for duplicates based on journal and year criterion as well?
Feature Request - Put titles in double curly brackets {{ }} to keep capital letters.
Hey there!
I'd like to report a security issue but cannot find contact instructions on your repository.
If not a hassle, might you kindly add a SECURITY.md
file with an email, or another contact method? GitHub recommends this best practice to ensure security issues are responsibly disclosed, and it would serve as a simple instruction for security researchers in the future.
Thank you for your consideration, and I look forward to hearing from you!
(cc @huntr-helper)
Sometimes, there are entries that have more than double curly braces. Cleaning those up would be nice :)
MWE:
@article{Zyngier2012, title = {{{{UOPSS : A New Paradigm for Modeling Production Planning {\&} Scheduling Systems}}}}, author = {Zyngier, Danielle and Kelly, Jeffrey D}, year = 2012, journal = {Symposium on Computer Aided Process Engineering}, number = {June}, pages = {17--20}, keywords = {decision-making,modeling,optimization,planning,scheduling} }
(Added several curly braces by mistake when hitting tidy several times.
In #37 and #38 it was suggested to use cat references.bib | bibtex-tidy --quiet -
for reading from stdin and outputting to stdout. But I'm getting
node:internal/fs/utils:344
throw err;
^
Error: ENOENT: no such file or directory, open '-'
at Object.openSync (node:fs:585:3)
at readFileSync (node:fs:453:35)
at start (/usr/lib/node_modules/bibtex-tidy/bin/bibtex-tidy:5450:47)
at Object.<anonymous> (/usr/lib/node_modules/bibtex-tidy/bin/bibtex-tidy:5466:1)
at Module._compile (node:internal/modules/cjs/loader:1101:14)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1153:10)
at Module.load (node:internal/modules/cjs/loader:981:32)
at Function.Module._load (node:internal/modules/cjs/loader:822:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:79:12)
at node:internal/main/run_main_module:17:47 {
errno: -2,
syscall: 'open',
code: 'ENOENT',
path: '-'
}
even though the file exists. Is this still implemented, and if so, how do I use it?
@FlamingTempura @mildblimp
I couldn't find anything about it in documentation.
I'm using v1.7.1.
When the Enclose in double braces
option is active, e.g., for the title
field and the first or last part of the field is escaped with curly braces, tidying up keeps adding curly braces.
To reproduce, tidy up this entry with the Enclose in double braces
option active for the title
field:
@article{munk2020model,
title = {{{M}odel-{B}ased {S}afety {A}ssessment with {SysML} and {C}omponent {F}ault {T}rees: {A}pplication and {L}essons {L}earned}},
author = {Munk, Peter and Nordmann, Arne},
year = 2020,
journal = {Softw Syst Model},
volume = 19,
pages = {889--910},
doi = {10.1007/s10270-020-00782-w}
}
Expected:
Number of braces is unchanged by the tidy operation.
Observed:
It keeps adding braces on each tidy operation.
Hi,
It will be nice if on the website one can get the command call to produce
the last outcome. I'm thinking to give the user a line code to run on their terminal,
so next time they can, like me, incorporate bibtex-tidy in their workflow.
Another feature request I have in mind is on an entry's key. A few options I find
convenient are: drop all caps, and generate the key. Although, the latter is way more
complex.
Thanks,
Great job!
Hi, does this package validate a bibtex file and point to the potential errors? Thanks!
Is there a way to sort the bib entries in descending order in the web-version?
I sort the bib entries according to the year. In this case, an option to sort in the descending order (reverse chronological order) would be beneficial.
Hi! I recently discovered this tool and I'm extremely happy with it. I noticed a bit of unexpected behavior when sorting by year/month/day. The months are being sorted alphabetically and not by the calendar. Strangely, this alphabetical order still happens if I specify the months as 10
, 3
, and 4
instead of oct
, mar
, and apr
. I tried to find the place in the source code where sorting occurs but didn't see it from a quick search. If someone can give me a pointer on where this code is located, I could try to make a PR. Also happy to review a PR if someone else makes it.
@book{impossible2,
title = {The other impossible book},
author = {Stefan Sweig},
year = 1942,
month = oct,
publisher = {Dead Poet Society}
}
@book{impossible,
title = {The impossible book},
author = {Stefan Sweig},
year = 1942,
month = mar,
publisher = {Dead Poet Society}
}
@book{sweig42,
title = {The impossible book},
author = {Stefa{n} Sweig},
year = 1942,
month = apr,
publisher = {Dead Poet Society}
}
This command shows the sorting flags I'm using. You can flip -month
to month
to swap between A-Z months and Z-A months.
bibtex-tidy --curly --numeric --space=2 --align=0 --sort=-year,-month,-day YOUR_FILE.bib
It is a commonplace that certain BibItems contain "AND" instead of "and". It would be worth having an option which regulates that "and" will be printed instead of "AND".
Thank you in advance for considering this suggestion.
I work with an automatic formatter which needs the file to be outputted to stdout. Expected behaviour:
File bibliography.bib
:
@Book{sweig42,
Author = { Stefa{n} Sweig },
title = { The impossible book },
publisher = { Dead Poet Society},
year = 1942,
month = mar
}
bibtex-tidy --quiet --no-in-place bibliography.bib
Output:
@book{sweig42,
title = {The impossible book},
author = {Stefa{n} Sweig},
year = 1942,
month = mar,
publisher = {Dead Poet Society}
}
Or in many other tools you can do something like this cat test.py | black --quiet -
, which reads the file from stdin and outputs the formatted file to stdout.
For some reason I cannot run the npm
installer on the current machine. If I clone the github version, how do I install it on a Linux system?
This is not a bug, but it would be useful to have an option within the "VALUES" block that can remove all braces from BibTeX fields. For example, the BibItem
@article{11Mellau,
title = {Highly excited rovibrational states of {HNC}},
author = {G. Ch. {M}ellau},
year = {2011},
journal = {J. Mol. Spectrosc.},
volume = {269},
pages = {77}
}
should be modified to
@article{11Mellau,
title = {Highly excited rovibrational states of HNC},
author = {G. Ch. Mellau},
year = {2011},
journal = {J. Mol. Spectrosc.},
volume = {269},
pages = {77}
}
It would also help if one could control which fields are needed to be cleaned in terms of extra braces.
Perhaps those braces which are part of math expressions (that is, surrounded with $$), should be retained.
Thank you in advance for considering this option.
I got the following error while cloning the repo:
error: invalid path 'test/bibliographies/better-bibtex/Options to use default import process? #1562.bib'
Cannot enclose values such as year and volume in braces
I remember an earlier version of the bibtex-tidy online website and I really liked it to clean up messy bibtex files. However, the current version seems not to work: clicking 'Tidy' at https://flamingtempura.github.io/bibtex-tidy/ doesn't do anything.
Firefox Developer Console gives the following two Javascript errors (on page load; with no new errors when clicking the 'Tidy' button).
SyntaxError: invalid regular expression flag s bibtex-tidy.js:2207:46
ReferenceError: bibtexTidy is not defined main.js:53:12
<anonymous> https://flamingtempura.github.io/bibtex-tidy/main.js:53
(Latest Firefox 74.0.1. The problem occurs also with all add-ons disabed).
Under BibTeX, it is allowed to use two different forms of author list:
(1)
author = {M. Born and J. R. Oppenheimer}
and
(2)
author = {Born, M. and Oppenheimer, J. R. }
.
It would be useful to have an option which provides a standard form for printing the author list, say (1).
Another problem could be the abbreviation of the first names of the authors. Users would be thankful if BibTeX Tidy had such an option.
Thank you for considering these suggestions.
I'd like to remove the url field on all book, but keep them on online sources.
Is this possible already?
When drawing bib files from multiple sources you can end up with non-standard bibtex keys. it would be great to be able to standardise them automatically
I installed bibtex-tidy with npm as indicated on Ubuntu 2020.04 and when I run it I get the following error:
var OPTIONS = new Set(optionDefinitions.flatMap((def) => Object.keys(def.cli)));
^
TypeError: optionDefinitions.flatMap is not a function
at Object.<anonymous> (/usr/local/lib/node_modules/bibtex-tidy/bin/bibtex-tidy:5342:41)
at Module._compile (internal/modules/cjs/loader.js:778:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10)
at Module.load (internal/modules/cjs/loader.js:653:32)
at tryModuleLoad (internal/modules/cjs/loader.js:593:12)
at Function.Module._load (internal/modules/cjs/loader.js:585:3)
at Function.Module.runMain (internal/modules/cjs/loader.js:831:12)
at startup (internal/bootstrap/node.js:283:19)
at bootstrapNodeJSCore (internal/bootstrap/node.js:623:3)
The command I'm running it with is simply bibtex-tidy refs.bib
. That file exists in the current directory.
I really like this project. It helps me a lot.
It could report the duplicate entry key to me. But I have to remove them by myself.
Maybe we can add a button to remove the same entries by one bottom.
Hi! Thank you for the project @FlamingTempura !
Are you open to making an option (e.g. --no-backup
) to support not creating a .original
file when using the script?
Also, are you open to having a --quiet
mode, that doesn't output to stdout/stderr?
For context: I'm using this as a formatter in vim (link, link 2 for removing .original file)
Thanks very much for the very convenient tools.
FWIW, it seems that when I chose "generating bibtex keys", sometimes there could be some errors. E.g., for the following entry
@inproceedings{aristidou2008predicting,
title = {
Predicting Missing Markers to Drive Real-Time Centre of Rotation
Estimation
},
author = {Aristidou, Andreas and Cameron, Jonathan and Lasenby, Joan},
year = 2008,
booktitle = {
AMDO '08: Proceedings of the 5th international conference on Articulated
Motion and Deformable Objects
},
location = {Port d'Andratx, Mallorca, Spain},
publisher = {Springer-Verlag},
address = {Berlin, Heidelberg},
pages = {238--247},
doi = {http://dx.doi.org/10.1007/978-3-540-70517-8_23},
isbn = {978-3-540-70516-1}
}
After generating the bibtex keys, it becomes
@inproceedings{aristidou2008{
predicting missing markers to drive real-time centre of rotation
estimation
},
title = {
Predicting Missing Markers to Drive Real-Time Centre of Rotation
Estimation
},
author = {Aristidou, Andreas and Cameron, Jonathan and Lasenby, Joan},
year = 2008,
booktitle = {
AMDO '08: Proceedings of the 5th international conference on Articulated
Motion and Deformable Objects
},
location = {Port d'Andratx, Mallorca, Spain},
publisher = {Springer-Verlag},
address = {Berlin, Heidelberg},
pages = {238--247},
doi = {http://dx.doi.org/10.1007/978-3-540-70517-8_23},
isbn = {978-3-540-70516-1}
}
Thanks so much!
Hi! Firstly, thanks for setting up this linter! it's great!
I was wondering if it might be possible for you to add a feature to reduce the number of authors for bib entries. Example, the Event Horizon M87 paper, or LIGO papers have too many authors.
Maybe using something like this: https://gist.github.com/zimmerst/9cb2ccad69b5f55a0a222c01b1d8e183
Perhaps it would be practical to have an option which adds one extra space before and after a weblink in a specific BibItem. This option could help that one can jump to the given webpage from a TXT editor, which adds the non-separated braces to the link.
Furthermore, it would be also worth having a button using which one can surround a link with the \url{} command. These options should work for all weblinks, not only those specified within the "url" field.
Thank you in advance for considering these options.
As far as I know, @STRING
is an accepted form of syntax in bibtex.
However, bibtex-tidy does not seem to support them, as e.g.
@string{Aubert={Aubert, Clément}}
@string{Varacca={Varacca, Daniele}}
@Inproceedings{Aubert2021h,
author=aubert#{ and }#Varacca,
title = {Processes, Systems \& Tests: Defining Contextual Equivalences},
pages = {1-21},
doi = {10.4204/EPTCS.347.1},
}
returns
There's a problem with the bibtex (Syntax Error)
…
Unexpected "a" in concat.
I installed via npm
and I get this
$ bibtex-tidy myrefs.bib
/usr/local/lib/node_modules/bibtex-tidy/bin/bibtex-tidy:5342
var OPTIONS = new Set(optionDefinitions.flatMap((def) => Object.keys(def.cli)));
^
TypeError: optionDefinitions.flatMap is not a function
at Object.<anonymous> (/usr/local/lib/node_modules/bibtex-tidy/bin/bibtex-tidy:5342:41)
at Module._compile (module.js:652:30)
at Object.Module._extensions..js (module.js:663:10)
at Module.load (module.js:565:32)
at tryModuleLoad (module.js:505:12)
at Function.Module._load (module.js:497:3)
at Function.Module.runMain (module.js:693:10)
at startup (bootstrap_node.js:188:16)
at bootstrap_node.js:609:3
I suppose it may be an issue with my java settings, any hint?
Installing bibtex-tidy using npm on fedora 32 and running bibtex-tidy --help
results in a No option "--help"
although bibtex-tidy -h
shows --help
as an option (and not -h
).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.