inveniosoftware / invenio Goto Github PK
View Code? Open in Web Editor NEWInvenio digital library framework
Home Page: https://invenio.readthedocs.io
License: MIT License
Invenio digital library framework
Home Page: https://invenio.readthedocs.io
License: MIT License
Originally on 2010-04-21
≍≬☠☮☻♥♾
≍≬☠☮☻♥♾
≍≬☠☮☻♥♾
Originally on 2010-05-18
In bibindex_engine, we find 8 occurrences of 'raise StandardError' without a message. If not caught and treated, this leads to useless emails sent to the admin as they do not contain the cause of the exception and thus prevent the admin to correctly fix the problem.
Here are the lines causing problems:
1245: raise StandardError
1311: raise StandardError
1486: raise StandardError
1510: raise StandardError
1542: raise StandardError
1565: raise StandardError
1605: raise StandardError
1629: raise StandardError
Originally on 2010-05-04
======================================================================
FAIL: bibcirculation - availability of your loans page
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/invenio/bibcirculation_regression_tests.py", line 44, in test_your_loans_page_availability
self.fail(merge_error_messages(error_messages))
AssertionError:
*** ERROR: Page http://pcuds33.cern.ch/yourloans/ (login guest) not accessible. HTTP Error 500: Internal Server Error
Originally on 2010-04-22
To create an icon is indeed a conversion from a given format to another. So this is a good usecase of a high level plugin to add to the WebSubmit Converter Tools library.
Originally on 2010-05-05
The action Run batchuploader is currently not included in the Administration menu.
Originally on 2010-05-04
======================================================================
ERROR: bibdocfile - BibDocs functions
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/invenio/bibdocfile_regression_tests.py", line 165, in test_BibDocs
my_new_bibdoc.add_icon( CFG_PREFIX + '/lib/webtest/invenio/icon-test.gif', basename=None, format=None)
TypeError: add_icon() got an unexpected keyword argument 'basename'
Originally on 2010-05-04
======================================================================
ERROR: bibconvert - availability of BibConvert Admin Guide parts
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/invenio/bibconvert_regression_tests.py", line 43, in test_availability_bibconvert_admin_guide_parts
test_web_page_existence(CFG_SITE_URL + '/admin/bibconvert/bibtex.cfg')
File "/usr/lib/python2.5/site-packages/invenio/testutils.py", line 146, in test_web_page_existence
browser.open(url)
File "/usr/lib/python2.5/site-packages/mechanize/_mechanize.py", line 209, in open
return self._mech_open(url, data, timeout=timeout)
File "/usr/lib/python2.5/site-packages/mechanize/_mechanize.py", line 261, in _mech_open
raise response
httperror_seek_wrapper: HTTP Error 404: Not Found
BTW, I'll take care of this when merging mpp->wsgi branches.
Originally on 2010-05-05
The batchuploader facility output messages should be prettified.
E.g. try to submit document for insertion vs document for revision.
The mode works, but the output message is misleading.
E.g. try to submit a MARCXML file to which you don't have proper
collection rights. The demo user Dorian can submit only Books files.
The check works, but if he does not have rights, the general output
message from the access control system is misleading.
Originally on 2010-05-18
Bibclassify, when loading the cache, is very slow
Cache contains regexes (and many of them), Python recompiles the regexes during load: http://stackoverflow.com/questions/65266/caching-compiled-regex-objects-in-python
Since we have much bigger taxonomy now, I need to investigate ways to:
(the cache is loaded only once, as this profile shows):
Mon May 17 15:25:57 2010 /opt/cds-invenio/var/tmp/invenio-profile-stats-20100517152447.raw
10719206 function calls (10505840 primitive calls) in 69.884 CPU seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 69.884 69.884 webinterface_handler.py:330(_handler)
2/1 0.000 0.000 69.883 69.883 webinterface_handler.py:171(_traverse)
1 0.000 0.000 69.878 69.878 websearch_webinterface.py:418(call)
1 0.000 0.000 69.874 69.874 search_engine.py:3986(perform_request_search)
1 0.000 0.000 69.762 69.762 search_engine.py:3214(print_records)
1 0.000 0.000 69.750 69.750 bibclassify_webinterface.py:65(main_page)
1 0.000 0.000 69.734 69.734 bibclassify_webinterface.py:189(generate_keywords)
1 0.000 0.000 69.720 69.720 bibclassify_engine.py:141(get_keywords_from_local_file)
1 0.000 0.000 69.637 69.637 bibclassify_engine.py:165(get_keywords_from_text)
1 0.000 0.000 56.950 56.950 bibclassify_ontology_reader.py:69(get_regular_expressions)
1 0.003 0.003 56.948 56.948 bibclassify_ontology_reader.py:572(_get_cache)
1 1.354 1.354 56.945 56.945 {cPickle.load}
17752 0.527 0.000 55.818 0.003 re.py:227(_compile)
17701 0.465 0.000 55.023 0.003 sre_compile.py:501(compile)
17701 0.325 0.000 31.684 0.002 sre_parse.py:669(parse)
28313/17701 0.743 0.000 30.786 0.002 sre_parse.py:307(_parse_sub)
34424/17701 8.570 0.000 30.433 0.002 sre_parse.py:385(_parse)
17701 0.197 0.000 22.618 0.001 sre_compile.py:486(_code)
83568/17701 5.385 0.000 17.814 0.001 sre_compile.py:38(_compile)
Originally on 2010-04-30
Since submitted comments are washed from potentially unsafe attributes and tags, it does not make sense to offer the 'TextColor' item in the toolbar displayed for WebComment input form (as the "style" attribute will be washed away). This item should be removed from the default settings.
Behaviour of other toolbar items should also be checked.
Originally on 2010-05-11
By default, in the HTML brief format on the search results pages, we
are showing only a limited number of snippets, governed by the
CFG_WEBSEARCH_FULLTEXT_SNIPPETS configuration variable.
The snippet box on the search results page should offer a link in the
bottom-right entitled e.g. Show more
that the user could click upon
to see more snippets. This link would lead the user to the Fulltext
tab on the detailed record page for the given record. This page would
offer on the rhs a box entitled Find inside
that would provide
snippet grepping facility with more snippets to show. It would be
kind of similar to `Search in this book' of Google Books.
Here is an ASCII art mock-up of the detailed record page tab:
Information | References (34) | Citations (124) | Fulltext (3)
Main file(s): Find inside:
thesis [neutralino_______] [FIND]
version 1
thesis.pdf [247.7 KB] 04 May 2010 1. ... supersymmetric neutralino ...
2. ... 238 GeV neutralinos ...
20. ... neutralino dark matter ...
[view next 20 snippets]
Originally on 2010-05-04
======================================================================
FAIL: websearch - query ellis, citation summary output format
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/invenio/websearch_regression_tests.py", line 1420, in test_ellis_citation_summary
expected_link_label='1'))
AssertionError: [] != ['ERROR: Page http://pcuds33.cern.ch/search?p=ellis&of=hcs (login guest) led to an error: ERROR: Page http://pcuds33.cern.ch/search?p=ellis&of=hcs (login guest) does not contain link to http://pcuds33.cern.ch/search?p=ellis%20cited%3A1-%3E9&rm=citation entitled 1..']
Originally on 2010-05-11
The full-text snippet configuration needs to use the number of
characters, not the number of words. We are allowed to show say 100
characters around the pattern, rounded to the closest word outside of
these 100 characters. So we need to replace
CFG_WEBSEARCH_FULLTEXT_SNIPPETS_WORDS configuration variables with
character counting before v1.0 is out, in order to stabilize the
config file.
Moreover, the length of the snippet depends on the full-text file
provenance. The provenance is currently store as bibdoc type, so this
has to be analyzed when snippets are generated. The full-text snippet
configuration should then look almost like a dictionary:
CFG_WEBSEARCH_FULLTEXT_SNIPPETS_CHARS = {
'arXiv': 200,
'Springer': 180,
'APS': 100,
}
Even the number of snippets to show can perhaps vary per source, so
it may be perhaps good to store it in the configuration as well, e.g.
(50, 200) would mean we are able to show up to 50 snippets containing
up to 200 characters.
The configuration variable that determines how many snippets are
shown per record in the HTML brief output format on the search results
pages can probably stay source independent.
Originally on 2010-05-11
Running 'make kwalitee-check-sql-queries' reveals, among others, use of escape_string in generation of queries in webstat_engine.py.
** SQL queries using charset-ignorant escape_string():
...
./modules/webstat/lib/webstat.py:33:from invenio.dbquery import run_sql, escape_string
./modules/webstat/lib/webstat.py:174: arg = escape_string(argument)
./modules/webstat/lib/webstat_engine.py:25:from invenio.dbquery import run_sql, escape_string
./modules/webstat/lib/webstat_engine.py:259: sql_query.append("AND `%s`" % escape_string(col_title))
./modules/webstat/lib/webstat_engine.py:261: sql_query.append("OR `%s`" % escape_string(col_title))
./modules/webstat/lib/webstat_engine.py:263: sql_query.append("AND NOT `%s`" % escape_string(col_title))
./modules/webstat/lib/webstat_engine.py:317: sql_query.append("AND `%s`" % escape_string(col_title))
./modules/webstat/lib/webstat_engine.py:319: sql_query.append("OR `%s`" % escape_string(col_title))
./modules/webstat/lib/webstat_engine.py:321: sql_query.append("AND NOT `%s`" % escape_string(col_title))
...
This should be cleaned.
Originally on 2010-05-08
Many places in search_engine catch Exception, when they should be more specific about the errors that they catch. It not only helps the reader to understand how the try-wrapped call could fail, it also means that unexpected exception types still get raised up the stack - so we see more types of errors, and can fix them.
Originally on 2010-04-20
test
Originally on 2010-04-20
This is a Test to see how newly opened tickets show up in RSS/feeds
Originally by [email protected] on 2009-10-22
test
Originally by [email protected] on 2009-10-22
test
Originally on 2010-04-22
Currently, recently merged converter tools are implemented as a set of hardcoded algorithms in the websubmit_converter_tools.py module. This is not directly extensible by Invenio admins.
The library should be refactored to use plugins.
Originally on 2010-04-27
The citation dictionary is cached inside each WSGI Invenio daemon
process for speed purposes. It looks like this: (for the demo site)
{18: [96],
74: [92],
77: [85, 86],
78: [79, 91],
79: [91],
81: [82, 83, 87, 89],
84: [85, 88, 91],
91: [92],
94: [80],
95: [77, 86]}
For bigger sites containing 1M of records and having fuller citation
maps, this dictionary can get quite big, e.g. WSGI daemon processes of
the INSPIRE instance eat about 1 GB of RAM.
It would be good to decrease the memory footprint of this citation
dictionary, especially since we are running on a 64-bit OS, where we
may easily consume more bytes to store list elements (of `unsigned
mediumint' type) than necessary.
We should investigate potential local replacements for the list
structure, for example using numpy.array
. We can measure the
memory footprint of various data structures via sys.getsizeof()
or via ps auxw
process sizes, aiming to find a more memory
optimized, yet still fast enough, data structure to represent the
citation dict.
If needed, we can even create a dedicated intbitset-like C extension,
that would be capable of storing recID vectors in a memory-efficient
way. This is arguably the best micro-optimization technique that we
could go for, albeit it would represent a bit more work than reusing
numpy.array
or other some such pre-existing module.
Note that this task is of a micro-optimization kind only, keeping the
overall citation indexer and searcher machinery unchanged, only
changing its internal data structures. The tests will show how much
such a micro-optimization would be worth it. The overall rethinking
of the citation dictionary handling and the inherent memory sharing
procedures would be another task, see some older musings at
[https://twiki.cern.ch/twiki/bin/view/CDS/InvenioScalability].
Originally on 2010-05-04
The output messages of /yourgroups subsystem should be prettified.
Currently, when you try to edit a group that you don't have rights for
(via URL mangling), the system says:
Error: Sorry there was an error with the database.
<type 'exceptions.TypeError'> 'dict' object is not callable
which is misleading. Moreover, invenio.err
is unnecessarily
filled with ERR_WEBSESSION_DB_ERROR
alarms.
The UI should report something more user friendly, like:
Error: Sorry, you don't have rights to edit this user group.
or:
Error: Sorry, group foo does not exist.
The database errors should be reported only when there are real
database problems.
Originally on 2010-05-05
Catalogers may prefer to work offline with Text MARC files instead of
with MARCXML files in their editors. So batchuploader should permit
submitting of Text MARC files, not only MARCXML files. For Text MARC
submissions, the conversion utility (textmarc2xmlmarc
, to be
committed soon) should be called to convert Text MARC files into
MARCXML files before further processing. (That is, before checking
collection rights, submitting to bibupload queue, etc.)
Originally by [email protected] on 2009-10-22
Test ticket
Originally by man on 2010-04-30
In some websearch utils, author tags are still hard coded in Python source. I'll replace by tag names.
Originally by [email protected] on 2009-10-22
test
Originally on 2010-04-30
If the Invenio source code is placed under a directory containing space in its, name, make install command fails
Originally on 2010-05-04
======================================================================
FAIL: htmlutils - washing of tags altering formatting of a page (e.g. </html>)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/invenio/htmlutils_tests.py", line 42, in test_forbidden_formatting_tags
'')
AssertionError: '</div>' != ''
----------------------------------------------------------------------
Originally on 2010-05-04
======================================================================
FAIL: bibformat - Detailed HTML output
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/invenio/bibformat_regression_tests.py", line 155, in test_detailed_html_output
self.assertEqual([], result)
AssertionError: [] != ['ERROR: Page http://pcuds33.cern.ch/record/7?of=hd (login guest) led to an error: ERROR: Page http://pcuds33.cern.ch/record/7?of=hd (log
in guest) does not contain <img src="http://pcuds33.cern.ch/record/7/files/icon-9806033.gif" alt="" /><br /><font size="-2"><b>\xc2\xa9 CERN Geneva</b></font>..']
Originally on 2010-05-04
======================================================================
FAIL: webjournal - gets an article view of a journal from cache
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/invenio/webjournal_regression_tests.py", line 311, in test_get_article_page_from_cache
assert("April 14th, 1832.—Leaving Socêgo, we rode to another estate on the Rio Macâe" in value)
AssertionError
Originally by [email protected] on 2009-10-22
test
Originally on 2010-04-26
The citation and download history grapher tool should be more
clever WRT x-axis unit calculation. For articles published long
time ago, the unit used for x axis ticks is so small that the
final result is unreadable:
[http://inspirebeta.net/record/3198/citations]
The x-axis ticks should be adapted to the real x-axis range used
in the input data, close to (x_max - x_min) / 10.
Originally on 2010-05-04
======================================================================
FAIL: bibformat - MARCXML output
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/invenio/bibformat_regression_tests.py", line 345, in test_marcxml_output
self.assertEqual([], result)
AssertionError: [] != ['ERROR: Page http://pcuds33.cern.ch/record/9?of=xm (login guest) led to an error: ERROR: Page http://pcuds33.cern.ch/record/9?of=xm (login guest) does not contain <?xml version="1.0" encoding="UTF-8"?>\n<collection xmlns="http://www.loc.gov/MARC21/slim">\n<record>\n <controlfield tag="001">9</controlfield>\n <datafield tag="041" ind1=" " ind2=" ">\n <subfield code="a">eng</subfield>\n </datafield>\n <datafield tag="088" ind1=" " ind2=" ">\n <subfield code="a">PRE-25553</subfield>\n </datafield>\n <datafield tag="088" ind1=" " ind2=" ">\n <subfield code="a">RL-82-024</subfield>\n </datafield>\n <datafield tag="100" ind1=" " ind2=" ">\n <subfield code="a">Ellis, J</subfield>\n <subfield code="u">University of Oxford</subfield>\n </datafield>\n <datafield tag="245" ind1=" " ind2=" ">\n <subfield code="a">Grand unification with large supersymmetry breaking</subfield>\n </datafield>\n <datafield tag="260" ind1=" " ind2=" ">\n <subfield code="c">Mar 1982</subfield>\n </datafield>\n <datafield tag="300" ind1=" " ind2=" ">\n <subfield code="a">18 p</subfield>\n </datafield>\n <datafield tag="650" ind1="1" ind2="7">\n <subfield code="2">SzGeCERN</subfield>\n <subfield code="a">General Theoretical Physics</subfield>\n </datafield>\n <datafield tag="700" ind1=" " ind2=" ">\n <subfield code="a">Ibanez, L E</subfield>\n </datafield>\n <datafield tag="700" ind1=" " ind2=" ">\n <subfield code="a">Ross, G G</subfield>\n </datafield>\n <datafield tag="909" ind1="C" ind2="0">\n <subfield code="y">1982</subfield>\n </datafield>\n <datafield tag="909" ind1="C" ind2="0">\n <subfield code="b">11</subfield>\n </datafield>\n <datafield tag="909" ind1="C" ind2="1">\n <subfield code="u">Oxford Univ.</subfield>\n </datafield>\n <datafield tag="909" ind1="C" ind2="1">\n <subfield code="u">Univ. Auton. Madrid</subfield>\n </datafield>\n <datafield tag="909" ind1="C" ind2="1">\n <subfield code="u">Rutherford Lab.</subfield>\n </datafield>\n <datafield tag="909" ind1="C" ind2="1">\n <subfield code="c">1990-01-28</subfield>\n <subfield code="l">50</subfield>\n <subfield code="m">2002-01-04</subfield>\n <subfield code="o">BATCH</subfield>\n </datafield>\n <datafield tag="909" ind1="C" ind2="S">\n <subfield code="s">h</subfield>\n <subfield code="w">1982n</subfield>\n </datafield>\n <datafield tag="980" ind1=" " ind2=" ">\n <subfield code="a">PREPRINT</subfield>\n </datafield>\n</record>\n</collection>..']
Originally on 2010-05-16
As Invenio is a digital object repository, when big files are being handled, BitTorrent could be a perfect solution for user to access such big files. Although not currently allowed at CERN, it could be a good solution for other installation handling huge multimedia material, or e.g. huge scientific data.
A possible integration would be as a special plugin to the WebSubmit converter tools. We can introduce a new converter to .torrent files, (e.g. by using transmission CLI) that would first create the torrent, and the seed it.
This would require integration with a Tracker, and e.g. we might use OpenTracker that can be configured to track only white-listed torrents (e.g. those created before with transmission).
Since BitTorrent would be useful only for huge files which are of interest to many users, we might either extend the /files handler to answer even for format that does not exists yet, and register the desire for a conversion that can be later handled by a daemon (See ticket #47), or alternatively we might have an automatic procedure that base on a threshold of size and number of download will pre-create the torrent.
Originally on 2010-04-23
The /yourgroups facility should improve its argument washing.
An URL such as https://localhost/yourgroups/edit?grpID=foo leads to
500 Internal Server Error and a traceback, because grpID had not been
washed properly in the web interface layer before being passed onto
the business logic layer.
>>>> Frame edit in /usr/lib/python2.5/site-packages/invenio/websession_webinterface.py at line 1190
*******************************************************************************
1187 else :
1188 (body, errors, warnings)= webgroup.perform_request_edit_group(uid=uid,
1189 grpID=argd['grpID'],
----> 1190 ln=argd['ln'])
1191
1192
1193
*******************************************************************************
>>>> Frame perform_request_edit_group in /usr/lib/python2.5/site-packages/invenio/webgroup.py at line 387
*******************************************************************************
384
385 body = ''
386 errors = []
----> 387 user_status = db.get_user_status(uid, grpID)
388 if not len(user_status):
389 errors.append('ERR_WEBSESSION_DB_ERROR')
390 return (body, errors, warnings)
*******************************************************************************
>>>> Frame get_user_status in /usr/lib/python2.5/site-packages/invenio/webgroup_dblayer.py at line 296
*******************************************************************************
293 WHERE id_user = %s
294 AND id_usergroup=%s"""
295 uid = int(uid)
----> 296 grpID = int(grpID)
297 res = run_sql(query, (uid, grpID))
298 return res
299
*******************************************************************************
Originally on 2010-05-10
BibEdit produces traceback if for some reason there are no MARCXML
historical versions for the given record in hstRECORD table. BibEdit
should simply continue gracefully in these cases, the History panel
being empty.
How to reproduce:
$ echo "TRUNCATE hstRECORD" | /opt/cds-invenio/bin/dbexec
and then edit a record.
The traceback is:
File "/usr/lib/python2.5/site-packages/invenio/bibedit_engine.py", line 527, in perform_request_record
record_revision, record = create_cache_file(recid, uid)
File "/usr/lib/python2.5/site-packages/invenio/bibedit_utils.py", line 104, in create_cache_file
record_revision = get_record_last_modification_date(recid)
File "/usr/lib/python2.5/site-packages/invenio/bibedit_dblayer.py", line 61, in get_record_last_modification_date
return run_sql("SELECT max(job_date) FROM hstRECORD WHERE id_bibrec=%s", (recid, ))[0][0].timetuple();
AttributeError: 'NoneType' object has no attribute 'timetuple'
Originally on 2010-04-26
Instead of using gnuplot in the grapher, investigate the usage of
jQuery plotting tools, such as [http://code.google.com/p/flot/].
Advantage being the possibility to zoom into graphs easily.
Disadvantage may be higher server side stress related to Ajax. So we
would have to nicely cache the {(X1,Y1),(X2,Y2),...} JSON structures
that Flot would need, just as we are now caching the PNGs produced by
gnuplot. The server stress would have to be tested. Caching PNGs is
definitely more scalable: once cached, they are served directly by
Apache, not by Invenio WSGI application.
Flot would be especially well suited for admin-like graphs in the
WebStat module, where currently the x-axis scale is chosen from a
drow-down selection box. It would be even more practical to zoom
here, and the stress issue would be less important here, since there
is only a few connections to the admin-level pages, as opposed to the
user-level pages.
Originally on 2010-05-18
In case of editing documents having a physical copy, a link to BibCirculation interface should be provided
Originally on 2010-04-20
One more thing to do before 1.0 in the jQuery department: there are
still some svn/tags/latest parts in install-jquery-plugins. This
is too bleeding edge, and we have been bitten by this in the past, so
to speak, e.g. file renames in
0c841f1, e.g. calendar changes in
e2a978e.
So it would be good to go through the whole install-jquery-plugins
target and change remaining parts in order to wget always some very
specific version of a jQuery library, as is needed, like in my changes
mentioned above. (I have been doing that as was necessary for
merging, I have not gone systematically through every jQuery
dependency we have.)
Can you please look at that?
If there is a trouble with some dependency lib URL stability, or
versions are lacking on the remote site, then we can host the
dependency on cdsware.cern.ch site, like the other installation files,
e.g. demo records, e.g. keyword ontologies.
Originally on 2010-05-18
Adding new fields in BibEdit seems broken. UI says "Updating..." but nothing happens.
Originally on 2010-05-04
======================================================================
FAIL: websearch - restricted pictures not available to Mr. Hyde
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/invenio/websearch_regression_tests.py", line 988, in test_restricted_pictures_hyde
self.fail(merge_error_messages(error_messages))
AssertionError:
*** ERROR: Page http://pcuds33.cern.ch/record/1/files/0106015_01.jpg (login hyde) not accessible. HTTP Error 401: Unauthorized
Note: as with many failed tests I'm now ticketizing, this is a problem
with the test case comparison technique, not with the file restriction
facility. The files are well restricted. I'm noting this down just
in case the summary of this ticket may appear to be alarming...
Originally by [email protected] on 2009-10-21
test
Originally on 2010-05-04
======================================================================
FAIL: websearch - check formats exported through /record/1/export/ URLs
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/invenio/websearch_regression_tests.py", line 272, in test_exported_formats
CFG_SITE_LANG))
AssertionError: [] != ['ERROR: Page http://pcuds33.cern.ch/record/1/export/hs (login guest) led to an error: ERROR: Page http://pcuds33.cern.ch/record/1/export/hs (login guest) does not contain <a href="/record/1?ln=en">ALEPH experiment..']
Originally on 2010-04-22
Currently, when a PDF is re-generated after an intermediate OCR process in the WebSubmit Converter Tools, the generate images are uncompressed bitmaps, that are stored directly in the final PDF, thus unnecessarily bloating it. Usage of JPEG is advisable instead.
Originally by [email protected] on 2009-10-21
test
Originally on 2010-05-04
======================================================================
FAIL: webjournal - checks if an article is new or not
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/invenio/webjournal_regression_tests.py", line 42, in test_is_new_article
self.assertEqual(article, True)
AssertionError: False != True
Originally on 2010-04-29
Suprisingly:
astro-ph/0607086
does not find rattazzon even though it is in the fulltext (see snippet for "honor theorist" search in fulltext)
This may be due to its enclosure in '' in the text...
Originally on 2010-05-11
Link to 'Record Merge' is missing under the administration menu.
Originally on 2010-05-16
It would be great to have a deamon (BibTask) to perform conversions. We might have a new table (e.g. conversion) with:
id_bibdoc
format
status
id_users
comment
where id_bibdoc is the document involved in the conversion, format is the desired format, status is either 'waiting', 'running', 'done', 'error', id_users is a intbitset of all the uid of users that should be alerted about the outcome of a given conversion (e.g. a user that have asked for a conversion or a submitter).
In case of error, comment would contain the error message.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.