Comments (3)
Hoi Ronald,
mooi informatief stuk.
Nadat we uit elkaar gingen schoot me nog een issue te binnen tussen p2 en p3; sorting.
In python 2 kun je aan het sorteren een compare functie meegeven. In python 3 niet meer, maar nog wel een key functie.
Dat lijkt zwakker.
Maar met een key kun je toch een compare simuleren, namelijk door als key een object te nemen, met methoden die het object vergelijken met anderer objecten.
Er is zelfs een module met een functie om een compare automatisch over te zetten in een key.
Zie cmp_to_key in https://docs.python.org/3.4/library/functools.html#module-functools
Ik denk dat je dit vast bij je collatie algoritme nodig zult hebben.
Groet, Dirk
Dirk Roorda
researcher
[email protected]:[email protected]
Data Archiving and Networked Services (DANS)
DANS promotes sustained access to digital research data. DANS is an institute of KNAW and NWO.
www.dans.knaw.nlhttp://www.dans.knaw.nl/
On 2014-10-22, at 16:41, Ronald Haentjens Dekker <[email protected]mailto:[email protected]> wrote:
Python 3 is a backwards incompatible API break of Python with the goal to make Unicode Strings the default. See the reasoning on the following page:
http://ncoghlan-devs-python-notes.readthedocs.org/en/latest/python3/questions_and_answers.html
Unicode support is a major problem in the current preview versions of CollateX Python.
I am currently investigating what a clean port to Python 3 would entail.
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/16.
from collatex.
Hi Dirk,
Thanks for this additional information. I did indeed run into this issue.
Old code:
#################################
# Dichotomy search of subString #
#################################
lower=0
upper=self.length
success=False
while upper-lower >0:
middle=(lower+upper)//2
middleSubString=string[SA[middle]:min(SA[middle]+lenSubString,self.length)]
cmpRes=cmp(subString, middleSubString)
if cmpRes == -1:
upper=middle
elif cmpRes == 1:
lower=middle+1
else:
success=True
break
if not success:
return False
else:
return middle
New code:
middleSubString=string[SA[middle]:min(SA[middle]+lenSubString,self.length)]
#NOTE: the cmp function is removed in Python 3
#Strictly speaking we are doing one comparison more now
if subString < middleSubString:
upper=middle
elif subString > middleSubString:
lower=middle+1
else:
success=True
break
This conversion is correct, but like I remarked in the NOTE comment this is strictly speaking less efficient.
from collatex.
This issue is fixed in the CollateX Python 2.0.0pre10 release.
from collatex.
Related Issues (20)
- <rdg> value in TEI output (Java) is "n", and should be "t"
- GraphML output in CollateX Python HOT 1
- Alignment error
- TEI output error (CollateX Python 2.2) HOT 1
- Regex to specify lower-priority collation tokens
- CollateX Python: Check input for duplicate witness ID and throw error if that is the case
- Choosing an algorithm HOT 6
- CollateX refuses Json input HOT 10
- JSON to TEI HOT 4
- Collatex demo api service seems to be down HOT 1
- Collating more than 2 witnesses by CollateX2.2 HOT 3
- Words in labels of dot and GraphML output are being merged HOT 4
- dependencies update HOT 2
- CollateX, bug HOT 1
- CollateX, bug, now with e-mail HOT 2
- ECMA/JavaScript Callbacks HOT 1
- Collate bug with detect_transpositions
- Unexpected error, Invalid resource: A, zsh: unknown file attribute: i HOT 9
- Can the latest Python 3.10 fix be pushed to PyPi? HOT 5
- demo service seems down (as of 2024-02-9 15:40 EST) HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from collatex.