brainkim / archieml-python Goto Github PK
View Code? Open in Web Editor NEWPython parser for the Archie Markup Language (ArchieML)
License: Other
Python parser for the Archie Markup Language (ArchieML)
License: Other
pip install archieml
still installs version 0.1.0 instead of 0.3.0! Probably due to the missing archive for 0.3.0, c.f. https://pypi.python.org/pypi/archieml/0.1.0 vs https://pypi.python.org/pypi/archieml/0.3.0
The following valid ArchieML doesn't parse correctly:
[a]
* first
[b]
* second
[]
c: No more parsing?
However this is fine:
[a]
* first
[]
[b]
* second
[]
c: This is just fine.
Presumably the logic behind closing arrays needs to work backward and close all the "open" ones. I'll have a look at the code and see if I can fix it, but maybe someone will beat me to it.
The following key-value pairs are successfully parsed in the ArchieML sandbox but don't convert into a dict via archieml.loads()
. It looks like the KEY_PATTERN
regex isn't covering them.
a/b: 5
a(b): 1e+16 cm⁻³
Σ_(ms)⁻¹.κₑ_W/(mKs): 10
I have been playing with downloading an ArchieML-formatted Google Doc as plain text, then parsing it with archieml-python
.
It pretty much works, except for the very first line. I have found that any time I have a key/value pair on line 1 of the text file, the parser doesn't recognize them.
If, however, I add a blank line to the top of the document, then the key/value on line 2 WILL be parsed.
After some more digging, it seems clear that when I download a file as "text/plain" from Google Docs, it's actually a UTF-8 file with a BOM.
So, if I change the following line in archieml-python's __init.py__
:
line = line.decode('utf-8')
to:
line = line.decode('utf-8-sig')
Then it works.
Not sure if you might want to consider adding BOM detection, ala http://stackoverflow.com/questions/13590749/reading-unicode-file-data-with-bom-chars-in-python.
A nested structure like the one below will fail in your parser. I believe it is due to the multiple "image" typed objects.
You can pass the same structure to the ArchieML sandbox on the official site and it parsers fine. So my first guess is that this is a bug in your code. I've attached the traceback I get as well.
[.profiles]
who:The women’s <br>rights activist
full_profile:y
one_liner:‘Keep us alive in your thoughts’
photo:order-4
frame_position:left
[.+copy]
{.image}
bleed:normal,
photo:order-4
top:35%
left:55%
width:65
height:65
fit:cover
placement:left
order:1
{}
{.who}
{}
When Mahbouba Seraj talks about her fears, an unusual thing happens: She sounds more indomitable than ever.
{.image}
bleed:normal,
photo:p1-2
top:65%
left:20%
width:40
height:55
fit:cover
placement:left
order:6
{}
At 73, Seraj is the doyenne of Afghan women’s rights activists — a stature and status that put her in the crosshairs of Taliban rulers.
“Of course I’m afraid,” she said. “Everyone’s afraid.”
But Seraj, who spent more than a quarter-century in exile in the United States before returning in 2003 to help build a women’s movement and women’s institutions, is standing fast in her refusal to leave, and to find ways of continuing her work.
{.image}
bleed:normal,
photo:p1-3
top:70%
left:65%
width:60
height:40
fit:cover
placement:left
order:10
{}
“I’m staying,” she said, interviewed in her Kabul home weeks after the Taliban seized control.
Seraj said she believed — or hoped, at least — that the Taliban movement would come to understand that the Afghanistan of today is not the same country it was 20 years ago.
“We are 18 million women; we have close to 6 million who are educated,” she said, pointing to the roles women carved out for themselves in business, education, medicine, media and government.
A niece of Amanullah Khan, the Afghan sovereign who led the country to independence from Britain in 1919, Seraj has a more finely honed sense than many of history’s vicissitudes.
Named one of Time magazine’s 100 most influential people in 2021, she thinks there is much to be learned from the two-decade-long U.S. presence in Afghanistan. The role of Afghan women was vastly elevated, she said — but in allocating aid and jobs, outsiders also trampled at times on the sensibilities of women reluctant to adopt certain Western mores.
Now, she said, the country’s women must seek a new path.
“If you can, keep us alive in your thoughts and your memory — read about us, ask about us…. It will be a great help,” she said. “We really have to be strong.”
[]
[]
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/palewire/.local/share/virtualenvs/afghan-women-in-hiding-YJqOGRxh/lib/python3.9/site-packages/archieml/__init__.py", line 237, in load
return Loader().load(fp)
File "/home/palewire/.local/share/virtualenvs/afghan-women-in-hiding-YJqOGRxh/lib/python3.9/site-packages/archieml/__init__.py", line 158, in load
self.load_scope(m.group('brace'), m.group('flags'), m.group('scope_key'))
File "/home/palewire/.local/share/virtualenvs/afghan-women-in-hiding-YJqOGRxh/lib/python3.9/site-packages/archieml/__init__.py", line 205, in load_scope
self.set_value(
File "/home/palewire/.local/share/virtualenvs/afghan-women-in-hiding-YJqOGRxh/lib/python3.9/site-packages/archieml/__init__.py", line 129, in set_value
data[k] = value
IndexError: list assignment index out of range
Consider this very simple example:
>>> s1 = """question: Pourquoi ?
... réponse: Parce que."""
...
>>> archieml.loads(s1)
OrderedDict([('question', 'Pourquoi ?')])
That's not right.
The ArchieML sandbox does it correctly, displaying:
{
"question": "Pourquoi ?",
"réponse": "Parce que."
}
Leaving this note here just to warn people that there are a couple of edge cases that don't really make sense:
For instance:
archieml.loads(
"""[foo]
{foo}
k: v
{}
* bar""")
produces the dictionary:
{'foo': {0: 'bar', 'k: v'}}
Some of the edge cases are listed on the main archieml.org repository newsdev/archieml.org#25, so if this bothers you, you should propose fixes there.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.