Coder Social home page Coder Social logo

archieml-python's People

Contributors

adrian-the-git avatar brainkim avatar kirkman avatar palewire avatar tschaume avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

archieml-python's Issues

Logic behind array-closing markers ([]) is broken

The following valid ArchieML doesn't parse correctly:

[a]
* first
[b]
* second
[]
c: No more parsing?

However this is fine:

[a]
* first
[]
[b]
* second
[]
c: This is just fine.

Presumably the logic behind closing arrays needs to work backward and close all the "open" ones. I'll have a look at the code and see if I can fix it, but maybe someone will beat me to it.

keys that should be valid ArchieML

The following key-value pairs are successfully parsed in the ArchieML sandbox but don't convert into a dict via archieml.loads(). It looks like the KEY_PATTERN regex isn't covering them.

a/b: 5
a(b): 1e+16 cm⁻³
Σ_(ms)⁻¹.κₑ_W/(mKs): 10

Problems with Google Docs

I have been playing with downloading an ArchieML-formatted Google Doc as plain text, then parsing it with archieml-python.

It pretty much works, except for the very first line. I have found that any time I have a key/value pair on line 1 of the text file, the parser doesn't recognize them.

If, however, I add a blank line to the top of the document, then the key/value on line 2 WILL be parsed.

After some more digging, it seems clear that when I download a file as "text/plain" from Google Docs, it's actually a UTF-8 file with a BOM.

So, if I change the following line in archieml-python's __init.py__:
line = line.decode('utf-8')
to:
line = line.decode('utf-8-sig')

Then it works.

Not sure if you might want to consider adding BOM detection, ala http://stackoverflow.com/questions/13590749/reading-unicode-file-data-with-bom-chars-in-python.

Parsing bug with some nested data.

A nested structure like the one below will fail in your parser. I believe it is due to the multiple "image" typed objects.

You can pass the same structure to the ArchieML sandbox on the official site and it parsers fine. So my first guess is that this is a bug in your code. I've attached the traceback I get as well.

[.profiles]
  who:The women’s <br>rights activist
  full_profile:y
  one_liner:‘Keep us alive in your thoughts’
  photo:order-4
  frame_position:left
  [.+copy]
  {.image}
    bleed:normal,
    photo:order-4
    top:35%
    left:55%
    width:65
    height:65
    fit:cover
    placement:left
    order:1
  {}

  {.who}
  {}

  When Mahbouba Seraj talks about her fears, an unusual thing happens: She sounds more indomitable than ever.

   {.image}
    bleed:normal,
    photo:p1-2
    top:65%
    left:20%
    width:40
    height:55
    fit:cover
    placement:left
    order:6
  {}

  At 73, Seraj is the doyenne of Afghan women’s rights activists — a stature and status that put her in the crosshairs of Taliban rulers.

  “Of course I’m afraid,” she said. “Everyone’s afraid.”

  But Seraj, who spent more than a quarter-century in exile in the United States before returning in 2003 to help build a women’s movement and women’s institutions, is standing fast in her refusal to leave, and to find ways of continuing her work.

  {.image}
    bleed:normal,
    photo:p1-3
    top:70%
    left:65%
    width:60
    height:40
    fit:cover
    placement:left
    order:10
  {}

  “I’m staying,” she said, interviewed in her Kabul home weeks after the Taliban seized control.

  Seraj said she believed — or hoped, at least — that the Taliban movement would come to understand that the Afghanistan of today is not the same country it was 20 years ago.

  “We are 18 million women; we have close to 6 million who are educated,” she said, pointing to the roles women carved out for themselves in business, education, medicine, media and government.

  A niece of Amanullah Khan, the Afghan sovereign who led the country to independence from Britain in 1919, Seraj has a more finely honed sense than many of history’s vicissitudes.

  Named one of Time magazine’s 100 most influential people in 2021, she thinks there is much to be learned from the two-decade-long U.S. presence in Afghanistan. The role of Afghan women was vastly elevated, she said — but in allocating aid and jobs, outsiders also trampled at times on the sensibilities of women reluctant to adopt certain Western mores.

  Now, she said, the country’s women must seek a new path.

  “If you can, keep us alive in your thoughts and your memory — read about us, ask about us…. It will be a great help,” she said. “We really have to be strong.”

  []
[]
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/palewire/.local/share/virtualenvs/afghan-women-in-hiding-YJqOGRxh/lib/python3.9/site-packages/archieml/__init__.py", line 237, in load
    return Loader().load(fp)
  File "/home/palewire/.local/share/virtualenvs/afghan-women-in-hiding-YJqOGRxh/lib/python3.9/site-packages/archieml/__init__.py", line 158, in load
    self.load_scope(m.group('brace'), m.group('flags'), m.group('scope_key'))
  File "/home/palewire/.local/share/virtualenvs/afghan-women-in-hiding-YJqOGRxh/lib/python3.9/site-packages/archieml/__init__.py", line 205, in load_scope
    self.set_value(
  File "/home/palewire/.local/share/virtualenvs/afghan-women-in-hiding-YJqOGRxh/lib/python3.9/site-packages/archieml/__init__.py", line 129, in set_value
    data[k] = value
IndexError: list assignment index out of range

Parser doesn't handle non-ascii correctly

Consider this very simple example:

>>> s1 = """question: Pourquoi ?
... réponse: Parce que."""
... 
>>> archieml.loads(s1)
OrderedDict([('question', 'Pourquoi ?')])

That's not right.

The ArchieML sandbox does it correctly, displaying:

{
  "question": "Pourquoi ?",
  "réponse": "Parce que."
}

Parser edge cases create weird data

Leaving this note here just to warn people that there are a couple of edge cases that don't really make sense:

For instance:

archieml.loads(
"""[foo] 
{foo} 
k: v 
{}
* bar""")

produces the dictionary:

{'foo': {0: 'bar', 'k: v'}}

Some of the edge cases are listed on the main archieml.org repository newsdev/archieml.org#25, so if this bothers you, you should propose fixes there.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.