Coder Social home page Coder Social logo

ply's Introduction

PLY (Python Lex-Yacc)                   Version 3.3

Copyright (C) 2001-2009,
David M. Beazley (Dabeaz LLC)
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:

* Redistributions of source code must retain the above copyright notice,
  this list of conditions and the following disclaimer.  
* Redistributions in binary form must reproduce the above copyright notice, 
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.  
* Neither the name of the David Beazley or Dabeaz LLC may be used to
  endorse or promote products derived from this software without
  specific prior written permission. 

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Introduction
============

PLY is a 100% Python implementation of the common parsing tools lex
and yacc. Here are a few highlights:

 -  PLY is very closely modeled after traditional lex/yacc.
    If you know how to use these tools in C, you will find PLY
    to be similar.

 -  PLY provides *very* extensive error reporting and diagnostic 
    information to assist in parser construction.  The original
    implementation was developed for instructional purposes.  As
    a result, the system tries to identify the most common types
    of errors made by novice users.  

 -  PLY provides full support for empty productions, error recovery,
    precedence specifiers, and moderately ambiguous grammars.

 -  Parsing is based on LR-parsing which is fast, memory efficient, 
    better suited to large grammars, and which has a number of nice
    properties when dealing with syntax errors and other parsing problems.
    Currently, PLY builds its parsing tables using the LALR(1)
    algorithm used in yacc.

 -  PLY uses Python introspection features to build lexers and parsers.  
    This greatly simplifies the task of parser construction since it reduces 
    the number of files and eliminates the need to run a separate lex/yacc 
    tool before running your program.

 -  PLY can be used to build parsers for "real" programming languages.
    Although it is not ultra-fast due to its Python implementation,
    PLY can be used to parse grammars consisting of several hundred
    rules (as might be found for a language like C).  The lexer and LR 
    parser are also reasonably efficient when parsing typically
    sized programs.  People have used PLY to build parsers for
    C, C++, ADA, and other real programming languages.

How to Use
==========

PLY consists of two files : lex.py and yacc.py.  These are contained
within the 'ply' directory which may also be used as a Python package.
To use PLY, simply copy the 'ply' directory to your project and import
lex and yacc from the associated 'ply' package.  For example:

     import ply.lex as lex
     import ply.yacc as yacc

Alternatively, you can copy just the files lex.py and yacc.py
individually and use them as modules.  For example:

     import lex
     import yacc

The file setup.py can be used to install ply using distutils.

The file doc/ply.html contains complete documentation on how to use
the system.

The example directory contains several different examples including a
PLY specification for ANSI C as given in K&R 2nd Ed.   

A simple example is found at the end of this document

Requirements
============
PLY requires the use of Python 2.2 or greater.  However, you should
use the latest Python release if possible.  It should work on just
about any platform.  PLY has been tested with both CPython and Jython.
It also seems to work with IronPython.

Resources
=========
More information about PLY can be obtained on the PLY webpage at:

     http://www.dabeaz.com/ply

For a detailed overview of parsing theory, consult the excellent
book "Compilers : Principles, Techniques, and Tools" by Aho, Sethi, and
Ullman.  The topics found in "Lex & Yacc" by Levine, Mason, and Brown
may also be useful.

A Google group for PLY can be found at

     http://groups.google.com/group/ply-hack

Acknowledgments
===============
A special thanks is in order for all of the students in CS326 who
suffered through about 25 different versions of these tools :-).

The CHANGES file acknowledges those who have contributed patches.

Elias Ioup did the first implementation of LALR(1) parsing in PLY-1.x. 
Andrew Waters and Markus Schoepflin were instrumental in reporting bugs
and testing a revised LALR(1) implementation for PLY-2.0.

Special Note for PLY-3.0
========================
PLY-3.0 the first PLY release to support Python 3. However, backwards
compatibility with Python 2.2 is still preserved. PLY provides dual
Python 2/3 compatibility by restricting its implementation to a common
subset of basic language features. You should not convert PLY using
2to3--it is not necessary and may in fact break the implementation.

Example
=======

Here is a simple example showing a PLY implementation of a calculator
with variables.

# -----------------------------------------------------------------------------
# calc.py
#
# A simple calculator with variables.
# -----------------------------------------------------------------------------

tokens = (
    'NAME','NUMBER',
    'PLUS','MINUS','TIMES','DIVIDE','EQUALS',
    'LPAREN','RPAREN',
    )

# Tokens

t_PLUS    = r'\+'
t_MINUS   = r'-'
t_TIMES   = r'\*'
t_DIVIDE  = r'/'
t_EQUALS  = r'='
t_LPAREN  = r'\('
t_RPAREN  = r'\)'
t_NAME    = r'[a-zA-Z_][a-zA-Z0-9_]*'

def t_NUMBER(t):
    r'\d+'
    t.value = int(t.value)
    return t

# Ignored characters
t_ignore = " \t"

def t_newline(t):
    r'\n+'
    t.lexer.lineno += t.value.count("\n")
    
def t_error(t):
    print "Illegal character '%s'" % t.value[0]
    t.lexer.skip(1)
    
# Build the lexer
import ply.lex as lex
lex.lex()

# Precedence rules for the arithmetic operators
precedence = (
    ('left','PLUS','MINUS'),
    ('left','TIMES','DIVIDE'),
    ('right','UMINUS'),
    )

# dictionary of names (for storing variables)
names = { }

def p_statement_assign(p):
    'statement : NAME EQUALS expression'
    names[p[1]] = p[3]

def p_statement_expr(p):
    'statement : expression'
    print p[1]

def p_expression_binop(p):
    '''expression : expression PLUS expression
                  | expression MINUS expression
                  | expression TIMES expression
                  | expression DIVIDE expression'''
    if p[2] == '+'  : p[0] = p[1] + p[3]
    elif p[2] == '-': p[0] = p[1] - p[3]
    elif p[2] == '*': p[0] = p[1] * p[3]
    elif p[2] == '/': p[0] = p[1] / p[3]

def p_expression_uminus(p):
    'expression : MINUS expression %prec UMINUS'
    p[0] = -p[2]

def p_expression_group(p):
    'expression : LPAREN expression RPAREN'
    p[0] = p[2]

def p_expression_number(p):
    'expression : NUMBER'
    p[0] = p[1]

def p_expression_name(p):
    'expression : NAME'
    try:
        p[0] = names[p[1]]
    except LookupError:
        print "Undefined name '%s'" % p[1]
        p[0] = 0

def p_error(p):
    print "Syntax error at '%s'" % p.value

import ply.yacc as yacc
yacc.yacc()

while 1:
    try:
        s = raw_input('calc > ')
    except EOFError:
        break
    yacc.parse(s)


Bug Reports and Patches
=======================
My goal with PLY is to simply have a decent lex/yacc implementation
for Python.  As a general rule, I don't spend huge amounts of time
working on it unless I receive very specific bug reports and/or
patches to fix problems. I also try to incorporate submitted feature
requests and enhancements into each new version.  To contact me about
bugs and/or new features, please send email to [email protected].

In addition there is a Google group for discussing PLY related issues at

    http://groups.google.com/group/ply-hack
 
-- Dave









ply's People

ply's Issues

Decorator on t_RULE() func breaks the rule ordering

What steps will reproduce the problem?
Adding some @Decorator to t_RULE funcs completely breaks the funcs order which 
is significant for correct lexing and parsing.


What is the expected output? What do you see instead?
Lexing rule order is broken.

What version of the product are you using? On what operating system?
PLY 3.3 with Python 2.5.4 on MacOs 10.6.4 (but I think OS doesn't matter)


Please provide any additional information below.
I don't try how it works in Python 3 because I need to use Python 2.5 (but I 
guess that problem will be the same in py3) .
In Python 2.5 the problem reproduced.
The root cause of the problem is in ordering code of lex.py - it uses the 
following func_code() which return src code obj for specified func f, where f - 
actually is our t_RULE func (and later attr co_firstlineno of this obj - number 
of 1-st line of func is used for sorting)

if sys.version_info[0] < 3:
    def func_code(f):
        return f.func_code
else:
    def func_code(f):
        return f.__code__

so the problem is that when I use decorator:
@MyDecorator
t_MYRULE(t)
and if this MyDecorator actually decorates (substitutes) the t_MYRULE then 
f.func_code will return the src code object of this decorator-func but not 
original t_MYRULE. This breaks correct ordering of t_RULE funcs.

Unfortunately attr co_firstlineno is readonly, so possible workaround - to 
assign 
decorated_func.func_code.co_firstlineno=original_func.func_code.co_firstlineno 
- doesn't work.

Does somebody know workaround for this problem?
Thanks in advance.

Original issue reported on code.google.com by [email protected] on 12 Oct 2010 at 8:18

Enh: New arg start_state for lex()

What steps will reproduce the problem?
It isn't a problem, but enhancement proposal. Now lexing always starts in 
predefined INITIAL state. There is no way to start from my own state.

What version of the product are you using? On what operating system?
PLY 3.3

Please provide any additional information below.
As I saw from lex src code - the lexing always starts in INITIAL state. And if 
my grammar has many state and there no common rules which I can place in 
INITIAL state (or may be I just don't want to identify these common rules). So 
in other words - I identified my states and named them all - how can I start 
lexing in my own START state instead of predefined INITIAL state?

Original issue reported on code.google.com by [email protected] on 14 Oct 2010 at 9:24

'nonassoc' does not work in ply-2.3 (includes patch and testcase)

What steps will reproduce the problem?
1. Make a simple grammar with non-assoc, such as
exp -> exp LT exp
   | INT
with LT having non-assoc precedence
2. Lexer supplies INT LT INT LT INT
3. Run parser

What is the expected output? What do you see instead?

Expected output: parse error
Actual output: parses without complaint

What version of the product are you using? On what operating system?

ply-2.3 on linux (arch/os doesn't matter) 

Please provide any additional information below.

The bug is on line 1746 of yacc.py:

                                    if (slevel > rlevel) or ((slevel ==
rlevel) and (rprec != 'left')): # BUG HERE BUG HERE
                                        # We decide to shift here...
highest precedence to shift
                                        st_action[a] = j
                                        st_actionp[a] = p
                                        if not rlevel:
                                            n_srconflict += 1
                                            _vfc.write("shift/reduce
conflict in state %d resolved as shift.\n" % st)
                                            _vf.write("  ! shift/reduce
conflict for %s resolved as shift.\n" % a)
                                    elif (slevel == rlevel) and (rprec ==
'nonassoc'):
                                        st_action[a] = None

Brief visual inspection should confirm that the elif branch a few lines
down cannot ever execute, because if slevel=rlevel and rprec='nonassoc'
then the if branch on 1746 will execute instead. The solution is to change
line 1746 to:

                                    if (slevel > rlevel) or ((slevel ==
rlevel) and (rprec == 'right')):


Note that this also brings it more in line with the 1691 (except that on
1746 we want to shift on right associativity, not reduce on left as in 1691). 

I am attaching a patch that appears to fix the problem. 

Original issue reported on code.google.com by [email protected] on 28 Feb 2008 at 2:07

Attachments:

ply documentation enhancement (braces example)

While doing the braces example I discoverd some minor issues.
I took the liberty to correct them and to enhance the example into a working 
script (see below and attachment)

I don't know if the documentation is part of the open source project on google 
code so I sent a mail to Dave Baezly aswell


CHANGES:
--------------------------------------------------------------------------------
-----------------------------------------------------------
OLD  t.value = t.lexer.lexdata[t.lexer.code_start:t.lexer.lexpos+1]
NEW t.value = t.lexer.lexdata[t.lexer.code_start:t.lexer.lexpos-1]

OLD  r'(/\*(.|\n)*?*/)|(//.*)'
NEW r'\"([^\\\n]|(\\.))*?\"'

thanks to: http://d.hatena.ne.jp/noritsugu/20080401
--------------------------------------------------------------------------------
-----------------------------------------------------------

SCRIPT:
# ----------------------------------------
# bracelex.py
# 
# Demonstrates how to grab arbitray C code enclosed by braces.
# It resolves nested braces 
# and ignores braces in comments ( /* */ , //) 
# and strings (" ",' ')
# ----------------------------------------
# Import
import ply.lex as lex
import re
tokens = (
    'CCODE',
)
# Declare the state
states = (
  ('ccode','exclusive'),
)
# Match the first {. Enter ccode state.
def t_ccode(t):
    r'\{'
    t.lexer.code_start = t.lexer.lexpos        # Record the starting 
position
    t.lexer.level = 1                          # Initial brace 
level
    t.lexer.begin('ccode')                     # Enter 'ccode' 
state
# Rules for the ccode state
def t_ccode_lbrace(t):     
    r'\{'
    t.lexer.level +=1                
def t_ccode_rbrace(t):
    r'\}'
    t.lexer.level -=1
    # If closing brace, return the code fragment
    if t.lexer.level == 0:
         t.value = t.lexer.lexdata[t.lexer.code_start:t.lexer.lexpos-1]
         t.type = "CCODE"
         t.lexer.lineno += t.value.count('\n')
         t.lexer.begin('INITIAL')           
         return t
#C or C++ comment (ignore)    
def t_ccode_comment(t):
    r'(/\*(.|\n)*?\*/)|(//.*)'
    pass
# C string
def t_ccode_string(t):
   r'\"([^\\\n]|(\\.))*?\"'
# C character literal
def t_ccode_char(t):
   r'\'([^\\\n]|(\\.))*?\''
# Any sequence of non-whitespace characters (not braces, strings)
def t_ccode_nonspace(t):
   r'[^\s\{\}\'\"]+'
# Ignored characters (whitespace)
t_ccode_ignore = " \t\n"
# For bad characters, we just skip over it
def t_ccode_error(t):
    t.lexer.skip(1)
# Mandatory Error handling rule
def t_error(t):
    #print "?:%s" % t.value[0]
    t.lexer.skip(1)
    
#Execute
data = '''
// {this is some comment between braces}
function () {
    // {this is a single line comment between braces}
    this is interessting code
    /* {this is another single line comment between braces} */
    /*
    {this is a multiline comment between braces}
    */
    { code between nested braces 1
        { code between nested braces 2 }
    }
    '{string between braces}'
    "{anaother string between braces}"
    
}
'''
# create lexer
lexer = lex.lex(reflags = re.IGNORECASE)
    
# Tokenize (case insensitive matches)
lexer.input(data)
for tok in lexer:
    print tok.type.rjust(24), tok.value
    #print tok.type, tok.value, tok.lineno, tok.lexpos

Original issue reported on code.google.com by [email protected] on 4 Sep 2010 at 8:46

Attachments:

Python 3

What steps will reproduce the problem?
1. install python 3
2. import lex or yacc v.2.5

The new python implementation does no longer support '%' string formatting,
and no longer support raise, print, exec, etc. with unbracketed arguments.

Original issue reported on code.google.com by [email protected] on 7 Jan 2009 at 12:15

Wrong output filename for the parsing tables module

I've found a problem with ply2.3 related to parsing table reloading. I am
using python2.5.1 (on windows2K :().

When we want to read the parsing tables from a dotted module name like
'pckg.subpckg.parsetab' (by adding 'tabmodule = "pckg.subpckg.parsetab" to
the parser constructor), the lr_read_tables function executes 'import
pckg.subpckg.parsetab as parsetab'. This is fairly ok, but that's not the
problem.

The problem is, when yacc writes the tables, it will write to a file named
'pckg.subpckg.parsetab.py' inside the specified outputdir (using the full
module name, and not only parsetab).
This isn't what we intended, so at reload, yacc will search for
'pckg.subpckg.parsetab' but it won't find a file named 'parsetab.py' inside
a 'pckg/subpckg' directory on python's module search path. 

So, instead of loading the tables from the previously generated file, it
will regenerate the parsing tables every time it gets loaded.

I've fixed the bug on my ply source. The solution is rather ugly (to blend
with the rest of yacc's source :P), but the problem seems to be gone. If
anyone's interested in my solution, please contact me be email (cptbinho at
gmail dot com) and i'll gladly send you my modified version.

Regards

Original issue reported on code.google.com by [email protected] on 8 Apr 2008 at 8:52

lexer in debug mode does not output tokens during lexing

What steps will reproduce the problem?
1. Make lexer with lex.lex(debug=True)
2. Run lexer.input('a string'); list(lexer)

What is the expected output? What do you see instead?

The documentation in section 4.13 says that the lexer in debug mode will print 
debug information about the "tokens generating during lexing".  I took this to 
mean that the lexer would log the tokens during the lexing process, but this 
does not occur. 

It's no real problem to print the tokens as they arrive:

for t in lexer:
    print t

but maybe the documentation would benefit from clarification?

What version of the product are you using? On what operating system?

SVN r86 OSX 10.6

Please provide any additional information below.


Original issue reported on code.google.com by [email protected] on 31 Dec 2010 at 1:35

set_lineno uses wrong parameter

In yacc.py on line 202, the method set_lineno is

 def set_lineno(self,n,lineno):
     self.slice[n].lineno = n

instead of

 def set_lineno(self,n,lineno):
     self.slice[n].lineno = lineno

This results in the line number to be wrong - in the most common case,
likely ending up as 0.

Original issue reported on code.google.com by [email protected] on 3 Mar 2009 at 9:35

Python 3 slicing no longer uses __getslice__

What steps will reproduce the problem?
1. Use a slice in a production rule, eg:

def p_toplevel(p):
    """ toplevel : alpha beta gamma TOKEN delta """
    p[0] = p[1:4] + p[5:]

What is the expected output? What do you see instead?
Expected: p[0] is a list of the nonterminals alpha, beta, gamma, and delta.
Got:
  File "/usr/lib/python3.1/site-packages/ply/yacc.py", line 198, in __getitem__
    if n >= 0: return self.slice[n].value
TypeError: unorderable types: slice() >= int()

What version of the product are you using? On what operating system?

This is python 3.1, but this has been true since python 3.0: 
http://docs.python.org/py3k/whatsnew/3.0.html#operators-and-special-methods

Namely, instead of __getslice__ being called as it was in python 2.x, 
__getitem__ is called with a slice object, which is being compared with an int 
in YaccProduction's case. (Similarly, my testing shows that even in Python 2.7, 
a slice that includes a step, eg. a[1:4:2], sends a slice object to __getitem__ 
instead of to __getslice__.)

Simple patch included (Python 2 compatible) to fix this, but I note it allows 
the use of negative indices in slices in a way that doesn't match how negative 
indexing currently works.

Original issue reported on code.google.com by jokeserver on 28 Feb 2011 at 1:25

Attachments:

Ply lexer does not reset line number on each new scan.

What steps will reproduce the problem?
1. Run 'python ply_bug.py' (see attachment).

---

What is the expected output? What do you see instead?

The expected output is to have the line number the same each time, as we
parse the same input. However, it seems the line number is not reset each time.

---

What version of the product are you using? On what operating system?

Ply 3.3 on Linux.

---
Please provide any additional information below.

I included a patch (lex_bugfix.patch attachment) that resets the lineno
attribute of the lexer each time the input is set. If you apply the patch
and rerun the example, you will see that the line number of the parse error
is the same each time.

Also note that the lexpos attribute of the lexer is properly reset in the
input() function.

Original issue reported on code.google.com by [email protected] on 8 Apr 2010 at 6:05

Attachments:

calc.py crash after syntax error

What steps will reproduce the problem?
1. Start calc.py
2. type 2=
3.

What is the expected output? What do you see instead?
Expected: Syntax error at '='
I get:
Syntax error at '='
Traceback (most recent call last):
  File "./calc.py", line 106, in <module>
    yacc.parse(s)
  File "../ply/yacc.py", line 303, in parse
  File "./calc.py", line 84, in p_expression_number
    p[0] = p[1]
  File "../ply/yacc.py", line 121, in __getitem__
IndexError: list index out of range


What version of the product are you using? On what operating system?
UBUNTU and python2.5, ply 2.3

Please provide any additional information below.


Original issue reported on code.google.com by [email protected] on 26 Sep 2007 at 11:08

lineno attrbute not mentioned in 'tok' object

In 3.19 (Miscellaneous Issues) the rules that should be followed when
writing an external Lexer w.r.t. the tokens may be extended with the need
to have a 'linno' attribute for getting line number information.
You may also expect column information somewhere, but I never use that, so
I don't know... :)

Original issue reported on code.google.com by [email protected] on 5 Feb 2008 at 3:44

pip can't install on Python 3.2 without errors

With PLY 3.4 on Python 3.2 on Windows XP, if I type "pip install python", I get 
this:

Downloading/unpacking ply
  Real name of requirement ply is ply
Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\python32\lib\threading.py", line 740, in _bootstrap_inner
    self.run()
  File "C:\python32\lib\threading.py", line 693, in run
    self._target(*self._args, **self._kwargs)
  File "C:\python32\lib\site-packages\pip-1.0.2-py3.2.egg\pip\index.py", line 23
9, in _get_queued_page
    page = self._get_page(location, req)
  File "C:\python32\lib\site-packages\pip-1.0.2-py3.2.egg\pip\index.py", line 32
4, in _get_page
    return HTMLPage.get_page(link, req, cache=self.cache)
  File "C:\python32\lib\site-packages\pip-1.0.2-py3.2.egg\pip\index.py", line 44
5, in get_page
    inst = cls(u(resp.read()), real_url, headers)
  File "C:\python32\lib\site-packages\pip-1.0.2-py3.2.egg\pip\backwardcompat.py"
, line 58, in u
    return s.decode('utf-8')
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf8 in position 10783: inval
id start byte

  Downloading ply-2.5.tar.gz (119Kb): 119Kb downloaded
  Running setup.py egg_info for package ply

Installing collected packages: ply
  Running setup.py install for ply
      File "C:\python32\Lib\site-packages\ply\lex.py", line 187
        exec "import %s as lextab" % tabfile
                                 ^
    SyntaxError: invalid syntax

      File "C:\python32\Lib\site-packages\ply\yacc.py", line 174
        raise YaccError, "Can't directly instantiate Parser. Use yacc() instead.
"
                       ^
    SyntaxError: invalid syntax


Successfully installed ply
Cleaning up...


It looks like there are two separate issues: the Unicode error and the exec 
error. I haven't looked into the Unicode error, but lex.py is using exec as a 
statement, not as a function, which is clearly wrong for Python 3.

Original issue reported on code.google.com by [email protected] on 20 Dec 2011 at 4:08

cpp.py skips preprocessor directives that follow a comment line

Using ply-3.3, if I process the following text using cpp.py:

// A comment
#define A_DIRECTIVE

then the directive line is _not_ detected as a directive. This occurs 
because the regex definition of a CPP_COMMENT (line 59) causes the \n to be 
included in the comment token, which then causes Preprocessor.group_lines 
to interpret the directive as part of the previous comment line. 

This can be easily fixed by changing the CPP_COMMENT definition to:
r'(/\*(.|\n)*?\*/)|(//[^\n]*)' 


Original issue reported on code.google.com by [email protected] on 30 Nov 2009 at 7:40

in clex.py comments are not detected

What steps will reproduce the problem?
1. the regular expression for comment is bad. And comment is not detected.
Token TIMES and DIVIDE are found.
2.
3.

What is the expected output? What do you see instead?
Regular expression in clex.py for comment (line 143 to 146) : 
143 # Comments
144 def t_comment(t):
145     r' /\*(.|\n)*?\*/'
146     t.lineno += t.value.count('\n')

They are a space on first character.

The good regular expression is :
143 # Comments
144 def t_comment(t):
145     r'/\*(.|\n)*?\*/'
146     t.lineno += t.value.count('\n')


What version of the product are you using? On what operating system?
PLY 2.3 on Windows XP

Please provide any additional information below.


Original issue reported on code.google.com by [email protected] on 5 May 2007 at 6:59

Add EOF rule for lexing

It seems there is no EOF rule in PLY right now, to match the end of file, 
dirty hacks must be used.

What version of the product are you using? On what operating system?
V3.3 on Linux


Original issue reported on code.google.com by [email protected] on 11 Mar 2010 at 4:50

Type in ANSI C grammar

While poking around in the ANSI C example I noticed what looks like a typo 
in the second struct_declaration_list rule.  Instead of:

def p_struct_declaration_list_2(t):
    'struct_declaration_list : struct_declarator_list struct_declaration'
    pass


I think it should be:

    'struct_declaration_list : struct_declaration_list struct_declaration'
                                          ^^^^^^^

Regards,
Phil



Original issue reported on code.google.com by [email protected] on 30 Mar 2007 at 3:09

Syntax error in ply/cpp.py in r39

What steps will reproduce the problem?
1. I checked out r39 and installed it.
2. ply/cpp.py seems to contain a syntax error.
3. I think the '... for ... in ...' needs '[' and ']' around it...

Dennis Hendriks

Output:

$ python setup.py install
running install
running bdist_egg
running egg_info
creating ply.egg-info
writing ply.egg-info/PKG-INFO
writing top-level names to ply.egg-info/top_level.txt
writing dependency_links to ply.egg-info/dependency_links.txt
writing manifest file 'ply.egg-info/SOURCES.txt'
reading manifest file 'ply.egg-info/SOURCES.txt'
writing manifest file 'ply.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-i686/egg
running install_lib
running build_py
creating build
creating build/lib
creating build/lib/ply
copying ply/__init__.py -> build/lib/ply
copying ply/lex.py -> build/lib/ply
copying ply/yacc.py -> build/lib/ply
copying ply/cpp.py -> build/lib/ply
creating build/bdist.linux-i686
creating build/bdist.linux-i686/egg
creating build/bdist.linux-i686/egg/ply
copying build/lib/ply/__init__.py -> build/bdist.linux-i686/egg/ply
copying build/lib/ply/lex.py -> build/bdist.linux-i686/egg/ply
copying build/lib/ply/yacc.py -> build/bdist.linux-i686/egg/ply
copying build/lib/ply/cpp.py -> build/bdist.linux-i686/egg/ply
byte-compiling build/bdist.linux-i686/egg/ply/__init__.py to __init__.pyc
byte-compiling build/bdist.linux-i686/egg/ply/lex.py to lex.pyc
byte-compiling build/bdist.linux-i686/egg/ply/yacc.py to yacc.pyc
byte-compiling build/bdist.linux-i686/egg/ply/cpp.py to cpp.pyc
  File "build/bdist.linux-i686/egg/ply/cpp.py", line 742
    filename = "".join(x.value for x in tokens[1:i])
                                 ^
SyntaxError: invalid syntax
creating build/bdist.linux-i686/egg/EGG-INFO
copying ply.egg-info/PKG-INFO -> build/bdist.linux-i686/egg/EGG-INFO
copying ply.egg-info/SOURCES.txt -> build/bdist.linux-i686/egg/EGG-INFO
copying ply.egg-info/dependency_links.txt ->
build/bdist.linux-i686/egg/EGG-INFO
copying ply.egg-info/top_level.txt -> build/bdist.linux-i686/egg/EGG-INFO
zip_safe flag not set; analyzing archive contents...
creating dist
creating 'dist/ply-2.4-py2.3.egg' and adding 'build/bdist.linux-i686/egg' to it
removing 'build/bdist.linux-i686/egg' (and everything under it)
Processing ply-2.4-py2.3.egg
Copying ply-2.4-py2.3.egg to <***>/lib/python2.3/site-packages
Adding ply 2.4 to easy-install.pth file

Installed <***>/lib/python2.3/site-packages/ply-2.4-py2.3.egg
Processing dependencies for ply==2.4
Finished processing dependencies for ply==2.4

Original issue reported on code.google.com by [email protected] on 29 Apr 2008 at 6:30

In Python 2.7, broken by `from __future__ import unicode_literals`

What steps will reproduce the problem?
1. Make a virtualenv with Python 2.7
2. Take some working example that uses precedence
3. Add `from __future__ import unicode_literals`

What is the expected output?
The behavior should be unchanged from the basic Python 2.7 behavior.

What do you see instead?
The error raise is `YaccError: Unable to build parser` and the verbose output 
is `precedence associativity must be a string`

What version of the product are you using? On what operating system?
PLY 3.4
Python 2.7 (probably true for any 2.X)
OSX 10.7.5 (probably irrelevant)

Please provide any additional information below.
Everything does work fine in Python 3. I am working on a project that should 
support both Python 2 and Python 3 simultaneously,

Original issue reported on code.google.com by [email protected] on 26 Mar 2013 at 5:11

calc.py has typo in the division operation (correction is included)

What steps will reproduce the problem?
1. Run calc.py and submit "10/2".

What is the expected output? What do you see instead?
Expected value is 5, but actual value is "None".

What version of the product are you using? On what operating system?
Copy hosted at http://www.dabeaz.com/ply/example.html, downloaded 3/19/2007.

Please provide any additional information below.
The patched function should read:

def p_expression_binop(t):
    '''expression : expression PLUS expression
                  | expression MINUS expression
                  | expression TIMES expression
                  | expression DIVIDE expression'''
    if t[2] == '+'  : t[0] = t[1] + t[3]
    elif t[2] == '-': t[0] = t[1] - t[3]
    elif t[2] == '*': t[0] = t[1] * t[3]
    elif t[2] == '/': t[0] = t[1] / t[3]

(The last line said "elif t[3] == '/'...")

Original issue reported on code.google.com by [email protected] on 20 Mar 2007 at 3:22

ply no longer python 2.3 compatible

What steps will reproduce the problem?
1. Use lex and make sure the table file is written.

What is the expected output? What do you see instead?
Line 143 in lex.py: "basetabfilename = tabfile.rsplit(".",1)[-1]". In
python 2.3, the rsplit method doesn't exist. It is new in python 2.4.

What version of the product are you using? On what operating system?
I'm using 2.4 prerelease version on python 2.3.

Please provide any additional information below.
Maybe use a regular split, which reduces performance. Alternative, an
rfind() can be used.

Original issue reported on code.google.com by [email protected] on 6 May 2008 at 8:20

lex state does not reset with lex.input()

What steps will reproduce the problem?
1. Run lexer with states, where lexer finishes in state other than initial
2. Rerun lexer after lexer.input(txt); continues in previous state rather than 
initial state

See lex_state_history.py (attached)

What is the expected output?

[LexToken(SOMETHINGFIRST,'start@',1,0), LexToken(STATECHANGER,'change@',1,6), 
LexToken(SOMETHINGSTATEY,'later',1,13)]
[LexToken(SOMETHINGFIRST,'start@',1,0), LexToken(STATECHANGER,'change@',1,6), 
LexToken(SOMETHINGSTATEY,'later',1,13)]

What do you see instead?

[LexToken(SOMETHINGFIRST,'start@',1,0), LexToken(STATECHANGER,'change@',1,6), 
LexToken(SOMETHINGSTATEY,'later',1,13)]
Traceback (most recent call last):
  File "lex_state_history.py", line 31, in <module>
    print list(lexer)
  File "/Users/mb312/usr/local/lib/python2.6/site-packages/ply/lex.py", line 405, in next
    t = self.token()
  File "/Users/mb312/usr/local/lib/python2.6/site-packages/ply/lex.py", line 393, in token
    raise LexError("Illegal character '%s' at index %d" % (lexdata[lexpos],lexpos), lexdata[lexpos:])
ply.lex.LexError: Illegal character 's' at index 0

What version of the product are you using? On what operating system?

Current SVN checkout r86 OSX 10.6

Please provide any additional information below.

Suggested patch attached


Original issue reported on code.google.com by [email protected] on 31 Dec 2010 at 1:28

Attachments:

re.compile errors in lex when reading from lextab file.

This was also posted to ply-hack, but without the attached patch.  Sorry
for the duplication.

What steps will reproduce the problem?
1. specify a lexer using # comments and whitespace in regexps (re.VERBOSE).
2. don't pass any reflags to lex.  (But see below).
3. compile a lextab file.
4. do a second run from lextab file.

What is the expected output? What do you see instead?
This works when compiling the regexps from your source file, but fails when
compiling them from the lextab file.  The re.VERBOSE flag is set for the
first case, but not the second case.

What version of the product are you using? On what operating system?
ply 3.2 on linux, with python 2.6.

Please provide any additional information below.
There is actually a second problem because calling lex() with
reflags=re.VERBOSE doesn't get stored in the lextab file, so doesn't fix
this.  I'm supplying a patch to fix this second problem, but the first
problem remains if re.VERBOSE is not passed to lex().

Original issue reported on code.google.com by [email protected] on 25 Jul 2009 at 9:08

Attachments:

ply 3.4 fails to install with Python 3.3

When I type

  easy_install ply

using Python 3.3, I get the following error:

Traceback (most recent call last):
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/bin/easy_install", line 9, in <module>
    load_entry_point('distribute==0.6.27', 'console_scripts', 'easy_install')()
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/easy_install.py", line 1915, in main
    with_ei_usage(lambda:
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/easy_install.py", line 1896, in with_ei_usage
    return f()
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/easy_install.py", line 1919, in <lambda>
    distclass=DistributionWithoutHelpCommands, **kw
  File "/store/store/sandia/python3.3/ver-3.3.0/lib/python3.3/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/store/store/sandia/python3.3/ver-3.3.0/lib/python3.3/distutils/dist.py", line 917, in run_commands
    self.run_command(cmd)
  File "/store/store/sandia/python3.3/ver-3.3.0/lib/python3.3/distutils/dist.py", line 936, in run_command
    cmd_obj.run()
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/easy_install.py", line 350, in run
    self.easy_install(spec, not self.no_deps)
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/easy_install.py", line 590, in easy_install
    return self.install_item(spec, dist.location, tmpdir, deps)
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/easy_install.py", line 620, in install_item
    dists = self.install_eggs(spec, download, tmpdir)
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/easy_install.py", line 814, in install_eggs
    return self.build_and_install(setup_script, setup_base)
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/easy_install.py", line 1094, in build_and_install
    self.run_setup(setup_script, setup_base, args)
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/easy_install.py", line 1080, in run_setup
    run_setup(setup_script, args)
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/sandbox.py", line 31, in run_setup
    lambda: exec(compile(open(
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/sandbox.py", line 79, in run
    return func()
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/sandbox.py", line 34, in <lambda>
    {'__file__':setup_script, '__name__':'__main__'})
  File "setup.py", line 29, in <module>
  File "/store/store/sandia/python3.3/ver-3.3.0/lib/python3.3/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/store/store/sandia/python3.3/ver-3.3.0/lib/python3.3/distutils/dist.py", line 917, in run_commands
    self.run_command(cmd)
  File "/store/store/sandia/python3.3/ver-3.3.0/lib/python3.3/distutils/dist.py", line 936, in run_command
    cmd_obj.run()
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/bdist_egg.py", line 227, in run
    os.path.join(archive_root,'EGG-INFO'), self.zip_safe()
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/bdist_egg.py", line 266, in zip_safe
    return analyze_egg(self.bdist_dir, self.stubs)
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/bdist_egg.py", line 402, in analyze_egg
    safe = scan_module(egg_dir, base, name, stubs) and safe
  File "/home/hudson/slave/workspace/PyUtilib_trunk_python3.3/python/lib/python3.3/site-packages/distribute-0.6.27-py3.3.egg/setuptools/command/bdist_egg.py", line 429, in scan_module
    code = marshal.load(f);  f.close()
ValueError: bad marshal data (unknown type code)

Error installing package ply with easy_install


I'm installing on a 64-bit Linux box with RH 6.

--Bill

Original issue reported on code.google.com by whart222 on 8 Nov 2012 at 12:26

rununit.py FAILED (failures=10) ply-2.5

What steps will reproduce the problem?
1. after python setup.py install, run test/rununit.py
2.
3.

What is the expected output? What do you see instead?
result is 10 failures out of 52 tests. i assume failures are unexpected?
attached file is stderr of runinit.

What version of the product are you using? On what operating system?
ply-2.5.tar.gz
Linux localhost.localdomain 2.6.27.9-159.fc10.x86_64 #1 SMP Tue Dec 16
14:47:52 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
$ python
Python 2.5.2 (r252:60911, Sep 30 2008, 15:42:03) 
[GCC 4.3.2 20080917 (Red Hat 4.3.2-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

Please provide any additional information below.
note: this is an install of ply-2.5 under virtualenv.
a virtualenv install of ply-1.8 works (tests) ok:
$ python rununit.py
........................................
----------------------------------------------------------------------
Ran 40 tests in 2.736s

OK

Original issue reported on code.google.com by [email protected] on 26 Jan 2009 at 9:08

Attachments:

PLY 2.5 fails to shift a token

What steps will reproduce the problem?
1. python cxx.py bug.cpp
2. python cxx.py -v bug.cpp
3.

What is the expected output? What do you see instead?
1. No output
2. Show the PLY trace, and terminate normally.

What version of the product are you using? On what operating system?

PLY 2.5, under Cygwin.

Please provide any additional information below.

I submitted a note about this to the ply-hack discussion, and I have
subsequently refined the C++ parser here.  I've tested the parser on a
variety of C++ test examples.  At this point, the remaining parsing issue
appears to lie with the table that PLY generates to parse nested ids, like
A::B::C.

In the bug.cpp example, the parser generates an error in state 124, when
processing a SCOPE token (which represents '::').  But state 124 is
summarized as:

state 124

    (5) id_scope -> id . SCOPE
    (8) nested_id -> id .

    SCOPE           shift and go to state 401

So, this state should shift the SCOPE token, but it doesn't.

Perhaps a clue to the issue with PLY is that there are two other
'equivalent' states in the table:

state 280

    (8) nested_id -> id .
    (5) id_scope -> id . SCOPE

state 641

    (5) id_scope -> id . SCOPE

Perhaps PLY is getting confused...???




Original issue reported on code.google.com by whart222 on 16 Sep 2008 at 7:00

Attachments:

Lex.py will always write out lextab files if in optimize mode, even if the lexer was created using a lextab file

What steps will reproduce the problem?
1. Use optimize mode
2. Generate tables
3. Unset write permission on the tables just generated

What is the expected output? What do you see instead?
Nothing. Why should lex.py be writing to the lextab file if it read them to
begin with?

Instead, I see an error message, that lex.py has tried to write to the
lextab file and failed.

What version of the product are you using? On what operating system?
3.2, linux (ubuntu 9.04, installed ply from pypi)

Please provide any additional information below.
A patch to fix this is attached

Original issue reported on code.google.com by [email protected] on 26 Aug 2009 at 10:33

Attachments:

yacc.parse(debug=1) does not generate 'copious amounts of debugging during parsing'

The documentation of PLY promises 'copious amounts of debugging during
parsing' with yacc.parse(debug=1) in the section '5.12 Yacc implementation
notes'.
debug=1 however only prints an additional line when an error is found.

To get the desired output, one should enter a number > 1 (ie
'yacc.parse(debug=2)'.

This probably holds also when re-directing output to file (the next item).

Original issue reported on code.google.com by [email protected] on 22 Jan 2008 at 12:22

way to create parsers from lr tables

yacc.yacc() parses tables and returns a new LRParser. It would be nice to
be able to simply get a LRTable back, so that new LRParsers could be
created from them. Since parsers have state on their instances, it should
be easy to create one from the same tables, so concurrent parsing can occur.

It would also be nice if something similar could be done for lex.


Original issue reported on code.google.com by [email protected] on 28 May 2010 at 8:30

lexer line number attribute is not properly incremented

What steps will reproduce the problem?
1. Build a parser that reports lexer.lineno on error
2. Trigger an error

What is the expected output? What do you see instead?

Expected output is the line number of the error.  Actual output is 1. 

What version of the product are you using? On what operating system?

PLY 3.3

Please provide any additional information below.

Inspection of the source code reveals that the lexer's lineno attribute is 
initialized to 1, but never incremented as characters are read.  The following 
patch appears to fix the problem.

--- a/lex.py
+++ b/lex.py
@@ -309,6 +309,8 @@ class Lexer:
         lexdata   = self.lexdata

         while lexpos < lexlen:
+            if lexdata[lexpos] == '\n':
+                self.lineno += 1
             # This code provides some short-circuit code for whitespace, tabs, and other ignored characters
             if lexdata[lexpos] in lexignore:
                 lexpos += 1

Original issue reported on code.google.com by [email protected] on 20 Sep 2010 at 2:37

Negative indexes no longer work in PLY

What steps will reproduce the problem?
1. Use a query production like:

def p_query(p):
    '''query : query_part'''
    p[0] = Query(p[-1])

What is the expected output? What do you see instead?
This used to work in PLY 2.5.  I found that it no longer works in PLY 3.3 when 
I upgraded today.  Instead, I get the exception:
    AttributeError: YaccSymbol instance has no attribute 'value'

It is possible to use p[1] rather than p[-1], and perhaps this is actually a 
better way of doing it, but I thought it was worth noting since this was a 
change that broke existing code.

Original issue reported on code.google.com by [email protected] on 31 Jul 2010 at 5:48

Python's RE module causes the wrong Token to come out of the Lexifier

What steps will reproduce the problem?
1. Create Tokens, where one is a broadly generic version of the second
2. Declare them both as functions (probably not necessary, but it
guarantees priority) with the specific case listed first.
3. Try parsing a file with this lexical structure, to see it in action, use
the attached files please.

What is the expected output? What do you see instead?
When reaching the line that says #begin environment support, I expect a
K_BEGIN_ENV_SUPPORT token.  Instead the Parser gets a K_ANYTHING token, and
uses the wrong rule.

What version of the product are you using? On what operating system?
Version 2.3, downloaded straight from the website. Running Fedora 7
(tragically), using eclipse for development and debugging (version 3.2 with
thelatest release of pydev 1.3.4)  I made a small change, so that I can see
all the groups from the Match object, in lex.  I'm including my copy of
lex.py and yacc.py

Please provide any additional information below.
As far as I can tell, lex seems to compile all the regexs into one massive
regex, with parameters.  I plan on making a patch where it will instead
iterate through the regexs until one matches.  This will probably be
slower.  Please let me know if this is desired, or if there will be another
solution in the works.


Original issue reported on code.google.com by [email protected] on 25 Jun 2007 at 3:00

Attachments:

No notice the project has moved to github

I didn't know until I checked the main PLY website just now that the source is 
now hosted on github instead of here. I think the project summary here should 
be updated to reflect this.

Original issue reported on code.google.com by [email protected] on 23 Dec 2011 at 7:43

Patch for allowing custom LexToken class

I needed to have a filename attached to each token.
One hacky way is to replace the LexToken class with my own version after import 
(ie "lex.LexToken = MyTokenClass").
A better way would be to have the lexer construct tokens from a variable 
holding the class rather than a hardcoded name.
The attached patch realizes this.

My token class can copy the file name as part of the constructor.

It might be better to also have a function call that delivers the token data to 
the new token (eg the token constructor??).
That would solve the problem of deciding when it is safe to read eg 
token.lineno. I don't know how that would affect lexer speed though, so I left 
this change out.

Original issue reported on code.google.com by [email protected] on 9 Jun 2010 at 6:59

Attachments:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.