Coder Social home page Coder Social logo

parser's People

Contributors

halpo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

parser's Issues

Remove Computed Assignment at load time

Brian Ripley informed me that the parser package does computed assignment at load time,this is strongly discouraged in the Writing R extensions manual. Remove if possible.

Non-top level `if` causes wrong parse tree

A token following an if clause can be recorded twice, and the output will contain an empty terminal token. For example, for

{
    if (x) 2
    f(x)
}

the result looks like:

   token id parent top_level           token.desc terminal text
1    123  1     35         0                  '{'     TRUE    {
2    272  4     18         0                   IF     TRUE   if
3     40  6     18         0                  '('     TRUE    (
4    263  7      9         0               SYMBOL     TRUE    x
5     41  8     18         0                  ')'     TRUE    )
6     78  9     18         0                 expr    FALSE     
7    261 12     13         0            NUM_CONST     TRUE    1
8     78 13     18         0                 expr    FALSE     
9    263 16     35         0               SYMBOL     TRUE    f
10    78 18     35         0                 expr    FALSE     
11   297 21     23         0 SYMBOL_FUNCTION_CALL     TRUE     
12    40 22     29         0                  '('     TRUE    (
13    78 23     29         0                 expr    FALSE     
14   263 24     26         0               SYMBOL     TRUE    x
15    41 25     29         0                  ')'     TRUE    )
16    78 26     29         0                 expr    FALSE     
17    78 29     35         0                 expr    FALSE     
18   125 33     35         0                  '}'     TRUE    }
19    78 35      0         0                 expr    FALSE

Line 9 is redundant and f should appear on line 11.

id attr on symbols and NULLs

parser tries to attach "id" attributes to symbols and NULLs. However, a "symbol" is, like an environment, only a reference to a single global entity (even though the symbol will be interpreted differently according to context when R is executing code containing it). Thus, putting an attribute on one instance of 'x' will put the same attribute on ALL occurrences of 'x'!

the example below looks funny because Git has decided to apply "git-flavoured markdown" to it, but the point should be clear... Example:

p <- parser( text='x;x')
p[[1]]
x
attr(,"id")
[1] 8
p[[2]]
x
attr(,"id")
[1] 8

Uh-oh; p[[2]] shouldn't have same tag (id attr) as p[[1]]! Look-up now doesn't work.

quote( x)
x
attr(,"id")
[1] 8

Big uh-oh; the symbol 'x' has globally acquired a tag!

Also, parser( text='NULL') tries to put a tag on the NULL but can't, because NULL can't have attributes. Again, there is only one "NULL" .

My suggested solution is not to tag each element directly, but instead to tag its parent element with a vector of child tags, one element for each child. NULL and symbols can't have children, so this will never try to tag illegal things AFAICS.

.Call improperly formed

from Prof. Brian Ripley:

For a package with a namespace (all these days), .C etc are allowed
without a PACKAGE argument to refer to entry points in the package,
but that was never checked.

I wrote some diagnostic code, and the namespace identification is not
doing a particularly good job. For example calls like

 drop(.C(...))
 do.call(".C", args)

are all missed, including some in your package (and ca 25 others). We
will do better in R-devel, but if you want the package to work well in
earlier versions you should add PACKAGE arguments.

An even better idea is to register your entry points and refer to them
as R objects. This is done extensively in R itself, so for example
fisher.test contains

    else PVAL <- .C(C_fexact, nr, nc, x, nr, as.double(-1), 
        as.double(100), as.double(0), double(1), p = double(1), 
        as.integer(workspace), mult = as.integer(mult), PACKAGE = "stats")$p

(and the PACKAGE = "stats" is not actually needed).

Non-ASCII characters

I encountered a strange issue with non-ASCII characters while using (the really awesome) parser. E.g. let us check out how a formula is parsed:

> parser::parser(text='hp ~ wt')
expression(hp ~ wt)
attr(,"data")
  line1 col1 byte1 line2 col2 byte2 token id parent top_level token.desc
1     1    0     0     1    2     2   263  1      4         0     SYMBOL
2     1    3     3     1    4     4   126  3      9         0        '~'
3     1    0     0     1    2     2    77  4      9         0       expr
4     1    5     5     1    7     7   263  6      8         0     SYMBOL
5     1    5     5     1    7     7    77  8      9         0       expr
6     1    0     0     1    7     7    77  9      0         0       expr
  terminal text
1     TRUE   hp
2     TRUE    ~
3    FALSE     
4     TRUE   wt
5    FALSE     
6    FALSE     
attr(,"file")
[1] "/tmp/RtmpHBZpw0/file62187b7f21c0"
attr(,"encoding")
[1] "unknown"
attr(,"class")

Is pretty cool, but when I try to parse a formula with some accented chars, I get this:

> parser::parser(text='hp ~ é')
Error in parser::parser(text = "hp ~ é") : 
/tmp/RtmpHBZpw0/file6218234db34e:1:5
        unexpected input

This does not happen with base::parse:

> parse(text='hp ~ é')
expression(hp ~ é)

devtools::install_github("halpo/parser") fatal error: R_ext/rlocale.h: No such file or directory

When I do
devtools::install_github("halpo/parser")

I get this error.
gram.y:248:28: fatal error: R_ext/rlocale.h: No such file or directory
How do I install?
What ENVIRONMENT variables do I need to setup.

I know I can get the R language source code from here.
https://github.com/wch/r-source/blob/b156e3a711967f58131e23c1b1dc1ea90e2f0c43/src/include/rlocale.h

Thank you.
Andre Mikulec
[email protected]

Request: grammar tokens documentation.

I am working on an R code optimizer project, in which I am trying to use as much as possible R base packages.
The utils::getParseData functions is being really useful for my tasks. However, it is getting hard to decipher what each token means. So far, the best I could find was the grammar definition found in this repo.
@halpo , @romainfrancois , @jimhester and @dmurdoch as you were working with parsing-related projects, could any of you provide me of documentation regarding to which R expression each token maps?

Sorry for the inconvenience.
Thank you very much!
Rodriguez, Juan Cruz

install attempt: g++.exe: error: [1]: No such file or directory

When I try to install, I get the following error

> devtools::install_github("halpo/parser")
Downloading GitHub repo halpo/parser@master
from URL https://api.github.com/repos/halpo/parser/zipball/master
Installing parser
"W:/R-3.4._/App/R-Portable/bin/x64/R" --no-site-file --no-environ --no-save  \
  --no-restore --quiet CMD INSTALL  \
  "W:/R-3.4._/R_USER_3.4.__R_STUDIO/AppData/Local/Temp/Rtmpme9vf0/devtools30e06fdf2d7f/halpo-parser-910d1fb"  \
  --library="W:/R-3.4._/R_LIBS_USER_3.4._" --with-keep.source --install-tests

[1] "W:/R-3.4._/R_LIBS_USER_3.4._"      "W:/R-3.4._/App/R-Portable/library"
* installing *source* package 'parser' ...
** libs

*** arch - i386
W:/Rtools34/mingw_32/bin/g++  -I"W:/R-34~1._/App/R-PORT~1/include" -DNDEBUG -I. [1] "W:/R-3.4._/R_LIBS_USER_3.4._"      "W:/R-3.4._/App/R-Portable/library" -IW:/R-3.4._/R_LIBS_USER_3.4._/Rcpp/include -I"W:/R-3.4._/R_LIBS_USER_3.4._/Rcpp/include"   -I"d:/Compiler/gcc-4.9.3/local330/include"     -O2 -Wall  -mtune=core2 -c Module.cpp -o Module.o
g++.exe: error: [1]: No such file or directory
make: *** [Module.o] Error 1
Warning: running command 'make -f "Makevars.win" -f "W:/R-34~1._/App/R-PORT~1/etc/i386/Makeconf" -f "W:/R-34~1._/App/R-PORT~1/share/make/winshlib.mk" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="parser.dll" OBJECTS="Module.o files.o gram.o io.o stretchyList.o tokens.o toplevel.o utf8.o"' had status 2
ERROR: compilation failed for package 'parser'
* removing 'W:/R-3.4._/R_LIBS_USER_3.4._/parser'
Installation failed: Command failed (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.