halpo / parser Goto Github PK
View Code? Open in Web Editor NEWR parser package
R parser package
Parser does not differentiate between the double <<-
from the single <-
.
Brian Ripley informed me that the parser package does computed assignment at load time,this is strongly discouraged in the Writing R extensions manual. Remove if possible.
A token following an if
clause can be recorded twice, and the output will contain an empty terminal token. For example, for
{
if (x) 2
f(x)
}
the result looks like:
token id parent top_level token.desc terminal text
1 123 1 35 0 '{' TRUE {
2 272 4 18 0 IF TRUE if
3 40 6 18 0 '(' TRUE (
4 263 7 9 0 SYMBOL TRUE x
5 41 8 18 0 ')' TRUE )
6 78 9 18 0 expr FALSE
7 261 12 13 0 NUM_CONST TRUE 1
8 78 13 18 0 expr FALSE
9 263 16 35 0 SYMBOL TRUE f
10 78 18 35 0 expr FALSE
11 297 21 23 0 SYMBOL_FUNCTION_CALL TRUE
12 40 22 29 0 '(' TRUE (
13 78 23 29 0 expr FALSE
14 263 24 26 0 SYMBOL TRUE x
15 41 25 29 0 ')' TRUE )
16 78 26 29 0 expr FALSE
17 78 29 35 0 expr FALSE
18 125 33 35 0 '}' TRUE }
19 78 35 0 0 expr FALSE
Line 9 is redundant and f
should appear on line 11.
parser tries to attach "id" attributes to symbols and NULLs. However, a "symbol" is, like an environment, only a reference to a single global entity (even though the symbol will be interpreted differently according to context when R is executing code containing it). Thus, putting an attribute on one instance of 'x' will put the same attribute on ALL occurrences of 'x'!
the example below looks funny because Git has decided to apply "git-flavoured markdown" to it, but the point should be clear... Example:
p <- parser( text='x;x')
p[[1]]
x
attr(,"id")
[1] 8
p[[2]]
x
attr(,"id")
[1] 8Uh-oh; p[[2]] shouldn't have same tag (id attr) as p[[1]]! Look-up now doesn't work.
quote( x)
x
attr(,"id")
[1] 8Big uh-oh; the symbol 'x' has globally acquired a tag!
Also, parser( text='NULL') tries to put a tag on the NULL but can't, because NULL can't have attributes. Again, there is only one "NULL" .
My suggested solution is not to tag each element directly, but instead to tag its parent element with a vector of child tags, one element for each child. NULL and symbols can't have children, so this will never try to tag illegal things AFAICS.
from Prof. Brian Ripley:
For a package with a namespace (all these days), .C etc are allowed
without a PACKAGE argument to refer to entry points in the package,
but that was never checked.
I wrote some diagnostic code, and the namespace identification is not
doing a particularly good job. For example calls like
drop(.C(...))
do.call(".C", args)
are all missed, including some in your package (and ca 25 others). We
will do better in R-devel, but if you want the package to work well in
earlier versions you should add PACKAGE arguments.
An even better idea is to register your entry points and refer to them
as R objects. This is done extensively in R itself, so for example
fisher.test contains
else PVAL <- .C(C_fexact, nr, nc, x, nr, as.double(-1),
as.double(100), as.double(0), double(1), p = double(1),
as.integer(workspace), mult = as.integer(mult), PACKAGE = "stats")$p
(and the PACKAGE = "stats" is not actually needed).
I encountered a strange issue with non-ASCII characters while using (the really awesome) parser
. E.g. let us check out how a formula is parsed:
> parser::parser(text='hp ~ wt')
expression(hp ~ wt)
attr(,"data")
line1 col1 byte1 line2 col2 byte2 token id parent top_level token.desc
1 1 0 0 1 2 2 263 1 4 0 SYMBOL
2 1 3 3 1 4 4 126 3 9 0 '~'
3 1 0 0 1 2 2 77 4 9 0 expr
4 1 5 5 1 7 7 263 6 8 0 SYMBOL
5 1 5 5 1 7 7 77 8 9 0 expr
6 1 0 0 1 7 7 77 9 0 0 expr
terminal text
1 TRUE hp
2 TRUE ~
3 FALSE
4 TRUE wt
5 FALSE
6 FALSE
attr(,"file")
[1] "/tmp/RtmpHBZpw0/file62187b7f21c0"
attr(,"encoding")
[1] "unknown"
attr(,"class")
Is pretty cool, but when I try to parse a formula with some accented chars, I get this:
> parser::parser(text='hp ~ é')
Error in parser::parser(text = "hp ~ é") :
/tmp/RtmpHBZpw0/file6218234db34e:1:5
unexpected input
This does not happen with base::parse
:
> parse(text='hp ~ é')
expression(hp ~ é)
When I do
devtools::install_github("halpo/parser")
I get this error.
gram.y:248:28: fatal error: R_ext/rlocale.h: No such file or directory
How do I install?
What ENVIRONMENT variables do I need to setup.
I know I can get the R language source code from here.
https://github.com/wch/r-source/blob/b156e3a711967f58131e23c1b1dc1ea90e2f0c43/src/include/rlocale.h
Thank you.
Andre Mikulec
[email protected]
I am working on an R code optimizer project, in which I am trying to use as much as possible R base packages.
The utils::getParseData
functions is being really useful for my tasks. However, it is getting hard to decipher what each token means. So far, the best I could find was the grammar definition found in this repo.
@halpo , @romainfrancois , @jimhester and @dmurdoch as you were working with parsing-related projects, could any of you provide me of documentation regarding to which R expression each token maps?
Sorry for the inconvenience.
Thank you very much!
Rodriguez, Juan Cruz
When I try to install, I get the following error
> devtools::install_github("halpo/parser")
Downloading GitHub repo halpo/parser@master
from URL https://api.github.com/repos/halpo/parser/zipball/master
Installing parser
"W:/R-3.4._/App/R-Portable/bin/x64/R" --no-site-file --no-environ --no-save \
--no-restore --quiet CMD INSTALL \
"W:/R-3.4._/R_USER_3.4.__R_STUDIO/AppData/Local/Temp/Rtmpme9vf0/devtools30e06fdf2d7f/halpo-parser-910d1fb" \
--library="W:/R-3.4._/R_LIBS_USER_3.4._" --with-keep.source --install-tests
[1] "W:/R-3.4._/R_LIBS_USER_3.4._" "W:/R-3.4._/App/R-Portable/library"
* installing *source* package 'parser' ...
** libs
*** arch - i386
W:/Rtools34/mingw_32/bin/g++ -I"W:/R-34~1._/App/R-PORT~1/include" -DNDEBUG -I. [1] "W:/R-3.4._/R_LIBS_USER_3.4._" "W:/R-3.4._/App/R-Portable/library" -IW:/R-3.4._/R_LIBS_USER_3.4._/Rcpp/include -I"W:/R-3.4._/R_LIBS_USER_3.4._/Rcpp/include" -I"d:/Compiler/gcc-4.9.3/local330/include" -O2 -Wall -mtune=core2 -c Module.cpp -o Module.o
g++.exe: error: [1]: No such file or directory
make: *** [Module.o] Error 1
Warning: running command 'make -f "Makevars.win" -f "W:/R-34~1._/App/R-PORT~1/etc/i386/Makeconf" -f "W:/R-34~1._/App/R-PORT~1/share/make/winshlib.mk" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="parser.dll" OBJECTS="Module.o files.o gram.o io.o stretchyList.o tokens.o toplevel.o utf8.o"' had status 2
ERROR: compilation failed for package 'parser'
* removing 'W:/R-3.4._/R_LIBS_USER_3.4._/parser'
Installation failed: Command failed (1)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.