Coder Social home page Coder Social logo

splparser's People

Contributors

keroro824 avatar richzeng avatar salspaugh avatar stevedh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

splparser's Issues

Parser litters directories from which it's called with LALR table files named <cmd>_parsetab.py.

Background: LALR tables are required for parsing. Because construction of these tables is expensive, PLY writes them to a file the first time it builds them and references that on subsequent parser calls. By default this file is named ./parsetab.py. In splparser, the parser for each command writes its parse table to a file named ./_parsetab.py.

THE ISSUE WITH THIS is that any directory from which you call the parser becomes littered with dozens of such files. Also, bad but somewhat less important is that any time you move to a new directory, the expense of table creation is incurred again.

Proposed solution: Write all LALR table files to a single location in the file system (probably configured upon installation), for example, /etc/splparser/parsetabs/_parsetab.py.

The parsetree module should make accessing of erroneous ParseTreeNodes accessible.

A recent policy is that when parsing queries that do not match the documentation, we still continue to "accept" them by having a valid parser rule for them (within reason -- we shouldn't accept 'cat on the keyboard'-like garbage). BUT in addition to accepting them, we add set an "errror" flag on the relevant ParseTreeNode containing the erroneous part of the query to "true" i.e., p[0].error = true.

In order for this to be useful we need someway of indicating that the entire parse tree is from an erroneous query.

Viable options (not mutually exlusive) for this include:

  • Propagate the error up to the root, and have it set an attribute as flag like "self.haserrors = True", and possibly keep a pointer to the erroneous children.
  • Add a function to return true when a node or its subtree contains and error, and a function for printing out the error.

Warnings printed out when parsetabs created need to be addressed.

When the parsetabs are created you might notice warnings like:

WARNING: 53 shift/reduce conflicts
WARNING: Token 'OUTPUTNEW' defined, but not used
WARNING: Token 'ASUC' defined, but not used
WARNING: Token 'ASLC' defined, but not used
WARNING: Token 'OUTPUT' defined, but not used
WARNING: /Users/salspaugh/splparser/splparser/rules/common/valuerules.py:5: Rule 'value' defined, but not used
WARNING: There are 4 unused tokens
WARNING: There is 1 unused rule
WARNING: Symbol 'value' is unreachable

Two things should be done:
(1) Change the logging so that the warnings get written to the logs and not to stderr, so that we can identify which command is generating which warning.
(2) Fix all warnings except those dealing with shift/reduce conflicts.

@richzeng Since it looks like the dedup command is responsible for some of these warnings, do you want to take a stab at this? Another command you wrote might also be doing this since it's giving the same warnings, but since the warnings just get written to stderr without any indication which command is causing them, it's hard to be sure. The second part should be really easy to address and is really just a code cleanliness thing. The first part shouldn't be hard either, and is a usefulness thing. The question is whether you pass the debug / logging info to the parser when it's created or to the parse command or etc. Depending on which it is, it will involve changes to splparser/parser.py or splparser/decorators.py.

Failed test cases cascade when there's an error

The way most tests are structured right now causes errors to cascade to later tests. eg.

>>> parsed = splparser.parser.parse("blahblah1")
>>> parsed.print_tree()
# tree prints
>>> parsed = splparser.parser.parse("blahblah2")
# ERROR
>>> parsed.print_tree()
# prints the tree for blahblah1

We should fix this buy just changing the test cases to say

>>> splparser.parser.parse("blahblah1").print_tree()

Fix rex tests so that they don't cause a cascading error.

Most tests have been changed from:

parsed = splparser.parser.parse("search foo")
parsed.print_tree()

to:

splparser.parser.parse("search foo").print_tree()

Because the former way causes errors to cascade. Rex tests need to be changed to look like the latter to address this bug too.

Inconsistent search command output

The output of the search command when there is an argument like key=value is sometimes:

('EQ')
    ('KEY')
    ('VALUE')

and other times:
('KEY')
    ('VALUE')

This should be consistent across commands.

Individual command logging is not working.

Problem: Although each command sets up logging and specifies a file of the name parser.log, these log files are not written to. It appears that all debugging messages get logged to splparser.log instead.

Possible solutions:
(1) Each command should have their own logging: Fix it so that each command writes to their own log file. In this case, choose a single place in the filesystem where all command logging is written to (configured at installation time).
(2) All commands share logging: Although it's not clear why this works, it's maybe acceptable for all commands to write to splparser.log. In this case, all command parsers can have the useless duplicate code that (fails to) set up logging correctly removed from their modules.

search field after search command

eg.
| loadjob rt_scheduler__dscolandsa_U3BsdW5rX2Zvcl9BY3RpdmVEaXJlY3Rvcnk__RMD5771d74ce16fe2d13_at_1393942723_1106.1 | head 1 | tail 1| search search=""

dedup argument sequence

we only have field+optlist but not optlist+field
eg.
search eventtype=msad-successful-computer-logons user="$" dest_nt_domain="EDM"|table _time,host,src_ip|dedup consecutive=T src_ip|lookup SiteInfo host|table _time,src_ip,Site

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.