Coder Social home page Coder Social logo

process-forest's People

Contributors

davedittrich avatar hiddenillusion avatar kolayne avatar matthewdunwoody avatar poorbillionaire avatar williballenthin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

process-forest's Issues

Handle lxml unicode error

I have a few evtx files were the record_xml being passed to to_lxml() is unicode, with no odd characters. LXML (3.7.3) spits the following out:

$ python process-forest/src/process_forest.py SYSMON%4OPERATIONAL.EVTX summary
INFO:process-forest.global:using evtx log file
DEBUG:Evtx.Evtx:FILE HEADER at 0x0.
DEBUG:Evtx.Evtx:CHUNK HEADER at 0x1000.
DEBUG:Evtx.Evtx:Record at 0x1200.
Traceback (most recent call last):
  File "process-forest/src/process_forest.py", line 483, in <module>
    main()
  File "process-forest/src/process_forest.py", line 461, in main
    analyzer.analyze(get_entries_with_eids(evtx, set([4688, 4689, 1, 5])))
  File "process-forest/src/process_forest.py", line 211, in analyze
    for entry in entries:
  File "process-forest/src/process_forest.py", line 193, in get_entries_with_eids
    for entry in get_entries(evtx):
  File "process-forest/src/process_forest.py", line 183, in get_entries
    yield Entry(xml, record)
  File "process-forest/src/process_forest.py", line 73, in __init__
    self._node = to_lxml(self._xml)
  File "process-forest/src/process_forest.py", line 25, in to_lxml
    record_xml.replace("xmlns=\"http://schemas.microsoft.com/win/2004/08/events/event\"", ""))
  File "src/lxml/lxml.etree.pyx", line 3213, in lxml.etree.fromstring (src/lxml/lxml.etree.c:79010)
  File "src/lxml/parser.pxi", line 1843, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:118282)
ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.

One possible solution, at least for this instance, is to wrap return in to_lxml(), catch the ValueError & manually set the etree.XMLParser to use utf-8 encoding and manually encode record_lxml.replace() as well

...
    except ValueError:
        utf8_parser = etree.XMLParser(encoding='utf-8')
        return etree.fromstring("<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\" ?>%s" % record_xml.replace("xmlns=\"http://schemas.microsoft.com/win/2004/08/events/event\"", "").encode('utf-8'), parser=utf8_parser)

or just manually encode w/o defining a new parser

...
return etree.fromstring("<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"yes\" ?>%s" % record_xml.replace("xmlns=\"http://schemas.microsoft.com/win/2004/08/events/event\"", "").encode('utf-8'))

`TypeError` during serialization

(I have made some other changes to the code before testing this, so I am not 100% sure this is correct, but it should be)

When main is asked to serialize a file, it creates a file with "wb" mode and gives it to ProcessTreeAnalyzer.serialize as an argument. But ProcessTreeAnalyzer.serialize tries to write the result of json.dumps to it, while (according to the documentation) it returns a string, which results in the following error:

Traceback (most recent call last):
  File "process-forest/src/process_forest.py", line 504, in <module>
    main()
  File "process-forest/src/process_forest.py", line 498, in main
    analyzer.serialize(f)
  File "process-forest/src/process_forest.py", line 339, in serialize
    f.write(s)
TypeError: a bytes-like object is required, not 'str'

DEBUG:Evtx.Evtx:Record at 0x....

Dear,

With the minute patch in place where a b is inserted on line 25 i now run into unexpected output.

INFO:process-forest.global:using evtx log file
DEBUG:Evtx.Evtx:FILE HEADER at 0x0.
DEBUG:Evtx.Evtx:CHUNK HEADER at 0x1000.
DEBUG:Evtx.Evtx:Record at 0x1200.
DEBUG:Evtx.Evtx:Record at 0x1c10.
DEBUG:Evtx.Evtx:Record at 0x20b8.
DEBUG:Evtx.Evtx:Record at 0x2370.
DEBUG:Evtx.Evtx:Record at 0x2a80.
....
DEBUG:Evtx.Evtx:Record at 0x10fc8.
DEBUG:Evtx.Evtx:CHUNK HEADER at 0x11000.
DEBUG:Evtx.Evtx:Record at 0x11200.
.....

this with the latest master version of process-forest

Add setup.py

It would be nice to see a setup.py and possible see this published to the python package repo. At the very least it would make it install-able via pip using git+ssh.

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.