Coder Social home page Coder Social logo

charlieyqin / yextend Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bayshorenetworks/yextend

0.0 2.0 0.0 94.66 MB

Yara integrated software to handle archive file data.

License: BSD 3-Clause "New" or "Revised" License

Makefile 1.49% Shell 0.24% C++ 48.78% M4 1.02% C 38.41% Python 7.28% JavaScript 2.70% ASP 0.07%

yextend's Introduction

yextend

Yara integrated software to handle archive file data.

yextend was written for the sake of augmenting yara. yara by itself is great but we realized that it could not natively handle archived content in the granular way that we needed it to. For instance, if we were hunting for malware and it happened to be buried a few levels into archived content, yara in its native form could not help us. So what we have done is natively handle the inflation of archived content. And we pass the inflated content of each discovered resource to yara so that it can work its magic natively on one file's payload. Then yara does what it does quite well in terms of pattern matching and such based on a given set of rules.

Notes:

  • (08/30/2016) yextend version 1.5

    • native parsing for PDF documents such that the Yara rulesets are now running against:
      • the target PDF's binary data set
      • the raw text (extracted out of the binary data set) from the target PDF
    • file type detection has been re-written so that it is now performed against a Yara ruleset
    • added some initial debug code (based on bool flag)
    • more test files added
    • nose based unit tests added
    • yextend is now licensed via the BSD (3-clause) license
  • (03/18/2016) yextend version 1.4 - output enhancements and runtime helper prog

    • output now includes the offset and string definition identifier for every hit reported by Yara
    • output now includes the name of the Yara ruleset file at hand
    • initial release of run_yextend prog
  • (10/24/2015) yextend version 1.3 will only work with yara 3.4.

    • if your rules have data in the 'meta' section they will now show up in the output, take a look at 'RULEWITHMETA' below and you will see an example of such output
  • (05/28/2015) yextend version 1.2 will only work with yara 3.3 and above

  • This software was written and tested on Linux (both Fedora and Debian). Ports to other platforms are currently TBD.

  • If a dir (and not a file) is passed in then this version will process all of the files at that top level. Subdirectories are not processed yet.

Requirements to build and run:

  • g++ (GNU c++ compiler)
  • autoconf 2.69 or above
  • openssl devel lib (sudo yum install openssl-devel or sudo apt-get install libssl-dev)
  • zlib devel lib (sudo yum install zlib-devel or sudo apt-get install zlib1g-dev)
  • libarchive (v4) be installed (sudo yum install libarchive-devel or sudo apt-get install libarchive-dev)
  • pcrecpp (sudo yum install pcre-devel or sudo apt-get install libpcre3-dev)
  • uuid (sudo yum install uuid-devel or sudo apt-get install uuid-dev libuuid-devel)
  • pdf2text (sudo yum install poppler-utils or sudo apt-get install poppler-utils)
  • yara v3.X be fully installed
  • if you are running yara pre-version 3.1.X then yara v3 lib header files to be moved to a specific location after a typical yara install, steps: A. cd into the dir where you extracted yara (for this example I will use "/tmp/yara") B. sudo cp /tmp/yara/libyara/include/yara/* /usr/local/include/yara/

Instructions:

1 - Make sure all requirements set forth above are met

2 - Extract our software in the directory of your choice (referred to as THEDIR from now on)

- cd THEDIR
- tar -xvzf yextend.tar.gz

3 - Build (make sure the executable bit is set on the file system for build.sh):

- ./build.sh

4 - Test (optional)

- make unittests

5 - Run:

- 2 options to run:

A. use executable run_yextend - it wraps the native yextend executable. To run:

- the program 'run_yextend' takes in 2 arguments:

	1. A yara ruleset file or directory of ruleset files
	2. A file name or a directory of target files

usage:

	- ./run_yextend rule_entity target_file_entity

	***** make sure the executable bit is set on the file system for run_yextend *****

B. run yextend executable - prefix the run statement by telling LD_LIBRARY_PATH where the yara shared object lib (or its symlink) is. If you changed nothing during the yara install then that value is '/usr/local/lib'

- the program 'yextend' takes in 2 arguments:

	1. A yara ruleset file
	2. A file name or a directory name where the target files exist

example:

	- LD_LIBRARY_PATH=/usr/local/lib ./yextend ~/Desktop/bayshore.yara.rules /tmp/targetfiles/filex
	- LD_LIBRARY_PATH=/usr/local/lib ./yextend ~/Desktop/bayshore.yara.rules /tmp/targetfiles/
	
	***** 
		if you don't want to set LD_LIBRARY_PATH on each prog run then you can set it in your .bashrc (or equivalent on your system) as such:
	
		export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
		
		then the program runs would be as such:

			- ./yextend ~/Desktop/bayshore.yara.rules /tmp/targetfiles/filex
			- ./yextend ~/Desktop/bayshore.yara.rules /tmp/targetfiles/			
	*****

6 - Analyze output. The output will be structured as such (number of result stanzas will obviously vary based on the content at hand):

===============================ALPHA===================================
Ruleset File Name: w
File Name: x
File Size: y
File Signature (MD5): z

=======================================================================

Yara Result(s): RULE1:[detected offsets=0x48:$a,hit_count=1]
Scan Type: a
Parent File Name: b
Child File Name: c
File Signature (MD5): d


Yara Result(s): RULE2, RULEWITHMETA:[type=Simple string search,signature=Looking for 95,author=Dre,testint=9,d=false,detected offsets=0x0:$a-0x1:$a-0x2:$a-0x3:$a,hit_count=4], RULE3
Scan Type: v
Parent File Name: x
Child File Name: y
File Signature (MD5): z


===============================OMEGA===================================

For each result stanza you will be presented with a listing of the relevant Yara rules that hit. This will be in the key/value pair with key "Yara Result(s)". The value here is formatted as follows:

	RULE_ID:[META_DATA,OFFSETS,HIT_COUNT], RULE_ID[OFFSETS,HIT_COUNT], ... RULE_ID[OFFSETS,HIT_COUNT]

	Inside the square brackets:
	
		The "META_DATA" are comma-separated and represent the data from your rules "meta" section. As this is optional you may see no meta-data output from yextend.
		The OFFSETS are comma-separated. Each offset listing is delimited by a dash and represent the offset in the content where the rule hit took place and the respective string definition identifier from your Yara rule.
		The HIT_COUNT is simply a listing of the number of rule hits.

	Here is an example:

		EXE_DROP:[detected offsets=0x4e:$a,hit_count=1], MZ_PORTABLE_EXE, S95:[detected offsets=0xd94e:$a-0x2c5b5:$b,hit_count=2]

	Breakdown: In this example there are 3 rules that hit:

		RULE_ID: EXE_DROP
		META_DATA: No meta data listed as there is none in the rule
		OFFSETS: detected offsets=0x4e:$a ("0x4e" is the offset in the content and "$a" is the respective string definition identifier)
		HIT_COUNT: hit_count=1

		RULE_ID: MZ_PORTABLE_EXE (Nothing else is listed because this rule only has the required "condition" section)

		RULE_ID: S95
		META_DATA: No meta data listed as there is none in the rule
		OFFSETS: detected offsets=0xd94e:$a-0x2c5b5:$b ("0xd94e" is one offset in the content that hit on string definition "$a". "0x2c5b5" is another offset in the content that hit on string definition "$b")
		HIT_COUNT: hit_count=2

	
	
A. example output from one of the test files:

	===============================ALPHA===================================
	Ruleset File Name: test_rules/ruleset_blah
	File Name: test_files/rands_tarball.tar.gz
	File Size: 271386
	File Signature (MD5): 74edc10648f6d65e90cd859120eaa31b
	
	=======================================================================
	
	Yara Result(s): JUDO:[detected offsets=0x48:$a,hit_count=1]
	Scan Type: Yara Scan (ASCII Text File) inside GZIP Archive file
	Parent File Name: AAA.gz
	Child File Name: AAA
	File Signature (MD5): bf6aadaf4b6fb726040c1131d809bfc2
	
	
	Yara Result(s): EXE_DROP:[detected offsets=0x4e:$a,hit_count=1], MZ_PORTABLE_EXE, S95:[detected offsets=0xd94e:$a-0x2c5b5:$a,hit_count=2]
	Scan Type: Yara Scan (Windows - Portable Executable) inside ZIP 2.0 (deflation) file
	Parent File Name: rand987.zip
	Child File Name: spoolsy.exe
	File Signature (MD5): 25ca7beed707e94ce70137f7d0c7b14e
	
	
	===============================OMEGA===================================
	
	
	This is based on file "test_files/rands_tarball.tar.gz" that has the following structure:
	
	rands_tarball.tar.gz
	|_ rands.tar
	   |_ rand123.zip
	      |_ AAA.gz
	         |_ AAA
	      rand987.zip
	      |_ spoolsy.exe


B. another more complex example using one of the test files:


	===============================ALPHA===================================
	Ruleset File Name: test_rules/ruleset_blah
	File Name: test_files/step1-zips.tar.gz
	File Size: 2400255
	File Signature (MD5): 98178b84fd9280fa1ed469c6512cd0ee

	=======================================================================

	Yara Result(s): EXE_DROP:[type=EXE DROP detection,signature=Some test sig,uuid=ce354f47467b31e4addcf4b7e2c79a19,detected offsets=0x4e:$a,hit_count=1], MZ_PORTABLE_EXE, S95:[type=Simple string search,signature=Looking for 95,author=Dre,detected offsets=0xd94e:$a-0x2c5b5:$a,hit_count=2]
	Scan Type: Yara Scan (Windows - Portable Executable) inside POSIX ustar format file
	Parent File Name: test_files/step1-zips.tar
	Child File Name: spoolsy.exe
	File Signature (MD5): 25ca7beed707e94ce70137f7d0c7b14e


	Yara Result(s): CTXTESTXML:[type=nocase 'XML' search,signature=Some test XML sig,testint=9,d=false,detected offsets=0x2:$a-0x40:$a-0x5a:$a-0xbf:$a-0x154:$a-0x175:$a-0x18f:$a-0x1ad:$a-0x1d6:$a-0x1fc:$a-0x235:$a-0x25c:$a-0x282:$a-0x2b4:$a-0x2dc:$a-0x302:$a-0x330:$a-0x359:$a-0x37f:$a-0x3b3:$a-0x3f4:$a-0x453:$a-0x479:$a-0x499:$a-0x4c3:$a-0x4e9:$a-0x51e:$a-0x54a:$a-0x570:$a-0x5a7:$a-0x5d0:$a-0x5f6:$a-0x619:$a,hit_count=33]
	Scan Type: Yara Scan (XML Document) inside ZIP 2.0 (deflation) file
	Parent File Name: Lorem-winlogon-spoolsy-pptx.docx
	Child File Name: [Content_Types].xml
	File Signature (MD5): 64a53ef2e6b1b9ed389232cc69898289

	Yara Result(s): S95:[type=Simple string search,signature=Looking for 95,author=Dre,detected offsets=0x2ee:$a-0x1698:$a,hit_count=2], CTXTESTXML:[type=nocase 'XML' search,signature=Some test XML sig,testint=9,d=false,detected offsets=0x2:$a-0x42:$a-0x5e:$a-0x1980:$a-0x19c4:$a,hit_count=5]
	Scan Type: Yara Scan (XML Document) inside ZIP 2.0 (deflation) file
	Parent File Name: Lorem-winlogon-spoolsy-pptx.docx
	Child File Name: word/theme/theme1.xml
	File Signature (MD5): 3badbd456f490c89a3a9a118c0eb9aca

	...

	Yara Result(s): CTXTESTXML:[type=nocase 'XML' search,signature=Some test XML sig,testint=9,d=false,detected offsets=0x2:$a-0x49:$a-0x65:$a-0x89:$a-0xa5:$a-0xd7:$a-0xf3:$a-0xd0d:$a-0xd16:$a,hit_count=9]
	Scan Type: Yara Scan (XML Document) inside ZIP 2.0 (deflation) file
	Parent File Name: word/embeddings/Microsoft_Office_PowerPoint_Presentation1.pptx
	Child File Name: ppt/presentation.xml
	File Signature (MD5): c312ab4768729234a74c94a67ccce68d


	Yara Result(s): LOREM:[detected offsets=0x31b:$a-0x4881:$a,hit_count=2], CTXTESTXML:[type=nocase 'XML' search,signature=Some test XML sig,testint=9,d=false,detected offsets=0x2:$a-0x40:$a-0x5c:$a-0x80:$a-0x9c:$a-0xce:$a-0xea:$a-0x5d07:$a-0x5d10:$a,hit_count=9]
	Scan Type: Yara Scan (XML Document) inside ZIP 2.0 (deflation) file
	Parent File Name: word/embeddings/Microsoft_Office_PowerPoint_Presentation1.pptx
	Child File Name: ppt/slides/slide2.xml
	File Signature (MD5): f8625ea3f7e5c1e10ca78b9dc72ebcf5


	Yara Result(s): CTXTESTXML:[type=nocase 'XML' search,signature=Some test XML sig,testint=9,d=false,detected offsets=0x2:$a-0x40:$a-0x5c:$a-0x80:$a-0x9c:$a-0xce:$a-0xea:$a-0x587:$a-0x590:$a,hit_count=9]
	Scan Type: Yara Scan (XML Document) inside ZIP 2.0 (deflation) file
	Parent File Name: word/embeddings/Microsoft_Office_PowerPoint_Presentation1.pptx
	Child File Name: ppt/slides/slide3.xml
	File Signature (MD5): 0bd56ae94bb2d043477c4d2189c44723


	Yara Result(s): CTXTESTXML:[type=nocase 'XML' search,signature=Some test XML sig,testint=9,d=false,detected offsets=0x2:$a-0x40:$a-0x5c:$a-0x80:$a-0x9c:$a-0xce:$a-0xea:$a-0x587:$a-0x590:$a,hit_count=9]
	Scan Type: Yara Scan (XML Document) inside ZIP 2.0 (deflation) file
	Parent File Name: word/embeddings/Microsoft_Office_PowerPoint_Presentation1.pptx
	Child File Name: ppt/slides/slide1.xml
	File Signature (MD5): 01b2314a10da34be2161e2057198e5ef

	...

	Yara Result(s): S95:[type=Simple string search,signature=Looking for 95,author=Dre,detected offsets=0xa48:$a-0xd21:$a,hit_count=2], CTXTESTXML:[type=nocase 'XML' search,signature=Some test XML sig,testint=9,d=false,detected offsets=0x2:$a-0x46:$a-0x62:$a-0x86:$a-0xa2:$a-0xd4:$a-0xf0:$a-0x349:$a-0x352:$a,hit_count=9]
	Scan Type: Yara Scan (XML Document) inside ZIP 2.0 (deflation) file
	Parent File Name: word/embeddings/Microsoft_Office_PowerPoint_Presentation1.pptx
	Child File Name: ppt/slideLayouts/slideLayout16.xml
	File Signature (MD5): 51f0848ae9a1c864fe467bddac461cde


	Yara Result(s): S95:[type=Simple string search,signature=Looking for 95,author=Dre,detected offsets=0x6c0:$a-0x999:$a,hit_count=2], CTXTESTXML:[type=nocase 'XML' search,signature=Some test XML sig,testint=9,d=false,detected offsets=0x2:$a-0x46:$a-0x62:$a-0x86:$a-0xa2:$a-0xd4:$a-0xf0:$a-0x33f:$a-0x348:$a,hit_count=9]
	Scan Type: Yara Scan (XML Document) inside ZIP 2.0 (deflation) file
	Parent File Name: word/embeddings/Microsoft_Office_PowerPoint_Presentation1.pptx
	Child File Name: ppt/slideLayouts/slideLayout6.xml
	File Signature (MD5): eefa70fd1d867683d9a4ad3e36b4b9fa


	===============================OMEGA===================================





	This is based on file "test_files/step1-zips.tar.gz" that has the following structure:
	
	step1-zips.tar.gz
	|_ docx-embedded-step3.xlsx.zip
	   |_ docx-embedded-step3.xlsx
	      |_ Lorem-winlogon-spoolsy-pptx.docx
	         |_ winlogon.exe
	         |_ spoolsy.exe
	         |_ pptx-plain.pptx
	|_ Lorem-winlogon-spoolsy-pptx.docx.zip
	   |_ Lorem-winlogon-spoolsy-pptx.docx
	      |_ winlogon.exe
	      |_ spoolsy.exe
	      |_ pptx-plain.pptx
	|_ spoolsy.exe

yextend's People

Contributors

dre avatar cecobayshore avatar garbagecat avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.