Coder Social home page Coder Social logo

emersonelectricco / fsf Goto Github PK

View Code? Open in Web Editor NEW
282.0 35.0 49.0 1.13 MB

File Scanning Framework

License: Apache License 2.0

Python 81.96% JSONiq 2.25% Dockerfile 5.19% YARA 10.60%
security-tools network-defense file-analysis file-analyzer yara

fsf's People

Contributors

akniffe1 avatar bryant1410 avatar compsecmonkey avatar dcode avatar jxb5151 avatar spartan782 avatar teetam avatar wzod avatar zachsis avatar zcatbear avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fsf's Issues

Automated Docker Image Build

It would be great to take PR #60 and have the repo owners create an automatic Docker image build of FSF that fires off when you folks push an update to master. If you need help setting it up, let me or @wzod know. Here's the docs: https://docs.docker.com/docker-hub/builds/

We'd do it ourselves but you need to have rw access to the repo to enable this.

Then there would be a public image always up to date with FSF ready to go!

ft_macho yara signature has matching string condition to ft_java_class

while working on #40 I noticed that ft_macho was firing on a java class file:

fsfclient datar/fsf_dump_1485954008_9700feb2e081ce6a0eb9d8d6c10604e7/ 
{
    "Scan Time": "2017-02-02 12:27:12.296612", 
    "Filename": "", 
    "Source": "Analyst", 
    "Object": {
        "META_BASIC_INFO": {
            "MD5": "", 
            "SHA1": "ff24ac9300e4fca38bb44b8750d393ce3babd79b", 
            "SHA256": "", 
            "SHA512": "12ca384903d66857adae386303d283b40d384d43e7ff1c8049477ff67fc842dc59218c77b0e14e6ca78804bfc62bf6d3d632ccee60efca8deab14750691a4de7", 
            "ssdeep": "12:zMGBtmHS8FwMqEaO4M/MTdy1blPC/MOWe93LbRF7MW1B0BRdW/fEKN9Ip2bie:AGBtmHS8bky1blqF3LbzF1izdWjT", 
            "Size": "664 bytes"
        }, 
        "SCAN_YARA": {
            "ft_macho": {
                "company": "BroEZ", 
                "lastmod": "September 5 2016", 
                "desc": "Signature to trigger on mach-o file format.", 
                "author": "Jamie Ford"
            }, 
            "ft_java_class": {
                "company": "Emerson", 
                "lastmod": "20160126", 
                "desc": "File magic for detecting a Java bytecode file.", 
                "author": "Jason Batchelor"
            }
        }, 
        "META_JAVA_CLASS": {
            "implements": [
                "a.a.a.A"
            ], 
            "name": "a", 
            "fields": [], 
            "platform": "1.5", 
            "constants_pool": [
                {
                    "index": 1, 
                    "type": "class", 
                    "value": "#34"
                }, 
                {
                    "index": 2, 
                    "type": "class", 
                    "value": "#36"
                }, 
                {
                    "index": 3, 
                    "type": "class", 
                    "value": "#37"
                }, 
                {
                    "index": 4, 
                    "type": "class", 
                    "value": "#38"
                }, 
                {
                    "index": 5, 
                    "type": "class", 
                    "value": "#40"
                }, 
                {
                    "index": 6, 
                    "type": "class", 
                    "value": "#41"
                }, 
                {
                    "index": 7, 
                    "type": "class", 
                    "value": "#42"
                }, 
                {
                    "index": 8, 
                    "type": "class", 
                    "value": "#43"
                }, 
                {
                    "index": 9, 
                    "type": "Field", 
                    "value": "#2.#20"
                }, 
                {
                    "index": 10, 
                    "type": "Field", 
                    "value": "#2.#22"
                }, 
                {
                    "index": 11, 
                    "type": "Field", 
                    "value": "#5.#19"
                }, 
                {
                    "index": 12, 
                    "type": "Method", 
                    "value": "#1.#17"
                }, 
                {
                    "index": 13, 
                    "type": "Method", 
                    "value": "#4.#18"
                }, 
                {
                    "index": 14, 
                    "type": "Method", 
                    "value": "#6.#16"
                }, 
                {
                    "index": 15, 
                    "type": "Method", 
                    "value": "#8.#21"
                }, 
                {
                    "index": 16, 
                    "type": "NameAndType", 
                    "value": "#28:#23"
                }, 
                {
                    "index": 17, 
                    "type": "NameAndType", 
                    "value": "#36:#24"
                }, 
                {
                    "index": 18, 
                    "type": "NameAndType", 
                    "value": "#36:#27"
                }, 
                {
                    "index": 19, 
                    "type": "NameAndType", 
                    "value": "#36:#30"
                }, 
                {
                    "index": 20, 
                    "type": "NameAndType", 
                    "value": "#36:#33"
                }, 
                {
                    "index": 21, 
                    "type": "NameAndType", 
                    "value": "#39:#26"
                }, 
                {
                    "index": 22, 
                    "type": "NameAndType", 
                    "value": "#40:#31"
                }, 
                {
                    "index": 23, 
                    "type": "Utf8", 
                    "value": "()V"
                }, 
                {
                    "index": 24, 
                    "type": "Utf8", 
                    "value": "(LRunApplet;)La/a/a/h;"
                }, 
                {
                    "index": 25, 
                    "type": "Utf8", 
                    "value": "(Lb;Ljava/lang/String;)V"
                }, 
                {
                    "index": 26, 
                    "type": "Utf8", 
                    "value": "(Ljava/lang/Object;ILjava/lang/Object;II)V"
                }, 
                {
                    "index": 27, 
                    "type": "Utf8", 
                    "value": "(Ljava/lang/String;[Ljava/lang/String;)Ljava/lang/Object;"
                }, 
                {
                    "index": 28, 
                    "type": "Utf8", 
                    "value": "<init>"
                }, 
                {
                    "index": 29, 
                    "type": "Utf8", 
                    "value": "Code"
                }, 
                {
                    "index": 30, 
                    "type": "Utf8", 
                    "value": "LRunApplet;"
                }, 
                {
                    "index": 31, 
                    "type": "Utf8", 
                    "value": "Lb;"
                }, 
                {
                    "index": 32, 
                    "type": "Utf8", 
                    "value": "LineNumberTable"
                }, 
                {
                    "index": 33, 
                    "type": "Utf8", 
                    "value": "Ljava/lang/String;"
                }, 
                {
                    "index": 34, 
                    "type": "Utf8", 
                    "value": "RunApplet"
                }, 
                {
                    "index": 35, 
                    "type": "Utf8", 
                    "value": "SourceFile"
                }, 
                {
                    "index": 36, 
                    "type": "Utf8", 
                    "value": "a"
                }, 
                {
                    "index": 37, 
                    "type": "Utf8", 
                    "value": "a/a/a/A"
                }, 
                {
                    "index": 38, 
                    "type": "Utf8", 
                    "value": "a/a/a/h"
                }, 
                {
                    "index": 39, 
                    "type": "Utf8", 
                    "value": "arraycopy"
                }, 
                {
                    "index": 40, 
                    "type": "Utf8", 
                    "value": "b"
                }, 
                {
                    "index": 41, 
                    "type": "Utf8", 
                    "value": "java/lang/Object"
                }, 
                {
                    "index": 42, 
                    "type": "Utf8", 
                    "value": "java/lang/String"
                }, 
                {
                    "index": 43, 
                    "type": "Utf8", 
                    "value": "java/lang/System"
                }
            ], 
            "source_file": "SourceFile", 
            "extends": "java.lang.Object", 
            "version": [
                49, 
                0
            ], 
            "class_requires": [
                "RunApplet", 
                "b", 
                "java.lang.Object", 
                "a.a.a.h.a(java.lang.String,java.lang.String[]):java.lang.Object", 
                "java.lang.System.arraycopy(java.lang.Object,int,java.lang.Object,int,int):void", 
                "java.lang.Object.<init>():void", 
                "java.lang.System", 
                "a.a.a.A", 
                "RunApplet.a(RunApplet):a.a.a.h", 
                "java.lang.String", 
                "a.a.a.h", 
                "b.a:RunApplet"
            ], 
            "class_provides": [
                "a", 
                "a.a(java.lang.String,java.lang.String[]):java.lang.Object"
            ], 
            "methods": []
        }
    }, 
    "Summary": {
        "Yara": [
            "ft_java_class", 
            "ft_macho"
        ], 
        "Modules": [
            "META_BASIC_INFO", 
            "META_JAVA_CLASS", 
            "SCAN_YARA"
        ], 
        "Observations": []
    }, 
    "Alert": false
}

Running yara with the -s flag shows that both ft_java_class and ft_macho have a similar string they're targeting:

ft_java_class ../datar/fsf_dump_1485954008_9700feb2e081ce6a0eb9d8d6c10604e7/
0x0:$class: CA FE BA BE 
ft_macho ../datar/fsf_dump_1485954008_9700feb2e081ce6a0eb9d8d6c10604e7/
0x0:$FAT_MAGIC: CA FE BA BE

@zcatbear, is the $FAT_MAGIC string a sufficient enough trigger for a macho file if its the only string observed and is at the 0 offset? Unfortunately, thats the same string and offset for the java_class signature, so we'd need to add either additional conditions to the java_class signature or the macho signature.

Unfortunately because of the fact that both signatures need that exact string at the same offset and have no other conditionals / strings required to trigger a hit, we can't exactly just say in ft_macho that we're going to ignore hits on that string if we also had a fit on ft_java_class--like below.

rule ft_macho
{
   meta:
      author = "Jamie Ford"
      company = "BroEZ"
      lastmod = "September 5 2016"
      desc = "Signature to trigger on mach-o file format."

   strings:
      $MH_CIGAM_64 = { CF FA ED FE }
      $MH_MAGIC_64 = { FE ED FA CF }
      $MH_MAGIC_32 = { FE ED FA CE }
      $MH_CIGAM_32 = { CE FA ED FE }
      $FAT_MAGIC = { CA FE BA BE }
      $FAT_CIGAM = { BE BA FE CA }

   condition:
      ($MH_CIGAM_64 at 0) or ($MH_MAGIC_64 at 0) or ($MH_CIGAM_32 at 0) or ($MH_MAGIC_32 at 0) or ($FAT_MAGIC at 0 and not ft_java_class) or ($FAT_CIGAM at 0)
}

Errors When Importing Macholibre

It has been reported from multiple sources that the MACHO module is having issues importing the macholibre library on a new install of FSF:
from macholibre import macholibre
ImportError: cannot import name macholibre

thresholding for alerting

Rather than alerting only when a yara sig or jq sig has the alert condition set, it would be very helpful to also allow for thresholded alerting wherein one could establish in the dispositioner a relative "suspiciousness" on a score of -10 to +10 for a yara sig or post processor sig and also set an alerting threshold so that a series of relatively suspicious things could trigger an alert or archival decision.

sanitize sample artifact in misc_hexascii_pe_in_html.yara

The misc_hexascii_pe_in_html.yara comments include a sample POC of the behavior this sig is targeting, which is great! Sadly, McAfee disagrees. After infuriating our monitoring guys, I poked around and found that a simple sanitization of the iframe src tag clears this problem right up (for my environment).

/* 
Example target...

<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

...

<iframe src="hxxp://NtKrnlpa[.]cn/rc/" width=1 height=1 style="border:0"></iframe>
</body></html><SCRIPT Language=VBScript><!--
DropFileName = "svchost.exe"
WriteData = "4D5A90000300000004000000FFFF0000B800000000000000400000000000000..
Set FSO = CreateObject("Scripting.FileSystemObject")
DropPath = FSO.GetSpecialFolder(2) & "\" & DropFileName
If FSO.FileExists(DropPath)=False Then
Set FileObj = FSO.CreateTextFile(DropPath, True)
For i = 1 To Len(WriteData) Step 2
FileObj.Write Chr(CLng("&H" & Mid(WriteData,i,2)))
Next
FileObj.Close
End If
Set WSHshell = CreateObject("WScript.Shell")
WSHshell.Run DropPath, 0
//--></SCRIPT>

Source: http://pastebin.com/raw/mkDzzjEv
*/
rule misc_hexascii_pe_in_html : encoding html suspicious
{
    meta:
        author = "Jason Batchelor"
        created = "2016-03-02"
        modified = "2016-03-02"
        university = "Carnegie Mellon University"
        description = "Detect on presence of hexascii encoded executable inside scripted code section of html file"

    strings:
        $html_start = "<html>" ascii nocase // HTML tags
        $html_end = "</html>" ascii nocase
        $mz = "4d5a"  ascii nocase // MZ header constant
        $pe = "50450000" ascii nocase // PE header constant

    condition:
        all of ($html*) and $pe in (@mz[1] .. filesize)
}

Keys with '.' causing error on indexing of JSON scan report in Elasticsearch

While working with using elasticsearch as a database for FSF scan reports I noticed an indexing problem that was the result of the key structure in META_PE, shown below:

         "Imports": {
                "version.dll": [...]

It appears that this will affect any database that is storing the raw report as 'flat' JSON, though I've only tested on Elasticsearch directly.

Yum Error After Adding EPEL

I ran into an issue after adding the EPEL repo for CentOS 7 - Minimal.

After turning on the EPEL release sudo yum install epel-release I get the following error when doing yum install python-pip ssdeep-devel:
Error: Cannot retrieve metalink for repository: epel. Please verify its path and try again

To fix this, I ran the following:
sudo sed -i 's/https/http/g' /etc/yum.repos.d/epel.repo

Not sure if this was just a me problem, but this fixed the issue.

Recursive File Type based alerting

Hello I'm seeking some help to identify best way to write a module that will set the alert flag if the parent is a compressed file (rar/zip) and the current file is an executable... Any thoughts?

Decision: Post Processor Efficiency

The post processor is awesome, but tends to be underutilized. This is a bit of a blessing as it has a potential scaling limitation, and its error handling could use some extension as well.

For each post processor signature we pipe out a json.dumps on the scan_report to the JQ interpreter, meaning that we're looping through the sigs performing 1 at a time. In addition to this, we also abandon all post processing steps if there's a problem on a single signature attempt--meaning that poorly designed JQ sigs could limit the effectiveness of all other sigs with little logging to indicate why. We could treat the JQ sigs like yara sigs where we attempt to compile the sigs (in this case a simple test of running them through the JQ interpreter) before initializing the fsf-server. We can also return the exit code from popen so there's some manner of audit trail for failed things.

Fixes:

  • test JQ sigs before initializing the daemon
  • dump the scan report to json just once, and update post processor documentation to indicate that there's no post processing of post processor results (inception?). Alternatively, if there is a positive JQ response then we can update the scan_report and then json.dumps
  • switch from the default json library to simplejson for this function as there's a well documented 50-60% performance gain in it over the builtin json library.

See below:

def post_processor(s, report):

   observations = []

   jq_location = find_executable('jq')
   if jq_location == None:
      s.dbg_h.error('%s Unable to find JQ, aborting post-processing routine...' % dt.now())
      return

   for script, observation, alert in disposition.post_processor:
      args = [jq_location, '-f', '%s/%s/%s' % (os.path.dirname(os.path.realpath(__file__)), 'jq', script)]
      proc = Popen(args, stdin=PIPE, stdout=PIPE, stderr=STDOUT)
      results = proc.communicate(input=json.dumps(report))[0].split('\n')

      if proc.returncode:
         s.dbg_h.error('%s There was a problem executing the JSON interpreter...' % dt.now())
         return

      for r in results:
         if r == 'true':
            observations.append(observation)
            # Allow ourselves to alert on certain observations
            if alert:
               s.alert = True

RTF Processing Improvement

Newer versions of oletools contain an rtfobj API implementation that was incompatible with the previous implementation from FSF.

The crux of the issue being the new value added here:

for index, orig_len, data in ...

as opposed to what was currently done...

for index, data in ...

There appear to have been several other API changes there.

https://github.com/decalage2/oletools/wiki/rtfobj

It's was quick and trivial fix to get things working using the 'rtf_iter_objects' API method (just add the field ;)). Done here.

It might also be an opportunity also to revisit the way it's being done now (write file to disk then process), in favor of something else (pass the buffer straight to method) if permitted by the current project.

Installs for Ubuntu?

Yum installer doesn't really work that well on Ubuntu.....but using apt-get I can't find the repositories. I keep getting "Unable to locate".

I can post pictures if necessary...has anyone been able to install on Ubuntu?

Add file output with JSON

I will write that myself, give me a day or two, so no worries here.

It would be very useful to have a JSON output of scan resuts, into a file, with no buffering, and with a proper separation between stdout and stderr.

How about adding a new parameter that takes a file name and dumps JSON with scan results there?

Using harcoded python paths is incompatibile with virtualenv

Hey,

when trying to run fsf in a virtualenv, I noticed most files use hardcoded #!/usr/bin/python instead of #!/usr/bin/env python - thus it ignores the overwrite that virtualenv does.

Could you make fsf virtualenv compatible with this simple change? :)

Extend FSF_RPC to pass more source content

Since FSF offers such a useful post processing framework for managing alerts based on file metadata it would be supremely handy for there to be an additional key in the FSF_RPC that allows a client to pass additional file context as JSON.

META_JAVA_CLASS returns tuples, which are not supported in JSON

When dumping JSON info about META_JAVA_CLASS module, you're leveraging python-javatools. here they're dumping the constants_pool as a tuple, which doesn't exist in JSON. The standard way JSON serializers handle this is to create a list, but you can (and do in this case) end up with a list of different types.

Namely, running against a sample gave me the following snippet:

                      "META_JAVA_CLASS": {
                            "implements": [],
                            "name": "a.a.a.K",
                            "fields": [],
                            "platform": "1.5",
                            "constants_pool": [
                                [
                                    1,
                                    "class",
                                    "#32"
                                ],
```

This tuple starts with an int, followed by two strings. The problem lies when using tools like Elasticsearch which try to analyze the tuple. It cannot treat the values of that list as both a numeric and string type and it blows up. ๐Ÿ˜ข 

It's obviously a bit more works on FSF's part, but one way to handle this particular use-case is to remove the integer and preserve order in the `constants_pool` list, as the int looks like an index, but I could be wrong.

I've also read about some weird approaches of nested dictionaries.

Method for Archiving on Conditions Without Alerting

A method is needed for archiving files that come through FSF in more granular way then the broad approach of archiving everything, archiving when there is an alert or don't archive at all. Many parties could benefit from this including the analyst wishing to gain better understanding of particulars files with conditions, developers who are testing a new module, and organizations that may wish to bulk archive objects but simply do not have the capacity to take everything or any other use case where you would want to archive a subset of files without triggering an "alert" condition that is tied to other systems being monitored.

The proposed solution for this would be to add an additional Boolean "archive" item to disposition tuple.

Can't get to run

Hi,

I'm on Ubuntu 18.04.2 LTS and believe I installed all dependencies.

Testing using ./fsf_client.py ~/fsf-master/docs/Test.zip results in the following output

2019-09-15 16:15:37.765584 There was a problem getting data for Test.zip from 127.0.0.1 on port 5800. Error: <class 'struct.error'>

/tmp/daemon.log contains:

Traceback (most recent call last): File "/usr/lib/python2.7/SocketServer.py", line 571, in process_request self.finish_request(request, client_address) File "/usr/lib/python2.7/SocketServer.py", line 331, in finish_request self.RequestHandlerClass(request, client_address, self) File "/usr/lib/python2.7/SocketServer.py", line 652, in __init__ self.handle() File "./main.py", line 52, in handle from scanner import Scanner File "/home/andrejas/fsf-master/fsf-server/scanner.py", line 28, in <module> import processor File "/home/andrejas/fsf-master/fsf-server/processor.py", line 41, in <module> from modules import * File "/home/andrejas/fsf-master/fsf-server/modules/META_MACHO.py", line 24, in <module> import macholibre File "/usr/local/lib/python2.7/dist-packages/macholibre-1.1.0-py2.7.egg/macholibre/__init__.py", line 37 def parse(macho, certs: bool=False, out=None): ^ SyntaxError: invalid syntax

Any idea what I missed?

Strings based analysis

In a somewhat similar vain to #11 I was wondering if the capability was there to extract and process string data? For example a module that extracts URLs and strings from a PE binary would then be consumed by another module later on which is able to do a lookup based on this output.

Another example is to have a base64 module that can consume base64 encoded data of attachments in emails. This would then be resubmitted into fsf.

Love to hear your thoughts

Add Colorized Error and Warnings

We modified the script to output yellow warnings and red errors so that we can better see when something went wrong. However we did this via bash color escape sequences which might not translate to everything. There appears to be a python library colorconsole although I have never used it might be helpful.

No netstat to check listening port

In CentOS 7 - Minimal, there is no netstat to check to ensure that the server is listening properly:

netstat -na | grep 5800
tcp        0      0 127.0.1.1:5800          0.0.0.0:*               LISTEN

It might be prudent to include the following in the dependencies for CentOS:
yum install net-tools

python-yara compile error can disable module execution

Recently observed an interesting problem with SCAN_YARA and the rules.yara file.

An analyst wrote a new yara signature, added its reference to rules.yara, and then successfully tested the updated rules.yara by running yara from its CLI wrapper. Having not observed a compile error then then check the updated rules.yara in and deployed the updated FSF configuration.

As it turned out, the newly deployed rules.yara also contained a reference to a yara signature that had been removed. When tested with the YARA CLI tools, this did not prevent yara from execution, however when the inaccurate rules.yara file was compiled using python-yara this became a fatal exception that resulted in no yara signatures (including the vital filetype signatures that drive module execution) running on the samples submitted. The error was identified via dgb.log, which shows that there was a YaraSyntaxError on all files submitted.

I think there's two lessons here:

  1. Some yara signatures are crucial to the overall function of FSF and also crucial to the generation of useful metadata for retroactive file hunting. These need to be protected from the frequent manipulation that can occur when maintaining a single yara signatures repository with multiple contributors and signatures from many different sources.
  2. Should SCAN_YARA errors be handled differently from other modules, since it is vital to the execution of modules?

Possible fixes

Protecting yara signatures that trigger modules

  • Rules that trigger modules could be quickly protected by the creation of a second rules.yara that only contains the yara signatures vital to module execution, while all other signatures could be added to another 'rules.yara'. This would require extending SCAN_YARA to first attempt to compile from two filepaths (the rules.yara that analysts can update with their non-module triggering signatures AND the vital module triggering rules). In the event that yara.compile raises a YaraSyntaxError, then SCAN_YARA will fall back to compiling only the rules needed for module execution. This will also require an additional scanner config to specify the trigger rules.
  • It may also be useful to extend scanner.check_yara_file as well, so that instead of simple checking to see if the rules files exist, it also runs yara.compile and exits if an exception condition exists.

Treating SCAN_YARA differently from other modules

  • Maybe if SCAN_YARA raises a compile exception we could divert incoming files to the server to the EXPORT_PATH and raise a unique message?

Question about META_PE_SIGNATURE.py ?

  • How to get the fields of the structure below ?
  • Are them in the decoded signed_data ?
    for example, How to extract the "dwLength" fleld ?

typedef struct _WIN_CERTIFICATE

{

DWORD dwLength;

WORD wRevision;

WORD wCertificateType;

BYTE bCertificate[ANYSIZE_ARRAY];

} WIN_CERTIFICATE, *LPWIN_CERTIFICATE;

thanks.

Update Dockerfile

The current dockerfile is super out of date. While the project is largely in maint mode we should be good stewards of the internet and either update it or remove it.

fsf in fails with yara-python 3.4.0.00 in virtualenv

Hey!

I installed fsf in a virtualenv and stumbled upon a different problem

Test.zip fails to be analyzed

(fsf)brobuntu% ./fsf_client.py ../docs/Test.zip
{
"Scan Time": "2016-07-15 06:03:07.894859",
"Filename": "Test.zip",
"Source": "Analyst",
"Object": {
"META_BASIC_INFO": {
"MD5": "f77489b604a106a0e0cd715d84a975a4",
"SHA1": "872f221d3a893afb3895f3eae82b7877ba38b655",
"SHA256": "1e97ee119b350f12ffb366aebab625a46c24d44f22416f289412e5de5053794d",
"SHA512": "cd10a807da10ed796f1b9a8fa10e378efec9b12e16acbb0a983c3e2579f622dc6e96f65e69e19e056934ab1999777492638bb75dd04ed2b35a1e9944418af3a3",
"ssdeep": "6144:Cl5QCGRIcQ5lunKBVOEfvWRmG7t1u9VhcLjs0ZAIdB084E5fnKESn4zbhxJfmeH1:CvQ1uf5l4KBIEfvW3wT6jjafLWK1nibV",
"Size": "351521 bytes"
}
},
"Summary": {
"Yara": [],
"Modules": [
"META_BASIC_INFO"
],
"Observations": [
"Only one kind of module was run on for this report.",
"There doesn't appear to be any Yara signature hits for this scan."
]
},
"Alert": false
}

(fsf)brobuntu% cat /tmp/scan.log
{"Scan Time": "2016-07-15 06:03:07.894859", "Filename": "Test.zip", "Source": "Analyst", "Object": {"META_BASIC_INFO": {"MD5": "f77489b604a106a0e0cd715d84a975a4", "SHA1": "872f221d3a893afb3895f3eae82b7877ba38b655", "SHA256": "1e97ee119b350f12ffb366aebab625a46c24d44f22416f289412e5de5053794d", "SHA512": "cd10a807da10ed796f1b9a8fa10e378efec9b12e16acbb0a983c3e2579f622dc6e96f65e69e19e056934ab1999777492638bb75dd04ed2b35a1e9944418af3a3", "ssdeep": "6144:Cl5QCGRIcQ5lunKBVOEfvWRmG7t1u9VhcLjs0ZAIdB084E5fnKESn4zbhxJfmeH1:CvQ1uf5l4KBIEfvW3wT6jjafLWK1nibV", "Size": "351521 bytes"}}, "Summary": {"Yara": [], "Modules": ["META_BASIC_INFO"], "Observations": ["Only one kind of module was run on for this report.", "There doesn't appear to be any Yara signature hits for this scan."]}, "Alert": false}

(fsf)brobuntu% cat /tmp/dbg.log
2016-07-15 06:03:07.923590 Failed to run module SCAN_YARA on 351521 byte buffer supplied for file Test.zip. Error: <class 'yara.SyntaxError'>

(fsf)brobuntu% which yara

/usr/local/bin/yara

(fsf)brobuntu% yara -v

yara 3.4.0

(fsf)brobuntu% pip list

argparse (1.2.1)
cffi (1.7.0)
ConcurrentLogHandler (0.9.1)
czipfile (1.0.0)
future (0.15.2)
hachoir-core (1.3.3)
hachoir-metadata (1.3.3)
hachoir-parser (1.3.4)
hachoir-regex (1.0.5)
hachoir-subfile (0.5.3)
javatools (1.3)
oletools (0.47)
pefile (2016.3.28)
pip (1.5.4)
pyasn1 (0.1.9)
pyasn1-modules (0.0.8)
pycparser (2.14)
pyelftools (0.23)
pylzma (0.4.8)
PyPDF2 (1.26.0)
rarfile (2.8)
requests (2.10.0)
setuptools (2.2)
six (1.10.0)
ssdeep (3.1.1)
wsgiref (0.1.2)
xmltodict (0.10.2)

yara-python (3.4.0.00) <-- should match

On the other hand, when I run yara manually, using your set of rules, it can find two exe files inside.

That is Ubuntu 14.04 LTS, newest master, yara-python from pip in a virtualenv.

modular logging outputs

the native FSF scan output is great, but sadly many databases and log handlers don't appreciate recursive JSON the way an analyst might.

We could accommodate this while still preserving the inherent goodness of the default data structure.

Feature Request: Enable modules to process the FSF scan output

I'd like to see a way for modules to run on the returned scan results, like a "meta module". The advantage of this is that Analysts could begin to develop post processing logic for the delicious data produced by FSF. Image a world with:

  • modules that could help identify significant object hierarchies (i.e. a zip that contains somewhere a high entropy exe)
  • modules to submit chunks of FSF analysis (not just the raw file buffer) to other services / logging engines for aggregation (i.e. combining the parent file attributes with another log source containing the state of the transaction that was observed and led to the submission of the file for analysis)

In keeping with the overall objective, I'd suggest that "meta modules" be triggered by content logged to the orderedDict, by default module execution, or finally after the execution of all available modules.

offload initializing loggers, configs, and yara.compile to FSF main process

Adding a SIGHUP handler in the main process could allow us to monitor for changes to the fsf-server configs, dispositioner, and yara rules directory to reload configs when those files change. We could also initialize the loggers here too, as the modular loggers should be subject to the fsf-server config.

Moving these processes to main affords the opportunity to find some performance gains for FSF by reducing the number of init steps in the scanner and processor.

wget errors when installing dependencies

On a fresh CentOS 7 - Minimal build, I got a lot of certificate errors when downloading the master.zip, Yara, and the RPMForge repo.

To fix this, you may want to include --no-check-certificate in the documentation when using wget.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.