Coder Social home page Coder Social logo

python-lambda-inspector's Introduction

Python Lambda Inspector

Contributors

  • Graham Jones
  • Andrew Krug

Package Deployment Zip

From the project root zip lambda-inspector.zip *.py

Sample Output

Sample output is contained in the sample directory. (Slightly redacted.)

python-lambda-inspector's People

Contributors

andrewkrug avatar jeffbryner avatar joelferrier avatar tnem avatar

Stargazers

Jeremy Bae avatar Kyle 'Essobi' Stone avatar Son Dinh avatar  avatar aph3x avatar Sparky avatar

Watchers

Forest Monsen avatar Rich Jones avatar Alex McCormack avatar Jeff Parr avatar James Cloos avatar  avatar  avatar  avatar

python-lambda-inspector's Issues

tokenize filesystem info to standardized dict

For windows and linux friendliness we're going to have to make dicts a standard. Something like
{
"filesystem": {
"mount_point": "/run/shm",
"name": "none",
"size": "1020452",
"used": "0"
"writeable": "true"
}
},
"filesystem": {
"mount_point": "/run/shm",
"name": "none",
"size": "1020452",
"used": "0"
"writeable": "true"
}
}

Sanitize sensitive data from env

Redact by truncating to the first 12 characters or so the following fields gathered from get_env()

"AWS_SESSION_TOKEN":
"AWS_SECURITY_TOKEN":
"AWS_ACCESS_KEY_ID":
"AWS_SECRET_ACCESS_KEY":

Fallback to s3 storage on exception

Request code needs a bit of exception handling to fallback to s3 storage if it can't get to the API

{
"stackTrace": [
[
"/var/task/main.py",
163,
"lambda_handler",
"api_call = store_results(res)"
],
[
"/var/task/main.py",
150,
"store_results",
"response = urllib2.urlopen(req)"
],
[
"/usr/lib64/python2.7/urllib2.py",
154,
"urlopen",
"return opener.open(url, data, timeout)"
],
[
"/usr/lib64/python2.7/urllib2.py",
435,
"open",
"response = meth(req, response)"
],
[
"/usr/lib64/python2.7/urllib2.py",
548,
"http_response",
"'http', request, response, code, msg, hdrs)"
],
[
"/usr/lib64/python2.7/urllib2.py",
473,
"error",
"return self._call_chain(*args)"
],
[
"/usr/lib64/python2.7/urllib2.py",
407,
"_call_chain",
"result = func(*args)"
],
[
"/usr/lib64/python2.7/urllib2.py",
556,
"http_error_default",
"raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)"
]
],
"errorType": "HTTPError",
"errorMessage": "HTTP Error 504: G

Sanitize Env Vars a little too good

This code:

sanitize_envvars = {
    "AWS_SESSION_TOKEN":
        {
            "func": truncate, "args": [], "kwargs": {'end': 12}
        },
    "AWS_SECURITY_TOKEN":
        {
            "func": truncate, "args": [], "kwargs": {'end': 12}
        },
    "AWS_ACCESS_KEY_ID":
        {
            "func": truncate, "args": [], "kwargs": {'end': 12}
        },
    "AWS_SECRET_ACCESS_KEY":
        {
            "func": truncate, "args": [], "kwargs": {'end': 12}
        }
}

Seems to actually be removing values from the actual environment instead of just from the dict. When this code is called it results in the function no longer being able to upload to S3. Lambda outputs error:

The provided token is malformed or otherwise invalid.

Update is_warm to account for windows

Azure instances appear to have a writable tmp location. Basically the approach would be:

Detect OS is windows... we do this in main.py
Write your warm file to D:\local\Temp

Bazinga!

Lambda functions and internet access

Turns out lambda functions don't get any access to the internet without the presence of a cost prohibitive NAT gateway. This means that lambda functions running inside of the ThreatResponse AWS account will need to POST their results in a different way than runtimes out in the wild.

Potential options are:
1. Profiler writes a {uuid.hex()}.json.gz file directly to S3 the same way the API does
2. Profiler writes to dynamo and we pick that up somewhere else ( seems unnecessary )
3. Deploy NAT gateway. ( Not cost effective ).
4. Deploy the Lambda in the same VPC as the API box and point it directly at the API instead.

So I'm sure that you've gathered option 1 is preferred. It's just a matter of writing a little logic that only does the S3 upload if you're running from within a lambda function. We'll still need the urllib2.Request method in the python profiler that @jeffbryner wrote. Oddly it doesn't actually cause the function to fail. Simply nothing ever happens...

Option 4. isn't a bad choice either but has implications for if / when we want to go multi-region and puts heavier requirements on the CI/CD pipeline to attach things.

Lookups return types that can not be stored in dynamo

#"warm_since": is_warm.warm_since, # Issues with dynamo types
#"warm_for":   is_warm.warm_for, # Issues with dynamo types
#"dmesg":      get_dmesg, # Issues with dynamo types

Error is:

TypeError: Float types are not supported. Use Decimal types instead.

ClientError: An error occurred (ValidationException) when calling the PutItem operation: One or more parameter values were invalid: An AttributeValue may not contain an empty string

Support posting to observatory

Here's a snippet that works against my local dev.

def store_results_api(res):
    """Store Results via the API Component.

    Store results either in urllib2 or directly in s3 if lambda.
    HTTP request will be a POST instead of a GET when the data
    parameter is provided.
    """
    data = json.dumps(res)
    api_key = '4KGRb4PMlx1bBLZQ'
    headers = {
        "Authorization": "Basic %s" % api_key,
        'Content-Type': 'application/json'
        }

    req = urllib2.Request(
        'http://localhost:5000/api/profile',
        data=data,
        headers=headers
    )
    try:
        response = urllib2.urlopen(req)
        return response.read()
    except Exception as e:
        raise e

Log timedata

Each run should also log a date time in epoch as part of the json

Tokenize CPUInfo Options to_dict

I thought it would be easy to do this in fluentd ... turns out it's not that easy without writing a fluentd plugin.

Basically bottom line is that if we take the output of CPUInfo and tokenize it in python as a dict it will parse to fields automatically in elastic.

Standardize DateTimes

TypeError: datetime.datetime(2017, 3, 6, 19, 13, 16) is not JSON serializable occurs in python for warm_since and warm_for

Let's move them all to same format.

Add runtime to json structure

Add which runtime we're evaling as static but present in the output
Example:
{
runtime: python | c-sharp | javascript
}

/var/task

It’d be great to get find /var/task in this profiler output if you’re releasing it as a general purpose tool.

@Miserlou : just capturing this in an issue

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.