Coder Social home page Coder Social logo

swiser-young / cspm-gpt Goto Github PK

View Code? Open in Web Editor NEW

This project forked from samvas-codes/cspm-gpt

0.0 0.0 0.0 22.05 MB

The following is a simple example of how LLMs and langchain agents can simplify asking questions to understand the security posture of a cloud environment.

License: Apache License 2.0

Shell 3.97% Python 91.73% Dockerfile 4.31%

cspm-gpt's Introduction

Cloud Security Posture Management(CSPM) powered by GPT-4

Cloud Security Posture Management (CSPM) tools have evolved over the past years. What began as only as audit of configurations of cloud resources has grown to a complex capability that requires complex querying across relationships, security related data and events. Users using these tools need to customize them to their environment to make it easier to address security issues. This requires building their own custom logic or queries using query languages that can be difficult to learn and adapt to.

The following is a simple example of how LLMs and langchain agents can simplify asking questions to understand the security posture of a cloud environment. The project now supports the use of multiple DB types including PostgreSQL and Neo4j. It is made extensible by also using different data ingest platforms including Cartography and Cloudquery. Initial attempts to validate shortest paths have been successful if prompted appropriately. This is intended to be a PoC for generating ad-hoc attack paths if assets are labeled appropriately.

Disclaimer : The app is a demo and several improvements can be made. The queries made and results displayed are currentl best effort.

NOTE: The dockerfile hasnt been updated to include GraphDBs.

Installation

There are two options to use this project.

1. Run it as is as a streamlit app
2. Run it as a containarized streamlit app

Prequisites

1. Install cloudquery or Cartography on your machine.
2. This example uses either a Postgres or Neo4j DB as its backend

Run Locally

Clone the project

  git clone cspm-pt

Go to the project directory

  cd cspm-gpt

Install dependencies

  pip install requirements.txt

Start the app

  streamlit run app.py

Demo

Insert gif or link to demo

Environment Variables

To run this project in a container, you will need to add the following environment variables to your .env file

AWS_ACCESS_KEY_ID= <AWS-ACCESS-KEY>

AWS_SECRET_ACCESS_KEY= <AWS-SECRET>

AWS_DEFAULT_REGION= <AWS-DEFAULT-REGION>

PGDATA= <POSTGRES DATA PATH>

POSTGRES_USER=""

POSTGRES_PASSWORD=""

POSTGRES_DB=""

A sample .env file is provided which contains the default postgres configuration used.

In addition the streamlit app needs access to the OPENAI API KEY. To add this

1. Create a .streamlit/secrets.toml file in the project directory
2. Add the following OPENAI_API_KEY=<API-KEY>

Usage/Examples

Once the app is running and you have ingested data from your AWS accounts (eg) using Cloudquery, use the following prompts.

1. How many running ec2 instances are present? List the instance ids.
2. How many ebs volumes are attached to ec2 instances? 
3. How many ec2 instances are public ? What are their public IPs ? List the instance ids and the public IPs as a table.  
4. List all CIS checks that have failed. Get the resources that have failed these checks. List the checks failed and resources as a table.
5. How many ec2 instances also have an IAM role attached to it. List the instance IDs, IAM roles and the IAM policy attached to the role
6. Find the shortest path between an EC2 instance and an S3 bucket, describe how they are connected (GRAPH USE CASE)

Running a few examples and their results

Lets start simple and find EC2 instances that have EBS volumes attached to them. ec2-ebs-llm-query

Lets verify if this query actually works by going to neo4j and querying the DB ec2-ebs-verified-query

Now lets ask the typical attack path question - can you find an ec2 instance that has access to an s3 bucket ec2-s3-llm-query

Is it hallucinating? Nope! ec2-s3-llm-query

Taking it a step further, lets find internet exposed ec2 instance that has access to an s3 bucket ec2-s3-llm-query

What I currently observe is, as long as the schema is known to GPT, and you can prompt engineer your question, the data is pretty accurate.

Use cases

1. The app can be a natural language query builder for CSPM tools 
2. The app can help SOC teams query their data on demand and visualize them 
3. It can also help security engineers quickly develop complex queries using natural language 
4. CISOs that want to understand the state of their environment can easily ask questions 

Roadmap

1. Updates to include knowledge graphs as a datasource (Neo4J, AWS Neptune) -- DONE 
2. Adding vector stores to cache similar queries 
3. Display generated queries to allow manual intervention

cspm-gpt's People

Contributors

samvas-codes avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.