Coder Social home page Coder Social logo

cookpad / minerva Goto Github PK

View Code? Open in Web Editor NEW
8.0 2.0 0.0 1.47 MB

Serverless Log Search Architecture for Security Monitoring based on Amazon Athena

License: MIT License

Makefile 0.69% Go 93.63% JavaScript 0.33% TypeScript 5.35%
amazon-athena security-monitoring cloudformation golang search-engine

minerva's Introduction

minerva Build Status Report card

Serverless Log Search Architecture for Security Monitoring based on Amazon Athena.

Overview

In security monitoring, a security engineer is required to analyze security alert from security devices to determine risk of the alert. When analyzing a security alert, various logs from system, application, middleware, network and 3rd party services strongly help a security engineer to understand what is happened around the alert. There are a lot of existing useful log search engine products and services. However these products and services are expensive due to amount of log traffic size.

Minerva is designed focusing on cost effectiveness by leveraging AWS managed serviecs. Target use case is log search for several security alerts per day.

  • Advantages
    • Low running cost: (e.g. 7.5 TB logs and several searches per day require about only $300/mo as total)
    • Low operational cost: All components of Minerva are managed services and require minimum operation. Additionally preprocessing Lambda function can smoothly scale in/out.
  • Disadvantage
    • Cost increases accourding to number of search times. Then Minerva is not appropriate for continuous searching operation (e.g. Threat hunting).
    • Amazon Athena has latency in search operation about from 10 seconds to several minutes. This latency is bigger than other search engines (e.g. Elasticsearch).

Minerva provides only API to saerch logs. See Strix as web based user interface for Minerva. A following figure shows abstracted architecture of Minerva and Strix.

rough arch

On a side note, Minerva is the Roman goddess that is equated with Athena.

Getting Started

Prerequisite

  • Tools
    • aws-cdk >= 1.38.0
    • go >= 1.13
  • Resources
    • S3 bucket stored logs (assuming bucket name is s3-log-bucket)
    • S3 bucket stored parquet files (assuming bucket name is s3-parquet-bucket)
    • Amazon SNS receiving s3:ObjectCreated. See docs to configure. (assuming topic name is s3-log-create-topic)
    • IAM role for Lambda Function to access S3 bucket and so on. (assuming role name is YourLambdaRole )
    • Additionally, these resources are in ap-northeast-1 region and account ID is 1234567890x

Configurations

Init your configuration directry by cdk init command.

$ cdk init --language typescript

Then, update bin/cdk.ts like following.

#!/usr/bin/env node
import "source-map-support/register";
import * as cdk from "@aws-cdk/core";
import { MinervaStack } from "../minerva/lib/minerva-stack";

const app = new cdk.App();
const stackID = "your-stack-name";
new MinervaStack(
  app,
  stackID,
  {
    dataS3Region: "ap-northeast-1",
    dataS3Bucket: "s3-parquet-bucket",
    dataS3Prefix: "production/", // Set as you like it
    athenaDatabaseName: "minerva_db", // Set as you like it
    dataSNSTopicARN:
      "arn:aws:sns:ap-northeast-1:1234567890x:s3-log-create-topic",
    lambdaRoleARN: "arn:aws:iam::1234567890x:role/YourLambdaRole",
  },
  {
    stackName: stackID,
    env: {
      region: "ap-northeast-1",
      account: "1234567890x",
    },
  }
);

After that, create indexer.go. An example is following.

package main

import (
	"context"

	"github.com/m-mizutani/rlogs"
	"github.com/m-mizutani/rlogs/parser"
	"github.com/m-mizutani/rlogs/pipeline"

	"github.com/aws/aws-lambda-go/events"
	"github.com/aws/aws-lambda-go/lambda"
	"github.com/m-mizutani/minerva/pkg/indexer"
)

func main() {
	lambda.Start(func(ctx context.Context, event events.SQSEvent) error {
		logEntries := []*rlogs.LogEntry{
			// VPC FlowLogs
			{
				Pipe: pipeline.NewVpcFlowLogs(),
				Src: &rlogs.AwsS3LogSource{
					Region: "ap-northeast-1",
					Bucket: "my-flow-logs",
					Key:    "AWSLogs/",
				},
			},

			// Syslog
			{
				Pipe: rlogs.Pipeline{
					Ldr: &rlogs.S3LineLoader{},
					Psr: &parser.JSON{
						Tag:             "ec2.syslog",
						TimestampField:  rlogs.String("timestamp"),
						TimestampFormat: rlogs.String("2006-01-02T15:04:05-0700"),
					},
				},
				Src: &rlogs.AwsS3LogSource{
					Region: "ap-northeast-1",
					Bucket: "my-ec2-syslog",
					Key:    "logs/",
				},
			},
		}

		return indexer.RunIndexer(ctx, event, rlogs.NewReader(logEntries))
	})
}

indexer.go is written based on rlogs. Please see the repository for more detail.

Lastly, clone minerva repository.

$ git clone [email protected]:m-mizutani/minerva.git

Deployment

$ go mod init indexer
$ env GOARCH=amd64 GOOS=linux go build -o build/indexer .
$ npm install
$ npm run build
$ cdk deploy

Development

Architecture Overview

github-readme

License

MIT License

minerva's People

Contributors

dependabot[bot] avatar m-mizutani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

minerva's Issues

indexer.go in Readme.md is out of date

Hi.
Thank your for your kindly-opened your great code.

I've tried to execute your minerva according to your readme.md.
Now I have following errors when I execute "env GOARCH=amd64 GOOS=linux go build -o build/indexer .

---- error starts ----
# indexer
./indexer.go:46:28: too many arguments in call to indexer.RunIndexer
        have (context.Context, events.SQSEvent, *rlogs.Reader)
        want (*rlogs.Reader)
./indexer.go:46:28: indexer.RunIndexer(ctx, event, rlogs.NewReader(logEntries)) used as value
---- error ends ----

It seems that your repository is updated but this "indexer.go" seems to be out of date.
in "9f7ce5c"

function "RunIndexer" has been changed, but your indexer.go does not fit to this update.

Would you take a look at this code?
Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.