Coder Social home page Coder Social logo

bucket-antivirus-function's Introduction

bucket-antivirus-function

CircleCI

Scan new objects added to any s3 bucket using AWS Lambda. more details in this post

Features

  • Easy to install
  • Send events from an unlimited number of S3 buckets
  • Prevent reading of infected files using S3 bucket policies
  • Accesses the end-user’s separate installation of open source antivirus engine ClamAV

How It Works

architecture-diagram

  • Each time a new object is added to a bucket, S3 invokes the Lambda function to scan the object
  • The function package will download (if needed) current antivirus definitions from a S3 bucket. Transfer speeds between a S3 bucket and Lambda are typically faster and more reliable than another source
  • The object is scanned for viruses and malware. Archive files are extracted and the files inside scanned also
  • The objects tags are updated to reflect the result of the scan, CLEAN or INFECTED, along with the date and time of the scan.
  • Object metadata is updated to reflect the result of the scan (optional)
  • Metrics are sent to DataDog (optional)
  • Scan results are published to a SNS topic (optional) (Optionally choose to only publish INFECTED results)
  • Files found to be INFECTED are automatically deleted (optional)

Installation

Build from Source

To build the archive to upload to AWS Lambda, run make all. The build process is completed using the amazonlinux Docker image. The resulting archive will be built at build/lambda.zip. This file will be uploaded to AWS for both Lambda functions below.

Create Relevant AWS Infra via CloudFormation

Use CloudFormation with the cloudformation.yaml located in the deploy/ directory to quickly spin up the AWS infra needed to run this project. CloudFormation will create:

  • An S3 bucket that will store AntiVirus definitions.
  • A Lambda Function called avUpdateDefinitions that will update the AV Definitions in the S3 Bucket every 3 hours. This function accesses the user’s above S3 Bucket to download updated definitions using freshclam.
  • A Lambda Function called avScanner that is triggered on each new S3 object creation which scans the object and tags it appropriately. It is created with 1600mb of memory which should be enough, however if you start to see function timeouts, this memory may have to be bumped up. In the past, we recommended using 1024mb, but that has started causing Lambda timeouts and bumping this memory has resolved it.

Running CloudFormation, it will ask for 2 inputs for this stack:

  1. BucketType: private (default) or public. This is applied to the S3 bucket that stores the AntiVirus definitions. We recommend to only use public when other AWS accounts need access to this bucket.
  2. SourceBucket: [a non-empty string]. The name (do not include s3://) of the S3 bucket that will have its objects scanned. Note - this is just used to create the IAM Policy, you can add/change source buckets later via the IAM Policy that CloudFormation outputs

After the Stack has successfully created, there are 3 manual processes that still have to be done:

  1. Upload the build/lambda.zip file that was created by running make all to the avUpdateDefinitions and avScanner Lambda functions via the Lambda Console.
  2. To trigger the Scanner function on new S3 objects, go to the avScanner Lambda function console, navigate to Configuration -> Trigger -> Add Trigger -> Search for S3, and choose your bucket(s) and select All object create events, then click Add. Note - if you chose more than 1 bucket as the source, or chose a different bucket than the Source Bucket in the CloudFormation parameter, you will have to also edit the IAM Role to reflect these new buckets (see "Adding or Changing Source Buckets")
  3. Navigate to the avUpdateDefinitions Lambda function and manually trigger the function to get the initial Clam definitions in the bucket (instead of waiting for the 3 hour trigger to happen). Do this by clicking the Test section, and then clicking the orange test button. The function should take a few seconds to execute, and when finished you should see the clam_defs in the av-definitions S3 bucket.

Adding or Changing Source Buckets

Changing or adding Source Buckets is done by editing the AVScannerLambdaRole IAM Role. More specifically, the S3AVScan and KmsDecrypt parts of that IAM Role's policy.

S3 Events

Configure scanning of additional buckets by adding a new S3 event to invoke the Lambda function. This is done from the properties of any bucket in the AWS console.

s3-event

Note: If configured to update object metadata, events must only be configured for PUT and POST. Metadata is immutable, which requires the function to copy the object over itself with updated metadata. This can cause a continuous loop of scanning if improperly configured.

Configuration

Runtime configuration is accomplished using environment variables. See the table below for reference.

Variable Description Default Required
AV_DEFINITION_S3_BUCKET Bucket containing antivirus definition files Yes
AV_DEFINITION_S3_PREFIX Prefix for antivirus definition files clamav_defs No
AV_DEFINITION_PATH Path containing files at runtime /tmp/clamav_defs No
AV_SCAN_START_SNS_ARN SNS topic ARN to publish notification about start of scan No
AV_SCAN_START_METADATA The tag/metadata indicating the start of the scan av-scan-start No
AV_SIGNATURE_METADATA The tag/metadata name representing file's AV type av-signature No
AV_STATUS_CLEAN The value assigned to clean items inside of tags/metadata CLEAN No
AV_STATUS_INFECTED The value assigned to clean items inside of tags/metadata INFECTED No
AV_STATUS_METADATA The tag/metadata name representing file's AV status av-status No
AV_STATUS_SNS_ARN SNS topic ARN to publish scan results (optional) No
AV_STATUS_SNS_PUBLISH_CLEAN Publish AV_STATUS_CLEAN results to AV_STATUS_SNS_ARN True No
AV_STATUS_SNS_PUBLISH_INFECTED Publish AV_STATUS_INFECTED results to AV_STATUS_SNS_ARN True No
AV_TIMESTAMP_METADATA The tag/metadata name representing file's scan time av-timestamp No
CLAMAVLIB_PATH Path to ClamAV library files ./bin No
CLAMSCAN_PATH Path to ClamAV clamscan binary ./bin/clamscan No
FRESHCLAM_PATH Path to ClamAV freshclam binary ./bin/freshclam No
DATADOG_API_KEY API Key for pushing metrics to DataDog (optional) No
AV_PROCESS_ORIGINAL_VERSION_ONLY Controls that only original version of an S3 key is processed (if bucket versioning is enabled) False No
AV_DELETE_INFECTED_FILES Controls whether infected files should be automatically deleted False No
EVENT_SOURCE The source of antivirus scan event "S3" or "SNS" (optional) S3 No
S3_ENDPOINT The Endpoint to use when interacting wth S3 None No
SNS_ENDPOINT The Endpoint to use when interacting wth SNS None No
LAMBDA_ENDPOINT The Endpoint to use when interacting wth Lambda None No

S3 Bucket Policy Examples

Deny to download the object if not "CLEAN"

This policy doesn't allow to download the object until:

  1. The lambda that run Clam-AV is finished (so the object has a tag)
  2. The file is not CLEAN

Please make sure to check cloudtrail for the arn:aws:sts, just find the event open it and copy the sts. It should be in the format provided below:

 {
    "Effect": "Deny",
    "NotPrincipal": {
        "AWS": [
            "arn:aws:iam::<<aws-account-number>>:role/<<bucket-antivirus-role>>",
            "arn:aws:sts::<<aws-account-number>>:assumed-role/<<bucket-antivirus-role>>/<<bucket-antivirus-role>>",
            "arn:aws:iam::<<aws-account-number>>:root"
        ]
    },
    "Action": "s3:GetObject",
    "Resource": "arn:aws:s3:::<<bucket-name>>/*",
    "Condition": {
        "StringNotEquals": {
            "s3:ExistingObjectTag/av-status": "CLEAN"
        }
    }
}

Deny to download and re-tag "INFECTED" object

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Action": ["s3:GetObject", "s3:PutObjectTagging"],
      "Principal": "*",
      "Resource": ["arn:aws:s3:::<<bucket-name>>/*"],
      "Condition": {
        "StringEquals": {
          "s3:ExistingObjectTag/av-status": "INFECTED"
        }
      }
    }
  ]
}

Manually Scanning Buckets

You may want to scan all the objects in a bucket that have not previously been scanned or were created prior to setting up your lambda functions. To do this you can use the scan_bucket.py utility.

pip install boto3
scan_bucket.py --lambda-function-name=<lambda_function_name> --s3-bucket-name=<s3-bucket-to-scan>

This tool will scan all objects that have not been previously scanned in the bucket and invoke the lambda function asynchronously. As such you'll have to go to your cloudwatch logs to see the scan results or failures. Additionally, the script uses the same environment variables you'd use in your lambda so you can configure them similarly.

Testing

There are two types of tests in this repository. The first is pre-commit tests and the second are python tests. All of these tests are run by CircleCI.

pre-commit Tests

The pre-commit tests ensure that code submitted to this repository meet the standards of the repository. To get started with these tests run make pre_commit_install. This will install the pre-commit tool and then install it in this repository. Then the github pre-commit hook will run these tests before you commit your code.

To run the tests manually run make pre_commit_tests or pre-commit run -a.

Python Tests

The python tests in this repository use unittest and are run via the nose utility. To run them you will need to install the developer resources and then run the tests:

pip install -r requirements.txt
pip install -r requirements-dev.txt
make test

Local lambdas

You can run the lambdas locally to test out what they are doing without deploying to AWS. This is accomplished by using docker containers that act similarly to lambda. You will need to have set up some local variables in your .envrc.local file and modify them appropriately first before running direnv allow. If you do not have direnv it can be installed with brew install direnv.

For the Scan lambda you will need a test file uploaded to S3 and the variables TEST_BUCKET and TEST_KEY set in your .envrc.local file. Then you can run:

direnv allow
make archive scan

If you want a file that will be recognized as a virus you can download a test file from the EICAR website and uploaded to your bucket.

For the Update lambda you can run:

direnv allow
make archive update

License

Upside Travel, Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

ClamAV is released under the GPL Version 2 License and all source for ClamAV is available for download on Github.

bucket-antivirus-function's People

Contributors

adminrobert avatar albertocubeddu avatar andrewlane avatar avipinto avatar dependabot[bot] avatar dmarkey avatar edhgoose avatar groybal avatar igavrilov avatar jacek99 avatar jaygorrell avatar jdepp avatar jfurmankiewiczupgrade avatar smellman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bucket-antivirus-function's Issues

Update to Latest ClamAV 0.100.2

Hello All,
How to update ClamAV to latest release 0.100.2.
Epel 7 supports only 0.100.1-3, is there a way to update to latest.
Thanks,
Krishh

I created a terraform module that deploys the bucket antivirus function

Hey there!

Took the liberty to automate the process of deploying this function. It does everything from compiling and deploying the code to a generic s3 bucket, creating the functions, setting the time trigger for the update function, configure the bucket event triggers for arbitrary, external buckets, as well as the necessary permissions for the lambda function to scan the bucket.

Now all there is to do is create the antivirus with a module like this:

module "antivirus" {
  source = "gchamon/bucket-antivirus/aws"

  buckets-to-scan = [
    "my-test-bucket"
  ]

  scanner-environment-variables = {
    AV_DELETE_INFECTED_FILES = "True"
  }
  
  allow-public-access = true
}

All the environment variables are supported.

I wrote a deployment script based on your make script. I had to do that, because terraform executes bash code outside shell environment, so no TTY available for docker. This is why I needed more control over the code.

Here is the repository project: https://github.com/gchamon/terraform-aws-bucket-antivirus

Here is the terraform registry page: https://registry.terraform.io/modules/gchamon/bucket-antivirus/aws

I didn't include support for sns permissions yet, because in my original project I didn't need them and I couldn't write a test to assure that sns is correctly set, but with help I could update the project to include support for sns.

Hope you enjoy!

datadog events & SNS

Hi, Im having an issue with the datadog and SNS functionality. I've set up an API key within datadog and set the environment variable within the bucket-antivirus-function but Im not seeing events or metrics. I've also set up a topic (and subscribed to it), but I don't get any notices from it either.

Im using an EICAR text file to test, so it gets an av-status of 'infected' within s3 itself. It's just not sending anything to datadog or SNS.

Can you point me in a direction to poke at this a bit more? Thanks!

Private Bucket

Has anyone had problems using a private bucket to store the virus scan rules.

My bucket policy has to be this:

{
    "Version": "2012-10-17",
    "Id": "BucketPolicy",
    "Statement": [
        {
            "Sid": "SecureTransport",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": "arn:aws:s3:::XXXXXXXXXXXXXXXXXXXXX/*",
            "Condition": {
                "Bool": {
                    "aws:SecureTransport": "false"
                }
            }
        }
    ]
}

I have given the role that executes the lambda full access to the bucket

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "WriteCloudWatchLogs",
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "*"
        },
        {
            "Sid": "s3GetAndPutWithTagging",
            "Action": [
                "*"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::xxxxxxxxxx/*",
                "arn:aws:s3:::xxxxxxxxxx"
            ]
        },
        {
            "Sid": "s3HeadObject",
            "Effect": "Allow",
            "Action": "s3:ListBucket",
            "Resource": [
                "arn:aws:s3:::xxxxxxxxx/*",
                "arn:aws:s3:::xxxxxxxxx"
            ]
        },
        {
            "Sid": "kmsDecrypt",
            "Action": [
                "kms:Decrypt"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::xxxxxxxxxxx/*"
            ]
        }
    ]
}

However I am getting the following error:

S3UploadFailedError: Failed to upload /tmp/clamav_defs/main.cvd to XXXXXXXXXXXXXXXXXXX/clamav_defs/main.cvd: An error occurred (AccessDenied) when calling the CreateMultipartUpload operation: Access Denied

Query regarding Python 2.7 End-Of-Life approaching

Just wondering with Python 2.7 EOL approaching and how pro-active AWS are with phasing out EOL runtimes from Lambda (eg recently Node 8.x) if you guys were planning to update the github code base to 3.x or leave as 2.7 for the forseeable future.

Love this solution and appreciate that you released it open source, it has been very helpful.

daily.cud and bytecode.cud files missing

After running the bucket-antivirus-update lambda function, there were only these files in the clamav_defs folder:

  • main.cvd
  • daily.cvd
  • bytecode.cvd

There were no files named daily.cud and bytecode.cud which caused the bucket-antivirus-function lambda function to give an error in downloading these files.

For now, I have removed these 2 files from https://github.com/upsidetravel/bucket-antivirus-function/blob/master/common.py#L31 and it works fine.

I see that this is a recent change. Is the current master not stable? Let me know if I am missing something

Missing dependencies

This looks really interesting.
Now I have made two attempts to set this up.
First I tried with Ubuntu on my Windows bash prompt,
then I started a Ubuntu EC2 and did the Make there.
Same result:
Testing the Lambda i get an error:

`START RequestId: nnn-bnnn-nnnn-nnnn Version: $LATEST
Script starting at 2018/12/18 12:26:12 UTC

Attempting to create directiory /tmp/clamav_defs.

Starting freshclam with defs in /tmp/clamav_defs.
freshclam output:
./bin/freshclam: error while loading shared libraries: libjson-c.so.2: cannot open shared object file: No such file or directory

Unexpected exit code from freshclam: 127.
Script finished at 2018/12/18 12:26:12 UTC
...
`

Any ideas about how to get trough this?

The bucket-antivirus-function timeouts when calling clamav

Hi!

After installing the functions manually the function that downloads the clamav and definitions is working fine, however the function that is supposed to scan files hangs on calling the clamav.

The last message in the log is "Starting clamscan of /tmp/gfx-av-tests/test.txt." Then AWS kills the lambda when the execution time gets over the timeout (5 min in my case).

It looks like it is hanging after executing output = av_proc.communicate()[0]

Could you please advise what could be the issue?

Exception error

I've been able to get the Update AV functionality working without issue, but keep running into this exception error when I upload a file to the bucket to be scanned..

Starting clamscan of /tmp/bucket/Testfile.txt.

./bin/clamscan: error while loading shared libraries: libpcre.so.1: cannot open shared object file: No such file or directory

Unexpected exit code from clamscan: 127.
: Exception
Traceback (most recent call last):
File "/var/task/scan.py", line 101, in lambda_handler
scan_result = clamav.scan_file(file_path)
File "/var/task/clamav.py", line 133, in scan_file
raise Exception(msg)
Exception: Unexpected exit code from clamscan: 127.

Any advice or pointers gratefully received

Deprecation of python 2 and upgrade to python 3 https://pythonclock.org/

It seems theres going to be an inevitable incompatibility with this repo moving forward since Python 2 is being deprecated in one month and will no longer receive any updates including severe security issues.

https://pythonclock.org/

https://www.python.org/doc/sunset-python-2/

When a language is deprecated AWS also removes support for creating new lambda functions and editing functions with that runtime. So they should be sending out emails shortly telling users to update their functions to python 3.

it does seem there are pull requests that are attempting to address this issue.

#71 & #74

freshclam not updating daily.cvd

There appears to be an issue with updating on our installation. The cdiffs download, and it appears to generate the daily.cld file, but it doesn't create the cvd and upload it. I have dropped -v into the freshclam command, and it's returning Exitcode 1, being the database is up to date, where my daily.cvd hasn't been updated since the 30th November.

Is there something I'm missing here? There was an instance last week where the cdiff section failed, so it downloaded the latest daily.cvd automatically, but it's only had one instance of that in over a month.

Full output with -v attached.

cloudwatch.output.txt

Freshclam returning -9 error code

Since 12th October my update lambda logs the following in CloudWatch:

WARNING: Your ClamAV installation is OUTDATED!
WARNING: Local version: 0.101.2 Recommended version: 0.101.4
DON'T PANIC! Read https://www.clamav.net/documents/upgrading-clamav
main.cvd is up to date (version: 58, sigs: 4566249, f-level: 60, builder: sigmgr)
Downloading daily-25600.cdiff [100%]
Downloading daily-25601.cdiff [100%]
Downloading daily-25602.cdiff [100%]
Downloading daily-25603.cdiff [100%]
Downloading daily-25604.cdiff [100%]
Downloading daily-25605.cdiff [100%]
Downloading daily-25606.cdiff [100%]
Downloading daily-25607.cdiff [100%]
Downloading daily-25608.cdiff [100%]
Downloading daily-25609.cdiff [100%]

Unexpected exit code from freshclam: -9.
Not uploading main.cvd because md5 on remote matches local.
Not uploading daily.cld because md5 on remote matches local.
Not uploading bytecode.cvd because md5 on remote matches local.

I've looked online for freshclam error code -9 and didn't find anything. The -9 only reminds me of SIGKILL.
The log is provided by the update_defs_from_freshclam function in clamav.py.
I am running the update lambda code from this repository from around February 2019, so I don't have the latest commits you've been merging these past months.

Would you know what this is and how to fix it?

Send only message to SNS if av-status = INFECTED

Hi, is it possible to send a message to SNS if the av-status is INFECTED? It works fine to publish to SNS with the original code (both clean and infected).

I've tried to add:
if result == AV_STATUS_CLEAN:
return

Under def sns_scan_results in common.py but get this error:

An error occurred (403) when calling the HeadObject operation: Forbidden: ClientError Traceback (most recent call last): File "/var/task/scan.py", line 146, in lambda_handler sns_scan_results(s3_object, scan_result) File "/var/task/scan.py", line 120, in sns_scan_results "version": s3_object.version_id, File "/var/runtime/boto3/resources/factory.py", line 339, in property_loader

Version tags

Hi, are you thinking in use version tags in the repo?

Thanks for your work.

"errorMessage": "cannot use a string pattern on a bytes-like object"

when doing a test, I get the following error message:
{
"errorMessage": "cannot use a string pattern on a bytes-like object",
"errorType": "TypeError",
"stackTrace": [
" File "/var/task/update.py", line 44, in lambda_handler\n clamav.update_defs_from_freshclam(AV_DEFINITION_PATH, CLAMAVLIB_PATH)\n",
" File "/var/task/clamav.py", line 115, in update_defs_from_freshclam\n ":".join(current_library_search_path()),\n",
" File "/var/task/clamav.py", line 47, in current_library_search_path\n return rd_ld.findall(ld_verbose)\n"
]
}

I have a public bucket and the role has full rights to the s3 bucket

No Object Tags?

Hello, first of all, great functions.. Been looking for some kind of solution like this for a while.

I have tested everything as per the instructions and it seems to be working well. However, objects in my s3 bucket are not being tagged. But I am not sure why no tags are applied, and no errors are being generated.

As far as permissions, these are already basically s3:* but ive also given full perms on my test environment setup but... still no tags.

generating clamscan file and lib files

Hi, Just wanted to ask, how do you generate the clamav binaries which come as part of your package? Could you point us towards any documentation you use for this? We are looking at this solution as it looks powerful and want to build something similar (but not on Lambda due to the limits) but we are a little unclear on the best way to get the standalone One-Time Scanning clamscan file itself (and the various lib files) as generated by the clamav installation. (https://www.clamav.net/documents/scanning#one-time-scanning) These seem very useful standalone as we can see in your repo.

To add a little more detail, we tried running the instructions here:
https://www.clamav.net/documents/installing-clamav-on-unix-linux-macos-from-source

We installed ClamAv to a folder on a EC2 and found the standalone clamscan file here:
/home/ec2-user/clamav-0.101.4/clamscan/.libs

We know that lib files are needed and found these in:
/home/ec2-user/clamav-0.101.4/libclamav/.libs

The lib files I mean are (as can be seen in your package):
libclamav.so.7
libclamav.so.7.1.1
libclammspack.so.0
libclammspack.so.0.1.0
libjson.so.0
libjson.so.0.1.0
libjson-c.so.2
libjson-c.so.2.0.1

How do we know which of these we need? Do those files need to be updated when ClamAv updates?

I have somewhat reverse engineered the codebase here to see what is needed to run clamAv standalone (without installation) and I wonder if I am over complicating this. I wonder if the clamscan file and the relevant libs are provided by ClamAv somewhere and I am simply overcomplicating matters by hacking away at the installation to get to the relevant files!

Any guidance is very appreciated :)

Updating ClamAV

Hi Support,
How can I update ClamAV to the latest version?
Any support will be highly helpful
-Best,
Krishna

Definitions Update Lambda fails on Test

Getting the following error trying to run a test after I created the AV Definitions update Lambda. Any idea what I did wrong?

START RequestId: e5822bad-6030-11e8-aaee-23e96b59c6e3 Version: $LATEST
Unable to import module 'update': No module named update

Very Slow

The time between the file uploaded and the scan process to the end, is taking more than 10 seconds. This is normal?

There is a faster way. Local scanning takes no more than 200ms

Not compatible with AmazonLinux2

I'll open a PR, but the latest version of amazon linux will require an alternative build script - especially as clamav is not available in the default repositories.

Support maximum file size limit

Thank you for the project!

It would be useful to support an optional maximum file size via environment variable. If a file is uploaded exceeding this size, it would be tagged with a new status (e.g. AV_STATUS_SKIPPED).

I'm happy to put up a PR for this but wanted to check if it is a desired feature first.

Additional Databases

Is there any way to utilize additional databases with this? (Such as what would come in the clamav-unofficial-dbs package)

Thank you. <3

A Lambda- based ClamAV thingy for S3 maks my life so much easier. Thanks for making this!

getting below error bucket-antivirus-function

START RequestId: 44a5bfce-0772-11e9-a343-cd0a75d4f9ef Version: $LATEST
Script starting at 2018/12/24 11:51:37 UTC

'Records': KeyError
Traceback (most recent call last):
File "/var/task/scan.py", line 140, in lambda_handler
s3_object = event_object(event)
File "/var/task/scan.py", line 29, in event_object
bucket = event['Records'][0]['s3']['bucket']['name']
KeyError: 'Records'

END RequestId: 44a5bfce-0772-11e9-a343-cd0a75d4f9ef
REPORT RequestId: 44a5bfce-0772-11e9-a343-cd0a75d4f9ef Duration: 1.95 ms Billed Duration: 100 ms Memory Size: 1152 MB Max Memory Used: 61 MB

Problem with update function on first run

Problems addressed in #93

Namely, there was an import issue with fromtimestamp, which is a method of datetime.datetime not datetime. And for head_object to fall into the exception implementation, there should be s3:ListBucket permission for the definitions bucket

Starting freshclam with defs failing

WIthin the last week I have noticed the update failing with the following error

Can't add daily.mdb to new daily.cld - please check if there is enough disk space available it looks like we have exceeded the Lambda 500mb limit here. I hadn't updated the package in a year or so so I updated and now see the following errors:

Starting freshclam with defs in /tmp/clamav_defs.
freshclam output:
ClamAV update process started at Tue Oct 29 13:25:29 2019
main.cvd is up to date (version: 58, sigs: 4566249, f-level: 60, builder: sigmgr)
Downloading daily-25614.cdiff [100%]
ERROR: cdiff_cmd_close: Can't write to ./clamav-f9e93c076af9719bbedd7d8f184007e4.tmp
ERROR: cdiff_apply: Can't execute command CLOSE
ERROR: cdiff_apply: Error executing command at line 1578
ERROR: getpatch: Can't apply patch
WARNING: Incremental update failed, trying to download daily.cvd
Downloading daily.cvd [100%]tabase load killed by signal 9
ERROR: Failed to load new database
Unexpected exit code from freshclam: 55.
File does not exist: main.cld

Intermittent failures

On both the scan file function and update operation I'm seeing Head operation failures. The update works the first time and then starts to return Head operation failures I'm aware this seems to be an S3 permission issue, but I can't identify why this works the first time and then stops working. An error occurred (403) when calling the HeadObject operation: Forbidden: ClientError
Traceback (most recent call last):
File "/var/task/scan.py", line 100, in lambda_handler
clamav.update_defs_from_s3(AV_DEFINITION_S3_BUCKET, AV_DEFINITION_S3_PREFIX)
File "/var/task/clamav.py", line 39, in update_defs_from_s3
s3.Bucket(bucket).download_file(s3_path, local_path)
File "/var/runtime/boto3/s3/inject.py", line 168, in bucket_download_file
ExtraArgs=ExtraArgs, Callback=Callback, Config=Config)
File "/var/runtime/boto3/s3/inject.py", line 130, in download_file
extra_args=ExtraArgs, callback=Callback)
File "/var/runtime/boto3/s3/transfer.py", line 307, in download_file
future.result()
File "/var/runtime/s3transfer/futures.py", line 73, in result
return self._coordinator.result()
File "/var/runtime/s3transfer/futures.py", line 233, in result
raise self._exception
ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

Can't open database files in scanner function

I have recently deployed the antivirus function to a new aws account, and I can't seem to scan files anymore:

Script starting at 2019/10/22 20:24:03 UTC

Downloading definition file main.cvd from s3://bucket-antivirus-definitions20191022185411604100000004/clamav_defs
Downloading definition file daily.cvd from s3://bucket-antivirus-definitions20191022185411604100000004/clamav_defs
Downloading definition file bytecode.cvd from s3://bucket-antivirus-definitions20191022185411604100000004/clamav_defs
Starting clamscan of /tmp/test-antivirus20191022185411604300000005/eicar.txt.
clamscan output:
LibClamAV Error: cli_loaddbdir(): No supported database files found in /tmp/clamav_defs
ERROR: Can't open file or directory

----------- SCAN SUMMARY -----------
Known viruses: 0
Engine version: 0.101.4
Scanned directories: 0
Scanned files: 0
Infected files: 0
Data scanned: 0.00 MB
Data read: 0.00 MB (ratio 0.00:1)
Time: 0.005 sec (0 m 0 s)
u'/tmp/test-antivirus20191022185411604300000005/eicar.txt': KeyError
Traceback (most recent call last):
File "/var/task/scan.py", line 228, in lambda_handler
scan_result, scan_signature = clamav.scan_file(file_path)
File "/var/task/clamav.py", line 199, in scan_file
signature = summary[path]
KeyError: u'/tmp/test-antivirus20191022185411604300000005/eicar.txt'

It seems that there is some problem with lambda accessing files in /tmp folder

unquote_plus fails in 3.7

Running under Python 3.7

scan.py
[ERROR] AttributeError: module 'urllib' has no attribute 'unquote_plus'
Traceback (most recent call last):
File "/var/task/scan.py", line 180, in lambda_handler
s3_object = event_object(event)
File "/var/task/scan.py", line 33, in event_object
key = urllib.unquote_plus(event["Records"][0]["s3"]["object"]["key"].encode("utf8"))

Changed to (on my local copy)
key = urllib.parse.unquote_plus(event["Records"][0]["s3"]["object"]["key"])
and routine now runs correctly.

Project has no tests

This project currently has no tests and due to internal bandwidth, it's difficult to keep up with requests and PRs largely because of that. If we can get a set of tests around the project we can do a better job keeping up with the demands.

AWS Lambda Layers

Hi!
Wouldn't be interesting to make use of AWS Lambda Layers to take care of the required bins and libraries for both bucket-antivirus-update and bucket-antivirus-function functions?
We could extract most of the size of the function (bins/libs) to the layer and make it way easier to update and test using the Lambda console.

Tagging broke Clouldberry Explorer copy/move operation

We implemented this a few days ago and it worked almost as expected. Files uploaded are tagged properly - But we found an issue that we can't seem to solve. In our firm, we use Cloudberry Explorer (CE) to move files from

our_bucket_name/others/ (I'll call this A)

to

our_bucket_name/others/finished/ (I'll name this B)

Prior to implementing the lamda function, our staffs were able to drag files that were processed from location A to location B. After adding the virus scanning feature, the move operation would result in "Access Denied" error. We ruled out it's the permission issue, because testing with aws node worked (aws.copyObject) without errors. By looking at scan.py seems the only operation that the scanner modifies the s3 object is set_av_tags.

We also tried upgrading Cloudberry Explorer to the newest version available but the error persists. I understand that there might be some issues with CE but can someone shed some lights on why the tagging operation from the scanner would break CE - Removing virus scanning solved the "Access Denied" error.

Build Issue

I am unable to get the build work. Getting the below error when running MAKE. any help??

image

Current script doesn't work for amazon latest

Edit: Another Update after testing.

So, been trying to build this and ran into several issues with the latest image i pulled from docker following the instructions. I had to make the following changes to get it even COMPILE the zip. I am not sure if it even works. I'll update testing tomorrow. However, here are the changes

yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm -y
yum update -y
yum install yum-utils -y
yum install -y cpio python2-pip zip
pip install --no-cache-dir virtualenv
pip install --upgrade pip
virtualenv env
. env/bin/activate
pip install --no-cache-dir -r requirements.txt


pushd /tmp
yumdownloader -x \*i686 --archlist=x86_64 clamav clamav-lib clamav-update json-c pcre2
rpm2cpio pcre2*.rpm | cpio -idmv
rpm2cpio json-c*.rpm | cpio -idmv
rpm2cpio clamav-0*.rpm | cpio -idmv
rpm2cpio clamav-lib*.rpm | cpio -idmv
rpm2cpio clamav-update*.rpm | cpio -idmv
popd
mkdir -p bin
cp /tmp/usr/bin/clamscan /tmp/usr/bin/freshclam /tmp/usr/lib64/* bin/.
echo "DatabaseMirror database.clamav.net" > bin/freshclam.conf

mkdir -p build
zip -r9 $lambda_output_file *.py bin
cd env/lib/python2.7/site-packages
zip -r9 $lambda_output_file *

Short Version, Added Repo that included ClamAV, change Python27-pip to python2-pip, install yum-utils to get yumdownloader. Included json-c, pcre2

Error when testing with Eicar file

I am getting the below error when i run a test for the infected file using eicar antivirus test file.
Any help would be much appreciated.
Errror trace:
An error occurred (403) when calling the HeadObject operation: Forbidden: ClientError
Traceback (most recent call last):
File "/var/task/scan.py", line 160, in lambda_handler
sns_scan_results(s3_object, scan_result)
File "/var/task/scan.py", line 128, in sns_scan_results
"version": s3_object.version_id,
File "/var/runtime/boto3/resources/factory.py", line 339, in property_loader
self.load()
File "/var/runtime/boto3/resources/factory.py", line 505, in do_action
response = action(self, *args, **kwargs)
File "/var/runtime/boto3/resources/action.py", line 83, in call
response = getattr(parent.meta.client, operation_name)(**params)
File "/var/runtime/botocore/client.py", line 314, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/var/runtime/botocore/client.py", line 612, in _make_api_call
raise error_class(parsed_response, operation_name)
ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

Invalid parameter: TargetArn: InvalidParameterException for SNS status

I have created a an SNS topic and set the AV_STATUS_SNS_ARN to the ARN of the topic. When the bucket-antivirus-function lambda function runs it fails with an InvalidParameterException, seemingly with the TargetArn value.

On inspecting the code I noticed that the AV_STATUS_SNS_ARN is passed to sns_client.publish as the TargetArn value (rather than the TopicArn). Could this be the problem? ... And if so

  1. What should the AV_STATUS_SNS_ARN be set to?
  2. How can I use an SNS topic (so I can manage subscriptions to the notifications)?

An error occurred (InvalidParameter) when calling the Publish operation: Invalid parameter: TargetArn: InvalidParameterException
Traceback (most recent call last):
File "/var/task/scan.py", line 256, in lambda_handler
result_time,
File "/var/task/scan.py", line 195, in sns_scan_results
"StringValue": scan_signature,
File "/var/runtime/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/var/runtime/botocore/client.py", line 661, in _make_api_call
raise error_class(parsed_response, operation_name)
InvalidParameterException: An error occurred (InvalidParameter) when calling the Publish operation: Invalid parameter: TargetArn

Updating AV definitions timing out

The script makes it to this line before timing out I assume on the next line trying to start the subprocess.

This has only started happening with one of the recent commits - I couldn't tell you which one, but around the commits for fixing the fromtimestamp issue.

EDIT: This is happening on regardless of which commit I'm running

freshclam runs out of disk

/tmp for Lambda functions is limited to 512MB. When I run freshclam in Lambda I get an error:

Downloading daily-25044.cdiff [100%]
Downloading daily-25045.cdiff [100%]
Downloading daily-25046.cdiff [100%]
ERROR: buildcld: Can't add daily.hsb to new daily.cld - please check if there is enough disk space available
ERROR: Can't create local database

Scanning with AV_UPDATE_METADATA overwrites Content-Disposition

Hi, thanks for open sourcing this code, it helped us a lot. There is one bug:
during upload we set Content-Disposition which is later overwritten in scan.py:75.

Our quick fix was changing:

def set_av_metadata(s3_object, result):
    content_type = s3_object.content_type
    content_disposition = s3_object.content_disposition
    metadata = s3_object.metadata
    metadata[AV_STATUS_METADATA] = result
    metadata[AV_TIMESTAMP_METADATA] = datetime.utcnow().strftime("%Y/%m/%d %H:%M:%S UTC")
    s3_object.copy(
        {
            'Bucket': s3_object.bucket_name,
            'Key': s3_object.key
        },
        ExtraArgs={
            "ContentType": content_type,
            "ContentDisposition": content_disposition,
            "Metadata": metadata,
            "MetadataDirective": "REPLACE"
        }
    )

as we don't set anything content connected except Content-Disposition

ClamAV Lambda hangs on scan

Hi,

I have pulled the latest version using Tag version 1. Latest tag does not work.

ClamAv was working fine month ago. I built the latest code. Build is fine. But when deployed. It simple hangs while scanning. Lambda function times out after 5 mins

We are unable to use at the moment. It gives no clue what is going on .

START RequestId: b80ae18a-8b9c-11e8-b25f-0506706b1a96 Version: $LATEST
Script starting at 2018/07/19 21:43:06 UTC

Attempting to create directiory /tmp/dv1-rh-virus-defs.

Attempting to create directiory /tmp/clamav_defs.

Downloading definition file main.cvd from s3://dv1-rh-virus-defs/clamav_defs
Downloading definition file daily.cvd from s3://dv1-rh-virus-defs/clamav_defs
Downloading definition file bytecode.cvd from s3://dv1-rh-virus-defs/clamav_defs
Starting clamscan of /tmp/dv1-rh-virus-defs/024707E3-0B86-4265-8FEE-BA513DA0B9FA_ROBERTHALF_170822161936000.
END RequestId: b80ae18a-8b9c-11e8-b25f-0506706b1a96
REPORT RequestId: b80ae18a-8b9c-11e8-b25f-0506706b1a96 Duration: 300096.32 ms Billed Duration: 300000 ms Memory Size: 512 MB Max Memory Used: 513 MB
2018-07-19T21:48:06.222Z b80ae18a-8b9c-11e8-b25f-0506706b1a96 Task timed out after 300.10 seconds

Thanks
Jaspal Sandhu

Fails to build

Checked out on Fedora 27, run make for the first time....

make
rm -rf compile/lambda.zip
docker run --rm -ti \
	-v /home/jfurmank/src/tmp/bucket-antivirus-function:/opt/app \
	amazonlinux:latest \
	/bin/bash -c "cd /opt/app && ./build_lambda.sh"
/bin/bash: ./build_lambda.sh: Permission denied
make: *** [Makefile:34: archive] Error 126

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.