Coder Social home page Coder Social logo

Comments (24)

christophgysin avatar christophgysin commented on August 26, 2024 28

Why not have the client upload the file directly to S3? No need to pay for Lambda execution time just to forward a file to S3.

from examples.

Keksike avatar Keksike commented on August 26, 2024 21

Hey,

My idea was that the Lambda function could include something like manipulation of the file or use data of the file in something.

This could also be done as a S3 event trigger (so when a file gets uploaded to the S3 bucket, the Lambda gets triggered with the uploaded file in the event), but in some cases it would be handier to upload the file through the API Gateway & Lambda-function.

from examples.

christophgysin avatar christophgysin commented on August 26, 2024 6

For uploading files, the best way would be to return a pre-signed URL, then have the client upload the file directly to S3. Otherwise you'll have to implement uploading the file in chunks.

from examples.

princeinexile avatar princeinexile commented on August 26, 2024 3

@Keksike thank you, but I want to do it in only lambda and api gateway, S3 in cloudformation but it is showing HTML.
Can you help me with how to link lambda function and api gateway

from examples.

waltermvp avatar waltermvp commented on August 26, 2024 2

@Keksike , @christophgysin also should this issue be assigned with the question label?

from examples.

Keksike avatar Keksike commented on August 26, 2024 2

@princeinexile It seems to be possible, however I never finished my implementation.

You might wanna check this blog which has quite simple instructions on how to do it with Zappa

http://blog.stratospark.com/secure-serverless-file-uploads-with-aws-lambda-s3-zappa.html

from examples.

7hibault avatar 7hibault commented on August 26, 2024 2

I'm quite new to serverless myself but isn't the most widely seen approach more costly than what was asked by OP for a file upload + processing in a lambda?

What we usually see is to send the file to S3 or asking a signed url to a lambda to then upload to S3 (like in https://www.netlify.com/blog/2016/11/17/serverless-file-uploads/).

However, if the file needs to be processed, that means that we access the file from S3 when we could access it directly in the lambda (and then store it in S3 if needed). Which means that we pay for an access we don't really need.

Am I wrong in thinking that the approach asked by OP (uploading in chunks then saving to S3) would be more cost-efficient than uploading to S3 when there's some processing involved?

from examples.

christophgysin avatar christophgysin commented on August 26, 2024 2

@vinyoliver As mentioned before, just store the state of the upload in your database.

You could use Lambda, S3 triggers and DynamoDB TTL to implement a flow like:

  • client calls API to get upload URL
  • backend creates database record with state: initiated and ttl: 3600
  • backend creates signed URL and returns it to the client
  • client receives URL, and starts uploading file directly to S3
  • when upload is complete, S3 triggers lambda
  • lambda updates database record: state: complete (remove ttl field)

All records in the DB with state: complete are available in S3. Records with state: initiated are either uploading (and will turn to state: complete), or abandoned, and will be removed automatically when the TTL expires.

from examples.

waltermvp avatar waltermvp commented on August 26, 2024 1

This worked for me with runtime: nodejs6.10 and the dependencies installed. Let me know if you have any questions.

`
use strict";

const uuid = require("uuid");
const dynamodb = require("./dynamodb");
const AWS = require("aws-sdk");
const s3 = new AWS.S3();
var shortid = require('shortid');

module.exports.create = (event, context, callback) => {
const timestamp = new Date().getTime();
const data = JSON.parse(event.body);

if (typeof data.title !== "string") {
console.error("Validation Failed");
callback(null, {
statusCode: 400,
headers: { "Content-Type": "text/plain" },
body: "Couldn't create the todo item due to missing title."
});
return;
}

if (typeof data.subtitle !== "string") {
console.error("Validation Failed");
callback(null, {
statusCode: 400,
headers: { "Content-Type": "text/plain" },
body: "Couldn't create the todo item due to missing subtitle."
});
return;
}

if (typeof data.description !== "string") {
console.error("Validation Failed");
callback(null, {
statusCode: 400,
headers: { "Content-Type": "text/plain" },
body: "Couldn't create the todo item due to missing description."
});
return;
}

if (typeof data.sectionKey !== "string") {
console.error("Validation Failed");
callback(null, {
statusCode: 400,
headers: { "Content-Type": "text/plain" },
body: "Couldn't create the todo item due to missing section key."
});
return;
}

if (typeof data.sortIndex !== "number") {
console.error("Validation Failed");
callback(null, {
statusCode: 400,
headers: { "Content-Type": "text/plain" },
body: "Couldn't create the todo item due to missing sort index."
});
return;
}

if (typeof data.image !== "string") {
console.error("Validation Failed");
callback(null, {
statusCode: 400,
headers: { "Content-Type": "text/plain" },
body: "Couldn't create the todo item due to missing image."
});
return;
}

var result = null;
var mime = data.image.match(/data:([a-zA-Z0-9]+/[a-zA-Z0-9-.+]+).,./);
if (mime && mime.length) {
result = mime[1];
}

if (result !== "image/png" && result !== "image/jpeg") {
const incorrectMimeType = {
statusCode: 400,
body: JSON.stringify({
message:
"Must have a valid png or jpeg image value, encoded as base64String. Instead got " +
result
}),
headers: {
"x-custom-header": "My Header Value"
}
};
callback(null, incorrectMimeType);
return;
}

var imageType;
if (result == "image/png") {
imageType = "png";
}
if (result == "image/jpeg") {
imageType = "jpeg";
}

// var buffer = new Buffer(data.image, 'base64');
var buffer = new Buffer(
data.image.replace(/^, ""),
"base64"
);
var imagePrefix = slide-images/${shortid.generate()}.${imageType};
// var imagePrefix = 'todo-images/' + shortid.generate() + "." + imageType;
const s3Params = {
Bucket: process.env.BUCKET,
Key: imagePrefix,
Body: buffer,
ACL: "public-read",
ContentEncoding: "base64",
ContentType: result
};

var putObjectPromise = s3.putObject(s3Params).promise();

putObjectPromise.then(function(putData) {

const imageName = 'http://' + process.env.BUCKET + '.s3.amazonaws.com/' + imagePrefix;

const params = {
  TableName: process.env.DYNAMODB_TABLE,
  Item: {
    id: uuid.v1(),
    title: data.title,
    subtitle: data.subtitle,
    description: data.description,
    createdAt: timestamp,
    updatedAt: timestamp,
    sectionKey: data.sectionKey,
    sortIndex: data.sortIndex,
    image: imageName        
  }
};

// write the todo to the database
dynamodb.put(params, (error) => {
  // handle potential errors
  if (error) {
    console.error(error);
    callback(new Error('Couldn\'t create the todo item.'));
    return;
  }

  // create a response
  const response = {
    statusCode: 200,
    body: JSON.stringify(params.Item),
  };
  callback(null, response);
});

}).catch(function(err) {
console.log(err);

// create a response
const s3PutResponse = {
  statusCode: 500,
  body: JSON.stringify({
    "message": "Unable to load image to S3"
  }),
};
callback(null, s3PutResponse);

});
};
`

from examples.

waltermvp avatar waltermvp commented on August 26, 2024

@christophgysin @Keksike is this the recommended pattern? I'm pretty new to building restful API (serverless is awesome), so I'm not exactly sure if I should be accepting a base64 encoded string via the create method or first creating an object via one restful call then putting the base64 encoded string (image) in a second call. Any examples would be greatly appreciated :)

I know that there are examples for S3 upload and post processing, but there is no example used with a restful/ dynamodb setup.

from examples.

princeinexile avatar princeinexile commented on August 26, 2024

@waltermvp @Keksike @christophgysin @rupakg did anybody trying to create it? because I'm also working for that

from examples.

princeinexile avatar princeinexile commented on August 26, 2024

@Keksike hi, I need help to create a cloudformation json file to upload a file directly to s3 using api gateway and lambda function. Is it possible to do it?

from examples.

Keksike avatar Keksike commented on August 26, 2024

@princeinexile

Can you help me with how to link lambda function and api gateway

This question is not really related to this thread.

from examples.

ChunAllen avatar ChunAllen commented on August 26, 2024

I'm also using API Gateway and Lambda to upload an image to S3. I posted a question on StackOverflow

from examples.

aemc avatar aemc commented on August 26, 2024

I'm able to upload a file through a presigned URL but for some reason the file loses its extension. Did anyone come across this?

from examples.

Keksike avatar Keksike commented on August 26, 2024

@aemc I think you should be able to set the file name when creating the presigned url. There you can add it with the extension included, if you wish to.

from examples.

christophgysin avatar christophgysin commented on August 26, 2024

Uploading a file can be slow. If you upload through lambda, you will have to pay for lamda compute time while you are essentially just waiting for the data the trickle in over the network. If you get a presigned URL, you only pay for a few ms to generate the URL, then the time it takes to upload is for free. Once the file is complete, you can than read it from lambda, which is probably a lot faster, saving you lambda execution cost.

from examples.

dbartholomae avatar dbartholomae commented on August 26, 2024

Does anyone who worked with the signed-URL approach find a way to bundle the upload in a transaction? We are currently handling files up to 50 MB, so using a lambda (or even API Gateway) is not an option due to the current limits. Whenever a file is uploaded, we have to make a database entry at the same time. If the database entry is made in a separate request (e. g. when creating the signed upload link) we run into trouble if the client calls the lambda but then loses internet access and cannot finish the file upload, then there is inconsistent state between the database and S3.
What is the serverless way to run transactions including a database and S3?

from examples.

christophgysin avatar christophgysin commented on August 26, 2024

You can create a database record when the signed URL is created, and then update it from a lambda triggered by an S3 event when the object has been created. If you want to handle aborted uploads, you could trigger a lambda from a cloudwatch schedule that handles (e.g. removes) records for files that have not been uploaded within the validity of the signed URL. Or if using dynamodb, you could set TTL on the records for pending uploads.

from examples.

dbartholomae avatar dbartholomae commented on August 26, 2024

Hmm, this would still leave me with an inconsistent state. I could set a flag on the record that states whether the file is already confirmed or not. Sounds like monkey-patching a transaction system, though. If there is no better solution I will stay off serverless for these uploads for a while longer.

from examples.

christophgysin avatar christophgysin commented on August 26, 2024

Yes, you could store the state of the upload (e.g. initiated/completed).
The serverless approach often requires you to think asynchronously. The advantage is that you don't have to pay compute time while you are just waiting for the client to upload data over a potentially slow connection.

from examples.

vinyoliver avatar vinyoliver commented on August 26, 2024

@dbartholomae have you figured this out? I'm facing a similar situation. I thought to create the database record when creating the signed URL but not sure how can I handle my database state in case something goes wrong or in case the user just give up uploading the file.

@christophgysin in cases like this, wouldn't it be better to handle the upload using a lambda function?

from examples.

vinyoliver avatar vinyoliver commented on August 26, 2024

I've ended up doing something pretty close to what you described. What I did was:

  1. Client calls the API to get an upload URL
  2. Client uploads the file to the provided URL
  3. When the file is uploaded to s3 I got a lambda that listens to this event and then inserts the data into my database;

Thanks for your help @christophgysin

from examples.

spawn-guy avatar spawn-guy commented on August 26, 2024

both approaches are valid.
with the presigned S3 URLs - you have to implement uploading logic on both backend and frontend. PUT requests can not be redirected.
as an additional note to aws-node-signed-uploads example it is better to sign content-type and file-size together with filename. (to make s3 to check for those as well, if attacker will want to send some .exe file instead)

but the receiving a file and processing it (even without s3 involved) is also a valid use-case. thanks @waltermvp code looks interesting. looks like APIGateway is base64-encoding the octet streams.

this piece of code seems to do the job

let encodedImage =JSON.parse(event.body).user_avatar;
let decodedImage = Buffer.from(encodedImage, 'base64');

from examples.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.