Coder Social home page Coder Social logo

humanmade / tachyon Goto Github PK

View Code? Open in Web Editor NEW
243.0 28.0 30.0 6 MB

Faster than light image resizing service that runs on AWS. Super simple to set up, highly available and very performant.

Home Page: https://engineering.hmn.md/projects/tachyon/

License: ISC License

JavaScript 19.22% Dockerfile 2.13% TypeScript 78.65%
wordpress image-processing aws-lambda aws-s3

tachyon's Introduction

Tachyon
Faster than light image resizing service that runs on AWS. Super simple to set up, highly available and very performant.
A Human Made project. Maintained by @joehoyle.

Tachyon is a faster than light image resizing service that runs on AWS. Super simple to set up, highly available and very performant.

Setup

Tachyon comes in two parts: the server to serve images and the plugin to use it. To use Tachyon, you need to run at least one server, as well as the plugin on all sites you want to use it.

Installation on AWS Lambda

We require using Tachyon on AWS Lambda to offload image processing task in a serverless configuration. This ensures you don't need lots of hardware to handle thousands of image resize requests, and can scale essentially infinitely. One Tachyon stack is required per S3 bucket, so we recommend using a common region bucket for all sites, which then only requires a single Tachyon stack per region.

Tachyon requires the following Lambda Function spec:

  • Runtime: Node JS 18
  • Function URL activated
  • Env vars:
    • S3_BUCKET=my-bucket
    • S3_REGION=my-bucket-region
    • S3_ENDPOINT=http://my-custom-endpoint (optional)
    • S3_FORCE_PATH_STYLE=1 (optional)

Take the lambda.zip from the latest release and upload it to your function.

Documentation

Credits

Created by Human Made for high volume and large-scale sites. We run Tachyon on sites with millions of monthly page views, and thousands of sites.

Written and maintained by Joe Hoyle.

Tachyon is inspired by Photon by Automattic. As Tachyon is not an all-purpose image resizer, rather it uses a media library in Amazon S3, it has a different use case to Photon.

Tachyon uses the Sharp (Used under the license Apache License 2.0) Node.js library for the resizing operations, which in turn uses the great libvips library.

Interested in joining in on the fun? Join us, and become human!

tachyon's People

Contributors

alsguimaraes avatar brenofabio avatar dependabot[bot] avatar japh avatar jerico avatar joehoyle avatar kingkool68 avatar kovshenin avatar mattheu avatar mikelittle avatar nathanielks avatar pandelisz avatar peterwilsoncc avatar rmccue avatar roborourke avatar shadyvb avatar sivanovhm avatar wisyhambolu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tachyon's Issues

Limit maximum size of images in resize parameter

Currently, it's possible to request images at arbitrary sizes. With sizes larger than the source image, Sharp will attempt to upscale the image, which is resource-intensive. With very large sizes, this can take a while and cause timeouts. It's also a waste of CPU time.

As a possible mitigation for this, we could clamp the resize values to the size of the image (e.g. resize_width = min( args.resize.width, image.width )

Make memory/CPU allocation configurable

Currently we have a 256MB memory limit for Lambda invocations. This is also directly proportional to how much CPU the function gets. In many cases I think this is too low. The more memory you give a function, the larger the "cost per second" the function invocation will be, but the more CPU it will get so in many cases the function invocation will be faster, and therefor not costing much more.

Document how to enable X-Ray in non lambda envs

When running in Lambda the X-Ray wrapper will be used automatically if X-Ray is enabled for the container, in other environments you may still want to use X-Ray however so we need to document how this is done.

Cloud Formation Template Has Errors

The LambdaFunction key has two "Handler" properties. I think the line "Handler": "index.handler", can be deleted.

I ended up having to reconstruct my own to get everything up and running.

Handle paths with percent encoding

When Tachyon gets paths with percent encoding, it looks up the S3 bucket without converting encoded characters. They key with the percent encoding is not found in the bucket.

Tachyon should decode the path before looking up S3.

Lossless compression could be optimized for jpg files

Using this tool https://yellowlab.tools - to scan a web page with images served by tachyon, it recommended optimisations to file size. These are pretty significant ( total of 1.6MB ).

To test this assumption, I downloaded an image from the page, and optimised it with the ImageOptim app, using lossless compression with default settings. Note that there is an "insane" setting...
The results showed an extra gain of 28.3%
Screenshot 2019-06-05 at 09 26 04
Screenshot 2019-06-05 at 09 25 54
Screenshot 2019-06-05 at 09 15 38

Smart cropping options

Proposing 3 new values that can be set as the crop argument:

  • ?crop=attention: uses sharp's built in attention cropping strategy
  • ?crop=entropy: uses sharp's built in Shannon entropy cropping strategy
  • ?crop=smart: use the smartcrop library - this might make a good default setting too.

How does the above sound @rmccue ?

Send CORS headers

There will likely be an extra step to whitelist the headers in Cloudfront but we should send an Access-Control-Allow-Origin header with the requesting domain as the value by default or read in from environment variables / config file.

Right now there are issues where images loaded via JavaScript won't work properly if the tachyon service runs on a separate domain.

What is tachyon's license?

Whats the license for tachyon itself? I might need to extend it a little to fit it into our workflow and would like to know how much of the source I need to keep public :) Thanks for the awesome work!

(Obv I'll contribute back all I can!)

Support for WebP

This has been tricky as it's then difficult to cache at the CDN level as you need to vary based off the Accept header, which is a huge amount of values that basically make the cache worthless.

I think we can actually do this with Lamdba @edge though, more to come!

Update cloudformation template for Node 6.1.0 end of life.

Node 6.1.0 will reach EOL on April 30, 2019.

Creating new functions using 6.1.0 will be disabled immediately, and updating existing functions will be disabled on AWS on May 30, 2019.

I can't find an online source but this is from their email:

The Node Foundation has previously published that node.js 6.x "Boron" will be declared End-of-Life (EOL) on April 2019 [1], and will stop receiving bug fixes, security updates, or performance improvements. Per the AWS Lambda runtime support policy [2], language runtimes that have reached their EOL are deprecated in AWS Lambda.

Invokes for functions configured to run on node.js 6.10 will continue to work normally, however the ability to create new Lambda functions configured to use the node.js 6.10 runtime will be disabled on April 30 2019. Code updates to existing functions using node.js 6.10 will be disabled 30 days later on May 30 2019

The run time support policy is quite vague about when invocations will be disabled for current functions:

After a runtime is deprecated, Lambda may retire it completely at any time by disabling invocation.

Rename to tachyon

We don't use the main tachyon repo any more, so we should rename that to tachyon-php and this one to tachyon to simplify the URLs.

Add support for `zoom` param

The zoom query parameter is part of the Photon API and makes it easy to get bigger images for high DPI displays. zoom=2 for example gives an image 2x bigger than the result of fit, resize, w or crop.

In terms of the implementation we'd factor this in to any calculated pixel dimensions (except those used for a crop) and multiply them by zoom.

Might not be a good idea but we could potentially use the zoom number as a factor to reduce the quality by to keep performance up. Images shown at half their actual size don't need to be as high quality.

CloudFormation template - The specified key does not exist - NoSuchKey

Hi,

Has anyone experienced the above issue with the latest CloudFormation template and the latest release bundle? To me this indicates the object doesn't exist, however, it does and numerous tests can confirm the object is present and publicly available. If I call the gateway directly and pass the arguments the lambda throws no errors, however, via CloudFront - its almost like its transforming the requested url and breaking the request?

{
"errorMessage": "The specified key does not exist.",
"errorType": "NoSuchKey",
"stackTrace": [
"Request.extractError (/var/runtime/node_modules/aws-sdk/lib/services/s3.js:585:35)",
"Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:106:20)",
"Request.emit (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:78:10)",
"Request.emit (/var/runtime/node_modules/aws-sdk/lib/request.js:683:14)",
"Request.transition (/var/runtime/node_modules/aws-sdk/lib/request.js:22:10)",
"AcceptorStateMachine.runTo (/var/runtime/node_modules/aws-sdk/lib/state_machine.js:14:12)",
"/var/runtime/node_modules/aws-sdk/lib/state_machine.js:26:10",
"Request. (/var/runtime/node_modules/aws-sdk/lib/request.js:38:9)",
"Request. (/var/runtime/node_modules/aws-sdk/lib/request.js:685:12)",
"Request.callListeners (/var/runtime/node_modules/aws-sdk/lib/sequential_executor.js:116:18)"
]
}

Any guidance would be greatly appreciated.

Thanks

Large PNG image error

It seems like a large image (PNG file > 2000x2000 px) is causing an error: {"message": "Internal server error"}.

I’ve tried increasing the Lambda memory allocation but it hasn’t worked. Any other ideas how I might troubleshoot the issue?

Trouble with `Invoke Error` and private S3 bucket

Hi,

I've set up private S3 buckets for the Tachyon source and storage (uploads), provisioned an EC2 instance with Node 10 to handle the Lambda execution environment requirements, installed the modules, zipped and sent to S3, and built out the CloudFormation stack without any apparent issues.

However, when testing the service, it 500s and I can't figure out where I went wrong:

2020-02-20T21:30:43.582Z	b45e3b76-4f85-4233-8eeb-b67d9eb16f6f	ERROR	Invoke Error	
{
    "errorType": "TypeError",
    "errorMessage": "Cannot read property 'params' of undefined",
    "stack": [
        "TypeError: Cannot read property 'params' of undefined",
        "    at makeRequest (/var/runtime/node_modules/aws-sdk/lib/service.js:188:21)",
        "    at Object.module.exports.s3 (/var/task/index.js:39:9)",
        "    at Runtime.exports.handler (/var/task/lambda-handler.js:13:17)",
        "    at Runtime.handleOnce (/var/runtime/Runtime.js:66:25)"
    ]
}

I've got PrivateS3Bucket set to true, and all public access blocked for the referenced UploadsS3Bucket. Looking at the IAM Role that the stack built out for the Lambda, it almost matches the documented policy here, with the exception that instead of the Resource ARN being the documented arn:aws:s3:::*, the stack built out arn:aws:s3:::tachyon-storage/*, which should be fine since I only have that one bucket -- but I updated the policy just in case, with no effect (as expected).

Does the above error ring any bells, or does anyone have ideas on what I could try next?

Thanks!

Make API Gateway timeout match Lambda Timeout

Currently the API Gateway timeout for the Lambda response is 29s, but the Lambda function timeout is 60s. If any requests take more than 29s, the client won't ever get the image. Also, is an image takes > 29s but less than 60s, the Lambda logs won't show any errors.

Permit whitelisting specific hosts

As a Tachyon user, it would be helpful if I could whitelist specific image domains, to only permit resizing / hosting images from those domains.

Support for colour grading using LUTs

It's not a super common use-case but when images from multiple sources are brought together they can make a design look somewhat "off".

It's possible for Tachyon to support colour grading using .cube files typically used with programs like Lightroom to apply transformations across multiple images to normalise the colourspace and luminance etc.

After a bit of research we could potentially use the following libraries:

The trickier part of this is how to get or specify the .cube file to be used. The most flexible way would be to fetch it from a URL specified in a query param and cache it strongly but initial image loads could be super slow with this method.

We'd also want to ensure that the .cube file was coming from the same domain as the original image request. Shouldn't be too hard but would need to check very carefully how that's handled when using nginx rewrites to pipe requests to a tachyon service within a network.

It would be useful to have ability to embed on a background

I've found that we often want an image to result in an exact size even if the resized image would be less than the requested size. Sharp has a embed() function that works similar to max but fills in the background out to the requested dimensions.

I've created a PR that adds this

#41

Look into API Gateway caching

Currently the cache-hit rate doesn't seem great on CloudFront due to the cache being on each edge node. It might be possible to make use of API Gateway's use of Cache offering as this I think is central to the lambda containers, and would get a much higher hit rate.

New TTL Issues with Cloudformation

Using the latest cloudformation-template.json the creation process would cause an error with a message like this:
MinTTL, MaxTTL and DefaultTTL should follow order MinTTL less than or equal to DefaultTTL less than or equal to MaxTTL (Service: AmazonCloudFront; Status Code: 400; Error Code: InvalidTTLOrder; Request ID: xxx)

It looks like fairly recently the config was updated to address this, but still won't successfully build. Adjusting the values to be further apart from each other still triggered this error. To complete the build I had to remove all TTL values from the .json file and rely on the defaults, including the DefaultTTL under CloudFrontDistribution -> DistributionConfig -> CacheBehaviors. Thought you should know!

Support for crop location

Some plugins generate an asymmetrically centred cropped image. It'd be good to have this option eg a crop x & y param to indicate the point of interest on the image. Adding these values to the tachyon URLs can be done on plugin by plugin basis.

eg. tchyn.io/path/to/image.jpg?width=200&height=400&crop=1&cropxy=600,100

Missing image results in 502 error

Currently the Lambda function returns an "Access denied" error for images that don't exist, and then the API Gateway response sends a 502 error.

Support for rotation

Now that Tachyon will be used outside of the HM sphere there are requirements from one client project where we'll need to provide a full thumbnail editor, this includes the cropping from #8 and also rotation.

It's really simple to add with minimal overhead too so I can't think of a reason not to support it.

Filesize optimizations

For reference here's some good test files:

https://us-east-1.tchyn.io/humanmade-production/uploads/2017/05/Humans-resized.jpg?w=500
https://us-east-1.tchyn.io/humanmade-production/uploads/2017/05/humans.png?w=500
https://us-east-1.tchyn.io/humanmade-production/uploads/2017/04/[email protected]?w=500

The JPG compression is very good, and I don't expect to improve there, however the PNG compression is not good for "simple" vector style pngs.

This in Sharp we can disable a thing called adaptiveFiltering which makes the compression of vector style PNGs much better. This is essentially what something like Optipng does. However, the problem is it makes PNGs larger if they are more like photos (image 2 above). Sharp can't do any smart testing to only apply it if necessary.

We could potentially create 2 versions of the image and serve the smaller of the two, but that's less than ideal due to significantly more memory usage and CPU etc.

There's also a thing called pngquant which allows you to do lossy compression of PNG images. Right now we have no lossy compression of PNGs. It might be possible to install the pngquant NPM module and apply that, but I'm not sure if we want lossy PNGs in most cases.

Add face detection boost to improve smart cropping

The smartcrop.js library gets decent results a lot of the time but can be helped by providing areas of an image to 'boost'.

The examples on the website suggest a few libraries for face detaction but they're all client side libraries that rely on the DOM & canvas.

We could use node-opencv to run it against the basic face cascade as an initial experiment.

Idea to route requests to cache file

Currently there is no disk cache for Tachyon files, only the CloudFront edge cache. By introducing an S3 cache and Edge function, we might be able to solve a couple problems in one:

  • Tachyon should cache it's output to S3 when it generates a file. We can use S3 lifetime object length for simple expiry.
  • We have a Lambda@Edge function which checks for the S3 file, if it exists, it could adjust the origin request, to instead get the file routed directly to S3.

I'm not 100% sure you can do this with Edge functions (adjust the origin), but I think you can. We already do it with an X-WebP header which changes the CloudFront behaviour.

By routing cache requests directly to S3, we can save on many Lambda invocations, and we can also get around the 5MB limit we currently have for image responses.

The uncached Lambda process will still be subject to a 5MB response limit, which I don't think we'd be able to get around though, but it would mean only a first request would fail, subsequent cached requests to S3 would be fine.

403 (Forbidden) https://dl.bintray.com/lovell/sharp/libvips-8.4.2-darwin-x64.tar.gz

Getting a failure on install on download of https://dl.bintray.com/lovell/sharp/libvips-8.4.2-darwin-x64.tar.gz with a 403 (Forbidden)

We are using commit 1b776d0 but does the same on master too.

> node-gyp rebuild

ERROR: Download of https://dl.bintray.com/lovell/sharp/libvips-8.4.2-darwin-x64.tar.gz failed: Response code 403 (Forbidden)
gyp: Call to 'node -e "require('./binding').download_vips()"' returned exit status 1 while in binding.gyp. while trying to load binding.gyp
gyp ERR! configure error
gyp ERR! stack Error: `gyp` failed with exit code: 1
gyp ERR! stack     at ChildProcess.onCpExit (/usr/local/lib/node_modules/npm/node_modules/node-gyp/lib/configure.js:305:16)
gyp ERR! stack     at emitTwo (events.js:106:13)
gyp ERR! stack     at ChildProcess.emit (events.js:191:7)
gyp ERR! stack     at Process.ChildProcess._handle.onexit (internal/child_process.js:215:12)
gyp ERR! System Darwin 15.6.0
gyp ERR! command "/usr/local/bin/node" "/usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "rebuild"
gyp ERR! cwd /Users/steve.gibbings/WebstormProjects/tachyon/node_modules/sharp
gyp ERR! node -v v6.9.1
gyp ERR! node-gyp -v v3.4.0
gyp ERR! not ok
[email protected] /Users/steve.gibbings/WebstormProjects/tachyon
├─┬ [email protected]
│ └── [email protected]
├─┬ [email protected]
│ ├─┬ [email protected]
│ │ ├── [email protected]
│ │ ├── [email protected]
│ │ └── [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ ├─┬ [email protected]
│ │ └── [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ └─┬ [email protected]
│   └── [email protected]
├─┬ [email protected]
│ └── [email protected]
└── [email protected]

npm ERR! Darwin 15.6.0
npm ERR! argv "/usr/local/bin/node" "/usr/local/bin/npm" "i"
npm ERR! node v6.9.1
npm ERR! npm  v3.10.8
npm ERR! code ELIFECYCLE

npm ERR! [email protected] install: `node-gyp rebuild`
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] install script 'node-gyp rebuild'.
npm ERR! Make sure you have the latest version of node.js and npm installed.
npm ERR! If you do, this is most likely a problem with the sharp package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR!     node-gyp rebuild
npm ERR! You can get information on how to open an issue for this project with:
npm ERR!     npm bugs sharp
npm ERR! Or if that isn't available, you can get their info via:
npm ERR!     npm owner ls sharp
npm ERR! There is likely additional logging output above.

npm ERR! Please include the following file with any support request:
npm ERR!     /Users/steve.gibbings/WebstormProjects/tachyon/npm-debug.lo

Enable authenticated requests for S3 based on environment variable

In the company I work for, we're using Tachyon and want to maintain the bucket as private. I have validated that it is possible to do that by changing the S3 method from makeUnauthenticatedRequest to makeRequest, keeping all options as they are.
It should be commented that, by adding this option, the user must configure a role for the Lambda Function to be able to fetch data from the S3 Bucket.

Support for images without a query string

Firstly, many thanks for releasing this publicly; I believe this is revolutionary in the context of dealing with images in WordPress.

I've noticed that images without a query string return an "internal server error" message rather than an image; for example: https://us-east-1.tchyn.io/humanmade-production/uploads/2017/04/ustwo.jpg?w=870 vs https://us-east-1.tchyn.io/humanmade-production/uploads/2017/04/ustwo.jpg

Would it be possible to adjust the functionality to make the query string optional, and serve the image at the original resolution where one isn't specified?

Add watermark support

This is something sharp can do pretty easily.

Suggested API:

?wm=/path/to/image.png,southeast

Whereby the S3 path is the first arg and after the comma is the position which could default to bottom right.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.