The converjon from berlinonline

Improved status page and improved statistics

/status should give an HTML page instead of just a JSON response.

Also, the data shown there should give more information about the actual state of the application.

Cache Configuration for individual URL namespaces.

Include the source URL in a response header

For debugging/testing/troubleshooting purposes, the original source URL should be included in the responses as a header.

In 1.x.x when you need it to test the origin of an image, you would first have to decode the source URL. Having it in a plaintext header will make this easier.

Better cleanup of downloader requests

The fix for #34 in bc57f48 is just an improvised one.

There needs to be a clean handling of the requests.

Request 189 Error during image analysis

Hi, I often get this kind of errors(Error during image analysis), and the images are broken.

Is there any way to fix the Error during image analysis ?
The original files that get download are ok, and if I try a few more times I do get a good image.

NaNs in crop arguments.

Apparently there is apossibiliuty that NaN values find their way into command line arguments.

[Mon, 11 Mar 2013 14:41:03 GMT] Process error output convert ./cache/source/www.berlinonline.de/image.php/localnews/asset-25929 -crop NaNxNaN+NaN+NaN -resize 620x250! jpg:./cache/target/www.berlinonline.de/image.php/localnews/asset-25929/0bd13514a1f6933aedf08795e3efece708b81bebcaa58d5a07919f0a0f065068.jpeg convert: invalid argument for option `-crop': NaNxNaN+NaN+NaN @ error/convert.c/ConvertImageCommand/1090.\n"

Support of Mime-Types

Some mimetypes are delivered with charset=binary so that the internal check for the correct mimetype within converjon does not match theese images correctly.

Hot fix for me was to stop delivering the charset information, but I think the right way would be to handle that right in converjon?

information for handling mime-types: http://www.mhonarc.org/~ehood/MIME/2046/rfc2046.html#4.2

source status 404 is delivered as »502 Bad gateway«

Example:

http://www.berlin.de/converjon/?width=620&height=250&url=http%3A%2F%2Fwww.berlinonline.de%2Fimage.php%2Fmovie%2Fasset-134002

Source:

http://www.berlinonline.de/image.php/movie/asset-134002

@see #29

Can not write cache file if image URL ends with '/'

Service dies if image url ends with '/'
Example URL: http://www.berlinonline.de/binaries/asset/image_assets/1073880/source/1263907606/667x500/

Check compatibility with node 0.10

Check, if Converjon runs on node 0.10.0 without additional problems.
Correct the engine dependency accordingly.

Socket pool get's exhausted by failing request.

The HTTP agents socket pool fills up to it's configured limit and then stops accepting any more download requests. This results in the Converjon server running into "Processing Timeout" errors on every subsequent request.

Disable agent pooling as a quick fix. A better solution should be found in 1.7.0.

Queue info in status page

The status page should contain information about the requests that are currently processed or waiting.

Support some kinda cache-busting parameter

It would be great if one could pass an etag or timestamp or something equivalent to converjon when requesting images. When set converjon could then ignore any (default) cache settings and then purge->deliver.

Status page information to server side file.

The status information should have a configuration option to disable the public staus page and write the data into a file instead.

Remove connect.js dependency

The use of the connect middleware framework does not yield any real benefit to converjon.

It should be removed and URL routing should be done directly in the app since it's a really simple routing.

This will also remove the "bouncer" middleware. It was never really used and it's basically out of scope for converjon. A dedicated reverse proxy can probably provide this functionality much better.

Automatic cache cleanup

Clear invalid cache items when the expire to reduce storage clutter.

Pass thru all headers

converjon should pass all headers from the backend response to the client, except the Content-Length.

Start/Stop system for sysv-init systems

We need a stable way to start/stop/respawn converjon on a sysv-init based system (SLE11SP2)

Installable via NPM

Converjon should be installable from NPM. The package should provide a bin script to launch the server.

Cleanup debug logs.

Numbers on all log lines for a request.
URL just on the first entry.

One request must be entirely grepable.

"Temp" cache directory is not cleared on launch

All 3 cache subdirectories ("source", "target", "temp") should be cleared on launch.

Support for HTTPS in backend connections

converjon should be able to fetch images from HTTPS servers.

EventsEmitter memory leak

Possible explanation:

When internal success/error events are fired, the respective other event handler is not removed. they accumulate until Converjon crashes with an EventEmitter memory leak exception.

a health check URL would be nice

For operating behind varnish a simple health check URL would be nice. Answering this URL should not depend on any secondary server.

Renamed: Constraints on a by-source basis

Due to legal issues (licence agreement with a commercial foto service) we need a max width and height for pictures.
The configuration should be in the global environment and should affect all delivered images.

Environment setting via file in working copy

As an alternative to env variables there should be a way to set the environment for a working copy via a text file in the base directory.

The env variable should still override this setting.

More information on status page

Add some more useful infos to the status page:

served requests
statistics
etc.

Clustering in Node

Load balancing and clustering directly in node.js

Converjon CLI utility

Thanks @aeytom, for the idea:

Converjon should provide a CLI utility that exposes the same features and API as the web server but with local files/IO-Streams.

Example calls could look like this:

cat foo.jpg | conversion-cli --width=140 quality=65 > thumb.jpg

and/or:

converjon-cli --width=140 --quality=65 foo.jpg thumb.jpg

This could enable other applications to use converjon as a wrapper around imagemagick.

Resize Bug in GIF

We have a GIF image, wich scales from 166x99px to 214x125px, but should be 166x125px. I prepared an example here:

http://test.berlin.de/converjon/?height=125&width=166&url=http%3A%2F%2Ftest.berlin.de%2Ftest%2F_assets%2Flogo_gute_tat.gif&mime=image%2Fgif
http://test.berlin.de/test/_assets/logo_gute_tat.gif
(password protected, contact me for a login)

Public, but might be deleted in near future:

http://www.berlin.de/converjon/?height=125&width=166&url=http%3A%2F%2Fwww.berlin.de%2Fvak%2F_assets%2Faktuelles%2F2014%2Flogo_gute_tat.gif&mime=image%2Fgif
http://www.berlin.de/vak/_assets/aktuelles/2014/logo_gute_tat.gif

Perhaps some problem with the GIF file format?

Versions:

converjon 1.7.1
ImageMagick 6.8.6-9 2014-03-06 Q16
exiftool 9.34

Status URL

make a /status url that always answeres with "200 OK" and returns some usefuls stats maybe.

Automated load test

Fetch random images, maybe from google image search
keep running with random conversions for extended period of time

Add a 'Cache-Control: max-age=…' or 'Expires: …' response header

The headers should depend on the age of the scaled image and the internal used cache timings.

unsecure SSL-requests for dev-boxes - do not reject

Service is used via https, and images are coming from https-servers.

If I test on my dev-box I have only self-signed certificates, such that node.js https module is rejecting the image-requests.

I patch it by myself in lib/downloader.js around line 60

--- a/lib/downloader.js
+++ b/lib/downloader.js
@@ -57,6 +57,7 @@ module.exports = function(req) {

         var download_options = url.parse(sourceUrl);
         download_options.method = "GET";
+        download_options.rejectUnauthorized = false; /* */

         var auth_credentials = authentication.getCredentials(sourceUrl);
         if (auth_credentials) {

It would be nice to configure it via url-regexp that I permit unauthorized ssl-requests for certain domains.

Distorted images

Images are sometimes distorted when using AOI-Cropping.

Conversions starting before file are downloaded completely

When the same source image is requested multiple time simultaneously, sometimes a convert worker starts reading the file before it is downloaded completely.

This happens because imageFetcher.js (line 107) just checks, if the file exists, which is TRUE while the file is being written by the downloadQueue.

This bug results in corrupted target image data.

Processing timeout

Just now, we became many: Processing timeout
Thee are no relavant problems on the status page.

Sample URL: http://www.berlinonline.de/converjon/?width=180&height=125&url=http%3A%2F%2Fwww.berlinonline.de%2Fimage.php%2Flocalnews%2Fasset-41133

A restart of converjon.service solves the problem for now

Error log tail in status page

The status page should include latest errors from the log.

New single-tier cache storage

The two-tiered cache storage of 1.x.x should be eliminated. Given the right timing, it can lead to an image being cached almost double the configured time.

The new storage will store every source image in a directory, similar to the source cache of 1.x.x but instead of having the additional target cache directory, the converted images are just stored in the same directory as the source and are identified by their conversion parameters. The storage dir for every item should contain a metadata file (probably JSON) that is used to hold HTTP headers from the origin server, the expiration date and other additional information if necessary.

When a cached item becomes stale, its source file and all conversion targets will be deleted.

Include host name in status information

Failed downloads result in processing timeouts on subsequent requests.

When the download of a source URL fails, it is not removed from the list of acrive downloads. Every subsequent request to that URL will result in a "Processing Timeout" error.

Move local config out of application directories

Currently, all configs are located in the config directory. This will no longer be possible, if converjon is installed via NPM (see #47).

Default config should still be packaged with the application but a local config file should be given to the application at launch time via command line argument.

something like converjon --config=foo/bar/conf.json

Check-URL

There needs to be an URL that can be called to determine wether the service is ready to process requests or if waiting queues are full.

Move server specific configuration to seperate repo

I think we should move things like the openSuse service config into a seperate repositiory.

Readme improvements

Suggestions from @graste

eventuell etwas mehr Beschreibung als "An on-the-fly image conversion
service" und vielleicht direkt auf "Usage" verlinken
Dependencies eventuell verlinken (Installationsseite)
"The configuration file are read and merged in this order:" FILES
URL-Whitelisting ist eventuell zu nachsichtig in den Beispielen?
- http://evil.com/foo.jpg#http://localhost/yolo.png
- eventuell nen Sicherheitshinweis mit paar guten Regeln/Beispielen?
"URLs from which images are allowed to be donwloaded must" - DOWNLOADED
Bouncer: "threshhold" sollte wohl "threshold" heißen
Downloader: ebenfalls Sicherheitshinweis für "authentication.*" Ausdrücke
"treated as stale nad will be refreshed" AND
Ist das "/" am Ende von Pfadangaben Pflicht oder wird automatisch addiert?
Logging: "wether" soll bestimmt "whether" sein
Analyzer: "basic setting for" SETTINGS?
- AOI mit "see below" oder direkt mit 'em Link nach unten versehen?
"request with aource URLs from localhost" A SOURCE
"The AOI is a rectangle in the folling format:" FOLLOWING
"embedded in the original images metadata" IMAGE'S?
"it is preferre over the" PREFERRED
"summary of Converjons current state" CONVERJON'S?
Copyright: Lizenz der Lib erwähnen mit Hinweis auf LICENSE.md

HTTPS Sources

We should also be able to fetch images from HTTPS sources.

Configurable authentication for URL spaces

For configured URLs (keys from whitelist) converjon needs to pass auth headers to the source server.

Running Processes Counter is indreased on process creation instead of actual start

The counter for running processes in inreased too soon. It is inreased in the process constructor instead of when the process actually starts running.

This allows the running process count to go beyond the configured limit and prevents aubsequent requests from ever being processed.

Enlarge area-of-interest fraction depending on thumbnail size

For art directional uses it may be cool to increase the fraction the area of interest has when thumbnails get smaller. That is, the smaller the thumbnail the larger the fraction of the image the area of interest takes.

See: http://usecases.responsiveimages.org/#art-direction for a small example. Bullet point 6 in http://blog.cloudfour.com/8-guidelines-and-1-rule-for-responsive-images/ gives another example image.

It would be nice if this feature would be configurable and even allow target images having other ratios than the source image as that is of use in many use cases (huge source images and multiple target teaser sizes and ratios)

Identifier for converjon instances

The /status page should include an identifier to help distinguish between multiple converjon instances behind a load balancer.

berlinonline / converjon Goto Github PK

converjon's People

Contributors

Stargazers

Watchers

Forkers

converjon's Issues

Suggestions from @graste

Recommend Projects

Recommend Topics

Recommend Org