Coder Social home page Coder Social logo

labexp / spreads Goto Github PK

View Code? Open in Web Editor NEW

This project forked from diybookscanner/spreads

1.0 6.0 3.0 22.51 MB

Workflow assistant for camera-based book digitization

License: GNU Affero General Public License v3.0

CSS 5.38% Python 69.15% JavaScript 24.75% Makefile 0.10% HTML 0.08% NSIS 0.54%

spreads's Introduction

https://raw.github.com/jbaiter/spreads/master/doc/_static/logo.png

Build status

spreads is a software suite for the digitization of printed material. Its main focus is to integrate existing solutions for individual parts of the scanning workflow into a cohesive package that is intuitive to use and easy to extend.

At its core, it handles the communication with the imaging devices, the post-processing of the captured material and its assembly into output formats like PDF or ePub. On top of this base layer, we have built a variety of interfaces that should fit into most use cases: A full-fledged and mobile-friendly web interface that can be served from even the most low-powered devices (like a Raspberry Pi), a graphical wizard for classical desktop users and a bare-bones command-line interface for purists.

As for extensibility, we offer a plugin API that allows developers to hook into almost every part of the architecture and extend the application according to their needs. There are interfaces for developing a device driver to communicate with new hardware and for writing new postprocessing or output plugins to take advantage of a as of yet unsupported third-party software. There is even the possibility to create a completely new user interface that is better suited for specific environments.

Features

  • Support for cameras running CHDK as well as cameras supported by libgphoto2 (experimental), with extensive configuration options.
  • Cropping of the images during capture (only supported in web interface)
  • Shoot with two devices simultaneously, directly storing the images in a single directory on your computer in the right order.
  • Automatically rotate images
  • Run captured images through ScanTailor (attended or unattended)
  • Recognize text from the images through Tesseract OCR
  • Generate PDF and DJVU files with hidden text layers
  • Every project is stored in a directory on your computer and contains all the information that is needed in human-readable form, laid out according to the BagIt specification. This makes it easy to exchange projects between computers.

Interfaces

Web

web interface

The interface with the most features. You have the choice between three modes: scanner, processor and full. The first is ideal for slim scanning workstations that just deal with the capturing of the images and little more. From it, you can transfer your scans either to an USB stick or another instance of spreads running in one of the other two modes (all from your browser!), where they will be post-processed. It is currently the only interface to support cropping during capture and on-the-fly changing of settings during capture.

GUI

graphical interface

A graphical wizard that guides you through every step, from setting up the devices to postprocessing the images

CLI

command-line interface

A text-only command-line interface that exposes each step as a subcommand. Ideal for controlling a scanner over SSH and for command-line fetishists.

Getting Started

If you are on Debian unstable, Ubuntu 14.04 or Raspbian stable, you can use our APT repositories. Just add one of the below lines to your sources.list:

# Debian unstable/sid (i386, amd64)
deb http://spreads.jbaiter.de/debian unstable main

# Ubuntu 14.04 LTS (i386, amd64)
deb http://spreads.jbaiter.de/ubuntu trusty main

# Raspbian stable/wheezy (armhf)
deb http://spreads.jbaiter.de/raspbian wheezy main

Now run apt-get update and install one or more of spreads, spreads-web or spreads-gui.

Please not that these repositories currently include snapshots from the Git repository, so they might not work from time to time

On other distributions you will have to install it yourself with pip, please refer to the documentation for details.

Documentation

You can find the detailed manual for users and developers at http://spreads.readthedocs.org

Please note that it is currently woefully incomplete and partially out of date. If you want to help with it, please get in touch!

Getting Help

spreads's People

Contributors

aladarthehun avatar chunkerchunker avatar duerig avatar gareth8118 avatar jbaiter avatar markvdb avatar matti-kariluoma avatar mumme74 avatar nafraf avatar takluyver avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

spreads's Issues

UnicodeDecoreError en el tesseract.py

El posprocesamiento no estaba finalizando correctamente, ya que cuando el tesseract trabajaba terminaba con este error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1214: ordinal not in range(128)

Empezamos a debuguear y logramos corregirlo agregando el parametro encoding = 'utf8' en todos los open() del tesseract.py

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.