Coder Social home page Coder Social logo

inuwamobarak / nougat Goto Github PK

View Code? Open in Web Editor NEW
18.0 1.0 2.0 642 KB

Nougat is a Meta AI's revolutionary OCR model designed to transcribe scientific PDFs into an easy-to-use Markdown format.

Home Page: https://www.analyticsvidhya.com/blog/2023/11/enhancing-scientific-document-processing-with-nougat/

Jupyter Notebook 100.00%
ai artificial-intelligence encoder-decoder-model huggingface-transformers nougat transformer-models transformers vision-transformer vit

nougat's Introduction

Nougat: Revolutionizing OCR for Scientific Documents

nuogat

About Nougat

Nougat is an advanced Transformer-based OCR model that simplifies the process of converting complex scientific documents, often stored in PDF format, into a common and machine-readable Markdown format. Developed by a team of experts, Nougat leverages state-of-the-art architecture and training techniques to make scientific knowledge more accessible and usable.

Key Features

  • Transformer Architecture: Nougat uses a Swin Transformer as a vision encoder and an mBART-based text decoder, allowing for end-to-end transcription of scientific PDFs.

  • End-to-End Training: With Nougat, there's no need for complex pipelines. The model takes raw pixels as input and generates Markdown text as output, simplifying the entire OCR process.

  • Bridging the Gap: Nougat not only transcribes scientific documents but also bridges the gap between human-readable content and machine-readable text, making it easier to access and utilize scientific knowledge.

    git clone https://github.com/inuwamobarak/nougat.git

nougat's People

Contributors

inuwamobarak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

nunamia lihuibng

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.