Coder Social home page Coder Social logo

mignotlabnoxturnalparser's Introduction

Author: Giorgio Ricciardiello
        [email protected]

Project dedicated for reading and parsing pdf Noxturnal reports into Excel tables.

INPUT
Place the pdfs files as input in the folder -> data/input_psg_reports

OUTPUT
The output will be given as a single Excel table containing all the information found on each Noxturnal report,
each row is a different de-identified subject and the columns contain the data present in the Noxturnal report

CONFIGURATION
The confi folder contains the configurations files
    config.py
        contains the paths
    pdf_structure.py
        contains a nested dictionary with all the sections and subsections in the Noxtrunal report. It is important
        that all pdfs follows this structure. If the pdfs are updated with new structures, this nested dictionary
        must be config

PROGRAM
The src folder contains the different classes
    read_pdf
        Class dedicated to extract all the information from the pdf report

    nox_to_excel.py
        Class dedicated for the parsing of extracted and constructed dictionary from the pdf, including the extracted
        imageas text. The Class builds the final dataframe to save all the information available in the NOX report.

    image_classifier_model_tesserac
        The class contains the run method which automatize all the necessary operations to pre-process and extract
        the text from the images of the NOX sleep report. The class runs one patient at the time.

    main.py
        Main script. The pdf is read as a complete string, information is then extracted with the necessary processing
        tools, a dictionary is saved where the extracted information of each subject can be analyses. Lastly, the
        information from the saved dictionary is parsed into an excel file and merged with all the patients pdf's
        inserted in the input fold

mignotlabnoxturnalparser's People

Contributors

giorgioricciardiello avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.