Coder Social home page Coder Social logo

tdallison / construction_doc_semantic_search Goto Github PK

View Code? Open in Web Editor NEW

This project forked from iantorweihe/construction_doc_semantic_search

0.0 0.0 0.0 35.74 MB

This Python script uses openai gpt to locate and extract data from unstructured text like scanned handwritten reports

Shell 0.02% C++ 0.27% Python 97.68% C 1.28% Nu 0.02% Fortran 0.01% PowerShell 0.05% Nix 0.01% Cython 0.68%

construction_doc_semantic_search's Introduction

Construction_Doc_Semantic_Search

This Python script designed to extract location data from text that is not formatted for machine readability. An example application could be processing handwritten reports that have been scanned and the text extracted from the reports. The script reads the entire contents of a text file into a string variable, and then uses OpenAI's gpt-3.5-turbo natural language processing model to search the text for location data. First, the script splits the text at a fixed marker, and then it searches a fixed number of lines for location data using a gpt powered binary classification switch. If it does not find any location data or the PDF formatting does not match the assumed formatting, the script uses the gpt model to search the entire text extraction for locations. The output is a string containing the location data found in the text.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.