Coder Social home page Coder Social logo

thewillyhuman / snoicd-codex Goto Github PK

View Code? Open in Web Editor NEW
6.0 1.0 4.0 4.43 MB

๐Ÿ“š Optimized search system for SNOMED CT & ICD API

License: MIT License

Java 97.41% Dockerfile 0.46% Kotlin 2.13%
icd-9 icd-10 snomed-ct api docker icd terminologies database

snoicd-codex's Introduction

Snomed logo

SNOMED-CT & ICD Codex

Architecture Status Tests Coverage
Ubuntu Trusty 14.04 x86_64 Build Status codecov

Welcome to snoicd-codex

Snoicd-codex is a set of tools that allow to crawl, index and perform queries over the ICD-9, ICD-10 and SNOMED CT terminologies. Thanks to the previous crawl and indexation it is able to achieve very low latency queries and it occupies a no more than a Gb.

It's architecture its based on the well known Google search engine where they have a crawler that collects data about websites, and indexer to reduce all the information to a physical location in their databases/indexes and a search algorithm that given a query finds the best matching results in their indexes.

In the following illustration the descrived architecture can be perfectly seen.

Snomed logo

Crawler

The crawler is in charge of finding the data among the different terminologies, add it to a common file-system for the indexer to look for and add the relations between the crawled data. At the beggining of the crawl process we have a set of unstructure and unlinked files meanwhile at the end of the process we have a unique JSON file containing all the crawled data. This file will be the one readed by the indexer.

Indexer

The indexer module starts by reading the file served by the crawler and it creates two index files, one where the concepts are indexed by its terminology unique code and another where concepts are text-indexed by its descriptions.

Search

The searcher is an api-structured module that by means of some algorithms it allows, trough REST requests to execute queries over the indexed data. At the time it allows queries by terminology codes and free text queries. Thanks to the previous indexation of the crawled data it performs free text queries over all the set of descriptions at less than 10ms. That is for unique words and n words with intersection search.

Getting started

These instructions give the most direct path to work with this module.

System Requirements

As the project is developed in java macOS, Windows and Linux distributions are natively supported. Of course you will need the latest JDK available and haveing Docker installed on your computer. Also, depending on where are you going to run the database, you will need internet connection or MongoDB installed and running on your machine.

Java Development Kit (JDK) โš ๏ธ No support for java 10 nor 11 โš ๏ธ

A Java Development Kit (JDK) is a program development environment for writing Java applets and applications. It consists of a runtime environment that "sits on top" of the operating system layer as well as the tools and programming that developers need to compile, debug, and run applets and applications written in the Java programming language.

If you do not has the latest stable version download you can download it here.

Docker

Docker provides container software that is ideal for developers and teams looking to get started and experimenting with container-based applications. It can be downloaded here for your favourite OS.

Deploying snoicd-codex

These instructions give the most direct path to a working snoice-codex. First thing to do is to clone the repository on to your local computer, for that:

git clone https://github.com/thewilly/snoicd-codex

Then you will need to change your working directory to the snoicd-codex, build the api sources and finally deploy the docker container, to do so:

cd snoicd-codex;
cd api;
mvn package -DskipTests;
cd ..;
docker-compose up;

Notice that once the docker-compose services start it will take between 1 and 5 minutes for services to start. Once the services started you will be able to connect at http://localhost:8082. Find more information about the API here. This API has been documented with Swagger 2 so at http://localhost:8082/swagger-ui.html you can try / test the behaviour of the endpoint.

Did you find an issue?

If you find any issue or have any doubt with the system just ask by submitting an issue.

Versions included

Terminology Version Internal code
ICD-9 icd9
ICD-10 icd10
SNOMED CT snomed

snoicd-codex's People

Contributors

thewillyhuman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.