Coder Social home page Coder Social logo

anusharanganathan / canary Goto Github PK

View Code? Open in Web Editor NEW

This project forked from contentmine/canary

0.0 2.0 0.0 349 KB

Canary is a UI to the contentmine tools getpapers, quickscrape, norma, and ami.

License: MIT License

CSS 1.38% HTML 32.10% JavaScript 66.53%

canary's Introduction

Canary

version 0.0.1

Canary is a controller for and user interface to other ContentMine tools - quickscrape, getpapers, norma, and AMI.

It is a node.js meteor app that uses mongodb for backend storage and can also send extracted facts to an elasticsearch index, either installed locally or by sending them remotely.

NOTE: this code remains at an early stage, and is not yet well structured (I was learning whilst building it). The next commit is likely to see quite a re-structuring and separation of configs etc now that I know node and meteor better.

Install

First install all the other tools, see their repos on how to do so - quickscrape, getpapers, norma, AMI.

By installing quickscrape and getpapers you will have ensured you already have node installed.

Install meteor (https://www.meteor.com/install):

curl https://install.meteor.com/ | sh

Get the codebase:

git clone http://github.com/contentmine/canary

Run it:

cd canary

meteor

If you want to have your own index running, install elasticsearch too (https://www.elastic.co/)

Configure

NOTE: this is all early stage, and not ideally setup for configuration...

At the top of the canary.js file there are various options that can be set. It is best to check directly there to ensure you are seeing the most up to date possibilities. But here is a rough overview:

Firstly there are some dir settings, to tell canary where to find various bits and pieces such as the scrapers for quickscrape, and to tell canary where to put the output files.

Then there are some url settings, to tell canary how to show links through to the storage from the UI (so this could be localhosted if you are running the system locally), and some URLs for where it should try to send facts and article metadata for elasticsearch indexing.

Next are the settings for what to run, such as runcron for running a daily extraction, and runlocal for running locally or not, and sendremote for whether or not to send extracted facts to a remote index of facts (by default our contentmine one). This is followed by some augmentations to the settings if runlocal is true, and a couple of remote URLs for sending facts and metadata remotely (again, our contentmine server).

Finally there is a setting for which processes should be available for running - this will depend on what AMI can support, and which ones you want to run. Put the names of the AMI processes you want to run into the availableProcesses setting.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.