Coder Social home page Coder Social logo

argus's People

Contributors

pasky avatar silvicek avatar tealmill avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

argus's Issues

Panoptes backend for sports domain

This has a lot of open questions, e.g. how to precisely specify the match / race.

Sources:

  • Investigate Yahoo Fantasy - bad, see wiki
  • Investigate scraping NBCSports - against ToS
  • Investigate Wolfram Alpha
  • Investigate http://www.cricapi.com/ for Cricket, specifically

Baseline choice (NBA NFL NHL): https://www.stattleship.com/

Domains:

  • soccer
  • tennis
  • cricket
  • basketball
  • baseball
  • football
  • golf
  • boxing
  • nascar
  • horse racing

Panoptos backends for finance domain

  • Stock market (using Yahoo API - NYSE, NASDAQ, LSE and more covered)
  • Commodities (oil, gold, silver, copper and more on the U.S. commodities market and exchange, see e.g. CNNMoney)
  • Currency exchange rates
  • ecurrency prices

Panoptes backend for weather

Lower priority.

  • Investigate Wolfram Alpha
  • Investigate forecast.io
  • Investigate Yahoo Weather API
  • Look for more?

Data types:

  • Temperature

Date entry in web interface

For evaluation, we have generated dates of events for the questions in our set (well, at least some of them; XXX) and whenever available, we use that to limit the search to a 14-day period after that date to improve result relevance.

It'd be nice to have an optional date entry in the web interface too, since our original motivation would be that this data is available in Augur anyway.

Clean up train/test splits

We should do a train/test split on input data before processing anything, not just when reporting the results, to ensure that proper data hygiene is kept.

(At a later time, we should also further split the train to trainmodel and val and perform learning of sub-classifiers like the sentiment on trainmodel and measure its performance on val rather than test, so that we don't overfit by parameter tuning. However, we have too little data to afford that at this point, so it's just something to bear in mind for now.)

Training / web inconsistency

There seems to be some inconsistency between web interface and the training process:

  • Will the New England Patriots defeat the Seattle Seahawks in Super Bowl XLIX? in tests/ftrain.tsv is marked as YES 0.521451771259 with "as it happened" as the most relevant article.
  • In the web interface right now, with the same model, it's 0.08 for NO and ABC comes first by relevancy, with "as it happened" story as second

Odd commodity output for Aluminium

When using the benchtest script, the output for the text question regarding aluminium is odd; despite wide date range being selected, both min and max values are the same value and both refer to the same day. Is only a single data point being utilized to describe the entire range?

  • Implement proper search for the minimal/maximal day in the range or define special behaviour.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.