Coder Social home page Coder Social logo

kryptooracle's Introduction

KryptoOracle: A Real-Time Cryptocurrency Price Prediction Platform Using Twitter Sentiments

The source code repository for the IEEE Big Data 2019 workshop publication: https://arxiv.org/abs/2003.04967.

KryptoOracle, a novel real-time and adaptive cryptocurrency price prediction platform based on Twitter sentiments. The integrative and modular platform is based on (i) a Spark-based architecture which handles the large volume of incoming data in a persistent and fault tolerant way; (ii) an approach that supports sentiment analysis which can respond to large amounts of natural language processing queries in real time; and (iii) a predictive method grounded on online learning in which a model adapts its weights to cope with new prices and sentiments.

The jupyter notebooks (1-4) must be executed in order. They obtain historical twitter data, perform preprocessing to clean it and perform sentiment analysis using VADER to obtain a compound score. Further, they obtain the cryptocurrency price data. Both datas are moved to seperate csv files.

Next the first half of notebook 5 must be executed until Run Twitter Stream Now. This will load the data, process it to obtain features, setup the Spark context and load the processed data into Spark. It will also bootstrap the ML model by using the processed data. This will make the model ready for making future predictions.

Next notebook 6 must be executed. This will launch the Twitter streamer that will fetch the real-time tweets and obtain sentiment scores.

Lastly, the second half of notebook 5 must be executed. This will launch the prediction engine which will make a price prediction based on the last minute twitter scores and then retrain the model once the actual price value arrives a minute later.

The 'validation_script.py' was run for a few weeks to accumulate the twitter data for bootstrapping the model.

We give acknowledgement to the Github repo: https://github.com/Drabble/TwitterSentimentAndCryptocurrencies. It helped us obtain starter code which we were able to extend further to create this project.

Citation

If you find this work useful, please cite:

@INPROCEEDINGS{9006554,
  author={S. {Mohapatra} and N. {Ahmed} and P. {Alencar}},
  booktitle={2019 IEEE International Conference on Big Data (Big Data)}, 
  title={KryptoOracle: A Real-Time Cryptocurrency Price Prediction Platform Using Twitter Sentiments}, 
  year={2019},
  volume={},
  number={},
  pages={5544-5551},
  doi={10.1109/BigData47090.2019.9006554}}

kryptooracle's People

Contributors

mshubhankar avatar nomiizz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

kryptooracle's Issues

#4 "not enough values to unpack (expected 2, got 0)"

Hi! I'm looking to fix this error in part 4: "not enough values to unpack (expected 2, got 0)"

The code is: stream_id_crypto, stream_id_tweets = tls.get_credentials_file()['stream_ids'][:2]

Do you know how I can go about fixing this? Thank you!

bitcoin_currency_grouped.csv

Greetings, I'm writing to ask how I build or what the bitcoin_currency_grouped.csv file is made of. I appreciate your fast response.

Speed up of step 1

Hi, just a note that I sped up the preprocessing part of step 1 by approximately 3000 times (from 14it/s to about 45000) by changing "d.loc[i, 'Text'] = text" to "d.at[i, 'Text'] = text". Thought it might help you.

[BUG] Outdated Access Token

The Twitter API access tokens provided in this repository is outdated and the code doesn't work. Is there a way to provide an updated version.

cannot run 02_CleanedTweetsIntoMultipleFiles

Cannot run the getvar method, there is no way to create a var.csv file in the code, and there is no var.csv file structure ,please give the var.csv file or the way to create var.csv file
the error msg:
Traceback (most recent call last):
File "D:/code/KryptoOracle-master/demo/cleanedtwttesintomulfiles.py", line 84, in
add_new_crypto(CURRENCY_SYMBOL)
File "D:/code/KryptoOracle-master/demo/cleanedtwttesintomulfiles.py", line 67, in add_new_crypto
df_var = pd.read_csv("data/twitter/var.csv", sep=',', dtype={'LINE_COUNT': np.int32})
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 686, in read_csv
return _read(filepath_or_buffer, kwds)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 452, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 936, in init
self._make_engine(self.engine)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 1168, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 1998, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas_libs\parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas_libs\parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: 'data/twitter/var.csv'
When I create a blank var.csv file : the error msg:
Traceback (most recent call last):
File "D:/code/KryptoOracle-master/demo/cleanedtwttesintomulfiles.py", line 84, in
add_new_crypto(CURRENCY_SYMBOL)
File "D:/code/KryptoOracle-master/demo/cleanedtwttesintomulfiles.py", line 67, in add_new_crypto
df_var = pd.read_csv("data/twitter/var.csv", sep=',', dtype={'LINE_COUNT': np.int32})
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 686, in read_csv
return _read(filepath_or_buffer, kwds)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 452, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 936, in init
self._make_engine(self.engine)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 1168, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 1998, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas_libs\parsers.pyx", line 540, in pandas._libs.parsers.TextReader.cinit
pandas.errors.EmptyDataError: No columns to parse from file

stream_crypto.js file missing?

Hi,

Thanks a lot for this amazing work!

I have a small request: by the end of notebook 4 you reference a file named 'streamer/stream_crypto.js' that is not present in the repository and is needed to run the script below.
Could you also upload this script?

Kind regards,

Steven

API Key and Secret

Is this still live? : APP_KEY = 'mPQKoRwd2Pb9qpQyQmyG5s8KR'
APP_SECRET = 'HLvIhusvfzDLKaRXY8CnZGP143kp3E3f2KqQBIEMfVL5mOxZjq'
OAUTH_TOKEN = "3459248236-0XPtHldG3ou6BfpTwaKWnOL2ywFk2niQekLwE7K"
OAUTH_TOKEN_SECRET = "08Vy2wuOkp7AmuC3rbjCHFJ94MLG2sWqdvGQtoiXmkVKr"

"training_data table not found" Part 5

Hi! Thank you guys for the work you've done, if you can help me with some issues I would really appreciate it.

I'm having issues with this section:
image
Where are the tables created?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.