nomiizz / kryptooracle Goto Github PK

A Real-Time Cryptocurrency Price Prediction Platform Using Twitter Sentiments

Jupyter Notebook 99.53% Python 0.38% JavaScript 0.09%

kryptooracle's Introduction

KryptoOracle: A Real-Time Cryptocurrency Price Prediction Platform Using Twitter Sentiments

The source code repository for the IEEE Big Data 2019 workshop publication: https://arxiv.org/abs/2003.04967.

KryptoOracle, a novel real-time and adaptive cryptocurrency price prediction platform based on Twitter sentiments. The integrative and modular platform is based on (i) a Spark-based architecture which handles the large volume of incoming data in a persistent and fault tolerant way; (ii) an approach that supports sentiment analysis which can respond to large amounts of natural language processing queries in real time; and (iii) a predictive method grounded on online learning in which a model adapts its weights to cope with new prices and sentiments.

The jupyter notebooks (1-4) must be executed in order. They obtain historical twitter data, perform preprocessing to clean it and perform sentiment analysis using VADER to obtain a compound score. Further, they obtain the cryptocurrency price data. Both datas are moved to seperate csv files.

Next the first half of notebook 5 must be executed until Run Twitter Stream Now. This will load the data, process it to obtain features, setup the Spark context and load the processed data into Spark. It will also bootstrap the ML model by using the processed data. This will make the model ready for making future predictions.

Next notebook 6 must be executed. This will launch the Twitter streamer that will fetch the real-time tweets and obtain sentiment scores.

Lastly, the second half of notebook 5 must be executed. This will launch the prediction engine which will make a price prediction based on the last minute twitter scores and then retrain the model once the actual price value arrives a minute later.

The 'validation_script.py' was run for a few weeks to accumulate the twitter data for bootstrapping the model.

We give acknowledgement to the Github repo: https://github.com/Drabble/TwitterSentimentAndCryptocurrencies. It helped us obtain starter code which we were able to extend further to create this project.

Citation

If you find this work useful, please cite:

@INPROCEEDINGS{9006554,
  author={S. {Mohapatra} and N. {Ahmed} and P. {Alencar}},
  booktitle={2019 IEEE International Conference on Big Data (Big Data)}, 
  title={KryptoOracle: A Real-Time Cryptocurrency Price Prediction Platform Using Twitter Sentiments}, 
  year={2019},
  volume={},
  number={},
  pages={5544-5551},
  doi={10.1109/BigData47090.2019.9006554}}

kryptooracle's People

Contributors

Stargazers

Watchers

kryptooracle's Issues

#4 "not enough values to unpack (expected 2, got 0)"

Hi! I'm looking to fix this error in part 4: "not enough values to unpack (expected 2, got 0)"

The code is: stream_id_crypto, stream_id_tweets = tls.get_credentials_file()['stream_ids'][:2]

Do you know how I can go about fixing this? Thank you!

Couldn't find file "streamdata.txt" in code 06_Realtime_Time_Series.

Hi I didn't find the file "streamdata.txt" in code "06_Realtime_Time_Series". what should I do.
I had some problems in the last part of file "04_TwitterSentimentAndCryptocurrencies" in running stream.open().

plotly no longer support live streaming services.

Hi.
We have a problem using plotly real-time streaming service.
https://plotly.com/python/v3/streaming-tutorial/

Can i get some advise about it?

bitcoin_currency_grouped.csv

Greetings, I'm writing to ask how I build or what the bitcoin_currency_grouped.csv file is made of. I appreciate your fast response.

Speed up of step 1

Hi, just a note that I sped up the preprocessing part of step 1 by approximately 3000 times (from 14it/s to about 45000) by changing "d.loc[i, 'Text'] = text" to "d.at[i, 'Text'] = text". Thought it might help you.

[BUG] Outdated Access Token

The Twitter API access tokens provided in this repository is outdated and the code doesn't work. Is there a way to provide an updated version.

Twitter var file missing

Step 2, twitter var file is missing. I took the one from here and renamed it: https://github.com/Drabble/TwitterSentimentAndCryptocurrencies/tree/master/data/twitter

cannot run 02_CleanedTweetsIntoMultipleFiles

Cannot run the getvar method, there is no way to create a var.csv file in the code, and there is no var.csv file structure ,please give the var.csv file or the way to create var.csv file
the error msg:
Traceback (most recent call last):
File "D:/code/KryptoOracle-master/demo/cleanedtwttesintomulfiles.py", line 84, in
add_new_crypto(CURRENCY_SYMBOL)
File "D:/code/KryptoOracle-master/demo/cleanedtwttesintomulfiles.py", line 67, in add_new_crypto
df_var = pd.read_csv("data/twitter/var.csv", sep=',', dtype={'LINE_COUNT': np.int32})
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 686, in read_csv
return _read(filepath_or_buffer, kwds)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 452, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 936, in init
self._make_engine(self.engine)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 1168, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 1998, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas_libs\parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas_libs\parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: 'data/twitter/var.csv'
When I create a blank var.csv file : the error msg:
Traceback (most recent call last):
File "D:/code/KryptoOracle-master/demo/cleanedtwttesintomulfiles.py", line 84, in
add_new_crypto(CURRENCY_SYMBOL)
File "D:/code/KryptoOracle-master/demo/cleanedtwttesintomulfiles.py", line 67, in add_new_crypto
df_var = pd.read_csv("data/twitter/var.csv", sep=',', dtype={'LINE_COUNT': np.int32})
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 686, in read_csv
return _read(filepath_or_buffer, kwds)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 452, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 936, in init
self._make_engine(self.engine)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 1168, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "D:\anaconda\envs\twitter\lib\site-packages\pandas\io\parsers.py", line 1998, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas_libs\parsers.pyx", line 540, in pandas._libs.parsers.TextReader.cinit
pandas.errors.EmptyDataError: No columns to parse from file

stream_crypto.js file missing?

Hi,

Thanks a lot for this amazing work!

I have a small request: by the end of notebook 4 you reference a file named 'streamer/stream_crypto.js' that is not present in the repository and is needed to run the script below.
Could you also upload this script?

Kind regards,

Steven

API Key and Secret

Is this still live? : APP_KEY = 'mPQKoRwd2Pb9qpQyQmyG5s8KR'
APP_SECRET = 'HLvIhusvfzDLKaRXY8CnZGP143kp3E3f2KqQBIEMfVL5mOxZjq'
OAUTH_TOKEN = "3459248236-0XPtHldG3ou6BfpTwaKWnOL2ywFk2niQekLwE7K"
OAUTH_TOKEN_SECRET = "08Vy2wuOkp7AmuC3rbjCHFJ94MLG2sWqdvGQtoiXmkVKr"

"training_data table not found" Part 5

Hi! Thank you guys for the work you've done, if you can help me with some issues I would really appreciate it.

I'm having issues with this section:

Where are the tables created?

nomiizz / kryptooracle Goto Github PK

kryptooracle's Introduction

KryptoOracle: A Real-Time Cryptocurrency Price Prediction Platform Using Twitter Sentiments

Citation

kryptooracle's People

Contributors

Stargazers

Watchers

Forkers

kryptooracle's Issues

Recommend Projects

Recommend Topics

Recommend Org