Coder Social home page Coder Social logo

guptarohit / cryptocmd Goto Github PK

View Code? Open in Web Editor NEW
525.0 20.0 108.0 265 KB

Cryptocurrency historical price data library in Python. Data from https://coinmarketcap.com.

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
cryptocurrency scraper historical-data historical-cryptocurrency-prices coinmarketcap dataset python utility coinmarketcap-api hacktoberfest2023

cryptocmd's Introduction

Hi 👋

cryptocmd's People

Contributors

davjack avatar dependabot[bot] avatar fossabot avatar giocaizzi avatar guptarohit avatar planeer avatar ttozatto avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cryptocmd's Issues

non-unique symbols issue

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

When using "get_coin_id(coin_code)" from utils with the symbol CTK:

https://web-api.coinmarketcap.com/v1/cryptocurrency/map?symbol=CTK

The api returns a json including 2 coins, since the CTK symbol is used for two different coins.
The function returns: "json_data["data"][0]["slug"]" , so only the first coin is processed. But shoul only be a minor problem since in most cases the coin symbol should be unique.

Api output below.
Kind regards
Vincent


{"status":{"timestamp":"2022-05-11T18:17:38.133Z","error_code":0,"error_message":null,"elapsed":13,"credit_count":0,"notice":null},

"data":[
{"id":4807,"name":"CertiK","symbol":"CTK","slug":"certik","rank":370,"is_active":1,"first_historical_data":"2020-10-21T08:40:00.000Z","last_historical_data":"2022-05-11T18:05:00.000Z","platform":null},
{"id":4596,"name":"Cryptyk Token","symbol":"CTK","slug":"cryptyk-token","rank":null,"is_active":0,"platform":{"id":1,"name":"Ethereum","symbol":"CTK","slug":"cryptyk-token","token_address":"0x42a501903afaa1086b5975773375c80e363f4063"}}]}

Open, Close, High, Low times

I see that your package has Open, Close, High, Low and Volume values for cryptocurrencies. Since trading for cryptos is always possible, what times do the Open and Close values correspond to?

ValueError: time data does not match format

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

When I run your example script, I get this message

Traceback (most recent call last):
File "test.py", line 12, in
scraper.export("csv", name="btc_all_time")
File "/home/pi/.local/lib/python2.7/site-packages/cryptocmd/core.py", line 207, in export
data = self.get_data(format, **kwargs)
File "/home/pi/.local/lib/python2.7/site-packages/cryptocmd/core.py", line 105, in get_data
self._download_data(**kwargs)
File "/home/pi/.local/lib/python2.7/site-packages/cryptocmd/core.py", line 84, in _download_data
self.end_date, self.start_date, self.headers, self.rows = extract_data(table)
File "/home/pi/.local/lib/python2.7/site-packages/cryptocmd/utils.py", line 150, in extract_data
row[0] = datetime.datetime.strptime(row[0], "%b %d %Y").strftime("%d-%m-%Y")
File "/usr/lib/python2.7/_strptime.py", line 332, in _strptime
(data_string, format))
ValueError: time data '3981.90\nUSD' does not match format '%b %d %Y'

Command line interface

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

Create a basic utility for command line usage, e.g. It can create the csv file of historical data right from terminal. Something like: cryptocmd -c BTC -f 28-4-2017 -t 31-12-2017

How to get price information of coins with same symbol

@guptarohit

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

There are many coins with the same ticker symbol. For example, UNUS SED LEO and LEOcoin use the same ticker symbol LEO; Cosmos and Atomic Coin use the same ticker symbol ATOM; CyberMiles and Comet use the same ticker symbol CMT. The circumstance is common for coins.

So I am wondering how to get the price information of these coins that have the same symbol?

Getting hourly data

Hello,
Would it be possible to get hourly data by specifying the date and time?

Best regards

Error: coin code is unavailable on coinmarketcap.com

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

I am going to fetch the historical price information of over four thousand tokens on coinmarketcap.com. The script works well to exact the first two hundred tokens ranking highest in coinmarketcap.com. But for all the rest tokens such as MTXLT and CCXX, it only returns the error "[Coin Name] coin code is unavailable on coinmarketcap.com"
image
And I also noticed that the reason is that the page https://coinmarketcap.com/all/views/all/ which is used to translate the coin code to a coin name now no longer displays all currencies and only displays the first 200 coins (#34 ).

Multiples coins and specific date

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

I want to add multiple coins to the scraper :> BTC,XRP but it seems not possible in one go or I have syntax issue :( Also the ability to scrape
2 specific date history instead of full interval but not sure it's possible
Thanks

coin code is unavailable on coinmarketcap.com

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

I am going to fetch the historical price information of over four thousand tokens on coinmarketcap.com. The script works well to exact the first two hundred tokens ranking highest in coinmarketcap.com. But for all the rest tokens, it only returns the error "[Coin Name] coin code is unavailable on coinmarketcap.com"
image

And I also noticed that the reason is that the page https://coinmarketcap.com/all/views/all/ which is used to translate the coincode to a coinname now no longer displays all currencies and only displays the first 200 coins (#34 ).

Error in utils.py

Sorry, should probably not raise an issue but rather submit a commit, but don't work with girhub often and short on time.

But I found that coin_id = coin_link.values()[0].lstrip("/currencies/")[:-1] in utils.py should be changed to coin_id = coin_link.values()[0].replace("/currencies/","")[:-1]

I discovered it while running into trouble trying to use your code for currency EOS.

I found that '/currencies/eos' in your code resulted in 'os' as the coin_id, while it should be 'eos'. I thought it was a bug in Python but found it wasn't > https://bugs.python.org/issue5318

List out of range error

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

When I try the following code:

scraper = CmcScraper("XRP", "15-10-2019", "25-10-2019")
df = scraper.get_dataframe()

This results in the following error:


IndexError Traceback (most recent call last)
in
2 scraper = CmcScraper("XRP", "15-10-2019", "25-10-2019")
3 # get dataframe for the data
----> 4 df = scraper.get_dataframe()

~/opt/anaconda3/lib/python3.7/site-packages/cryptocmd/core.py in get_dataframe(self, date_as_index, **kwargs)
138 )
139
--> 140 self._download_data(**kwargs)
141
142 dataframe = pd.DataFrame(data=self.rows, columns=self.headers)

~/opt/anaconda3/lib/python3.7/site-packages/cryptocmd/core.py in _download_data(self, **kwargs)
81 print(self.coin_code, self.start_date, self.end_date)
82
---> 83 table = download_coin_data(self.coin_code, self.start_date, self.end_date)
84
85 self.end_date, self.start_date, self.headers, self.rows = extract_data(table)

~/opt/anaconda3/lib/python3.7/site-packages/cryptocmd/utils.py in download_coin_data(coin_code, start_date, end_date)
71 end_date = yesterday.strftime("%d-%m-%Y")
72
---> 73 coin_id = get_coin_id(coin_code)
74
75 # Format the dates as required for the url.

~/opt/anaconda3/lib/python3.7/site-packages/cryptocmd/utils.py in get_coin_id(coin_code)
50 raise InvalidCoinCode("'{}' coin code is unavailable on coinmarketcap.com".format(coin_code))
51 except Exception as e:
---> 52 raise e
53
54

~/opt/anaconda3/lib/python3.7/site-packages/cryptocmd/utils.py in get_coin_id(coin_code)
44
45 for _row in raw_data("tr")[1:]:
---> 46 symbol = _row.cssselect("td.text-left.col-symbol")[0].text_content()
47 coin_id = _row.values()[0][3:]
48 if symbol == coin_code:

IndexError: list index out of range

Add unit test cases

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:
Add unit test cases

How to deal with Connection error

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

In the utils.py file, there is the function get_url_data(url). While extracting data, my application stops running when there is something wrong with the http connection. I get the error "Error message (get_url_data) :....". Everything is normal so far. But what I need is to continue, i.e. catching the exception in my source code and re-request data. Currently, that is not possible because of sys.exit(1). The get_url_data(url) function should have the flexibility of disabling this line of code. Also, what kind of exception is this exactly? It would be great to be more specific with it so that this exception can be raised and I can catch it in my application.

How to get OHLC of Today?

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

How to get the OHLC data for the current unfinished day, as most of other APIs provide? Obviously, the Close (and HL) will be temporary from "live" data until the day ends.

Thanks!

Error downloading data

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:
When I try the code

from cryptocmd import CmcScraper
scraper = CmcScraper("XRP")
headers, data = scraper.get_data()

I get the below error. It used to work until few days ago...

File "", line 1, in
File "/home/davide/.local/lib/python3.7/site-packages/cryptocmd/core.py", line 106, in get_data
self._download_data(**kwargs)
File "/home/davide/.local/lib/python3.7/site-packages/cryptocmd/core.py", line 83, in _download_data
table = download_coin_data(self.coin_code, self.start_date, self.end_date)
File "/home/davide/.local/lib/python3.7/site-packages/cryptocmd/utils.py", line 73, in download_coin_data
coin_id = get_coin_id(coin_code)
File "/home/davide/.local/lib/python3.7/site-packages/cryptocmd/utils.py", line 52, in get_coin_id
raise e
File "/home/davide/.local/lib/python3.7/site-packages/cryptocmd/utils.py", line 46, in get_coin_id
symbol = _row.cssselect("td.text-left.col-symbol")[0].text_content()
IndexError: list index out of range

br davide

1 hour or 4 hour data

How can we get data from specific time periods Ex: 30 m or 1 hour or 4 hour maybe 1 week

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

Same Symbol for coins

Hey Is there a workaround for coins with the same symbol? For example

Wonderland (TIME)
Chrono.tech (TIME)

Bug - pandas datetime parse

When you call pd.to_datetime() you forgot to add dayfirst=True param. Data that come from CMC are in format DD-MM-YYYY and pd.to_datetime() defaults param dayfirst to False which leads to completely messed up data.

Just fixed that in my local virtualenv and it works flawlessly now.

Thank you.

README instructions to run

Hi! I am trying to use the code following the instructions in the README. I installed the package and created a python file run.py with the following commands

from cryptocmd import CmcScraper
scraper = CmcScraper('XRP')
scraper.export_csv('xrp_all_time.csv')

and run it with
python run.py

I get the following errors from python
Traceback (most recent call last):
File "scrape_test.py", line 1, in
from cryptocmd import CmcScraper
File "/home/ggrossi/cryptoCMD/cryptocmd/init.py", line 1, in
from .core import *
File "/home/ggrossi/cryptoCMD/cryptocmd/core.py", line 84
print(*self.headers, sep=', ')
^
SyntaxError: invalid syntax

I don't know if it's my issue but maybe some more detailed instructions would be nice. Thank you

* in headers

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

Because of the column name changes on coinmarketcap, getting * in the headers, i.e. Date, Open*, High, Low, Close**, Volume, Market Cap.

Mass Scraping Multiple Coins within a for loop error

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

I am trying to mass scrape multiple coins in a for loop however i always get this error on 6-7th coin regardless of me putting time.sleep() before each scraping. It keeps on repeating for a few attempts and gets through but sometimes it gets stuck for 5-10 minutes and i can't mass scrape.

Error example:

Error fetching price data for TON for interval '28-4-2013' and '24-09-2023'
Error message (download_data) : Expecting value: line 1 column 1 (char 0)

throwing error "list index out of range"

This is a(n):

  • Error
    CmcScraper is not working. it is givng "list index out of range"

Details:
Below is recreation steps.

Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from cryptocmd import CmcScraper
>>> scraper = CmcScraper("XRP")
>>> 
>>> headers, data = scraper.get_data()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/gaurang/anaconda3/lib/python3.6/site-packages/cryptocmd/core.py", line 105, in get_data
    self._download_data(**kwargs)
  File "/home/gaurang/anaconda3/lib/python3.6/site-packages/cryptocmd/core.py", line 84, in _download_data
    self.end_date, self.start_date, self.headers, self.rows = extract_data(table)
  File "/home/gaurang/anaconda3/lib/python3.6/site-packages/cryptocmd/utils.py", line 154, in extract_data
    end_date, start_date = rows[0][0], rows[-1][0]
IndexError: list index out of range
>>> 

Infinite loop while mass scraping data

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

I want to import the historical data to calculate volatility over the last period for all crypto coins. For this, I made a loop that goes through all coins and scrapes them one-by-one. Unfortunately, I already experienced problems with doing more than 10 requests within a minute.

Now my question is:

  • Are there any limitation to scraping historical data?
  • Is there any way I can still scrape the historical data while respecting the limitations.

for i in alldataCMC['Symbol']:
scraper = CmcScraper(i)
while True:
try:
data = scraper.get_dataframe()
except IndexError:
print("Downloading data of "+ i +" failed. Next try in 1 minute")
time.sleep(10)
continue
break
data = scraper.get_dataframe()
data = data.set_index(['Date'])

This is the part of my code where it got stuck around #850 on coinmarketcap. I got the same symbol failing over and over again for 2 hours.

I hope you can help me out.

Add BTC and ETH convert pairs

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:
As the title states, would like to add the ability to receive data on BTC and ETH base pairs instead of just USD.

Broken symbols

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:

The following symbols appear to be broken:

['ΤBTC',
 'DEFI++',
 'POP!',
 'MOON STOP',
 'yVault LP-yCurve(YYCRV)',
 'BTC++',
 'DEFI+L',
 'DEFI+S',
 'YDAI+YUSDC+YUSDT+YTUSD',
 'PXUSD_MAR2021',
 'MD+']

For example trying to get a DataFrame for DEFI++ does not work:

scraper = CmcScraper('DEFI++')
scraper.get_dataframe(date_as_index=True).sort_index()

It prints:

Error fetching coin id data for coin code {} defi++
Error message: 'defi++' coin code is unavailable on coinmarketcap.com
Error fetching price data for {} for interval '{}' and '{}' defi++ 28-4-2013 16-05-2021
Error message (download_data) : "slug" must only contain lowercase characters

Trying to replace DEFI++ with defi++ or other variants did not work for me either. Is this a known problem?

close price

How do you define close price? Is it 4:00 pm EST or is it 00:00 AM EST?

CmcScraper ignoring coin symbol and always returning data for BCH

This is a(n):

  • New feature
  • Update to an existing feature
  • Error
  • Proposal to the Repository

Details:
Python 3.7.1 under Anaconda
cryptocmd 0.4.3
MacOS X 10.11.6

scraper = CmcScraper('XRP')
scraper.get_dataframe().head()
Out[20]:
Date Open High Low Close Volume Market Cap
0 2018-12-26 172.06 186.34 165.54 175.51 537284643 3077145391
1 2018-12-25 183.03 183.03 155.40 170.91 621674076 2996171177
2 2018-12-24 200.03 213.72 179.09 182.26 611323718 3194811505
3 2018-12-23 197.89 209.86 189.16 197.66 578345903 3464468332
4 2018-12-22 196.30 206.45 184.08 196.67 716022686 3446690555

scraper = CmcScraper('LTC')
scraper.get_dataframe().head()
Out[22]:
Date Open High Low Close Volume Market Cap
0 2018-12-26 172.06 186.34 165.54 175.51 537284643 3077145391
1 2018-12-25 183.03 183.03 155.40 170.91 621674076 2996171177
2 2018-12-24 200.03 213.72 179.09 182.26 611323718 3194811505
3 2018-12-23 197.89 209.86 189.16 197.66 578345903 3464468332
4 2018-12-22 196.30 206.45 184.08 196.67 716022686 3446690555

IndexError: list index out of range

This is a(n):

  • New feature
  • Update to an existing feature
  • [ x] Error
  • Proposal to the Repository

Details:

Hi,

Trying to run scrapper for multiple coins and getting the error. Can I run scrapper for all coins in one go?

Code that I am trying:

scraper = CmcScraper('CS')
scraper.export('json', name='cs_all_time')
scraper = CmcScraper('PHX')
scraper.export('json', name='phx_all_time')
scraper = CmcScraper('DMT')
scraper.export('json', name='dmt_all_time')
scraper = CmcScraper('TEN')

ERROR

Traceback (most recent call last):
  File "/Users/scrapper.py", line 634, in <module>
    scraper.export('json', name='cs_all_time')
  File "/Users/.virtualenvs/lib/python3.7/site-packages/cryptocmd/core.py", line 206, in export
    data = self.get_data(format, **kwargs)
  File "/Users/.virtualenvs/lib/python3.7/site-packages/cryptocmd/core.py", line 104, in get_data
    self._download_data(**kwargs)
  File "/Users/.virtualenvs/lib/python3.7/site-packages/cryptocmd/core.py", line 83, in _download_data
    self.end_date, self.start_date, self.headers, self.rows = extract_data(table)
  File "/Users/.virtualenvs/lib/python3.7/site-packages/cryptocmd/utils.py", line 154, in extract_data
    end_date, start_date = rows[0][0], rows[-1][0]
IndexError: list index out of range

Empty new line within rows

While opening the csv documents there appears a new line between rows. I am using windows as the operating system.

New Issue - list index out of range

This is a(n):

  • Error

Details:
On https://pypi.org/project/cryptocmd/ the example use is:

from cryptocmd import CmcScraper

# initialise scraper with time interval
scraper = CmcScraper("XRP", "15-10-2017", "25-10-2017")

# get raw data as list of list
headers, data = scraper.get_data()

# get data in a json format
json_data = scraper.get_data("json")

# export the data to csv
scraper.export("csv")

# get dataframe for the data
df = scraper.get_dataframe()

This used to work perfect when I ran it a couple of days ago. However now it keeps returning:

>>> from cryptocmd import CmcScraper
>>> scraper = CmcScraper("XRP", "15-10-2017", "25-10-2017")
>>> headers, data = scraper.get_data()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/$USER/.local/lib/python3.8/site-packages/cryptocmd/core.py", line 97, in get_data
    self._download_data(**kwargs)
  File "/home/$USER/.local/lib/python3.8/site-packages/cryptocmd/core.py", line 83, in _download_data
    self.end_date, self.start_date, self.headers, self.rows = extract_data(table)
  File "/home/$USER/.local/lib/python3.8/site-packages/cryptocmd/utils.py", line 161, in extract_data
    end_date, start_date = rows[0][0], rows[-1][0]
IndexError: list index out of range

I tried different currencies and dates. All give the same error.

Could it be coinmarketcap changed their HTML layout? I tried fixing the issue myself but my knoowledge of PyQuery is limited and I couldn't figure it out. that headers = [col.text_content().strip("*") for col in raw_data("table:first>thead>tr>th")] is now returning empty strings.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.