paytaca / bcmr-indexer Goto Github PK
View Code? Open in Web Editor NEWBitcoin Cash Metadata Registry (BCMR) indexer and validator
Home Page: https://bcmr.paytaca.com
Bitcoin Cash Metadata Registry (BCMR) indexer and validator
Home Page: https://bcmr.paytaca.com
In
the indexer first callsgetblock
to retrieve the list of transaction hashes, and then getrawtransaction
for every transaction.
Thus is may take forever to process a block with, say, 100k transactions.
getblock
has a verbosity
option to show all transaction details. BCHN is even more advanced than standard Bitcoin nodes as it has level 3
which shows input details.
So the proposal is to use getblock
with sufficient verbosity
instead of querying every transaction with getrawtransaction
. This should speed up the process by like 1000x.
https://bcmr.paytaca.com/api/status/latest-block/
reports block #802559 while the tip is #802780.
Examples:
When requesting token data, it'd be useful to know what's the latest block number the indexer has processed.
For example, if I'm requesting token data on tokens in block N, I'd like to know if the indexer has already processed this block and has the relevant data.
Otherwise there's no way to differ between "there's no token data available, because there's no data" and "because it hasn't been processed yet".
Reindexing tokens from their first block of appearance encounters errors that loop the process.
While indexing, the indexer might encounter 404 errors when requesting metadata. However, retrying the request is ineffective and exceeding a certain number of attempts will lead to a ResponseError: too many 404 error responses
.
[2024-02-22 21:12:02,337 bcmr_main.utils] INFO [/code/bcmr_main/utils.py:55] - Downloading from: https://bafkreiejafiz23ewtyh6m3dpincmxouohdcimrd33abacrq3h2pacewwjm.ipfs.dweb.link/.well-known/bitcoin-cash-metadata-registry.json
[2024-02-22 21:12:15,170 bcmr_main.management.commands.block_scanner] ERROR [/code/bcmr_main/management/commands/block_scanner.py:53] - Error processing txid: 8a93e3c629b2d406f4f1623a8a83dd117239afcb62118b710edc0d216abb48e4
--URL: https://gist.githubusercontent.com/Sydwell/04b1f4b9059eefbbe633d3e996e246d4/raw
--URL: https://bafkreiejafiz23ewtyh6m3dpincmxouohdcimrd33abacrq3h2pacewwjm.ipfs.dweb.link/.well-known/bitcoin-cash-metadata-registry.json
urllib3.exceptions.ResponseError: too many 404 error responses
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 948, in urlopen
return self.urlopen(
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 948, in urlopen
return self.urlopen(
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 948, in urlopen
return self.urlopen(
[Previous line repeated 4 more times]
File "/usr/local/lib/python3.9/site-packages/urllib3/connectionpool.py", line 938, in urlopen
retries = retries.increment(method, url, response=response, _pool=self)
File "/usr/local/lib/python3.9/site-packages/urllib3/util/retry.py", line 515, in increment
raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='bafkreiejafiz23ewtyh6m3dpincmxouohdcimrd33abacrq3h2pacewwjm.ipfs.dweb.link', port=443): Max retries exceeded with url: /.well-known/bitcoin-cash-metadata-registry.json (Caused by ResponseError('too many 404 error responses'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/code/manage.py", line 22, in <module>
main()
File "/code/manage.py", line 18, in main
execute_from_command_line(sys.argv)
File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
utility.execute()
File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 413, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 354, in run_from_argv
self.execute(*args, **cmd_options)
File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 398, in execute
output = self.handle(*args, **options)
File "/code/bcmr_main/management/commands/block_scanner.py", line 54, in handle
raise exc
File "/code/bcmr_main/management/commands/block_scanner.py", line 51, in handle
process_tx(tx)
File "/usr/local/lib/python3.9/site-packages/celery/local.py", line 188, in __call__
return self._get_current_object()(*a, **kw)
File "/usr/local/lib/python3.9/site-packages/celery/app/task.py", line 388, in __call__
return self.run(*args, **kwargs)
File "/code/bcmr_main/tasks.py", line 280, in process_tx
_process_tx(txn, bchn)
File "/code/bcmr_main/tasks.py", line 204, in _process_tx
process_op_return(**{
File "/code/bcmr_main/op_return.py", line 68, in process_op_return
response = download_url(decoded_bcmr_url)
File "/code/bcmr_main/utils.py", line 84, in download_url
response = _request_url(url)
File "/code/bcmr_main/utils.py", line 56, in _request_url
response = session.get(url, timeout=30)
File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.9/site-packages/requests/adapters.py", line 510, in send
raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPSConnectionPool(host='bafkreiejafiz23ewtyh6m3dpincmxouohdcimrd33abacrq3h2pacewwjm.ipfs.dweb.link', port=443): Max retries exceeded with url: /.well-known/bitcoin-cash-metadata-registry.json (Caused by ResponseError('too many 404 error responses'))
2024-02-22 21:12:15,320 INFO exited: block_scanner (exit status 1; not expected)
This causes infinity loop with enormous amount of retries.
For example, it can be fixed in bcmr_main/utils.py:52
:
def _request_url(url):
response = None
try:
session = requests.Session()
retry_triggers = tuple( x for x in requests.status_codes._codes if x not in [200, 301, 302, 307, 308, 404])
retries = Retry(total=7, backoff_factor=0.1, status_forcelist=retry_triggers)
session.mount('https://', HTTPAdapter(max_retries=retries))
LOGGER.info('Downloading from: ' + url)
response = session.get(url, timeout=30)
except requests.exceptions.ConnectionError:
pass
except requests.exceptions.InvalidURL:
pass
except LocationParseError:
pass
return response
By adding 404 error in array of returns codes.
The second issue arises when requested metadata has an invalid JSON structure. This triggers an infinite loop where the same block and its transactions are repeatedly requested.
[2024-02-22 21:28:05,888 bcmr_main.op_return] ERROR [/code/bcmr_main/op_return.py:20] - COMPARED AGAINST: ['', '8e770cf9dac37f143a6ef069e78d82be71fa3049b279634b876a2389713e5003']
[2024-02-22 21:28:05,889 bcmr_main.management.commands.block_scanner] ERROR [/code/bcmr_main/management/commands/block_scanner.py:53] - Error processing txid: a3e2c5b28b5eb2af49be716f196d632f60c51043b447ecc00d901858f1cf964f
--URL: https://gist.githubusercontent.com/alpsy05/1fedc74d7420c2407485ab421cd84f06/raw
Traceback (most recent call last):
File "/code/manage.py", line 22, in <module>
main()
File "/code/manage.py", line 18, in main
execute_from_command_line(sys.argv)
File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
utility.execute()
File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 413, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 354, in run_from_argv
self.execute(*args, **cmd_options)
File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 398, in execute
output = self.handle(*args, **options)
File "/code/bcmr_main/management/commands/block_scanner.py", line 54, in handle
raise exc
File "/code/bcmr_main/management/commands/block_scanner.py", line 51, in handle
process_tx(tx)
File "/usr/local/lib/python3.9/site-packages/celery/local.py", line 188, in __call__
return self._get_current_object()(*a, **kw)
File "/usr/local/lib/python3.9/site-packages/celery/app/task.py", line 388, in __call__
return self.run(*args, **kwargs)
File "/code/bcmr_main/tasks.py", line 280, in process_tx
_process_tx(txn, bchn)
File "/code/bcmr_main/tasks.py", line 204, in _process_tx
process_op_return(**{
File "/code/bcmr_main/op_return.py", line 102, in process_op_return
BitcoinCashMetadataRegistry.validate_contents(response.text)
File "/code/bcmr_main/app/BitcoinCashMetadataRegistry.py", line 74, in validate_contents
validate(instance=json.loads(contents) if type(contents) == str else contents, schema=bcmr_schema)
File "/usr/local/lib/python3.9/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.9/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.9/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 26 column 7 (char 678)
2024-02-22 21:28:06,027 INFO exited: block_scanner (exit status 1; not expected)
It can be solved in bcmr_main/app/BitcoinCashMetadataRegistry.py
:
@staticmethod
def validate_contents(contents):
with open(f'{settings.BASE_DIR}/bcmr_main/app/bcmr-schema-v2.json', 'r') as bcmr_schema_file:
bcmr_schema = json.load(bcmr_schema_file)
try:
validate(instance=json.loads(contents) if type(contents) == str else contents, schema=bcmr_schema)
except json.decoder.JSONDecodeError:
pass
By adding a catch for exception.
The bcmr-indexer info for the token 7c6c8b889a825e928d9edb76841b3aaa899e5df8666457f4f4c8a9bbf1c03706
has not updated correctly in the BCMR indexer.
The indexer shows https://bcmr.paytaca.com/api/registries/7c6c8b889a825e928d9edb76841b3aaa899e5df8666457f4f4c8a9bbf1c03706/latest/
while the latest metadata is https://ipfs.io/ipfs/bafkreib2avpze4ccc2akjatv32xyh4vdzv2bzn35hnpv6r377rt5ct3rhm
the token's metadata was updated 8/15 09:35utc(3mo, 16d ago)
the authhead is 1c696bf8a8cba7b29a95411d10c4b95b65a5950dea842957f0d54652625d6df5
as show on my BCMR token explorer
https://tokenexplorer.cash/?tokenId=7c6c8b889a825e928d9edb76841b3aaa899e5df8666457f4f4c8a9bbf1c03706
https://bcmr.paytaca.com/api/status/latest-block/ reports block #813529 while the tip is #813738.
Right now there's no way to understand whether the API path has changed (or the service has been shut down) or there's simply no data for the requested token: in both cases the same generic 404 page is returned.
Cashonize is transitioning to using the token/NFT specific endpoints instead of fetching the full registries from the BCMR indexer.
The problem is that multiple NFT collections (Cash-Ninjas, Pepis, Reapers, Puffers) all start with the VM number zero which is an empty commit.
Currently it is possibly to get the NFT specific metadata only for NFT number 2 and beyond, for example for the Pepi collection:
https://bcmr.paytaca.com/api/tokens/1a05bce0af8b57e27b11e9429fc534d0fc27230fc541928f38b3ca945c4bca11/01
as normal commitments can only be hexadecimals, the endpoint could either be something like /empty, /none , /xx
For example, spaces in a string are either encoded with %20 or replaced with the plus sign (+)
so other options would be /+/ or /%20/
The specific endpoint used is an implementation detail
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.