Coder Social home page Coder Social logo

hub's Introduction

LBRY Hub

This repo provides a python library, hub, for building services that use the processed data from the LBRY blockchain in an ongoing manner. Hub contains a set of three core executable services that are used together:

  • scribe (hub.scribe.service) - maintains a rocksdb database containing the LBRY blockchain.
  • herald (hub.herald.service) - an electrum server for thin-wallet clients (such as lbry-sdk), provides an api for clients to use thin simple-payment-verification (spv) wallets and to resolve and search claims published to the LBRY blockchain. A drop in replacement port of herald written in go - herald.go is currently being worked on.
  • scribe-elastic-sync (hub.elastic_sync.service) - a utility to maintain an elasticsearch database of metadata for claims in the LBRY blockchain

Features and overview of hub as a python library:

Installation

Scribe may be run from source, a binary, or a docker image. Our releases page contains pre-built binaries of the latest release, pre-releases, and past releases for macOS and Debian-based Linux. Prebuilt docker images are also available.

Prebuilt docker image

docker pull lbry/hub:master

Build your own docker image

git clone https://github.com/lbryio/hub.git
cd hub
docker build -t lbry/hub:development .

Install from source

Scribe has been tested with python 3.7-3.9. Higher versions probably work but have not yet been tested.

  1. clone the scribe repo
git clone https://github.com/lbryio/hub.git
cd hub
  1. make a virtual env
python3.9 -m venv hub-venv
  1. from the virtual env, install scribe
source hub-venv/bin/activate
pip install -e .

That completes the installation, now you should have the commands scribe, scribe-elastic-sync and herald

These can also optionally be run with python -m hub.scribe, python -m hub.elastic_sync, and python -m hub.herald

Usage

Requirements

Scribe needs elasticsearch and either the lbrycrd or lbcd blockchain daemon to be running.

With options for high performance, if you have 64gb of memory and 12 cores, everything can be run on the same machine. However, the recommended way is with elasticsearch on one instance with 8gb of memory and at least 4 cores dedicated to it and the blockchain daemon on another with 16gb of memory and at least 4 cores. Then the scribe hub services can be run their own instance with between 16 and 32gb of memory (depending on settings) and 8 cores.

As of block 1147423 (4/21/22) the size of the scribe rocksdb database is 120GB and the size of the elasticsearch volume is 63GB.

docker-compose

The recommended way to run a scribe hub is with docker. See this guide for instructions.

If you have the resources to run all of the services on one machine (at least 300gb of fast storage, preferably nvme, 64gb of RAM, 12 fast cores), see this docker-compose example.

From source

Options

Content blocking and filtering

For various reasons it may be desirable to block or filtering content from claim search and resolve results, here are instructions for how to configure and use this feature as well as information about the recommended defaults.

Common options across scribe, herald, and scribe-elastic-sync:

  • --db_dir (required) Path of the directory containing lbry-rocksdb, set from the environment with DB_DIRECTORY
  • --daemon_url (required for scribe and herald) URL for rpc from lbrycrd or lbcd:@.
  • --reorg_limit Max reorg depth, defaults to 200, set from the environment with REORG_LIMIT.
  • --chain With blockchain to use - either mainnet, testnet, or regtest - set from the environment with NET
  • --max_query_workers Size of the thread pool, set from the environment with MAX_QUERY_WORKERS
  • --cache_all_tx_hashes If this flag is set, all tx hashes will be stored in memory. For scribe, this speeds up the rate it can apply blocks as well as process mempool. For herald, this will speed up syncing address histories. This setting will use 10+g of memory. It can be set from the environment with CACHE_ALL_TX_HASHES=Yes
  • --cache_all_claim_txos If this flag is set, all claim txos will be indexed in memory. Set from the environment with CACHE_ALL_CLAIM_TXOS=Yes
  • --prometheus_port If provided this port will be used to provide prometheus metrics, set from the environment with PROMETHEUS_PORT

Options for scribe

  • --db_max_open_files This setting translates into the max_open_files option given to rocksdb. A higher number will use more memory. Defaults to 64.
  • --address_history_cache_size The count of items in the address history cache used for processing blocks and mempool updates. A higher number will use more memory, shouldn't ever need to be higher than 10000. Defaults to 1000.
  • --index_address_statuses Maintain an index of the statuses of address transaction histories, this makes handling notifications for transactions in a block uniformly fast at the expense of more time to process new blocks and somewhat more disk space (~10gb as of block 1161417).

Options for scribe-elastic-sync

  • --reindex If this flag is set drop and rebuild the elasticsearch index.

Options for herald

  • --host Interface for server to listen on, use 0.0.0.0 to listen on the external interface. Can be set from the environment with HOST
  • --tcp_port Electrum TCP port to listen on for hub server. Can be set from the environment with TCP_PORT
  • --udp_port UDP port to listen on for hub server. Can be set from the environment with UDP_PORT
  • --elastic_services Comma separated list of items in the format elastic_host:elastic_port/notifier_host:notifier_port. Can be set from the environment with ELASTIC_SERVICES
  • --query_timeout_ms Timeout for claim searches in elasticsearch in milliseconds. Can be set from the environment with QUERY_TIMEOUT_MS
  • --blocking_channel_ids Space separated list of channel claim ids used for blocking. Claims that are reposted by these channels can't be resolved or returned in search results. Can be set from the environment with BLOCKING_CHANNEL_IDS.
  • --filtering_channel_ids Space separated list of channel claim ids used for blocking. Claims that are reposted by these channels aren't returned in search results. Can be set from the environment with FILTERING_CHANNEL_IDS
  • --index_address_statuses Use the address history status index, this makes handling notifications for transactions in a block uniformly fast (must be turned on in scribe too).

Contributing

Contributions to this project are welcome, encouraged, and compensated. For more details, please check this link.

License

This project is MIT licensed. For the full license, see LICENSE.

Security

We take security seriously. Please contact [email protected] regarding any security issues. Our PGP key is here if you need it.

Contact

The primary contact for this project is @jackrobison.

hub's People

Contributors

dependabot[bot] avatar jackrobison avatar jeffreypicard avatar jessopb avatar kodxana avatar moodyjon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hub's Issues

Don't return results with deleted channels

Probably related to: #76

{"method": "claim_search", "params":{"no_totals": true, "order_by": ["trending_group", "trending_mixed"], "claim_type": ["stream", "repost"], "stream_types": ["audio", "video"], "limit_claims_per_channel": 999, "page": 1, "page_size": 20, "release_time": ">1645103286", "channel_ids": ["80d2590ad04e36fb1d077a9b9e3a8bba76defdf8", "a8d1094b9c6624f59f19f6e139152d1e00caa9f4", "d4d17e20bec31c971b1ab6370a11203ccec095a4", "4841ccaac983b40eff8c7724afd31f4163277cbe", "5b0b41c364c89c5cb13f011823e0d6ee9b89af26", "5a1b164d0a2e7adf1db08d7363ea1cb06c30cd74", "7566c26e4b0e51d84900b8f153fc6f069ad09ef7", "3346a4ff80b70ee7eea8a79fc79f19c43bb4464a", "2b6c71a57bad61e17276ba9b9e4c58959cad1d7b", "273163260bceb95fa98d97d33d377c55395e329a", "c5cd9b63e2ba0abc191feae48238f464baecb147", "96043a243e14adf367281cc9e8b6a38b554f4725", "719b2540e63955fb6a90bc4f2c4fd9cfd8724e1a", "589276465a23c589801d874f484cc39f307d7ec7", "5e0333be82071767a3aa44a05bb77dcec4c30341", "32d4c07ecf01f2aeee3f07f9b170d9798b5e1d37", "25f384bd95e218f6ac37fcaca99ed40f36760d8c", "48c7ea8bc2c4adba09bf21a29689e3b8c2967522", "18b0d45be9f72c3c20a47f992325cb0f8af0fe7c", "fee415182e20af42122bea8d1682dc6a4d99a0d6", "d2af9d4cec08f060dfe47510f6b709ebf01d5686", "49fe7ca8bb2f7a794b1cba1d877d98dae520ac73", "ba79c80788a9e1751e49ad401f5692d86f73a2db", "c1a5fd043a1dbc8ff4ec992aefc482c970e7568e", "b6e207c5f8c58e7c8362cd05a1501bf2f5b694f2", "fb364ef587872515f545a5b4b3182b58073f230f", "beddc710e4a9f8f296fa3a6d7ea13f816422ffa5", "e4264fc7a7911ce8083a61028fe47e49c74100cf", "f3e79bf8229736a9f3ae208725574436e9d4ac03", "f1dff225e758dd5bc8ab8b91894096215297b2be", "468aa10ee3f12f0ba6cf2641f11e558c841f12fa", "b6a8abdc754fd7f86d571fd98a04deaac4cef889", "3808a556e5994e51b7e6b86f1173fdaf558dfd4e", "af927bd2092e7383789df183ff1eaf95c7041ee9", "7566c26e4b0e51d84900b8f153fc6f069ad09ef7", "b3c1ec3211a801de8ac0ef979467da4c721c0ec4", "3f89fd1bb05eb81f1b159d7f9d3cf15431ede280", "15627c8d79e7c45b15fbe726b34d47accf11b8e2", "a3e35f723d9ad82159b4858ad628e090d0e372df", "6f3940e512a40f2ac8068103bd9195fa07107043", "1487afc813124abbeb0629d2172be0f01ccec3bf", "76283e4cd0168e392128124a013c8335a987186d", "a87ee0c50662b13f11b6fdd3eefd4cee17930599", "1eb4bf3b47b07f646d760c0accf7a4295aa89024", "04a11f1d97ec47103239322f921673c1c4b9bb10", "abf8c3b0426cd89fce01770a569d525c648a92b5", "9d37b138c50014eaaf9d5e6110d58250acc521fd", "9dc7f2791db1fefb0f4aea2c856b9ecea6f3f5d0", "dcdfcd5b837a6ce46d4e8c997b1bf9a9d294d4f4", "8d497e7e96c789364c56aea7a35827d2dc1eea65", "e5872eb7237883e4158cf88e96a465f7a674c968", "3e63119a8503a6f666b0a736c8fdeb9e79d11eb4", "a29db3ebf677f1fe317ca4ecf0a65a172d4735be", "d3f050228497b023747fe18d6639105c89611255", "871ba605db0cad46e43081c4b3d942b80696359f", "4a7f6709df6770e0786b04599fb00262a98d220b", "d273a5d2b57785d19d4c123255bc54f9e45f7e83", "3511b71e5843ae53c35a5fff3e6ef7a3377dd0f7", "bc935e4482c6bf70d14dd872fae159a65c552eb3"], "not_tags": ["porn", "porno", "nsfw", "mature", "xxx", "sex", "creampie", "blowjob", "handjob", "vagina", "boobs", "big boobs", "big dick", "pussy", "cumshot", "anal", "hard fucking", "ass", "fuck", "hentai", "pron", "p0rn", "pr0n", "s3x", "camporn", "fetish", "pornographic", "pornography"]}}
returns:

{
    "jsonrpc": "2.0",
    "result": {
        "lbry://super-easy-and-tasty-greek-pita-bread": {
            "address": "bDdShxA5qBbcsqckEBTu2gb9pH8C1ZN26r",
            "amount": "0.002",
            "canonical_url": "lbry://super-easy-and-tasty-greek-pita-bread#2",
            "claim_id": "23b558d8d1fc11b525080823baf628a3901bc75d",
            "claim_op": "create",
            "confirmations": 3169,
            "height": 1206989,
            "is_channel_signature_valid": false,
            "meta": {
                "activation_height": 1206989,
                "creation_height": 1206989,
                "creation_timestamp": 1660145586,
                "effective_amount": "16.32699039",
                "expiration_height": 3309389,
                "is_controlling": true,
                "reposted": 2,
                "support_amount": "16.32499039",
                "take_over_height": 1206989,
                "trending_global": 0.0,
                "trending_group": 0,
                "trending_local": 0.0,
                "trending_mixed": 0.0
            },
            "name": "super-easy-and-tasty-greek-pita-bread",
            "normalized_name": "super-easy-and-tasty-greek-pita-bread",
            "nout": 0,
            "permanent_url": "lbry://super-easy-and-tasty-greek-pita-bread#23b558d8d1fc11b525080823baf628a3901bc75d",
            "short_url": "lbry://super-easy-and-tasty-greek-pita-bread#2",
            "signing_channel": {
                "channel_id": "04a11f1d97ec47103239322f921673c1c4b9bb10"
            },
            "timestamp": 1660145586,
            "txid": "678e8add6c0f7ea2d4176410893d7517d5b7744ce12b3611289d7b826ca14d3f",
            "type": "claim",
            "value": {

--daemon_url option can fail to parse <USER>:<PASS> causing aiohttp post() to fail

Saw this while trying to start scribe using autogenerated rpcuser/rpcpass from lbcd.conf.

--daemon_url=https://FQYAibqCp/zFfOSbxaB7K4=:<password>@127.0.0.1:9245

2022-08-17 15:33:30,709 ERROR hub.scribe.daemon:137: connection problem - is your daemon running?  Retrying occasionally...
2022-08-17 15:34:30,763 ERROR hub.scribe.daemon:137: connection problem - is your daemon running?  Retrying occasionally...

Root cause of this is some restrictions on the character classes that can be in the username & password that are embedded in the URL string. Aiohttp appears to use yarl.URL to parse the URL. In my case '/' was the offending character.

>>> x = yarl.URL("https://FQYAibqCp/zFfOSbxaB7K4=:<password>@127.0.0.1:9245")
>>> x.user
>>> x.password
>>> x = yarl.URL("https://FQYAibqCpzFfOSbxaB7K4=:<password>@127.0.0.1:9245")
>>> x.user
'FQYAibqCpzFfOSbxaB7K4='
>>> x.password
'<password>'

missing claim search - reposts + stream_type

randomly some reposts won't show when stream type is passed. It works for some though.

Compare:
{"jsonrpc":"2.0","method":"claim_search","params":{"page_size":36,"page":1,"claim_type":["repost"],"no_totals":true,"not_channel_ids":[],"order_by":["release_time"],"has_source":true,"channel_ids":["bc935e4482c6bf70d14dd872fae159a65c552eb3"],"release_time":">1657206000"},"id":1658422571434}
to
{"jsonrpc":"2.0","method":"claim_search","params":{"page_size":36,"page":1,"claim_type":["repost"],"no_totals":true,"not_channel_ids":[],"order_by":["release_time"],"has_source":true,"channel_ids":["bc935e4482c6bf70d14dd872fae159a65c552eb3"],"release_time":">1657206000","stream_types":["video"]},"id":1658422571434}

Support using ssl with lbcd

Currently lbcd must be run on localhost with ssl turned off. By adding a setting to specify a .cert file the hub would be able to use a lbcd node listening on 0.0.0.0 with ssl turned on.

Scribe writer does not use `multi_get` api

The wrtier can be made a good bit faster by batching the RevertablePut and RevertableDelete ops given to RevertableOpStack.extend_ops, internally combining them into fewer multi_get calls instead of verifying integrity on each key.

Cache mempool hashX histories

Cache the hashX history string for touched hashXes in mempool so that it doesn't need to be concatenated again each time it's touched, and so it's already in memory when the block comes in.

Block `scribe-hub` startup on initial catchup

Expected behavior: scribe-hub starts listening for electrum sessions after scribe is caught up with the blockchain daemon

Actual behavior: scribe-hub starts the server up right away and notifies clients as the writer catches up.

Handle elasticsearch becoming read only due to low disk space

The elastic sync service should handle ES temporarily becoming read only and recover when ES becomes writable, currently the failed queries are ignored:

2022-04-02 10:42:23,620 - scribe.service.ElasticSyncService - WARNING - indexing failed for an item: {'update': {'_index': 'claims', '_type': '_doc', '_id': '4023ca39b415cd916a304d71a3d528d8770f2ca5', 'status': 429, 'error': {'type': 'cluster_block_exception', 'reason': 'index [claims] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];'}}}

duplicate partial keys given to multi_get_dict

New uncaught resolve error

herald_1               | 2022-06-17 04:18:12,256 ERROR hub.herald.session:1063: exception handling Request('blockchain.claimtrie.resolve', ['lbry://@neilmccoyward#a', 'lbry://how-to-completely-avoid-a-question#9b1242408625c2c007c23c498f6e6821075ae45c', 'lbry://science-got-it#2', 'lbry://@neilmccoyward#a/it-happened-again…-coincidence-!#9', 'lbry://no-wonder-he-got-put-on-leave#a3a7a16c49df8b30706154a5628f938fcc3cebf2', "lbry://america's-economy-will-collapse...#a", 'lbry://biden-threatening-oil-companies#0553abf48a0968fcb224ad51a4627a5fd8b20b09', 'lbry://UK-Column-News-15th-June-2022#4e28f2e82d24466ec95fffdf3f696c8982e6dcdb', 'lbry://ep025-gmos#3e892771fa7d1ca07807ebeb6eba9db1f1f95e72', 'lbry://I-Wonder-What-Brad-Hazzard-Is-Paid-To-Say-What-He-Does-Here#c5a6db374bf28bbe4e1c8ab815dd5b172aefd31c'])
herald_1               | Traceback (most recent call last):
herald_1               |   File "/home/lbry/hub/herald/session.py", line 1056, in _handle_request
herald_1               |     result = await self.handle_request(request)
herald_1               |   File "/home/lbry/hub/herald/session.py", line 962, in handle_request
herald_1               |     return await coro(*request.args)
herald_1               |   File "/home/lbry/hub/herald/session.py", line 1292, in claimtrie_resolve
herald_1               |     resolved_needed = await self.db.batch_resolve_urls(list(needed))
herald_1               |   File "/home/lbry/hub/db/db.py", line 350, in batch_resolve_urls
herald_1               |     needed, await self._batch_resolve_parsed_urls(needed)
herald_1               |   File "/home/lbry/hub/db/db.py", line 272, in _batch_resolve_parsed_urls
herald_1               |     async for _, v in self.prefix_db.txo_to_claim.multi_get_async_gen(self._executor, list(needed_full_claim_hashes.values())):
herald_1               |   File "/home/lbry/hub/db/interface.py", line 107, in multi_get_async_gen
herald_1               |     assert len(packed_keys) == len(key_args), 'duplicate partial keys given to multi_get_dict'
herald_1               | AssertionError: duplicate partial keys given to multi_get_dict

Automatically compactify address histories

This makes subscriptions and wallet sync much faster for the client and much less heavy for the server. Currently this is done as a manual maintenance task every few months but it should be automated and can be done each block.

Address subscriptions are slow for addresses with large histories

When a client subscribes to an address the hub needs to send it the hash of the address history to minimally allow the client to check that it has the same history. Currently this is an expensive call that the hub must do, where it has to concatenate the history and then hash it. As the address history grows the heaviness of recalculating the status hash grows with it.

This can be fixed by moving the calculation of address statuses into two new indexes maintained by the block processor: one for address statuses at the new block and another for address statuses factoring in mempool.

Blocking and filtering metadata

Support blocking/filtering reposts using additional description/title/tags fields. These can be used by users to specify why a claim is blocked/filtered.

- include blocking/filtering repost (and the channel if it has one) in the extra txos list in an Outputs response

  • include the description from the blocking/filtering repost in the error message
  • if a claim is blocked/filtered by multiple channels, include all of them so all of the blocking reasons can be shown

lbry-sdk suite: test_es_sync_utility: AssertionError: 217 != 218

It seems this is a vulnerability in the es_writer.start() where it can fail if interleaved with a task adding new block(s) (e.g. create_task(self.generate(N))). Hence filing against lbryio/hub.

Test Code:

        # stop the es writer and advance the chain by 1, adding a new claim. upon resuming the es writer, it should
        # add the new claim
        await es_writer.stop()
        await self.stream_create(f"stream11", bid='0.001', confirm=False)
        generate_block_task = asyncio.create_task(self.generate(1))
        await es_writer.start()
        await generate_block_task
        self.assertEqual(11, len(await self.claim_search(order_by=['height'])))

https://github.com/lbryio/lbry-sdk/blob/8becf1f69f38019c8c1d0ac6fbba80897f94c8ed/tests/integration/blockchain/test_wallet_server_sessions.py#L126

Failure:

======================================================================
[653](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:654)
FAIL: test_es_sync_utility (integration.blockchain.test_wallet_server_sessions.TestESSync)
[654](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:655)
----------------------------------------------------------------------
[655](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:656)
Traceback (most recent call last):
[656](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:657)
  File "/home/runner/work/lbry-sdk/lbry-sdk/lbry/testcase.py", line 145, in run
[657](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:658)
    self.loop.run_until_complete(maybe_coroutine)
[658](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:659)
  File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
[659](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:660)
    return future.result()
[660](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:661)
  File "/home/runner/work/lbry-sdk/lbry-sdk/tests/integration/blockchain/test_wallet_server_sessions.py", line 131, in test_es_sync_utility
[661](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:662)
    await es_writer.start()
[662](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:663)
  File "/home/runner/work/lbry-sdk/lbry-sdk/.tox/blockchain/lib/python3.9/site-packages/hub/elastic_sync/service.py", line 389, in start
[663](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:664)
    return await super().start()
[664](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:665)
  File "/home/runner/work/lbry-sdk/lbry-sdk/.tox/blockchain/lib/python3.9/site-packages/hub/service.py", line 81, in start
[665](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:666)
    await start_task
[666](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:667)
  File "/home/runner/work/lbry-sdk/lbry-sdk/.tox/blockchain/lib/python3.9/site-packages/hub/elastic_sync/service.py", line 315, in catch_up
[667](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:668)
    self.advance(height)
[668](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:669)
  File "/home/runner/work/lbry-sdk/lbry-sdk/.tox/blockchain/lib/python3.9/site-packages/hub/elastic_sync/service.py", line 230, in advance
[669](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:670)
    super().advance(height)
[670](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:671)
  File "/home/runner/work/lbry-sdk/lbry-sdk/.tox/blockchain/lib/python3.9/site-packages/hub/service.py", line 159, in advance
[671](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:672)
    assert len(self.db.tx_counts) == height, f"{len(self.db.tx_counts)} != {height}"
[672](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:673)
AssertionError: 217 != 218
[673](https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:674)

Examples:
https://github.com/lbryio/lbry-sdk/runs/6944766671?check_suite_focus=true#step:11:686
https://github.com/lbryio/lbry-sdk/runs/6846528407?check_suite_focus=true#step:11:499
https://github.com/lbryio/lbry-sdk/runs/6846200096?check_suite_focus=true#step:11:469

Incorrect address statuses in the address status index

There is a bug in the address status index, it's not clear yet how many addresses are impacted, and can result in clients not fully syncing their histories (which can result in them erroneously and unsuccessfully trying to spend already spent utxos).

For the address bSeWnSKTiQRtQ76jkz4pru2XXnw9wchmqb the history is correct but the precalculated status omits the last transaction when it's hashed, ie it's supposed to be

sha256('3c1b765b412703a9f02d0328886df43fe9516ecfea66e70be61658f79b48d70a:1142400:32ddebefc0620d020d91b0f9de5a93082b636905b5b934675e90dbd26b6f7141:1176287:'.encode()).hex()
'49050a8ffaf514cbae321f21e8ab9649da3743ccaeda102fb107fbde491f7971'

but it actually contains

sha256('3c1b765b412703a9f02d0328886df43fe9516ecfea66e70be61658f79b48d70a:1142400:'.encode()).hex()
'9ad3840619ee1ba015c29ce78cd7bb1b4c89d40c5c682bd2c646e476599003a9'

This needs to be reproduced in a test, the bug is probably in the startup / initial catchup code (or else it would be far more common).

Permission issue rolling over log file

herald_1               | --- Logging error ---
herald_1               | Traceback (most recent call last):
herald_1               |   File "/usr/lib/python3.9/logging/handlers.py", line 74, in emit
herald_1               |     self.doRollover()
herald_1               |   File "/usr/lib/python3.9/logging/handlers.py", line 177, in doRollover
herald_1               |     self.rotate(self.baseFilename, dfn)
herald_1               |   File "/usr/lib/python3.9/logging/handlers.py", line 115, in rotate
herald_1               |     os.rename(source, dest)
herald_1               | PermissionError: [Errno 13] Permission denied: '/database/herald.log' -> '/database/herald.log.1'

MemoryError (OOM) in advance block

Traceback (most recent call last):
  File "/home/lbry/scribe/blockchain/service.py", line 1625, in process_blocks_and_mempool_forever
    raise err
  File "/home/lbry/scribe/blockchain/service.py", line 1620, in process_blocks_and_mempool_forever
    await self.check_and_advance_blocks(blocks)
  File "/home/lbry/scribe/blockchain/service.py", line 212, in check_and_advance_blocks
    txo_count = await self.run_in_thread_with_lock(self.advance_block, block)
  File "/home/lbry/scribe/blockchain/service.py", line 140, in run_in_thread_with_lock
    return await asyncio.shield(run_in_thread_locked())
  File "/home/lbry/scribe/blockchain/service.py", line 139, in run_in_thread_locked
    return await asyncio.get_event_loop().run_in_executor(self._executor, func, *args)
  File "/usr/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/lbry/scribe/blockchain/service.py", line 1441, in advance_block
    self.db.prefix_db.commit(self.height, self.tip)
  File "/home/lbry/scribe/db/interface.py", line 209, in commit
    undo_ops = self._op_stack.get_undo_ops()
  File "/home/lbry/scribe/db/revertable.py", line 163, in get_undo_ops
    return b''.join(op.invert().pack() for op in reversed(self))
MemoryError

This error can be recovered from by restarting scribe. The server that encountered this also runs lbrycrd and has limited available memory.

"fee" parameter misbehaving

How to reproduce (July 11, spv19, the channel is probably not important):

$ lbrynet claim search --channel=@DistributedBarbecue:3 --fee_amount=0 | grep total_it
  "total_items": 4,
$ lbrynet claim search --channel=@DistributedBarbecue:3  | grep total_it
  "total_items": 13,
$ lbrynet claim search --channel=@DistributedBarbecue:3 --fee_currency=LBC | grep total_it
  "total_items": 0,
$ lbrynet claim search --channel=@DistributedBarbecue:3 --fee_amount=">=0" | grep total_it
  "total_items": 4,
$ lbrynet claim search --channel=@DistributedBarbecue:3 --fee_amount="<0" | grep total_it
  "total_items": 0,

Looks like reposts are gone? Not sure if only reposts are affected

Improve reposted count performance

Maintain a dedicated index to track how many times a claim has been reposted so that it doesn't need to be summed every time it's requested.

Dead code pertaining to "any_languages" query generation

Saw this while investigating lbryio/lbry-sdk#3328 and any_languages search term implementation.

1 File:
https://github.com/lbryio/scribe/blob/baf630dfa7854dd6ec4152ffdf35b13999093cde/scribe/elasticsearch/search.py#L404

...contains duplicate elif cases pertaining to any_languages:

        elif key == 'any_languages':
            query['must'].append({"terms": {'languages': clean_tags(value)}})
        elif key == 'any_languages':
            query['must'].append({"terms": {'languages': value}})

Can't say which one is the correct behavior.

Add an admin api

As a hub operator, it would be useful to be able to:

  • change values of hub component settings without restarting the components
  • check the status of hub components
  • run diagnostics and maintenance on the database (scan for integrity errors, repair damaged dbs, rebuild indexes, drop optional indexes, etc)
  • request usage statistics
  • request hub component resource usage information
  • request mempool / other information from either lbcd or lbrycrd directly
  • call generate on lbcd/lbrycrd on regtest so that the docker stack can be used for integration tests

A new HubAdminService can be added that can expose a single admin api to do these things across the other running and configured hub services.

`scribe-hub` update loop crashes

There is an uncaught exception in the update loop for scribe-hub which causes it to stop polling for updates. This causes it to appear stuck on a block and stop sending notifications, even though scribe maybe continue writing more blocks.

incorrect claim returned in resolve (when there's a collision on name in channel)

Hablando-de-Bitcoin-y-Criptomonedas#b was asked for, hablando-de-bitcoin-y-criptomonedas#3 was returned

This doesn't happen often, but there are cases of it.

See:

(lbry-venv) C:\Users\thoma\Documents\lbry-sdk>lbrynet resolve lbry://@criptomonedastv#6/Hablando-de-Bitcoin-y-Criptomonedas#b
{
  "lbry://@criptomonedastv#6/Hablando-de-Bitcoin-y-Criptomonedas#b": {
    "address": "bR81DLPEwTwkMQLqNz48jPtPZFEWJoTcUJ",
    "amount": "0.005",
    "canonical_url": "lbry://@criptomonedastv#6/hablando-de-bitcoin-y-criptomonedas#3",
    "claim_id": "3462d2d415ecd3821ed02bcca15d28847ecab681",
    "claim_op": "update",
    "confirmations": 388845,
    "height": 664642,
    "is_channel_signature_valid": true,
    "meta": {
      "activation_height": 664642,
      "creation_height": 462851,
      "creation_timestamp": 1541135509,
      "effective_amount": "41.02061598",
      "expiration_height": 2767042,
      "is_controlling": false,
      "reposted": 0,
      "support_amount": "41.01561598",
      "take_over_height": 1003910,
      "trending_global": 0.0,
      "trending_group": 0,
      "trending_local": 0.0,
      "trending_mixed": 0.0
    },
    "name": "hablando-de-bitcoin-y-criptomonedas",
    "normalized_name": "hablando-de-bitcoin-y-criptomonedas",
    "nout": 0,
    "permanent_url": "lbry://hablando-de-bitcoin-y-criptomonedas#3462d2d415ecd3821ed02bcca15d28847ecab681",
    "short_url": "lbry://hablando-de-bitcoin-y-criptomonedas#3",
    "signing_channel": {
      "address": "bR81DLPEwTwkMQLqNz48jPtPZFEWJoTcUJ",

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.