aih / billsim Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
For elasticsearch, see: https://github.com/elastic/elastic-github-actions/tree/master/elasticsearch
Also add Postgres (docker?) and save data to the Postgres db:
See https://github.com/microsoft/vscode-dev-containers/tree/main/containers/python-3-postgres
I tried running billsim with python3.8 and python3.9, and got two separate issues.
With 3.8
___________ ERROR collecting tests/constants_test.py ___________
tests/constants_test.py:3: in <module>
from billsim.pymodels import BillPath
src/billsim/pymodels.py:36: in <module>
class Section(SectionMeta):
src/billsim/pymodels.py:37: in Section
similar_sections: list[SimilarSection]
E TypeError: 'type' object is not subscriptable
_____________ ERROR collecting tests/utils_test.py _____________
tests/utils_test.py:9: in <module>
from billsim.pymodels import BillPath
src/billsim/pymodels.py:36: in <module>
class Section(SectionMeta):
src/billsim/pymodels.py:37: in Section
similar_sections: list[SimilarSection]
E TypeError: 'type' object is not subscriptable
With 3.9
from lxml import etree
ImportError: dlopen(/opt/homebrew/lib/python3.9/site-packages/lxml/etree.cpython-39-darwin.so, 2): no suitable image found. Did find:
/opt/homebrew/lib/python3.9/site-packages/lxml/etree.cpython-39-darwin.so: mach-o, but wrong architecture
/opt/homebrew/lib/python3.9/site-packages/lxml/etree.cpython-39-darwin.so: mach-o, but wrong architecture
This latter one seems to be a issue with m1 macs from considering the error complains about architecture.
When finding a bill similarity from A to B, need to store that similarity in B to A so that looking up B brings up Bill A.
Add a table with run instance info. Each row include id, version and date.
Add for each bill2bill a run instance id.
See https://github.com/aih/billsim/blob/main/src/billsim/utils_db.py#L382
We currently use sqlalchemy for this batch save operation. However, in some cases, it causes errors:
Traceback (most recent call last):
File "/home/ubuntu/.pyenv/versions/3.9.1/envs/py391/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1799, in _execute_context
self.dialect.do_execute(
File "/home/ubuntu/.pyenv/versions/3.9.1/envs/py391/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 717, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.StatementTooComplex: stack depth limit exceeded
HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack depth limit is adequate.
We may try to increase max_stack_depth, but the statements also appear to be unnecessarily large and recursive.
Can we take advantage of psycopg3 improvements and use it directly for saves?
This issue is to create a github action to run tests on PR
Consider, e.g. sqlalchemy's create_async_engine() for async postgres calls
This function builds on the functions in this repository and the Go functions in aih/bills.
Assumptions:
The generic similarity functions would:
For a Library of Congress project, we need to connect to Postgres on port 5433, not port 5432.
The billsim code doesn't seem to use the values defined in the .env file at the top-level. I thought it did use those .env values, but potentially I hardcoded port 5433 into the billsim package and forgotten it, then reinstalled billsim and wiped that change away.
How can we best pass the postgres port and host into the billsim package? Potentially we should pass it at runtime rather than as an env variable? Or is there some way to have the billsim package use the .env file at the top level of the repo? This would appear to be an issue for any user of the billsim package, as generally you'd want to pass in username/password/etc and not use defaults.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.