nextstrain / flu_frequencies Goto Github PK
View Code? Open in Web Editor NEWFlu clade and mutation frequencies
Home Page: https://flu-frequencies.vercel.app
Flu clade and mutation frequencies
Home Page: https://flu-frequencies.vercel.app
In the web app, currently there are 2 "China"s in the list of locations. One is region ("China") and another is country ("CHN"). We need to disambiguate these.
Are these still different?
Options:
To reproduce:
Some possible enhancements to interface:
This seems to be due to a lack of data in this region. One way around this would be to add a level in the region-country hierarchy, so that this region would inherit global frequency estimates. Alternatively, we could allow frequency estimates for proximal countries to affect estimates for countries with completely missing data.
Suggestion from @rneher
npm
or snakemake
git clone https://github.com/neherlab/flu_frequencies.git
web
branch (wanting to focus on contributing to front-end development)python3 -m venv venv; source venv/bin/activate
pip
to install Python dependencies (pandas
, matplotlib
, polars
)frequencies.py
, expecting different data files - seems that web
branch is behind master
sudo apt install snakemake
docs/dev/developer-guide.md
- ran into problems with missing data folder, see nextstrain/nextclade#1140sudo apt install npm
installed npm 6.14.4
sudo npm install --global yarn
installed yarn v1.22.19
yarn
attempted to install package dependencies, but this threw a large number of warnings for version discordance in dependencies, i.e., X has unmet peer dependency Y
yarn add postcss
etc. to manually add these dependencies to the project, but this modifies tracked files, i.e., yarn.lock
and package.json
, so it does not seem to be the right way to go about itnpm run test
also threw an error:(venv) art@Kestrel:~/git/flu_frequencies/web$ npm run test
> [email protected] test /home/art/git/flu_frequencies/web
> yarn test:nowatch --watch --verbose
yarn run v1.22.19
$ jest --config=config/jest/jest.config.js --passWithNoTests --watch --verbose
/home/art/git/flu_frequencies/web/node_modules/jest-cli/build/run.js:129
if (error?.stack) {
^
SyntaxError: Unexpected token .
at Module._compile (internal/modules/cjs/loader.js:723:23)
sudo apt install docker.io
sudo docker run hello-world
runs OKsudo docker build -f docker/docker-dev.dockerfile .
fails at step 23 of 24:The command 'bash -euxo pipefail -c set -euxo pipefail >/dev/null && if [ -z "$(getent group ${GID})" ]; then
...
returned a non-zero code: 2
When the flu frequencies workflow gets to the fit_single_frequencies.py
step, recent versions of polars throw a panic exception with the following error message:
$ python scripts/fit_single_frequencies.py --metadata data/vic/combined_na.tsv --geo-categories region --frequency-category clade --min-date 2021-01-01 --days 14 --inclusive-clades flu --output-csv results/vic_na/region-frequencies.csv
thread '<unnamed>' panicked at crates/polars-core/src/series/iterator.rs:74:9:
assertion `left == right` failed: impl error
left: 4
right: 1
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
--- PyO3 is resuming a panic after fetching a PanicException from Python. ---
Python stack trace below:
Traceback (most recent call last):
File "/Users/jlhudd/miniconda3/envs/flu_frequencies/lib/python3.10/site-packages/polars/expr/expr.py", line 3976, in __call__
result = self.function(*args, **kwargs)
File "/Users/jlhudd/miniconda3/envs/flu_frequencies/lib/python3.10/site-packages/polars/expr/expr.py", line 4299, in wrap_f
return x.map_elements(
File "/Users/jlhudd/miniconda3/envs/flu_frequencies/lib/python3.10/site-packages/polars/series/series.py", line 5270, in map_elements
self._s.apply_lambda(function, pl_return_dtype, skip_nulls)
pyo3_runtime.PanicException: assertion `left == right` failed: impl error
left: 4
right: 1
Traceback (most recent call last):
File "/Users/jlhudd/projects/nextflu-reports/who-2024-02/flu_frequencies/scripts/fit_single_frequencies.py", line 163, in <module>
data, totals, counts, time_bins = load_and_aggregate(d, args.geo_categories, freq_cat,
File "/Users/jlhudd/projects/nextflu-reports/who-2024-02/flu_frequencies/scripts/fit_single_frequencies.py", line 44, in load_and_aggregate
d = d.with_columns([pl.col('date').map_elements(lambda x: to_day_count(x, start_date)).alias("day_count")])
File "/Users/jlhudd/miniconda3/envs/flu_frequencies/lib/python3.10/site-packages/polars/dataframe/frame.py", line 8270, in with_columns
return self.lazy().with_columns(*exprs, **named_exprs).collect(_eager=True)
File "/Users/jlhudd/miniconda3/envs/flu_frequencies/lib/python3.10/site-packages/polars/lazyframe/frame.py", line 1730, in collect
return wrap_df(ldf.collect())
pyo3_runtime.PanicException: assertion `left == right` failed: impl error
left: 4
right: 1
I don't see any obvious changes to our input data between when this used to work and now. Downgrading polars to 0.20.3 allows the frequencies script to run without an error, suggesting that the issue first appeared in polars 0.20.4 (release Jan 12, 2024). This is all with Python 3.10.13 on an Intel Mac (OS version 12.6).
I confirmed that the error only occurs when calling the map_elements
section of the failing expression above.
As a band-aid, we could pin polars to 0.20.3 in the Conda environment.
As a longer-term solution, we might try to replace the officially discouraged map_elements
call with a different approach.
Or we could switch to pandas.
In PR #27 I am filtering away rows containing countries and regions named ?
. But this should be done on pipeline level.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.