ml-struct-bio / cryodrgn Goto Github PK

View Code? Open in Web Editor NEW

285.0 285.0 74.0 130.76 MB

Neural networks for cryo-EM reconstruction

Home Page: http://cryodrgn.cs.princeton.edu

License: GNU General Public License v3.0

Python 86.35% Shell 1.22% Jupyter Notebook 12.43%

cryodrgn's People

Contributors

Stargazers

Watchers

Forkers

jkaelber fredericpoitevin kemeng89 jianglab smallelephant9516 xzhao-ai asarnow heejongkim kttn8769 huangmt 1aviervargas gwynase xiangmengcai gushaocheng zruan maozhouhe typo-graph guillawme aspirincode qitsweauca danieledelberg bpowell122 kennykangmpc aprilaugust codzart parthsinghal09 pdudzic123 rahulrmanda yuhangwang jamaliki jordibc nki-ai mstabrin thorstenwagner godfatherqyx kekexinz rockyray1 dnlebard phonchi zichenzachwang wsxdyyd dugu9sword rdsza mhanson2019 adamlerer lipan6461188 zmsunnyday soumyac1999 truatpasteurdotfr limzh00 jsanket12 ff98li dylan8527 evgir mohammedyasserr comeondatamining shengxinlittlemoon filipemaia rish-raghu wushaowen1992 jxshi adriangzlz97 diegosc13 jamesjunyuguo peterzs michal-g dy-hu jj4485 shreyakashi mpi-dortmund rich-xgk

cryodrgn's Issues

Filtering & Viz.ipynb UMAP Table not shown

cryodrgn parse_ctf_csparc fails when .cs file is missing image size

Sometimes the cryosparc CTF metadata file (.cs) doesn't contain the image size. Add additional arguments to cryodrgn parse_ctf_csparc to overwrite these fields.

$ cryodrgn parse_ctf_csparc J150/P10_J150_passthrough_particles.cs -o ctf.pkl
2021-04-18 13:15:07     1313694 particles
Traceback (most recent call last):
  File "/nobackup/users/zhonge/anaconda3/envs/cryodrgn4/bin/cryodrgn", line 11, in <module>
    load_entry_point('cryodrgn', 'console_scripts', 'cryodrgn')()
  File "/nobackup/users/zhonge/dev/cryodrgn/master/cryodrgn/__main__.py", line 52, in main
    args.func(args)
  File "/nobackup/users/zhonge/dev/cryodrgn/master/cryodrgn/commands/parse_ctf_csparc.py", line 27, in main
    ctf_params[:,0] = metadata['blob/shape'][0][0]
ValueError: no field of name blob/shape

Wrong version reported for v0.3.0b

Hi @zhonge - just putting this on your radar that when cryodrgn version is queried it returns the previous version number even though I'm running v0.3.0b:

$ cryodrgn --version
cryoDRGN 0.2.1b

Mike

Trying to refine poses fails

Hello,

I am trying to run a training with pose refinement, but it fails before it even starts. Here are the command I ran and the error I got:

cryodrgn train_vae j167-particles-box128_for-cryodrgn.mrcs --o train_009/ --zdim 8 --poses j167-particles-poses-from-RELION.pkl --ctf j167-particles-CTFs-from-RELION.pkl --seed 12345 -n 50 --domain hartley --do-pose-sgd

I am using --domain hartley because an initial attempt without this option told me it is necessary for pose refinement.

2021-03-15 16:25:28     /opt/miniconda3/envs/cryodrgn-0.3.1/bin/cryodrgn train_vae j167-particles-box128_for-cryodrgn.mrcs --o train_009/ --zdim 8 --poses j167-particles-poses-from-RELION.pkl --ctf j167-particles-CTFs-from-RELION.pkl --seed 12345 -n 50 --domain hartley --do-pose-sgd
2021-03-15 16:25:28     Namespace(amp=False, batch_size=8, beta=1.0, beta_control=None, checkpoint=1, ctf='/data/luka/processing/201208_relion/cryodrgn/j167-box128-exploration/j167-particles-CTFs-from-RELION.pkl', datadir=None, do_pose_sgd=True, domain='hartley', emb_type='quat', enc_mask=None, encode_mode='resid', func=<function main at 0x7f1dffde65f0>, ind=None, invert_data=True, lazy=False, load=None, log_interval=1000, lr=0.0001, multigpu=False, norm=None, num_epochs=50, outdir='/data/luka/processing/201208_relion/cryodrgn/j167-box128-exploration/train_009', particles='/data/luka/processing/201208_relion/cryodrgn/j167-box128-exploration/j167-particles-box128_for-cryodrgn.mrcs', pdim=256, pe_dim=None, pe_type='geom_lowf', players=3, pose_lr=0.0003, poses='/data/luka/processing/201208_relion/cryodrgn/j167-box128-exploration/j167-particles-poses-from-RELION.pkl', pretrain=1, qdim=256, qlayers=3, relion31=False, seed=12345, tilt=None, tilt_deg=45, use_real=False, verbose=False, wd=0, window=True, zdim=8)
2021-03-15 16:25:28     Use cuda True
2021-03-15 16:25:29     Loaded 43698 128x128 images
2021-03-15 16:25:55     Normalized HT by 0 +/- 231.78614807128906
Traceback (most recent call last):
  File "/opt/miniconda3/envs/cryodrgn-0.3.1/bin/cryodrgn", line 33, in <module>
    sys.exit(load_entry_point('cryodrgn==0.3.1', 'console_scripts', 'cryodrgn')())
  File "/opt/miniconda3/envs/cryodrgn-0.3.1/lib/python3.7/site-packages/cryodrgn-0.3.1-py3.7.egg/cryodrgn/__main__.py", line 50, in main
    args.func(args)
  File "/opt/miniconda3/envs/cryodrgn-0.3.1/lib/python3.7/site-packages/cryodrgn-0.3.1-py3.7.egg/cryodrgn/commands/train_vae.py", line 318, in main
    pose_optimizer = torch.optim.SparseAdam(posetracker.parameters(), lr=args.pose_lr) if do_pose_sgd else None
  File "/opt/miniconda3/envs/cryodrgn-0.3.1/lib/python3.7/site-packages/torch/optim/sparse_adam.py", line 49, in __init__
    super(SparseAdam, self).__init__(params, defaults)
  File "/opt/miniconda3/envs/cryodrgn-0.3.1/lib/python3.7/site-packages/torch/optim/optimizer.py", line 47, in __init__
    raise ValueError("optimizer got an empty parameter list")
ValueError: optimizer got an empty parameter list

Is there anything wrong with my command? I am keeping most options to their defaults values, and the exact same command without --domain hartley --do-pose-sgd completes successfully, so not sure what I'm missing.

Thank you in advance for your help.

Import error in write_starfile.py

Hi Ellen, I pulled the latest code. Getting an import error now when trying to run write_starfile.py (even though cryodrgn environment is active). Would appreciate any guidance- thanks!!:

  File "/home/darst/cryodrgn/utils/write_starfile.py", line 12, in <module>
    from cryodrgn import dataset
ImportError: No module named cryodrgn

amp running error

I uninstalled cryodrgn and reinstalled.

Then, according to
https://github.com/NVIDIA/apex#quick-start
I re-installed Apex via Python-only build (because installation via CUDA and C++ extensions errored).

However,

~/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/bin/cryodrgn train_vae cryosparc_P4_J251_009_particles_cs_abs_w_mrcs_star_06_25.64.mrcs --poses pose_128.pkl --ctf ctf.pkl --zdim 64 -n 100 -o vae64_z64_e100_b_64_amp_wo_tonga_interactive --batch-size 64 --amp

resulted in

2020-09-07 19:19:31 1315458 parameters in model Selected optimization level O1: Insert automatic casts around Pytorch functions and Tensor methods. Defaults for this optimization level are: enabled : True opt_level : O1 cast_model_type : None patch_torch_functions : True keep_batchnorm_fp32 : None master_weights : None loss_scale : dynamic Processing user overrides (additional kwargs that are not None)... After processing overrides, optimization options are: enabled : True opt_level : O1 cast_model_type : None patch_torch_functions : True keep_batchnorm_fp32 : None master_weights : None loss_scale : dynamic Traceback (most recent call last): File "/people/kimd999/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/bin/cryodrgn", line 33, in <module> sys.exit(load_entry_point('cryodrgn==0.2.1b0', 'console_scripts', 'cryodrgn')()) File "/people/kimd999/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/__main__.py", line 50, in main args.func(args) File "/people/kimd999/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/commands/train_vae.py", line 364, in main model, optim = amp.initialize(model, optim, opt_level='O1') File "/people/kimd999/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/lib/python3.7/site-packages/apex/amp/frontend.py", line 358, in initialize return _initialize(models, optimizers, _amp_state.opt_properties, num_losses, cast_model_outputs) File "/people/kimd999/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/lib/python3.7/site-packages/apex/amp/_initialize.py", line 171, in _initialize check_params_fp32(models) File "/people/kimd999/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/lib/python3.7/site-packages/apex/amp/_initialize.py", line 93, in check_params_fp32 name, param.type())) File "/people/kimd999/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/lib/python3.7/site-packages/apex/amp/_amp_state.py", line 32, in warn_or_err raise RuntimeError(msg) RuntimeError: Found param encoder.main.0.weight with type torch.FloatTensor, expected torch.cuda.FloatTensor. When using amp.initialize, you need to provide a model with parameters located on a CUDA device before passing it no matter what optimization level you chose. Use model.to('cuda') to use the default device.

nan during train_vae

Maybe this is not cryodrgn specific issue, but more general vae problem???.

Anyway, have you ever seen "nan" like below?

Thank you

2020-08-15 07:13:21     # =====> Epoch: 2457 Average gen loss = 0.838749, KLD = 3.326967, total loss = 0.839008; Finished in 0:03:07.132678
2020-08-15 07:17:01     # =====> Epoch: 2458 Average gen loss = 0.838748, KLD = 3.325715, total loss = 0.839007; Finished in 0:03:02.530018
2020-08-15 07:20:41     # =====> Epoch: 2459 Average gen loss = nan, KLD = 4347.587754, total loss = nan; Finished in 0:03:05.964074
2020-08-15 07:24:22     # =====> Epoch: 2460 Average gen loss = nan, KLD = 297.862892, total loss = nan; Finished in 0:03:02.167709
2020-08-15 07:28:03     # =====> Epoch: 2461 Average gen loss = 0.908481, KLD = 8.505987, total loss = 0.909143; Finished in 0:03:03.817373
2020-08-15 07:31:55     # =====> Epoch: 2462 Average gen loss = 1.12218, KLD = 7.893573, total loss = 1.122793; Finished in 0:03:10.836993
2020-08-15 07:35:33     # =====> Epoch: 2463 Average gen loss = 0.840698, KLD = 3.719698, total loss = 0.840987; Finished in 0:03:03.367703
2020-08-15 07:39:14     # =====> Epoch: 2464 Average gen loss = 0.928574, KLD = 3.505119, total loss = 0.928846; Finished in 0:03:08.811053
2020-08-15 07:42:48     # =====> Epoch: 2465 Average gen loss = 0.838928, KLD = 3.334144, total loss = 0.839187; Finished in 0:03:00.716042
2020-08-15 07:46:29     # =====> Epoch: 2466 Average gen loss = 0.8393, KLD = 3.351988, total loss = 0.839561; Finished in 0:03:02.098446
2020-08-15 07:50:07     # =====> Epoch: 2467 Average gen loss = 0.839245, KLD = 3.370683, total loss = 0.839507; Finished in 0:03:02.463680
2020-08-15 07:53:52     # =====> Epoch: 2468 Average gen loss = 0.839711, KLD = 3.369308, total loss = 0.839973; Finished in 0:03:07.060146
2020-08-15 07:57:28     # =====> Epoch: 2469 Average gen loss = 0.838877, KLD = 3.337764, total loss = 0.839137; Finished in 0:03:00.234357
2020-08-15 08:01:08     # =====> Epoch: 2470 Average gen loss = 0.839412, KLD = 3.340018, total loss = 0.839672; Finished in 0:03:06.454027
2020-08-15 08:05:04     # =====> Epoch: 2471 Average gen loss = 0.83875, KLD = 3.335607, total loss = 0.839010; Finished in 0:03:14.294994
2020-08-15 08:08:58     # =====> Epoch: 2472 Average gen loss = 0.838798, KLD = 3.332927, total loss = 0.839057; Finished in 0:03:17.291361
2020-08-15 08:12:41     # =====> Epoch: 2473 Average gen loss = 0.839363, KLD = 3.337848, total loss = 0.839623; Finished in 0:03:02.637869
2020-08-15 08:16:24     # =====> Epoch: 2474 Average gen loss = 0.83879, KLD = 3.334751, total loss = 0.839049; Finished in 0:03:06.469525
2020-08-15 08:20:06     # =====> Epoch: 2475 Average gen loss = 0.838746, KLD = 3.335152, total loss = 0.839005; Finished in 0:03:02.485381
2020-08-15 08:23:46     # =====> Epoch: 2476 Average gen loss = 0.839, KLD = 3.334727, total loss = 0.839259; Finished in 0:03:03.908909
2020-08-15 08:27:28     # =====> Epoch: 2477 Average gen loss = 0.838743, KLD = 3.335539, total loss = 0.839003; Finished in 0:03:06.945390
2020-08-15 08:31:05     # =====> Epoch: 2478 Average gen loss = 0.838759, KLD = 3.336668, total loss = 0.839019; Finished in 0:03:01.373870
2020-08-15 08:34:52     # =====> Epoch: 2479 Average gen loss = 0.838768, KLD = 3.334897, total loss = 0.839027; Finished in 0:03:04.845571
2020-08-15 08:38:32     # =====> Epoch: 2480 Average gen loss = 0.838751, KLD = 3.331835, total loss = 0.839010; Finished in 0:03:03.818442
2020-08-15 08:42:07     # =====> Epoch: 2481 Average gen loss = nan, KLD = 33385075859247759360.000000, total loss = nan; Finished in 0:03:01.271044
2020-08-15 08:45:51     # =====> Epoch: 2482 Average gen loss = nan, KLD = 319211574631489344.000000, total loss = nan; Finished in 0:03:05.973848
2020-08-15 08:49:58     # =====> Epoch: 2483 Average gen loss = nan, KLD = 64188202244660896.000000, total loss = nan; Finished in 0:03:24.483174

cryodrgn analyze crash while running UMAP

Hi,
I ran the training on my own data (~900k particles, box size 130) like this and then tried to analyze some intermediate steps:

cryodrgn train_vae stack130.mrcs --poses pose.pkl --ctf ctf.pkl --zdim 10 -n 50 --qdim 1024 --qlayers 3 --pdim 1024 --players 3 -o 01_SE_z10_bx130 --lazy

cryodrgn analyze 01_SE_z10_bx130 18 --Apix 3.252 --device 1

Whenever I do this I get a crash at the end of the analysis. I get volumes for PC1/2 and kmeans20 but I seem to be missing some pngs and the jupyter notebook files

Any Idea what could be causing this?

Log:

2020-06-19 16:05:09 Saving results to /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18
2020-06-19 16:05:09 Perfoming principal component analysis...
2020-06-19 16:05:09 Explained variance ratio:
2020-06-19 16:05:09 [0.25000114 0.12236101 0.11018245 0.09589829 0.09308468 0.08158701
0.07270409 0.06875768 0.05479458 0.05062907]
2020-06-19 16:05:09 K-means clustering...
2020-06-19 16:05:37 Generating volumes...
2020-06-19 16:05:37 Running command:
CUDA_VISIBLE_DEVICES=1 cryodrgn eval_vol /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/weights.18.pkl --config /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/config.pkl --zfile /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18/pc1/z_values.txt -o /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18/pc1 --Apix 3.252
2020-06-19 16:05:40 Use cuda True
2020-06-19 16:05:40 Namespace(Apix=3.252, D=130, config='/scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/config.pkl', domain='fourier', downsample=None, enc_mask=65, encode_mode='resid', flip=False, func=<function main at 0x7f0a9272b440>, l_extent=0.5, n=10, norm=[0, 130.7921984067214], o='/scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18/pc1', pdim=1024, pe_dim=None, pe_type='geom_lowf', players=3, prefix='vol_', qdim=1024, qlayers=3, verbose=False, weights='/scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/weights.18.pkl', z=None, z_end=None, z_start=None, zdim=10, zfile='/scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18/pc1/z_values.txt')
2020-06-19 16:05:42 Using circular lattice with radius 65
2020-06-19 16:05:42 Loading weights from /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/weights.18.pkl
2020-06-19 16:05:43 Generating 10 volumes
2020-06-19 16:05:43 [ 0.02677018 1.16648743 -1.00013293 0.37046741 0.03779986 1.58210712
-0.21389392 0.75957619 0.09397226 -0.31373119]
2020-06-19 16:05:45 [-0.01569043 0.84260323 -0.71331644 0.2482569 -0.00333118 1.14377593
-0.22116149 0.59137023 0.07997815 -0.26208557]
2020-06-19 16:05:48 [-0.05815104 0.51871902 -0.42649996 0.12604639 -0.04446223 0.70544475
-0.22842907 0.42316426 0.06598405 -0.21043995]
2020-06-19 16:05:51 [-0.10061165 0.19483482 -0.13968347 0.00383587 -0.08559327 0.26711356
-0.23569665 0.2549583 0.05198994 -0.15879433]
2020-06-19 16:05:53 [-0.14307226 -0.12904939 0.14713302 -0.11837464 -0.12672432 -0.17121763
-0.24296423 0.08675234 0.03799583 -0.10714871]
2020-06-19 16:05:56 [-0.18553287 -0.45293359 0.4339495 -0.24058515 -0.16785536 -0.60954882
-0.2502318 -0.08145363 0.02400173 -0.0555031 ]
2020-06-19 16:05:59 [-0.22799348 -0.7768178 0.72076599 -0.36279566 -0.20898641 -1.04788001
-0.25749938 -0.24965959 0.01000762 -0.00385748]
2020-06-19 16:06:02 [-0.27045409 -1.10070201 1.00758247 -0.48500618 -0.25011745 -1.4862112
-0.26476696 -0.41786556 -0.00398648 0.04778814]
2020-06-19 16:06:04 [-0.31291469 -1.42458621 1.29439896 -0.60721669 -0.2912485 -1.92454239
-0.27203453 -0.58607152 -0.01798059 0.09943376]
2020-06-19 16:06:07 [-0.3553753 -1.74847042 1.58121545 -0.7294272 -0.33237955 -2.36287358
-0.27930211 -0.75427748 -0.03197469 0.15107938]
2020-06-19 16:06:10 Finsihed in 0:00:31.376099
2020-06-19 16:06:11 Running command:
CUDA_VISIBLE_DEVICES=1 cryodrgn eval_vol /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/weights.18.pkl --config /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/config.pkl --zfile /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18/pc2/z_values.txt -o /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18/pc2 --Apix 3.252
2020-06-19 16:06:14 Use cuda True
2020-06-19 16:06:14 Namespace(Apix=3.252, D=130, config='/scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/config.pkl', domain='fourier', downsample=None, enc_mask=65, encode_mode='resid', flip=False, func=<function main at 0x7fc4e2df9440>, l_extent=0.5, n=10, norm=[0, 130.7921984067214], o='/scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18/pc2', pdim=1024, pe_dim=None, pe_type='geom_lowf', players=3, prefix='vol_', qdim=1024, qlayers=3, verbose=False, weights='/scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/weights.18.pkl', z=None, z_end=None, z_start=None, zdim=10, zfile='/scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18/pc2/z_values.txt')
2020-06-19 16:06:16 Using circular lattice with radius 65
2020-06-19 16:06:16 Loading weights from /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/weights.18.pkl
2020-06-19 16:06:16 Generating 10 volumes
2020-06-19 16:06:16 [ 1.26795325 -0.75064031 -0.33676747 -0.38691357 0.51339586 0.44470336
0.67303123 0.36097304 0.62133282 0.43958098]
2020-06-19 16:06:19 [ 0.95961906 -0.56249652 -0.27260132 -0.30833687 0.37685765 0.37421879
0.46977053 0.32547674 0.49331972 0.30971192]
2020-06-19 16:06:22 [ 0.65128487 -0.37435273 -0.20843517 -0.22976017 0.24031943 0.30373421
0.26650983 0.28998044 0.36530663 0.17984287]
2020-06-19 16:06:25 [ 0.34295068 -0.18620894 -0.14426902 -0.15118348 0.10378121 0.23324964
0.06324912 0.25448415 0.23729353 0.04997381]
2020-06-19 16:06:28 [ 0.03461649 0.00193485 -0.08010287 -0.07260678 -0.03275701 0.16276506
-0.14001158 0.21898785 0.10928043 -0.07989524]
2020-06-19 16:06:30 [-0.2737177 0.19007864 -0.01593672 0.00596992 -0.16929522 0.09228049
-0.34327229 0.18349156 -0.01873267 -0.2097643 ]
2020-06-19 16:06:33 [-0.58205189 0.37822243 0.04822943 0.08454661 -0.30583344 0.02179591
-0.54653299 0.14799526 -0.14674576 -0.33963335]
2020-06-19 16:06:36 [-0.89038609 0.56636622 0.11239558 0.16312331 -0.44237166 -0.04868866
-0.7497937 0.11249897 -0.27475886 -0.46950241]
2020-06-19 16:06:39 [-1.19872028 0.75451001 0.17656173 0.2417 -0.57890988 -0.11917324
-0.9530544 0.07700267 -0.40277196 -0.59937146]
2020-06-19 16:06:41 [-1.50705447 0.9426538 0.24072788 0.3202767 -0.71544809 -0.18965781
-1.1563151 0.04150638 -0.53078505 -0.72924052]
2020-06-19 16:06:44 Finsihed in 0:00:31.934163
2020-06-19 16:06:45 Running command:
CUDA_VISIBLE_DEVICES=1 cryodrgn eval_vol /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/weights.18.pkl --config /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/config.pkl --zfile /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18/kmeans20/z_values.txt -o /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18/kmeans20 --Apix 3.252
2020-06-19 16:06:48 Use cuda True
2020-06-19 16:06:48 Namespace(Apix=3.252, D=130, config='/scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/config.pkl', domain='fourier', downsample=None, enc_mask=65, encode_mode='resid', flip=False, func=<function main at 0x7fce93b994d0>, l_extent=0.5, n=10, norm=[0, 130.7921984067214], o='/scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18/kmeans20', pdim=1024, pe_dim=None, pe_type='geom_lowf', players=3, prefix='vol_', qdim=1024, qlayers=3, verbose=False, weights='/scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/weights.18.pkl', z=None, z_end=None, z_start=None, zdim=10, zfile='/scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/analyze.18/kmeans20/z_values.txt')
2020-06-19 16:06:51 Using circular lattice with radius 65
2020-06-19 16:06:51 Loading weights from /scratch/lk/cryoDRGN_test2/01_SE_z10_bx130/weights.18.pkl
2020-06-19 16:06:51 Generating 20 volumes
2020-06-19 16:06:51 [-0.97132516 -0.64415139 1.90188849 0.34832573 -1.67768192 -2.35388446
-0.97918421 0.40027481 -0.57715338 0.90271413]
2020-06-19 16:06:54 [-0.59527075 1.36312187 -0.56915736 -0.54813004 -0.82470226 0.73684198
-0.01291508 0.48670399 0.92089242 -0.69209099]
2020-06-19 16:06:56 [-0.0711624 1.09866858 -1.84852767 0.22727478 -0.67786133 1.02027476
0.78967232 -0.61266065 0.23449217 0.34753153]
2020-06-19 16:06:59 [-0.14201422 -1.40049565 1.12201524 -2.0198772 -0.76233393 -2.03652024
-0.38700676 -0.23054661 -0.39152634 -0.43772185]
2020-06-19 16:07:02 [-0.16621244 0.81222612 -1.20070136 -0.31326452 -0.76325411 0.31952477
-1.06328392 0.73861891 -0.71570486 -0.5696395 ]
2020-06-19 16:07:05 [-0.72862613 0.12580048 -0.75117576 1.07117748 0.57677025 1.36989951
-0.00699391 -0.49410373 -1.05118728 -0.25212762]
2020-06-19 16:07:07 [-0.43604895 -1.51223695 0.66002649 0.35800946 0.45665389 -1.19505262
-0.04969365 -2.26757884 -0.01168675 0.23415926]
2020-06-19 16:07:10 [-0.18047653 0.01970948 0.62216055 0.84891927 1.28573275 1.66495371
-0.61027855 0.83711773 0.06010877 -0.74653506]
2020-06-19 16:07:13 [-0.1910335 0.34751123 -0.31187138 0.74085212 0.71890074 1.61903584
0.31797802 -0.46175307 0.50515455 -0.30812699]
2020-06-19 16:07:16 [-1.60206747 0.92126781 -0.17496088 0.43709821 -0.3615002 1.04394341
-0.79904234 0.60373396 -0.44559866 -0.7461015 ]
2020-06-19 16:07:19 [ 0.40465146 0.44999629 0.80674958 1.4138279 -1.33104229 -1.16981912
-0.26164401 0.44663632 0.42494154 1.19738328]
2020-06-19 16:07:21 [ 0.59802741 -2.95357847 1.96336532 -0.59710884 0.59047776 -1.21152484
-0.09131909 -0.28926587 0.17629631 -0.00756058]
2020-06-19 16:07:24 [ 0.77092814 0.35969877 -0.51961726 -0.52401483 -0.02129811 1.14776206
0.40104675 1.6071167 -0.65990335 0.07273325]
2020-06-19 16:07:27 [ 0.08617193 -1.41762102 -0.02224282 -0.902996 0.02453905 -1.86269879
0.67480099 -1.76085651 -0.01389218 1.05045223]
2020-06-19 16:07:30 [ 0.87995082 -0.17809655 -0.79066652 0.18712413 0.41063935 1.22204924
-0.83809823 1.44176102 0.48284733 -0.23812459]
2020-06-19 16:07:33 [ 1.20228064 0.32292104 -1.41099977 -0.02424157 0.16307265 1.43185556
-0.1174975 -0.32243931 -0.10479626 0.71535921]
2020-06-19 16:07:35 [ 0.77479064 0.13963751 -0.3947379 0.10177684 0.1000241 1.31804752
0.50730151 1.11336291 1.3616066 -0.4266327 ]
2020-06-19 16:07:38 [-0.79391146 0.16412331 0.62868607 -0.64603126 -0.09520882 -1.21085966
0.01015996 0.07755069 1.40696418 -0.88159704]
2020-06-19 16:07:41 [ 0.70707965 2.43540907 4.02419424 2.76443481 -1.47308397 -0.29974866
-2.50558877 -2.63116574 1.03713751 -3.34746337]
2020-06-19 16:07:44 [-1.09395432 -1.4983263 0.95806909 -1.22308755 0.21075344 -1.88790619
-1.16221488 -0.54821455 -0.60948658 -0.93069148]
2020-06-19 16:07:47 Finsihed in 0:00:59.782863
2020-06-19 16:07:47 Running UMAP...
Traceback (most recent call last):
File "/home/lk/anaconda3/envs/cryodrgn/bin/cryodrgn", line 11, in
load_entry_point('cryodrgn==0.2.1b0', 'console_scripts', 'cryodrgn')()
File "/home/lk/anaconda3/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/main.py", line 50, in main
args.func(args)
File "/home/lk/anaconda3/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/commands/analyze.py", line 161, in main
analyze_zN(z, outdir, vg, args.skip_umap)
File "/home/lk/anaconda3/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/commands/analyze.py", line 88, in analyze_zN
umap_emb = analysis.run_umap(z)
File "/home/lk/anaconda3/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/analysis.py", line 68, in run_umap
z_embedded = reducer.fit_transform(z)
File "/home/lk/anaconda3/envs/cryodrgn/lib/python3.7/site-packages/umap_learn-0.4.4-py3.7.egg/umap/umap_.py", line 2012, in fit_transform
self.fit(X, y)
File "/home/lk/anaconda3/envs/cryodrgn/lib/python3.7/site-packages/umap_learn-0.4.4-py3.7.egg/umap/umap_.py", line 1833, in fit
self._search_graph.transpose()
File "/home/lk/anaconda3/envs/cryodrgn/lib/python3.7/site-packages/scipy-1.5.0rc1-py3.7-linux-x86_64.egg/scipy/sparse/lil.py", line 437, in transpose
return self.tocsr(copy=copy).transpose(axes=axes, copy=False).tolil(copy=False)
File "/home/lk/anaconda3/envs/cryodrgn/lib/python3.7/site-packages/scipy-1.5.0rc1-py3.7-linux-x86_64.egg/scipy/sparse/lil.py", line 468, in tocsr
_csparsetools.lil_get_lengths(self.rows, lengths)
File "_csparsetools.pyx", line 109, in scipy.sparse._csparsetools.lil_get_lengths
File "stringsource", line 658, in View.MemoryView.memoryview_cwrapper
File "stringsource", line 349, in View.MemoryView.memoryview.cinit
TypeError: a bytes-like object is required, not 'list'

Jupyter notebook interactive widgets not showing up

Documenting a known issue where the interactive widgets in the jupyter notebook silently fail to appear. The text VBox(...) appears instead of the widget.

It has something to do with the interaction between plotly and the jupyter extension, but not exactly sure the issue. Here are a few documented workarounds:

Using jupyter-lab instead of jupyter-notebook to open the jupyter notebook (thanks @sora)
Installing additional jupyter lab extensions (thanks @lkinman):

conda install -c conda-forge nodejs
jupyter labextension install @jupyter-widgets/jupyterlab-manager
export NODE_OPTIONS=--max-old-space-size=4096
jupyter labextension install [email protected] --no-build
jupyter labextension install [email protected] --no-build
jupyter lab build
unset NODE_OPTIONS

Chunked data loading for large datasets

The default behavior is to load the whole dataset in memory for training, however for particularly large datasets that don't fit in memory, an option is currently provided (--lazy) to load images from disk on the fly during training. This is a very bad filesystem access pattern, and the latency of disk access can be a severe bottleneck in some cases.

Probably what makes more sense is to preprocess the data into chunks (how big?) and train on each chunk sequentially. Slightly less randomness in the mini-batches, but assuming there's no order in the dataset, this likely doesn't matter. Could also add FFT + normalization in this preprocessing step too. The main downside I see is additional disk space usage for storing the chunks.

Some logged info is missing from run.log

Logged messaged outside of main are not logged to file (e.g. particle stats)

Eventually, switch to logging module.

Compatibility issue with pytorch v1.7.0 and pose SGD

When I run "cryodrgn train_nn data/toy_projections.mrcs --poses data/toy_rot_zerotrans.pkl -o output/toy_recon --do-pose-sgd
", I get the "ValueError: optimizer got an empty parameter list". The whole log is below:

ZOU@LAPTOP-GMM4MN25 MINGW64 /d/Study/The Road to AI/AIRoad/cryodrgn/testing (master)
$ cryodrgn train_nn data/toy_projections.mrcs --poses data/toy_rot_zerotrans.pkl -o output/toy_recon --do-pose-sgd
2020-12-14 09:27:02     Namespace(amp=False, batch_size=8, checkpoint=1, ctf=None, datadir=None, dim=256, do_pose_sgd=True, domain='fourier', emb_typ
e='quat', func=<function main at 0x000001E7D44836A8>, ind=None, invert_data=True, l_extent=0.5, layers=3, lazy=False, load=None, log_interval=1000, l
r=0.0001, multigpu=False, norm=None, num_epochs=20, outdir='D:\\Study\\The Road to AI\\AIRoad\\cryodrgn\\testing\\output\\toy_recon', particles='D:\\
Study\\The Road to AI\\AIRoad\\cryodrgn\\testing\\data\\toy_projections.mrcs', pe_dim=None, pe_type='geom_lowf', pose_lr=0.0001, poses='D:\\Study\\Th
e Road to AI\\AIRoad\\cryodrgn\\testing\\data\\toy_rot_zerotrans.pkl', pretrain=5, relion31=False, seed=95562, verbose=False, wd=0, window=True)
2020-12-14 09:27:02     Use cuda True
2020-12-14 09:27:02     Loaded 1000 30x30 images
2020-12-14 09:27:02     Normalized HT by 0 +/- 25.786991119384766
2020-12-14 09:27:06     FTPositionalDecoder(
  (decoder): ResidLinearMLP(
    (main): Sequential(
      (0): Linear(in_features=90, out_features=256, bias=True)
      (1): ReLU()
      (2): ResidLinear(
        (linear): Linear(in_features=256, out_features=256, bias=True)
      )
      (3): ReLU()
      (4): ResidLinear(
        (linear): Linear(in_features=256, out_features=256, bias=True)
      )
      (5): ReLU()
      (6): ResidLinear(
        (linear): Linear(in_features=256, out_features=256, bias=True)
      )
      (7): ReLU()
      (8): Linear(in_features=256, out_features=2, bias=True)
    )
  )
)
2020-12-14 09:27:06     221186 parameters in model
Traceback (most recent call last):
  File "D:\Anaconda\envs\cryodrgn\Scripts\cryodrgn-script.py", line 33, in <module>
    sys.exit(load_entry_point('cryodrgn==0.3.0b0', 'console_scripts', 'cryodrgn')())
  File "D:\Anaconda\envs\cryodrgn\lib\site-packages\cryodrgn-0.3.0b0-py3.7.egg\cryodrgn\__main__.py", line 50, in main
    args.func(args)
  File "D:\Anaconda\envs\cryodrgn\lib\site-packages\cryodrgn-0.3.0b0-py3.7.egg\cryodrgn\commands\train_nn.py", line 169, in main
    pose_optimizer = torch.optim.SparseAdam(posetracker.parameters(), lr=args.pose_lr)
  File "D:\Anaconda\envs\cryodrgn\lib\site-packages\torch\optim\sparse_adam.py", line 49, in __init__
    super(SparseAdam, self).__init__(params, defaults)
  File "D:\Anaconda\envs\cryodrgn\lib\site-packages\torch\optim\optimizer.py", line 47, in __init__
    raise ValueError("optimizer got an empty parameter list")
ValueError: optimizer got an empty parameter list

I find it caused by the code in /cryodrgn/commands/train_nn.py:

 # load poses
    if args.do_pose_sgd:
        posetracker = PoseTracker.load(args.poses, Nimg, D, args.emb_type, ind)
        pose_optimizer = torch.optim.SparseAdam(posetracker.parameters(), lr=args.pose_lr)

If I try to run "train_vae" with the same setting, I get the same error caused by the same code in /cryodrgn/commands/train_vae.py:

  # load poses
    if args.do_pose_sgd: assert args.domain == 'hartley', "Need to use --domain hartley if doing pose SGD"
    do_pose_sgd = args.do_pose_sgd
    posetracker = PoseTracker.load(args.poses, Nimg, D, 's2s2' if do_pose_sgd else None, ind)
    pose_optimizer = torch.optim.SparseAdam(posetracker.parameters(), lr=args.pose_lr) if do_pose_sgd else None

So I think it may be an undiscovered bug...
My environment setting is below:
Win10, PyCharm, cryodrgn 0.3.0 b

NameError: name 'amp' is not defined

Hi Ellen, I am using cryodrgn version 0.3.3b0. It could not recognize the option 'amp'. Here is the error:

Traceback (most recent call last):
File "/sdf/home/d/donghuac/anaconda3/envs/cryodrgn/bin/cryodrgn", line 33, in
sys.exit(load_entry_point('cryodrgn==0.3.3b0', 'console_scripts', 'cryodrgn')())
File "/sdf/home/d/donghuac/anaconda3/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.3.3b0-py
3.7.egg/cryodrgn/main.py", line 52, in main
args.func(args)
File "/sdf/home/d/donghuac/anaconda3/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.3.3b0-py
3.7.egg/cryodrgn/commands/train_vae.py", line 379, in main
model, optim = amp.initialize(model, optim, opt_level='O1')
NameError: name 'amp' is not defined

eval_vol error

Hi Ellen,

I have been reluctant to say this, since the eval_vol error may have come from my mistake (cryodrgn library installation, or my changing cryodrgn code....).

However, would you please quickly run any simplest eval_vol from your computer with the latest cryodrgn code (2/11/2021)?
Then, I can better investigate what needs to be fixed.

I used to run eval_vol well, but nowadays, I can't.

Even I erased whole cryodrgn installation and re-installed
and I used both personal data and
https://github.com/zhonge/cryodrgn_empiar#empiar-10076-assembling-bacterial-50s-ribosome
that you used.

However, I keep seeing

 File "/people/kimd999/bin/Miniconda3-py37_4.8.3-Linux-x86_64/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.3.2b0-py3.7.egg/cryodrgn/models.py", line 305, in positional_encoding_geom
    assert x.shape[-1] == self.in_dim
AssertionError

(don't worry about different line #, I just added print like below),

print ("self.in_dim:" + str(self.in_dim))    # 908
print ("self.zdim:" + str(self.zdim)) # 8
print ("*coords.shape[:-2]:" + str(*coords.shape[:-2])) # 1
print ("x.shape before view:" + str(x.shape)) # torch.Size([1, 3, 300])
x = x.view(*coords.shape[:-2], self.in_dim-self.zdim) # B x in_dim-zdim
print ("x.shape after view:" + str(x.shape)) # torch.Size([1, 900])
if self.zdim > 0:
     x = torch.cat([x,coords[...,3:,:].squeeze(-1)], -1)
     print ("x.shape after cat:" + str(x.shape))  # torch.Size([1, 901])
     assert x.shape[-1] == self.in_dim

parsing relion 3.1 star file doesn't work

Hi, I'm trying to run the first step to downsample particles using a relion 3.1 star file:

00020:   source ~/rc/conda.rc&& conda activate cryodrgn-0.2.1 && cryodrgn downsample Runs/000416_CryoDrgnProtPreprocess/extra/input_particles.star  -o Runs/000416_CryoDrgnProtPreprocess/extra/downsampled_parts.mrcs  --datadir Runs/000130_ProtRelionPreprocessParticles/extra -D 64 
00021:   Traceback (most recent call last):
00022:     File "/home/azazello/soft/scipion3/scipion_home/software/em/void/cryodrgn/cryodrgn/dataset.py", line 25, in load_particles
00023:       particles = starfile.Starfile.load(mrcs_txt_star, relion31=relion31).get_particles(datadir=datadir, lazy=lazy)
00024:     File "/home/azazello/soft/scipion3/scipion_home/software/em/void/cryodrgn/cryodrgn/starfile.py", line 60, in load
00025:       assert words.ndim == 2, f"Uneven # columns detected in parsing {set([len(x) for x in words])}. Is this a RELION 3.1 starfile?" 
00026:   AssertionError: Uneven # columns detected in parsing {0, 1, 3, 9, 11}. Is this a RELION 3.1 starfile?

Are 3.1 files not supported yet (I'm using latest master branch)?

Parse Frealign .par files

Inquiry of beta

In https://github.com/zhonge/cryodrgn

can I interpret

--beta BETA Choice of beta schedule or a constant for KLD weight
(default: 1.0)

as the one in beta-vae (https://openreview.net/forum?id=Sy2fzU9gl)?

If so, then using cryodrgn's train_vae with --beta 2 means that essentially I'm using cryodrgn in beta-vae.

analyze error message

Command was
cryodrgn analyze vae128_z10_e100_conv 75 --Apix 0.2966015625 --ksample 3

...
2020-09-19 22:52:56     Generating 3 volumes
2020-09-19 22:52:56     [ 8.76704752e-02 -3.42603713e-01  1.30035257e+00  1.18819630e+00
  1.42611980e-01 -1.22894645e-02 -1.77807882e-01  4.19831753e-01
  5.52097082e-01  7.15658069e-04]
2020-09-19 22:53:02     [-0.60905266  0.1405278   0.11575085 -1.17220247  0.0083428   0.01721188
  0.06637335  0.77177167  0.17333728 -0.13592422]
2020-09-19 22:53:08     [ 0.72590917  0.10197172 -0.96716791 -0.3785781  -0.0888432  -0.39900404
  0.07986095 -0.39363769  0.14182752 -0.13963668]
2020-09-19 22:53:12     Finsihed in 0:00:20.618744
2020-09-19 22:53:17     Running UMAP...
2020-09-19 22:58:51     Generating plots...
2020-09-19 22:59:11     Creating jupyter notebook...
/bin/sh: 1: cp: not found
Traceback (most recent call last):
  File "/people/kimd999/.conda/envs/cryodrgn/bin/cryodrgn", line 33, in <module>
    sys.exit(load_entry_point('cryodrgn==0.2.1b0', 'console_scripts', 'cryodrgn')())
  File "/people/kimd999/.conda/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/__main__.py", line 50, in main
    args.func(args)
  File "/people/kimd999/.conda/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/commands/analyze.py", line 168, in main
    subprocess.check_call(cmd, shell=True)
  File "/people/kimd999/.conda/envs/cryodrgn/lib/python3.7/subprocess.py", line 363, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'cp /people/kimd999/.conda/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/templates/cryoDRGN_viz_template.ipynb /pic/projects/MARScryo/dn/cryodrgn/exp/PDX/coexp/latest/vae128_z10_e100_conv/analyze.75/cryoDRGN_viz.ipynb' returned non-zero exit status 127.

Since I use cp command all the time in my linux terminal tab (who wouldn't?), I don't understand why it says
/bin/sh: 1: cp: not found

This cp command appears to do nothing but just copying template notebook file as in
cp /people/kimd999/.conda/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/templates/cryoDRGN_viz_template.ipynb /pic/projects/MARScryo/dn/cryodrgn/exp/PDX/coexp/latest/vae128_z10_e100_conv/analyze.75/cryoDRGN_viz.ipynb

Therefore, for now, I can just copy template notebook file manually, but it will be nice if it is done automatically.

which pixel value for analyze?

According to https://github.com/zhonge/cryodrgn

I need to specify original dimension when I generate pose.pkl

This pose.pkl is used for train_vae that uses my downsampled mrcs

Then, when I run
cryodrgn analyze
should I enter original pixel size (1.33 A) since pose.pkl used the original dimension?
or should I enter adjusted pixel size (5.32 A = 1.33 x 256/64) since my train_vae used downsampled mrc?

Thanks.

Running UMAP hangs

Hi Ellen,

I have had this issue from the start but have just been going along with PCA. Unless I skip running umap, all my cryodrgn analyze runs appear to hang at this stage (i.e. till the job exits on hitting the cluster time limit). Our system admin tells me the job isn't doing anything, and no errors pop up in the log either:

(cryodrgn) -bash-4.2$ tail -f CryoDRGN-01_vae128_big_z8-analyze_umap.out
2021-04-21 18:36:24 Saving results to /home/ts387/CryoDRGN/01_vae128_big_z8/analyze.24
2021-04-21 18:36:24 Perfoming principal component analysis...
2021-04-21 18:36:24 Explained variance ratio:
2021-04-21 18:36:24 [0.14331131 0.13593155 0.13252919 0.12811412 0.12108949 0.11471545
0.11343906 0.11086983]
2021-04-21 18:36:24 Generating volumes...
2021-04-21 18:36:24 K-means clustering...
2021-04-21 18:36:32 Generating volumes...
2021-04-21 18:36:32 Running UMAP...

I am looking at ~150,000 particles. Can you tell what might be going on, and if there's a solution or workaround? Just in case, I am attaching my slurm script as a text file. I apologise if the question is too tangential to core cryodrgn functionality.

Thanks a million!

Taha

cryodrgn analyze (--skip-vol).txt

[minor bug] --w vs -w

https://github.com/zhonge/cryodrgn/blob/master/cryodrgn/commands/parse_ctf_star.py

comment shows to use --w
but actual requirement needs -w

Clean up testing scripts

Since you have renamed the "lib-python" folder to "cryodrgn" which is a local library, you forget to change it in the code under testing folder.
Like in /testing/test_entropy.py

sys.path.insert(0,'../lib-python')
import lie_tools

Though it is a small problem, it can be a little annoying...

UMAP very slow for large (>1M) datasets

Look into GPU implementations

cryodrgn analyze fail

Hi Ellen,

My cryodrgn analyze run upon the first training (downsampled, small architecture) is failing with following error:

size mismatch for encoder.main.0.weight: copying a param with shape torch.Size([256, 16641]) from checkpoint, the shape in current model is torch.Size([256, 4])

I believe I took care to ensure -D and -Apix parameters corresponded to the original images when parsing poses and ctf.

I have made it work in the past and can't seem to figure out what I might be missing this time.

Please let me know if there's an obvious explanation.

Many thanks!

Taha

UnboundLocalError: local variable 'umap_emb' referenced before assignment

Hi, I am getting an Local error after running analyze function after the initial training. Did anyone get this error?

File "C:\Users\User\anaconda3\envs\cryodrgn\Scripts\cryodrgn-script.py", line 33, in
sys.exit(load_entry_point('cryodrgn==0.3.3b0', 'console_scripts', 'cryodrgn')())
File "C:\Users\User\anaconda3\envs\cryodrgn\lib\site-packages\cryodrgn-0.3.3b0-py3.8.egg\cryodrgn_main_.py", line 52, in main
args.func(args)
File "C:\Users\User\anaconda3\envs\cryodrgn\lib\site-packages\cryodrgn-0.3.3b0-py3.8.egg\cryodrgn\commands\analyze.py", line 190, in main
analyze_zN(z, outdir, vg, skip_umap=args.skip_umap, num_pcs=args.pc, num_ksamples=args.ksample)
File "C:\Users\User\anaconda3\envs\cryodrgn\lib\site-packages\cryodrgn-0.3.3b0-py3.8.egg\cryodrgn\commands\analyze.py", line 140, in analyze_zN
analysis.scatter_color(umap_emb[:,0], umap_emb[:,1], pc[:,i], label=f'PC{i+1}')
UnboundLocalError: local variable 'umap_emb' referenced before assignment

It seems that there are something wrong with your email [email protected]

I have used both gmail and my education mailbox to send a email to you. But it seems that you didn't receive it at all. Is there anything wrong with you email? May be your mailbox is full. Please check it, thanks!

Symmetry

Hi @zhonge -

How do you recommend we prepare data from symmetric particles? Should we run symmetric refinement (e.g. D2) or an asymmetric refinement (C1 symmetry) as the consensus refinement? And, if we are doing a C1 refinement, should we 'symmetry expand' the data so that all asymmetric units are oriented in the same manner?

Thanks!
Mike

Higher order aberrations and beam tilt correction

Correct for higher order optical effects in the CTF in cryoDRGN training.

Reference: https://journals.iucr.org/m/issues/2020/02/00/fq5009/

Potential testing datasets: the proteasome (EMPIAR-10185) and aldolase (EMPIAR-10181)

Automatically parse CTF parameters in RELION 3.1 star files

Parse the data_optics table in RELION 3.1 star files instead of requiring --kv/-w/etc. command line arguments. There might be packages that already do this, e.g. gemmi.

seaborn plotting error: ValueError: `kind` must be one of ['scatter', 'hist', 'hex', 'kde', 'reg', 'resid'], but hexbin was passed.`

Hi Ellen - glad to see the new features in 0.3.0 ! I don't know if this is a change that happened in this release or if it's something else, but I got this error when I was running the updated code.

I don't know if this is a version problem, but I'm running seaborn version 0.11.0.

For my installation, after changing lines 99 & 112 from 'hexbin' to 'hex' in file cryodrgn/lib/python3.7/site-packages/cryodrgn-0.3.0-py3.7.egg/cryodrgn/commands/analyze.py that fixed the error.

Mike

  File "/lsi/groups/mcianfroccolab/mcianfro/conda-pkgs/cryodrgn/bin/cryodrgn", line 33, in <module>
    sys.exit(load_entry_point('cryodrgn==0.3.0', 'console_scripts', 'cryodrgn')())
  File "/lsi/groups/mcianfroccolab/mcianfro/conda-pkgs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.3.0-py3.7.egg/cryodrgn/__main__.py", line 50, in main
    args.func(args)
  File "/lsi/groups/mcianfroccolab/mcianfro/conda-pkgs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.3.0-py3.7.egg/cryodrgn/commands/analyze.py", line 172, in main
    analyze_zN(z, outdir, vg, skip_umap=args.skip_umap, num_pcs=args.pc, num_ksamples=args.ksample)
  File "/lsi/groups/mcianfroccolab/mcianfro/conda-pkgs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.3.0-py3.7.egg/cryodrgn/commands/analyze.py", line 99, in analyze_zN
    g = sns.jointplot(pc[:,0], pc[:,1], kind='hexbin')
  File "/lsi/groups/mcianfroccolab/mcianfro/conda-pkgs/cryodrgn/lib/python3.7/site-packages/seaborn/_decorators.py", line 46, in inner_f
    return f(**kwargs)
  File "/lsi/groups/mcianfroccolab/mcianfro/conda-pkgs/cryodrgn/lib/python3.7/site-packages/seaborn/axisgrid.py", line 2040, in jointplot
    _check_argument("kind", plot_kinds, kind)
  File "/lsi/groups/mcianfroccolab/mcianfro/conda-pkgs/cryodrgn/lib/python3.7/site-packages/seaborn/utils.py", line 675, in _check_argument
    f"`{param}` must be one of {options}, but {value} was passed.`"
ValueError: `kind` must be one of ['scatter', 'hist', 'hex', 'kde', 'reg', 'resid'], but hexbin was passed.`

Only a single particle stack supported for write_starfile.py

After selecting particles using the Jupyter notebook, I am unable to convert them to a .star file. My input:

python $SRC/utils/write_starfile.py particles.256.txt ctf.pkl --ind ind_keep.6795_particles.pkl -o output.star

Output:

Traceback (most recent call last):
  File "/home/darst/cryodrgn/utils/write_starfile.py", line 91, in <module>
    main(parse_args().parse_args())
  File "/home/darst/cryodrgn/utils/write_starfile.py", line 44, in main
    assert args.particles.endswith('.mrcs'), "Only a single particle stack as an .mrcs is currently supported"
AssertionError: Only a single particle stack as an .mrcs is currently supported

Thank you!

Tensorflow not installed

Hi, I ran cryodrgn analyze but the log says "Tensorflow not installed" (see below). What version tensorflow should I install? I remember the INSTALL doesn't require it.

2021-02-16 11:34:15 [-0.56919843 -0.28779495 -2.99930263 0.91964149 -1.15936553 1.9017576
-0.19979304 -1.33526468]
2021-02-16 11:34:33 Finsihed in 0:06:16.000652
2021-02-16 11:34:34 Running UMAP...
2021-02-16 11:35:51 Generating plots...
2021-02-16 11:35:59 Creating jupyter notebook...
2021-02-16 11:35:59 /data/N/cryoDGRN/P217_J1204_particles/02_vae256_big_z8/analyze.24/cryoDRGN_viz.ipynb
2021-02-16 11:35:59 Finished in 0:14:13.429942
/opt/conda3/envs/cryodrgn/lib/python3.7/site-packages/umap/init.py:9: UserWarning: Tensorflow not installed; ParametricUMAP will be unavailable
warn("Tensorflow not installed; ParametricUMAP will be unavailable")

Backproject_voxel error

With published 50S ribosome set, and some of my own sets, I have no problem.

However, for two of my own sets, I have a problem with backproject_voxel (all other jobs, e.g. downsampling, generation of pkl files, train_vae, analyze, eval_vol worked well and produced expected results).

invert-data resulted in the same error message.

cryodrgn backproject_voxel cryosparc_P4_J251_009_particles_cs_abs_w_mrcs_star.mrcs --poses pose_300.pkl --ctf ctf.pkl -o backproject_w_pose_300_not_inverted_data.cryosparc_P4_J251_009_particles_cs_abs_w_mrcs_star.mrc --first 284133
-->

2020-06-18 01:14:58 Namespace(ctf='/pic/projects/MARScryo/cryodrgn/real_data/PDX/coexp/w_new_mrcs/ctf.pkl', datadir=None, first=284133, func=<function main at 0x7f335b0139e0>, ind=None, invert_data=False, o='/pic/projects/MARScryo/cryodrgn/real_data/PDX/coexp/w_new_mrcs/backproject_w_pose_300_not_inverted_data.cryosparc_P4_J251_009_particles_cs_abs_w_mrcs_star.mrc', particles='/pic/projects/MARScryo/cryodrgn/real_data/PDX/coexp/w_new_mrcs/cryosparc_P4_J251_009_particles_cs_abs_w_mrcs_star.mrcs', poses='/pic/projects/MARScryo/cryodrgn/real_data/PDX/coexp/w_new_mrcs/pose_300.pkl', tilt=None, tilt_deg=45)
2020-06-18 01:14:59 Use cuda True
2020-06-18 01:15:00 Loaded 284133 300x300 images
2020-06-18 01:15:04 Loading ctf params from /pic/projects/MARScryo/cryodrgn/real_data/PDX/coexp/w_new_mrcs/ctf.pkl
2020-06-18 01:15:04 Image size (pix) : 300
2020-06-18 01:15:04 A/pix : 1.0124000310897827
2020-06-18 01:15:04 DefocusU (A) : 21748.703125
2020-06-18 01:15:04 DefocusV (A) : 21489.791015625
2020-06-18 01:15:04 Dfang (deg) : -30.64923858642578
2020-06-18 01:15:04 voltage (kV) : 300.0
2020-06-18 01:15:04 cs (mm) : 2.700000047683716
2020-06-18 01:15:04 w : 0.07000000029802322
2020-06-18 01:15:04 Phase shift (deg) : 0.0
2020-06-18 01:15:04 Using circular lattice with radius 150
2020-06-18 01:15:04 image 0
...
2020-06-18 01:20:24. image 19300

/tmp/pip-req-build-v3zvx_ui/aten/src/ATen/native/cuda/IndexKernel.cu:60: lambda ->auto::operator()(int)->auto: block: [44,0,0], thread: [112,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "/people/kimd999/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/bin/cryodrgn", line 11, in
load_entry_point('cryodrgn==0.2.1b0', 'console_scripts', 'cryodrgn')()
File "/people/kimd999/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/main.py", line 50, in main
args.func(args)
File "/people/kimd999/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/commands/backproject_voxel.py", line 126, in main
add_slice(V, counts, ff_coord, ff, D)
File "/people/kimd999/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/commands/backproject_voxel.py", line 54, in add_slice
add_for_corner(xf,yc,zf)
File "/people/kimd999/bin/Miniconda3-latest-Linux-x86_64/envs/cryodrgn/lib/python3.7/site-packages/cryodrgn-0.2.1b0-py3.7.egg/cryodrgn/commands/backproject_voxel.py", line 49, in add_for_corner
w[w<0]=0
RuntimeError: copy_if failed to synchronize: device-side assert triggered

Get "ImportError: attempted relative import with no known parent package" when run test_mrc.py on Windows

When I run test_mrc.py, I get the error below:

Traceback (most recent call last):
  File "D:/Study/The Road to AI/AIRoad/cryodrgn/testing/test_mrc.py", line 13, in <module>
    import dataset
  File "D:\Study\The Road to AI\AIRoad\cryodrgn\testing/../cryodrgn\dataset.py", line 7, in <module>
    from . import fft
ImportError: attempted relative import with no known parent package

It's cause by the code of cryodrgn/dataset.py:

from . import fft
from . import mrc
from . import utils
from . import starfile

I want to konw how to fix it? And dose it only appear on Windows?
My computer system is Win10, and I run the code on PyCharm.

RuntimeError: Expected object of scalar type double but got scalar type float for sequence element 1.

Hi Ellen,

We see this error with the command below.
It appears to be reproducible, we have seen it on a couple of runs now.
Looks like a possible bug to me, let me know if we can provide more info.

$ cryodrgn analyze 00_vae128_z1 49 --Apix 0.85

2020-05-15 12:20:12     Saving results to /nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/analyze.49

2020-05-15 12:20:16     Running command:

cryodrgn eval_vol /nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/weights.49.pkl --config /nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/config.pkl --zfile /nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/analyze.49/z_values.txt -o /nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/analyze.49 --Apix 0.85

2020-05-15 12:20:21     Use cuda True

2020-05-15 12:20:21     Namespace(Apix=0.85, D=128, config='/nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/config.pkl', domain='fourier', downsample=None, enc_mask=64, encode_mode='resid', flip=False, func=<function main at 0x7f38212e13b0>, l_extent=0.5, n=10, norm=[0, 480.96075], o='/nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/analyze.49', pdim=256, pe_type='geom_lowf', players=3, prefix='vol_', qdim=256, qlayers=3, verbose=False, weights='/nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/weights.49.pkl', z=None, z_end=None, z_start=None, zdim=1, zfile='/nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/analyze.49/z_values.txt')

2020-05-15 12:20:26     Using circular lattice with radius 64

2020-05-15 12:20:26     Loading weights from /nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/weights.49.pkl

2020-05-15 12:20:27     Generating 10 volumes

2020-05-15 12:20:27     [-1.62018406]

Traceback (most recent call last):

  File "/programs/x86_64-linux//cryodrgn/0.2.0/cryodrgn/bin/cryodrgn", line 11, in <module>

    load_entry_point('cryodrgn==0.2.0', 'console_scripts', 'cryodrgn')()

  File "/programs/x86_64-linux/cryodrgn/0.2.0/cryodrgn_extlib/miniconda3-4.7.12.1-2tlz/lib/python3.7/site-packages/cryodrgn-0.2.0-py3.7.egg/cryodrgn/__main__.py", line 50, in main

    args.func(args)

  File "/programs/x86_64-linux/cryodrgn/0.2.0/cryodrgn_extlib/miniconda3-4.7.12.1-2tlz/lib/python3.7/site-packages/cryodrgn-0.2.0-py3.7.egg/cryodrgn/commands/eval_vol.py", line 120, in main

    vol = model.decoder.eval_volume(lattice.coords, lattice.D, lattice.extent, args.norm, zz)

  File "/programs/x86_64-linux/cryodrgn/0.2.0/cryodrgn_extlib/miniconda3-4.7.12.1-2tlz/lib/python3.7/site-packages/cryodrgn-0.2.0-py3.7.egg/cryodrgn/models.py", line 299, in eval_volume

    x = torch.cat((x,z), dim=-1)

RuntimeError: Expected object of scalar type double but got scalar type float for sequence element 1.

Traceback (most recent call last):

  File "/programs/x86_64-linux//cryodrgn/0.2.0/cryodrgn/bin/cryodrgn", line 11, in <module>

    load_entry_point('cryodrgn==0.2.0', 'console_scripts', 'cryodrgn')()

  File "/programs/x86_64-linux/cryodrgn/0.2.0/cryodrgn_extlib/miniconda3-4.7.12.1-2tlz/lib/python3.7/site-packages/cryodrgn-0.2.0-py3.7.egg/cryodrgn/__main__.py", line 50, in main

    args.func(args)

  File "/programs/x86_64-linux/cryodrgn/0.2.0/cryodrgn_extlib/miniconda3-4.7.12.1-2tlz/lib/python3.7/site-packages/cryodrgn-0.2.0-py3.7.egg/cryodrgn/commands/analyze.py", line 159, in main

    analyze_z1(z, outdir, vg)

  File "/programs/x86_64-linux/cryodrgn/0.2.0/cryodrgn_extlib/miniconda3-4.7.12.1-2tlz/lib/python3.7/site-packages/cryodrgn-0.2.0-py3.7.egg/cryodrgn/commands/analyze.py", line 55, in analyze_z1

    vg.gen_volumes(outdir, ztraj)

  File "/programs/x86_64-linux/cryodrgn/0.2.0/cryodrgn_extlib/miniconda3-4.7.12.1-2tlz/lib/python3.7/site-packages/cryodrgn-0.2.0-py3.7.egg/cryodrgn/commands/analyze.py", line 131, in gen_volumes

    analysis.gen_volumes(self.weights, self.config, zfile, outdir, **self.vol_args)

  File "/programs/x86_64-linux/cryodrgn/0.2.0/cryodrgn_extlib/miniconda3-4.7.12.1-2tlz/lib/python3.7/site-packages/cryodrgn-0.2.0-py3.7.egg/cryodrgn/analysis.py", line 292, in gen_volumes

    return subprocess.check_call(cmd, shell=True)

  File "/programs/x86_64-linux/cryodrgn/0.2.0/cryodrgn_extlib/miniconda3-4.7.12.1-2tlz/lib/python3.7/subprocess.py", line 347, in check_call

    raise CalledProcessError(retcode, cmd)

subprocess.CalledProcessError: Command 'cryodrgn eval_vol /nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/weights.49.pkl --config /nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/config.pkl --zfile /nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/analyze.49/z_values.txt -o /nfs/lcemdata/fischer/mhunkeler/20190204_krios_hms/DRGN/00_vae128_z1/analyze.49 --Apix 0.85' returned non-zero exit status 1.

perl: warning: Setting locale failed.

perl: warning: Please check that your locale settings:

    LANGUAGE = (unset),

    LC_ALL = (unset),

    LC_CTYPE = "UTF-8",

    LANG = "en_US.UTF-8"

    are supported and installed on your system.

perl: warning: Falling back to the standard locale ("C").

 DRGN$

CryoDRGN FILTERING & viz.ipynb

When I set dim =2, it allows me to skip umap but the template code for filtering and debugging the analysis doesn't show up in analyze.49. Is there another way of getting these template code from the jupyter notebook?

Datasets with different pixel sizes

Is it possible to use a consensus refinement from particles with different pixel sizes? Specifically, I have a star file with different optics groups each with their own pixel size. I suspect this won't work with downsampling and training, but I might be wrong. Is there a workaround?

graph_traversal documentation

I ran as
cryodrgn graph_traversal data vae128_z8_e400_beta_4_amp
but it says

usage: cryodrgn graph_traversal [-h] --anchors ANCHORS [ANCHORS ...]
                                [--max-neighbors MAX_NEIGHBORS]
                                [--avg-neighbors AVG_NEIGHBORS]
                                [--batch-size BATCH_SIZE]
                                [--max-images MAX_IMAGES] -o PATH.TXT --out-z
                                Z.PATH.TXT
                                data
cryodrgn graph_traversal: error: the following arguments are required: --anchors, -o, --out-z

Can I know what is anchors?
Thank you

Details of the EMPIAR-10076 homogeneous reconstruction

Hi,

Thanks for sharing your nice work 👍 I am trying to reproduce the experiments of your ICLR paper. In appendix C, it says that poses were determined by aligning the images to a mature LSU structure obtained from a homogeneous reconstruction of the full resolution dataset in cryoSPARC, i.e. "a consensus reconstruction".

Could you please provide more details on how to get the reconstruction from the particle images of EMPIAR-10076? Thanks!

Mis-scaled volumes using `cryodrgn eval_vol` with `--downsample` and `--zfile`

Documenting a bug in cryodrgn eval_vol when using --downsample AND --zfile to generate volumes at a downsampled box size. Fixed in commit 7a74656.

When downsampling a volume from D->d, a subvolume is extracted in Fourier space with an extent of d/D from the original box. Found an off-by-1 error where the extent was mis-scaled by 1/(D+1), e.g. for downsampling a volume from 256 to 128, the extent was 0.498 instead of 0.5 of the original. The effect should be extremely minor.

Note this does NOT affect cryodrgn downsample

How can I reproduce the experiments in the paper? (5.2 and 5.3)

Hello. Nice work!
I was impressed by your work and wanted to understand the model by trying the codes and reproducing the experiments in the paper. However, I found it is not quite obvious how I can reproduce the results in the paper.
For example, I am not sure which main file is corresponding to which experiment. I guess 5.2 HETEROGENEOUS RECONSTRUCTION WITH POSE SUPERVISION can be done by using train_vae.py, but this is also not clear to me from where I have to start.
It seems like (as far as I understood), the code for 5.3 unsupervised heterogeneous reconstruction is not uploaded yet.
Would it be possible to have some kinds of step-by-step instructions to get and load data, train the model, and visualize the results?

Migrate to sphinx docs

Install shared cryodrgn environment for multi-user facility

Hi,

I have a question about installing a shared cryodrgn conda environment for a multi-user facility. I used the --prefix option to install the .conda environment in a different location that is accessible to all users.

After running the setup.py the cryodrgn software is installed in this shared .conda folder. However, somehow the path to the conda environment in my home directory is used during installation such that cryodrgn and other python scripts contain this at the start of the file: #!/home/nansa/.conda/envs/cryodrgn-0.3.1/bin/python instead of /shared/.conda/envs/cryodrgn-0.3.1/bin/python

This is causing a bunch of permissions issues since no one can read my home directory. Is there a way to trick the setup.py script to use this alternative conda path during installation?

EDIT: Adding --home to the command fixed the permissions issues
python setup.py install --home=/shared/.conda/envs/cryodrgn-0.3.1

Then I ran into this error: importlib_metadata.PackageNotFoundError: No package metadata was found for cryodrgn
I fixed it by running the normal install command:
python setup.py install

Bug in cryoDRGN_filtering.ipynb RR.from_dcm depreciated in Scipy 1.4 and above

Hi,
Minor bug in cryoDRGN_filtering_template.ipynb when using latest scipy (v1.6.1), I get the error:
AttributeError Traceback (most recent call last)
in
1 # Convert rotation matrices to euler angles
----> 2 euler = RR.from_dcm(rot).as_euler('zyz', degrees=True)

AttributeError: type object 'scipy.spatial.transform.rotation.Rotation' has no attribute 'from_dcm'

From the release notes to Scipy v1.4.0 (https://docs.scipy.org/doc/scipy/reference/release.1.4.0.html)) scipy deprecations
"In scipy.spatial.Rotation methods from_dcm, as_dcm were renamed to from_matrix, as_matrix respectively. The old names will be removed in SciPy 1.6.0."

Replacing 'from_dcm' with 'from_matrix' makes the error go away :)
Many Thanks, Cheers, Dave H.

Warning: Masked input image dimension is not a mutiple of 8 -- AMP training speedup is not optimized

Hi,

I installed the APEX libraries to increase the throughput and it seems to be working. However, I noticed the following error:

Warning: Masked input image dimension is not a mutiple of 8 -- AMP training speedup is not optimized

Is there anything we can do to avoid this error? The particle dimensions are 128 x 128

Thanks!

Andrea

Parallelization of downsampling

Hello,

This is low priority, especially if it's complicated to implement. But it would be nice if cryodrgn downsample was optionally parallelized, so it could run faster on multi-core computers (the one I use has 80 cores and downsampling was only using one core; not a big deal for small datasets, but it took a very long time for ~0.4 M particles).

A related question: the way I have worked so far was to extract particles at their original (finest) pixel size in RELION, then put them all in the same file with relion_stack_create, then downsample this stack with cryodrgn downsample to whatever box size I need for a given run. Does it make a difference if I downsample during the RELION Extract job? (could be faster in certain cases, since it is parallelized with MPI, although the extraction step adds to the run time so not sure it would be faster overall).

Mixed precision training

extracting particles from latent space?

Hi,

I think it's a pretty naive question but it would be fantastic if it's possible. (I hope i didn't miss this info from README)

From PCA or UMAP, would it be possible to extract particle information from specific classification(s)?
If possible, I'd like to get subset of specific "classes" and go back to other software to refine or classify furthermore.

Thanks for the fantastic software again.

best,
heejong

I can't use result from write_starfile.py

Hi Ellen,

I can't generate mrcs from star file from cryodrgn.
I did
(cryodrgn) [kimd999@marianas to_extract_particles]$ python ~/script/python/cryoEM/cryodrgn/utils/write_starfile.py cryosparc_P122_J44_003_particles_cs_abs_w_mrcs_star.64.mrcs --ind ind_selected.pkl -o class1_from_cryodrgn.star ctf.pkl

Then I did
(cryodrgn) [kimd999@marianas to_extract_particles]$ ~/bin/ccpem-20200722/bin/relion_stack_create --i class1_from_cryodrgn.star --o class1_from_cryodrgn_star.mrcs

However, it resulted in

in: /home/jenkins/workspace/CCP-EM/sl6_devtoolset/devtools/checkout/relion-ver3.1/src/jaz/obs_model.cpp, line 200
ERROR:
ERROR: not all necessary variables defined in _optics.star file: rlnPixelSize, rlnVoltage and rlnSphericalAberration. Make sure to convert older STAR files anew in version-3.1, with relion_convert_star.
=== Backtrace  ===
/qfs/people/kimd999/bin/ccpem-20200722/bin/../lib/librelion_lib.so(_ZN11RelionErrorC1ERKSsS1_l+0x49) [0x7f4c6092ef99]
/qfs/people/kimd999/bin/ccpem-20200722/bin/../lib/librelion_lib.so(_ZN16ObservationModelC2ERK13MetaDataTableb+0x476) [0x7f4c60cee9a6]
/qfs/people/kimd999/bin/ccpem-20200722/bin/../lib/librelion_lib.so(_ZN16ObservationModel10loadSafelyESsRS_R13MetaDataTableSsib+0x619) [0x7f4c60cf4e09]
/people/kimd999/bin/ccpem-20200722/bin/relion_stack_create(_ZN23stack_create_parameters3runEv+0x2d7c) [0x41b9cc]
/people/kimd999/bin/ccpem-20200722/bin/relion_stack_create(main+0x31) [0x4058f1]
/usr/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f4c51055c05]
/people/kimd999/bin/ccpem-20200722/bin/relion_stack_create() [0x405a41]
==================
ERROR:
ERROR: not all necessary variables defined in _optics.star file: rlnPixelSize, rlnVoltage and rlnSphericalAberration. Make sure to convert older STAR files anew in version-3.1, with relion_convert_star.

Indeed I can't find rlnPixelSize in class1_from_cryodrgn.star.

I'm using relion 3.1

(cryodrgn) [kimd999@marianas to_extract_particles]$ ~/bin/ccpem-20200722/bin/relion_stack_create --version
RELION version: 3.1.0-commit-842729
Precision: BASE=double, CUDA-ACC=single

I'm pretty much sure that relion3.1 format star needs to have 2 sections anyway.

For example, relion3.1 star should have something like

# version 30001

data_optics

loop_
_rlnOpticsGroup #1
_rlnOpticsGroupName #2
_rlnAmplitudeContrast #3
_rlnSphericalAberration #4
_rlnVoltage #5
_rlnImageSize #6
_rlnImageDimensionality #7
           1 opticsGroup1     0.070000     2.700000   300.000000           64            2


# version 30001

data_particles

loop_
_rlnImageName #1
_rlnDefocusU #2
_rlnDefocusV #3
_rlnDefocusAngle #4
_rlnPhaseShift #5
_rlnOpticsGroup #6
19@cryosparc_P122_J44_003_particles_cs_abs_w_mrcs_star.64.mrcs 15314.800000 14930.100000     5.280000     0.000000            1
23@cryosparc_P122_J44_003_particles_cs_abs_w_mrcs_star.64.mrcs 15159.800000 14775.200000     5.280000     0.000000            1

While, I see 1 section in class1_from_cryodrgn.star

# Created 2020-10-14 07:10:00.521278

data_

loop_
_rlnImageName
_rlnDefocusU
_rlnDefocusV
_rlnDefocusAngle
_rlnVoltage
_rlnSphericalAberration
_rlnAmplitudeContrast
_rlnPhaseShift
19@cryosparc_P122_J44_003_particles_cs_abs_w_mrcs_star.64.mrcs 15314.8 14930.1 5.28 300.0 2.7 0.07 0.0
23@cryosparc_P122_J44_003_particles_cs_abs_w_mrcs_star.64.mrcs 15159.8 14775.2 5.28 300.0 2.7 0.07 0.0

ml-struct-bio / cryodrgn Goto Github PK

cryodrgn's People

Contributors

Stargazers

Watchers

Forkers

cryodrgn's Issues

Recommend Projects

Recommend Topics

Recommend Org