Coder Social home page Coder Social logo

lucy-forrest-lab / hdxer Goto Github PK

View Code? Open in Web Editor NEW
15.0 15.0 6.0 23.59 MB

HDXer is a package to compute Hydrogen-Deuterium exchange data from biomolecular simulations, compare to experiment, and perform ensemble refinement to fit a structrual ensemble to the experimental data

License: BSD 3-Clause "New" or "Revised" License

Python 80.38% Shell 0.45% Jupyter Notebook 19.06% Promela 0.11%

hdxer's People

Contributors

fabsugar avatar leesup avatar lucyforrest avatar rtb1c13 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

hdxer's Issues

Line 176 is throwing ValueError when the seg file is with two columns in Python 3.8

Hi,
Line 176 in the analysis.py throw ValueError when reading seg file with two columns.
Then, the code thatshould read the two column get excluded as the last exception catch get directly with the ValueError:

Traceback (most recent call last):
  File "/usr/envs/HDXER_ENV/lib/python3.8/site-packages/HDXer/analysis.py", line 176, in read_segfile
    self.segres = np.loadtxt(self.params['segfile'],
  File "/usr/envs/HDXER_ENV/lib/python3.8/site-packages/numpy/lib/npyio.py", line 1356, in loadtxt
    arr = _read(fname, dtype=dtype, comment=comment, delimiter=delimiter,
  File "/usr/envs/HDXER_ENV/lib/python3.8/site-packages/numpy/lib/npyio.py", line 999, in _read
    arr = _load_from_filelike(
ValueError: the number of columns changed from 3 to 2 at row 1; use `usecols` to select a subset and avoid this error

self.segres = np.loadtxt(self.params['segfile'],

Problems running MaxEnt in the tutorial 03_reweighting notebook

Hi,
I'm getting this error when running the 03_reweighting notebook

---------------------------------------------------------------------------
AxisError                                 Traceback (most recent call last)
/tmp/ipykernel_27749/1894631224.py in <module>
     26 
     27 reweight_object = MaxEnt(do_reweight=True, do_params=False, stepfactor=0.00001)
---> 28 reweight_object.run(gamma=basegamma, data_folders=folders, kint_file=rates, exp_file=expt, times=times, restart_interval=100, out_prefix=f'reweighting_gamma_1x10^{exponent}_')
     29 print(f'Reweighting for gamma = 1x10^{exponent} completed')
     30 

~/tmp/HDX/hdxer/HDXer/HDXer/reweighting.py in run(self, gamma, resultsobj, analysisobj, restart, **run_params)
    971                                      self.runparams['kint_file'],
    972                                      self.runparams['exp_file'],
--> 973                                      self.runparams['times'])
    974             except KeyError:
    975                 raise HDX_Error("Missing parameters to set up a reweighting run.\n"

~/tmp/HDX/hdxer/HDXer/HDXer/reweighting.py in setup_no_runobj(self, folderlist, kint_file, expt_file_path, times)
     96         _contacts, _hbonds, _sorted_resids = read_contacts_hbonds(folderlist,
     97                                                                   self.runparams['contacts_prefix'],
---> 98                                                                   self.runparams['hbonds_prefix'])
     99         if self.runparams['do_subsample']:
    100             _contacts, _hbonds = subsample_contacts_hbonds(_contacts, _hbonds,

~/tmp/HDX/hdxer/HDXer/HDXer/reweighting_functions.py in read_contacts_hbonds(folderlist, contacts_prefix, hbonds_prefix)
    120         map(lambda x, y: x[y], [files_to_array(curr_hfiles) for curr_hfiles in hbondfiles], filters))
    121 
--> 122     contacts = np.concatenate(_contacts, axis=1)
    123     print("Contacts read")
    124     hbonds = np.concatenate(_hbonds, axis=1)

<__array_function__ internals> in concatenate(*args, **kwargs)

AxisError: axis 1 is out of bounds for array of dimension 1

My _contacts values are:

[array([22., 27., 32., 31., 23., 16., 26., 21., 28., 15.,  7., 30., 19.,
        20., 19., 16., 19., 20., 17., 15., 16., 32., 33., 28., 35., 30.,
        24., 17., 24., 10.,  9., 19., 17., 15., 16., 25., 26., 21., 13.,
        26., 28., 19., 21.,  3.,  5., 14., 28., 30., 24., 22., 14., 42.,
        37., 36., 34., 33., 15., 18., 13., 19., 36., 30., 20., 27., 29.,
        23., 24., 26., 24., 12., 11., 13., 19., 14., 24., 35., 30., 28.,
        30., 20.,  9., 18., 17., 11.,  9., 17., 32., 28., 22., 27., 27.,
        23., 33., 32., 29., 27., 23., 19., 25., 23., 32., 38., 36., 47.,
        39., 16., 12., 27., 24., 17., 23., 37., 21., 14., 18., 22., 30.,
        31., 32., 40., 35., 34., 30., 33., 35., 19., 26., 32., 22., 32.,
        33., 25., 20., 30., 33., 31., 24., 33., 49., 35., 19., 15., 20.,
        18., 30., 31., 40., 36., 36., 34., 34., 30.,  9., 14., 22., 20.,
        14., 19., 24., 20., 21., 18., 17., 12., 13.,  6., 11., 14., 11.,
        22., 30., 22., 24., 30., 25., 19., 28., 29., 22., 22., 31., 27.,
        22., 26., 27., 27., 18., 20., 19., 33., 32., 27., 23., 24., 24.,
        26., 23., 16., 17., 22., 14.,  0.,  3., 13., 13.,  6., 17.,  6.,
         7.,  3., 10., 17., 11., 21., 17., 12., 33., 43., 36., 24., 31.,
        35., 21., 18., 20., 19., 16., 13.,  9.,  9.,  5.,  1.,  1.,  2.,
         0.,  0.,  4., 16., 22., 15., 31., 28., 24., 36., 32., 26., 29.,
        27., 35., 35., 32., 26., 13., 20., 18., 20., 14., 22., 23., 20.,
        12., 18., 24., 27., 34., 32., 40., 40., 25., 10., 15.,  6.,  5.,
        18., 20.,  9.,  8., 14., 24., 14., 17.,  5., 12., 29., 36., 26.,
        31., 22.,  5., 16., 27., 16., 16., 19., 30., 27., 30., 39., 30.,
        24.,  0.,  0.,  0.,  0.,  0.,  1.,  2.,  2.,  2.,  2.,  2.,  0.,
         0.,  2.,  2.,  2.,  2.,  3.,  2.,  6., 19.,  8., 10., 28., 31.,
        20., 26., 35., 34., 26., 25., 26., 36., 38., 23., 10., 13., 15.,
        15., 28., 28., 27., 31., 32., 30., 37., 37., 27., 16., 17., 21.,
        21., 32., 39., 38., 32., 29., 37., 42., 23., 22., 16., 15., 19.,
        20., 18., 26., 27., 27., 30., 34., 26., 18., 19., 33., 37., 46.,
        37., 36., 34., 40., 19., 15., 11., 25., 29., 27., 22., 31., 29.,
        23., 23., 30., 33., 19., 25., 20., 23., 41., 40., 22., 25., 26.,
        15.,  6., 10.,  1.,  1.,  3., 17., 16., 15., 27., 35., 23., 17.,
        19., 15., 24., 26., 29., 35., 36., 39., 35., 15., 14., 28., 25.,
        13., 20., 16., 10., 15., 21., 26., 22., 25., 29., 19., 13., 14.,
        28., 31., 33., 30., 37., 41., 38., 36., 24., 29., 23., 31., 36.,
        42., 25., 21., 31., 26., 17., 22., 21.,  7., 14., 29., 26., 14.,
        21., 22., 20., 17., 23., 20., 17., 21.,  7., 20., 31., 37., 35.,
        38., 40., 37., 27., 24., 22., 17., 12., 23., 27., 22., 34., 31.,
        17., 14., 20., 34., 39., 32., 16., 22., 16., 14.,  1., 10.,  6.,
         2., 11., 15., 31., 35., 33., 30., 31., 12., 17., 35., 32., 18.,
        21., 21., 18., 21., 39., 22., 11., 12.,  9., 14., 13., 27., 19.,
        24., 19., 28., 21.,  7.,  7., 24., 22.,  5.,  2.,  2.,  2.,  2.,
         2.,  2.,  2.,  2.,  2.,  2.,  0.,  1.,  1.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  6.,  2.,  4.,  5.,
         6.,  8.,  9., 10.,  7.,  9., 13.,  1.,  0.,  0.,  2.,  2.,  2.,
         1.,  0.,  2.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.,  2.,  2.,  2.,  2.,  2.,  2.,  0.,  5., 13.,
         8., 11., 18., 29., 20., 23., 32., 21., 19., 29., 27., 29., 25.,
        26., 31., 17., 24., 27., 25., 20., 20., 31., 26., 15., 17., 15.,
        20., 25., 15., 26., 32., 32., 33., 23., 22., 27., 28., 27., 31.,
        40., 25., 23., 37., 37., 23., 10.,  9., 24., 34., 27., 19., 26.,
        20., 25., 33., 36., 28., 23., 32., 30., 23., 13., 23., 19., 19.,
        21., 20., 17., 25., 28., 23., 24., 29., 21., 19., 18., 12., 15.,
        13., 14., 24., 22.,  9.,  1.,  0.,  2.,  2.,  1.,  2.,  1.,  2.])]

How can I fix this, please?

MDtraj discontinued

MDtraj is currently under low maintainance and might be discontinued soon.
It might be useful to replace it in the near future with another package that reads topology and trajectories

reweighting error

I have been running this function with the correct paramters
reweight_object.run(gamma=basegamma, data_folders=folders, kint_file=rates, exp_file=expt, times=times, restart_interval=100, out_prefix=f'reweighting_gamma_1x10^{exponent}')
and I keep getting this error.
Traceback (most recent call last):
File "/Users/ragarwal/Downloads/HDXer/HDXer/reweighting_functions.py", line 49, in files_to_array
return np.stack(datalist, axis=0)
File "<array_function internals>", line 180, in stack
File "/Users/ragarwal/miniconda3/envs/HDXER_ENV/lib/python3.8/site-packages/numpy/core/shape_base.py", line 426, in stack
raise ValueError('all input arrays must have the same shape')
ValueError: all input arrays must have the same shape

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "/Users/ragarwal/Downloads/HDXer/HDXer/reweighting.py", line 970, in run
self.setup_no_runobj(self.runparams['data_folders'],
File "/Users/ragarwal/Downloads/HDXer/HDXer/reweighting.py", line 96, in setup_no_runobj
_contacts, _hbonds, _sorted_resids = read_contacts_hbonds(folderlist,
File "/Users/ragarwal/Downloads/HDXer/HDXer/reweighting_functions.py", line 118, in read_contacts_hbonds
map(lambda x, y: x[y], [files_to_array(curr_cfiles) for curr_cfiles in contactfiles], filters))
File "/Users/ragarwal/Downloads/HDXer/HDXer/reweighting_functions.py", line 118, in
map(lambda x, y: x[y], [files_to_array(curr_cfiles) for curr_cfiles in contactfiles], filters))
File "/Users/ragarwal/Downloads/HDXer/HDXer/reweighting_functions.py", line 54, in files_to_array
raise ValueError("Error in stacking files read with np.loadtxt - are they all the same length?")
ValueError: Error in stacking files read with np.loadtxt - are they all the same length?

I checked the files and they all are of the same length.

Error in re-weighting step for multichain proteins

Hello, I'm having issues in the reweighting step because my protein have two chains. At first I ran the notebook '03_reweigthin' and realized I got 'nan' for all segments in the second chain in the final_segment_fractions.dat file. I realized that my Hbonds and Contacts .tmp files are with '0' and '1', like "Hbonds_chain_0_res.tmp' and Hbonds_chain_1_res_*.tmp" , and its only reading the '0' index one as in lines 433-444 in HDXer/HDXer/reweighting.py script:

I tried to run like this specifying tmp prefix for both chains:

reweight_object = MaxEnt(do_reweight=True, do_params=False, stepfactor=0.00001)
reweight_object.run(gamma=basegamma, data_folders=folders, kint_file=rates, exp_file=expt, times=times, restart_interval=100, out_prefix=f'reweighting_gamma_1x10^{exponent}', hbonds_prefix=('Hbonds_chain_0_res','Hbonds_chain_1_res_'), contacts_prefix=['Contacts_chain_0_res_','Contacts_chain_1_res_'])

But I get this error:

ValueError Traceback (most recent call last)
Cell In[64], line 28
25 basegamma = 10**exponent
27 reweight_object = MaxEnt(do_reweight=True, do_params=False, stepfactor=0.00001)
---> 28 reweight_object.run(gamma=basegamma, data_folders=folders, kint_file=rates, exp_file=expt, times=times,
29 restart_interval=100, out_prefix=f'reweighting_gamma_1x10^{exponent}',
30 hbonds_prefix=('Hbonds_chain_0_res
','Hbonds_chain_1_res_'),
31 contacts_prefix=('Contacts_chain_0_res_','Contacts_chain_1_res_'))
34 print(f'Reweighting for gamma = 1x10^{exponent} completed')
36 # Help text describing options and how to call the reweighting functions
37 # is available in the docstrings of the MaxEnt class, e.g.:
38 #help(MaxEnt)
39 #help(MaxEnt.run)

File ~/HDXer/HDXer/reweighting.py:970, in MaxEnt.run(self, gamma, resultsobj, analysisobj, restart, **run_params)
968 else:
969 try:
--> 970 self.setup_no_runobj(self.runparams['data_folders'],
971 self.runparams['kint_file'],
972 self.runparams['exp_file'],
973 self.runparams['times'])
974 except KeyError:
975 raise HDX_Error("Missing parameters to set up a reweighting run.\n"
976 "Please ensure a restart or calc_hdx object is provided,"
977 "or provide the following arguments to the run() call: "
978 "data_folders, kint_file, exp_file, times")

File ~/HDXer/HDXer/reweighting.py:96, in MaxEnt.setup_no_runobj(self, folderlist, kint_file, expt_file_path, times)
78 self.runvalues = {}
80 maxentvalues = { 'contacts' : None,
81 'hbonds' : None,
82 'minuskt' : None,
(...)
94 'curriter' : None,
95 'is_converged' : None }
---> 96 _contacts, _hbonds, _sorted_resids = read_contacts_hbonds(folderlist,
97 self.runparams['contacts_prefix'],
98 self.runparams['hbonds_prefix'])
99 if self.runparams['do_subsample']:
100 _contacts, _hbonds = subsample_contacts_hbonds(_contacts, _hbonds,
101 self.runparams['sub_start'],
102 self.runparams['sub_end'],
103 self.runparams['sub_interval'])

File ~/HDXer/HDXer/reweighting_functions.py:110, in read_contacts_hbonds(folderlist, contacts_prefix, hbonds_prefix)
108 for r, f in list(zip(resids, filters)):
109 new_resids.append(np.array(r)[f])
--> 110 new_resids = np.stack(new_resids)
111 if not np.diff(new_resids, axis=0).sum(): # If sum of differences between filtered resids == 0
112 pass

File <array_function internals>:200, in stack(*args, **kwargs)

File ~/.conda/envs/HDXER_ENV/lib/python3.8/site-packages/numpy/core/shape_base.py:464, in stack(arrays, axis, out, dtype, casting)
462 shapes = {arr.shape for arr in arrays}
463 if len(shapes) != 1:
--> 464 raise ValueError('all input arrays must have the same shape')
466 result_ndim = arrays[0].ndim + 1
467 axis = normalize_axis_index(axis, result_ndim)

ValueError: all input arrays must have the same shape

How can I specify more than one chain when running the MaxEnt.

Your help will be really helpful.
Thanks,

Issues with NaNs When Re-weighting Deuterated Fractions from MD

I want to preface in saying that I'm much more familiar with MD than HDX, but I've recently been interested in determining structures from MD ensemble that match with HDX observables.

I've managed to create a script from the tutorial notebooks that work well for my purposes. The predicted DF from MD look good and correlate decently with experiment, but when I perform re-weighting the resulting final_segment_fractions contain a lot of NaN values.

I'm attaching the script I'm using, as well as plots of the MD vs EXPT deuterated fractions before and after re-weighting.
Hoping you can help me understand where I'm going wrong. Many thanks.

Screen Shot 2023-10-13 at 1 25 17 PM Screen Shot 2023-10-13 at 1 25 25 PM

hdxer_test.txt

Std. Dev. across trajectory blocks

Hey, I am using multiple simulation trajectories as input but I don't see the "Std. Dev. across trajectory blocks" in the output 'Deuteration fraction vs. Time' plots?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.