teese / eccpy Goto Github PK

View Code? Open in Web Editor NEW

41.0 41.0 9.0 2.47 MB

ECCpy is a program for EC50 calculation in python.

License: MIT License

Python 100.00%

eccpy's People

Contributors

Stargazers

Watchers

Forkers

nkhosi007 shkdidrlf cbmii chu-yadong bbyun28 soply ricardo-ayres sommerkai zhang1leo

eccpy's Issues

IC50/EC50 calculation directly from a pandas df

Hey,
First of all, I really appreciate the tool, its quite useful, thanks for the effort :)
I find the usage of the tool quite confusing and the fact that I've to depend on both a gui-based tool (excel) and command prompt at the same time annoying, would it be a lot easier and logical if we could have a pure python version for it?
Something like a jupyter notebook where we could just use csv/pandas to do all the calculation and plotting?
This is more of a recommendation than an issue, nevertheless, I believe it would be a nice option to have.

problem with skipped rows in versamax format

The excel file holding the dose concentrations in the Versamax format cannot have any skipped rows that are "FALSE" for Contains_Data.

If any Contains_Data rows are False, followed by True, the wrong sample names are added!

symbols are not accepted in samplenames

'float' object has no attribute 'max'

Hi! Running your program on your test file generated_data_0.xlsx (Mac OS Big Sur, python 3.8.5) and get the following message: 'float' object has no attribute 'max'

Here are details:
AttributeError Traceback (most recent call last)
in
1 settings = "/Users/nomadkml/Desktop/result/ECCpy_settings_template.xlsx"
----> 2 eccpy.run_curvefit(settings)
3 eccpy.run_gatherer(settings)

~/opt/anaconda3/lib/python3.8/site-packages/eccpy/curvefit.py in run_curvefit(settings_excel_file)
77 print("Starting run_curvefit program for selected samples.\n")
78 for fn in dff.loc[dff["run curvefit"] == True].index:
---> 79 calc_EC50(fn, dff, settings, t20)
80 else:
81 print("None of the datafiles are marked TRUE for 'run curvefit'. Suggest checking the excel settings file.")

~/opt/anaconda3/lib/python3.8/site-packages/eccpy/curvefit.py in calc_EC50(fn, dff, settings, t20)
1107 # set the x-axis limit so that the legend does not hide too many data points
1108 # find the maximum dose concentration in the whole experiment for that day
-> 1109 maxAC = x_orig.max().max()
1110 # # obtain the variable altering the extension of the x-axis
1111 # x_axis_extension_after_dosemax_in_summ_plot = dff.loc[fn, "x-axis extension in summary fig_0"]

AttributeError: 'float' object has no attribute 'max'

Images, but no data appear.

Unclear why, but even with all dependencies installed the images of the curves appear, but none of the excel sheets showing the data appear. I should note the images have the EC50s.
May be a general excel issue as the template sheet only has the first data sheet read. The others are ignored. For instance, in the template, it reads dataset 0, but not 1 or 2.
Any suggestions greatly appreciated.

adapt to new pandas syntax

Problem: Running eccpy with the latest version of python-pandas gives the following error:
TypeError: read_excel() got an unexpected keyword argument sheetname

Eccpy should therefore be adapted to work with the latest version of pandas, numpy, and matplotlib.

Acceptance Criteria:

eccpy tests work successfully with the latest anaconda python

all other functions still work, and the plots still look okay.

Input preparation to run eccpy for n=5

Dear Mark Teese,

I have made 12 dose-response experiments, each following the scheme attached below.
I haven't noticed so far an option to merge n. replica in order to get mean & SEM curves in the current version of the script.
How should I prepare the ECCpy_settings_template.xlsx?

Concentration	Response 1	Response 2	Response 3	Response 4	Response 5
10	Resp1-1	Resp1-2	Resp1-3	Resp1-4	Resp1-5
5	Resp2-1	Resp2-2	Resp2-3	Resp2-4	Resp2-5
2,5	Resp3-1	Resp3-2	Resp3-3	Resp3-4	Resp3-5
1,25	Resp4-1	Resp4-2	Resp4-3	Resp4-4	Resp4-5
0,625	Resp5-1	Resp5-2	Resp5-3	Resp5-4	Resp5-5
0,3125	Resp6-1	Resp6-2	Resp6-3	Resp6-4	Resp6-5
0	Resp7-1	Resp7-2	Resp7-3	Resp7-4	Resp7-5

add requirements.txt

Problem: The compatible versions of pandas/numpy/matplotlib are not defined anywhere in eccpy. Breaking changes in the dependencies result in a non-functioning software.

Suggested implementation: after fixing Issue #10 , save a snapshot of the python dependencies and add them to a requirements.txt.

Acceptance criteria:

pip and manual installation (python setup.py install) download the exact working dependencies

Sample Replicates

When replicates of a single sample are added (using the exact same sample name) the values for the response of a given dose are not averaged and the error bars (SEM) are not displayed in the graph produced.

doseconc_stepsize does not seem to change data_needs_checking

need to check that max_acceptable_doseconc_stepsize_at_EC50 is taken seriously!

analysed barchart sample names are not in order

In one example, after collecting data from multiple experiments:

the bar chart showed the wrong sample names on the x-axis, BUT
the scatter plot with a point for each experiment was good
the saved excel and csv were both correct

Need to check if the bug can be repeated, and find how the sample names became unordered.

change n_neighb to num_dp_expected_in_curve

The n_neighb is a key parameter for the automatic judgement of curves, but is really difficult to explain.

At some stage, it might make sense to use a value that is easy to explain in the settings file:
num_dp_expected_in_curve (number of datapoints expected in curve)
or even better,
num_dp_expected_at_inflection (number of datapoints expected at inflection)

If we don't want to change the code, n_neighb could simply be half this value:
n_neighb = num_dp_expected_at_inflection / 2

filter for max_highdose_slope not applied

THe calculated value is correct, but not applied as a filter for TRUE/FALSE as good data.

drop-down menus in excel settings file

At some stage, options like the method_calc_y50 should be choosable from drop-down menus.

It's relatively easy to do, and would greatly help users choose different settings

Index 0 is out of bounds for axis 0 with size 0

Hello teese,

Im trying to get the EC50 from some data on my own and I have not been able to do so with your program. The next error pops out everytime I run the code:index 0 is out of bounds for axis 0 with size 0
I sincerely do not know what I am doing wrong...even though i have read your tutorial over and over...
I would be very glad if you help me
Thanks in advance
pd: here is a screenschoot of the problem.