genomiqueens / toulligqc Goto Github PK
View Code? Open in Web Editor NEWA post sequencing QC tool for Oxford Nanopore sequencers
License: Other
A post sequencing QC tool for Oxford Nanopore sequencers
License: Other
Hi!
I managed to run the latest version of toulligQC (2.3) with default guppy basecalling:
toulligqc \
--report-name "$run_id" \
--telemetry-source ./fastq_hac_400/sequencing_telemetry.js \
--sequencing-summary-source ./fastq_hac_400/sequencing_summary.txt \
--html-report-path ./toolligqc/"$run_id"_tooligqc_"$run_id".html \
--data-report-path ./toolligqc/"$run_id"_tooligqc_"$run_id".data
However when I ran it with the demultiplexing files I got the following error:
toulligqc \
--force \
--report-name "$run_id" \
--barcoding \
--telemetry-source "$run_path"/fastq_hac_400/sequencing_telemetry.js \
--sequencing-summary-source "$run_path"/fastq_hac_400/sequencing_summary.txt \
--sequencing-summary-source "$run_path"/guppy_demultiplexed_pass/barcoding_summary_pass.txt \
--sequencing-summary-source "$run_path"/guppy_demultiplexed_fail/barcoding_summary_fail.txt \
--html-report-path "$run_path"/toolligqc/"$run_id"_tooligqc_"$run_id".2.html \
--data-report-path "$run_path"/toolligqc/"$run_id"_tooligqc_"$run_id".2.data \
--barcodes BC01,BC02,BC03,BC04,BC05,BC06,BC07,BC08,BC09,BC10,BC11,BC12,BC13,BC14,BC15,BC16,BC17,BC18,BC19,BC20,BC21,BC22,BC23,BC24
* Start Basecaller sequencing summary extractor
Traceback (most recent call last):
File "/home/vincent.hahaut/anaconda3/bin/toulligqc", line 33, in <module>
sys.exit(load_entry_point('toulligqc==2.3', 'console_scripts', 'toulligqc')())
File "/home/vincent.hahaut/anaconda3/lib/python3.8/site-packages/toulligqc-2.3-py3.8.egg/toulligqc/toulligqc.py", line 347, in main
extractor.init()
File "/home/vincent.hahaut/anaconda3/lib/python3.8/site-packages/toulligqc-2.3-py3.8.egg/toulligqc/sequencing_summary_extractor.py", line 106, in init
self.dataframe_1d = self._load_sequencing_summary_data()
File "/home/vincent.hahaut/anaconda3/lib/python3.8/site-packages/toulligqc-2.3-py3.8.egg/toulligqc/sequencing_summary_extractor.py", line 408, in _load_sequencing_summary_data
dataframes_merged = pd.merge(
File "/home/vincent.hahaut/anaconda3/lib/python3.8/site-packages/pandas/core/reshape/merge.py", line 74, in merge
op = _MergeOperation(
File "/home/vincent.hahaut/anaconda3/lib/python3.8/site-packages/pandas/core/reshape/merge.py", line 598, in __init__
_left = _validate_operand(left)
File "/home/vincent.hahaut/anaconda3/lib/python3.8/site-packages/pandas/core/reshape/merge.py", line 2148, in _validate_operand
raise TypeError(
TypeError: Can only merge Series or DataFrame objects, a <class 'NoneType'> was passed
I did not manage to debug it myself. The links to the barcoding summary files are correct:
head -n 2 "$run_path"/guppy_demultiplexed_pass/barcoding_summary_pass.txt
read_id barcode_arrangement barcode_full_arrangement barcode_kit barcode_variant barcode_score barcode_front_id barcode_front_score barcode_front_refseq barcode_front_foundseq barcode_front_foundseq_length barcode_front_begin_index barcode_rear_id barcode_rear_score barcode_rear_refseq barcode_rear_foundseq barcode_rear_foundseq_length barcode_rear_end_index barcode_front_total_trimmed barcode_rear_total_trimmed barcode_mid_front_id barcode_mid_front_score barcode_mid_front_end_index barcode_mid_rear_id barcode_mid_rear_score barcode_mid_rear_end_index adapter_front_id adapter_front_score adapter_front_foundseq_len adapter_front_begin_index adapter_rear_id adapter_rear_score adapter_rear_foundseq_len adapter_rear_end_index adapter_mid_id adapter_mid_score adapter_mid_end_index
fa6dee83-eb61-5169-bddf-8f6c19de6b53 barcode19 NB19_var1 NB var1 46.3333 NB19_FWD 46.3333 AGGTTAAGTTCCTCGTGCAGTGTCAAGAGATCAGCACCT AGGTAGTCTGTACATAATTCAGAGAGAGGACAT 33 37 NB19_REV 21.9167 GGTGCTGATCTCTTGACACTGCACGAGGAACTTAACCTTAGCAAT TGCTGCCATTCGGCCAGTGAGTCTTCTCCCAAT 33 84 70 117 unclassified 0 -1 unclassified 0 -1 ADAPTER_LSK109_FWD 37.9286 20 73 ADAPTER_LSK110_REV 43.5556 13 114 unclassified 0 -1
head -n 2 "$run_path"/guppy_demultiplexed_fail/barcoding_summary_fail.txt
read_id barcode_arrangement barcode_full_arrangement barcode_kit barcode_variant barcode_score barcode_front_id barcode_front_score barcode_front_refseq barcode_front_foundseq barcode_front_foundseq_length barcode_front_begin_index barcode_rear_id barcode_rear_score barcode_rear_refseq barcode_rear_foundseq barcode_rear_foundseq_length barcode_rear_end_index barcode_front_total_trimmed barcode_rear_total_trimmed barcode_mid_front_id barcode_mid_front_score barcode_mid_front_end_index barcode_mid_rear_id barcode_mid_rear_score barcode_mid_rear_end_index adapter_front_id adapter_front_score adapter_front_foundseq_len adapter_front_begin_index adapter_rear_id adapter_rear_score adapter_rear_foundseq_len adapter_rear_end_index adapter_mid_id adapter_mid_score adapter_mid_end_index
82bb5bfb-5e34-565e-a6b4-6ad6947004b0 unclassified NB12_var2 NB var2 39.75 NB12_FWD 39.75 ATTGCTAAGGTTAATCCGATTCTGCTTCTTTCTACCTGCAGCACC TTGCTACATAGACGGGTGTGCTCTTTTCACTGTTCAG 37 41 NB12_REV 16.25 AGGTGCTGCAGGTAGAAAGAAGCAGAATCGGATTAACCT GGGCTAGGTTTAGCCCCATACTATGTTAGTTGATACC 37 39 0 0 unclassified 0 -1 unclassified 0 -1 ADAPTER_LSK109_FWD 37 22 126 ADAPTER_LSK109_REV 29.8261 12 5 unclassified 0 -1
Any idea why this is happening ?
Thank you in advance
Hi,
I am having a problem running toulligQC. It can't seem to find the files it needs even though I am using absolute paths and I allow read/write permissions for everything. I am using docker since I couldn't get the standard install to work. I am on Ubuntu 16.04.
I pulled the image as in the instructions, and I also tried building it with docker build. I get the same error either way. I tried with my own data and with the test data that is downloaded.
If I modify the config file and run the test data:
sudo docker run genomicpariscentre/toulligqc -c /tools/toulligQC/test_data/config.txt -n /tools/toulligQC/test_data/091817_test -b
Traceback (most recent call last):
File "/usr/local/bin/toulligqc", line 11, in <module>
load_entry_point('toulligqc===-0.1a1-', 'console_scripts', 'toulligqc')()
File "/usr/local/lib/python3.5/dist-packages/toulligqc-_0.1a1_-py3.5.egg/toulligqc/toulligqc.py", line 228, in main
parse_args(config_dictionary)
File "/usr/local/lib/python3.5/dist-packages/toulligqc-_0.1a1_-py3.5.egg/toulligqc/toulligqc.py", line 87, in parse_args
config_dictionary.load(conf_file)
File "/usr/local/lib/python3.5/dist-packages/toulligqc-_0.1a1_-py3.5.egg/toulligqc/toulligqc_conf.py", line 78, in load
with open(conf_path, 'r') as config_file:
FileNotFoundError: [Errno 2] No such file or directory: '/tools/toulligQC/test_data/config.txt'
Running without the config file but with the individual files and the built docker image:
sudo docker run 8b035f9a408a -n test_individual_files -f /tools/toulligQC/test_data/dnacpc14_20170328_FNFAF04250_MN17734_mux_scan_1D_validation_test1_45344_ch282_read40_strand.fast5 -a /tools/toulligQC/test_data/sequencing_summary/sequencing_summary.txt -q /tools/toulligQC/test_data/fastq/20170328_FAF04250/20170328_FAF04250_barcode01.fastq -o /tools/toulligQC/test_data -s /tools/toulligQC/test_data/samplesheet.csv -b
Traceback (most recent call last):
File "/usr/local/bin/toulligqc", line 11, in <module>
load_entry_point('toulligqc===-0.1a1-', 'console_scripts', 'toulligqc')()
File "/usr/local/lib/python3.5/dist-packages/toulligqc-_0.1a1_-py3.5.egg/toulligqc/toulligqc.py", line 237, in main
barcode_selection = get_barcode(sample_sheet_file)
File "/usr/local/lib/python3.5/dist-packages/toulligqc-_0.1a1_-py3.5.egg/toulligqc/toulligqc.py", line 174, in get_barcode
with open(barcode_file) as csvfile:
FileNotFoundError: [Errno 2] No such file or directory: '/tools/toulligQC/test_data/samplesheet.csv
`
Any help would be great. Thanks!
Ok, I tried our barcoded nanopore data, and toulligqc generated reports. So, for non bar coded data, what can I do because tougligqc requires sample sheet file?
It gives KeyError: 'sample_sheet_file'
Thanks!
George
I am trying to run toulligQC to generate albacore run reports. I tried different options, and always got errors. It looks like that toulligqc requires all the options even when the run does not have barcodes.
I ran following cammand:
toulligqc -n test_run
-f /storage/FILES/Oxford_Minion/Hudson_Sept8_FLO-MIN106_SQK-LSK108/Sept_R9.4_flowcells/testReport/0
-a /storage/FILES/Oxford_Minion/Hudson_Sept8_FLO-MIN106_SQK-LSK108/Sept_R9.4_flowcells/testReport/
-q /storage/FILES/Oxford_Minion/Hudson_Sept8_FLO-MIN106_SQK-LSK108/Sept_R9.4_flowcells/testReport/fastq
-o /storage/FILES/Oxford_Minion/Hudson_Sept8_FLO-MIN106_SQK-LSK108/Sept_R9.4_flowcells/testReport/report
and got following errors:
Traceback (most recent call last):
File "/home/apps/toulligQC/toulligQC-20170912/bin/toulligqc", line 11, in
load_entry_point('toulligqc===-0.1a1-', 'console_scripts', 'toulligqc')()
File "/home/apps/toulligQC/toulligQC-20170912/lib/python3.6/site-packages/toulligqc-0.1a1-py3.6.egg/toulligqc/toulligqc.py", line 229, in main
check_conf(config_dictionary)
File "/home/apps/toulligQC/toulligQC-20170912/lib/python3.6/site-packages/toulligqc-0.1a1-py3.6.egg/toulligqc/toulligqc.py", line 142, in check_conf
if not config_dictionary['sample_sheet_file']:
File "/home/apps/toulligQC/toulligQC-20170912/lib/python3.6/site-packages/toulligqc-0.1a1-py3.6.egg/toulligqc/toulligqc_conf.py", line 46, in getitem
return self._config_dictionary[item]
KeyError: 'sample_sheet_file'
I am wondering if you could show me how to run toulligqc with following output from an Albacore run. The pass directory has 0, 1, 2 sub directories holding fast5 files. Fastq files are also in the pass directory.
Thanks!
George
albacoreResults
----- configuration.cfg
----- pipeline.log
----- sequencing_summary.txt
----- workspace
------ fail
------ pass
--------- 0
--------- 1
--------- 2
--------- fastq_runid_31ee3c191ce764f50a56b1bbee67326bf4c6d40e_0.fastq
--------- fastq_runid_31ee3c191ce764f50a56b1bbee67326bf4c6d40e_1.fastq
--------- fastq_runid_77317b2780c17bc6d729717e91f58ff31823c8a8_2.fastq
Hi!
thank you for providing this tool. I have new projects that are stored in pod5. The tool seems to not be compatible with this format.
Is there any wayaround to this?
Thank you in advance.
Kind regards,
Oscar
I'm running toulligqc on very recent fast5 files and get the following error:
ToulligQC version 1.1
* Initialize extractors
* Start Sequencing telemetry extractor
* End of Sequencing telemetry extractor (done in 00:00:00)
* Start Fast5 extractor
Traceback (most recent call last):
File "/home/jroels/miniconda3/envs/toulligqc/bin/toulligqc", line 10, in <module>
sys.exit(main())
File "/home/jroels/miniconda3/envs/toulligqc/lib/python3.6/site-packages/toulligqc/toulligqc.py", line 354, in main
extractor.extract(result_dict)
File "/home/jroels/miniconda3/envs/toulligqc/lib/python3.6/site-packages/toulligqc/fast5_extractor.py", line 115, in extract
result_dict['sequencing.telemetry.extractor.flowcell.id'] = self._get_fast5_items(h5py_file, 'flow_cell_id')
File "/home/jroels/miniconda3/envs/toulligqc/lib/python3.6/site-packages/toulligqc/fast5_extractor.py", line 230, in _get_fast5_items
tracking_id_items = list(h5py_file["/UniqueGlobalKey/tracking_id"].attrs.items())
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/jroels/miniconda3/envs/toulligqc/lib/python3.6/site-packages/h5py/_hl/group.py", line 167, in __getitem__
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: 'Unable to open object (component not found)'
I don't find the cause of the problem, anyone who can help?
In the event anyone tries to run this with numpy 1.24, you just need to change the numpy type casting for booleans in line 355 of sequencing_summary_extractor.py
from
sequencing_summary_datatypes = {
'channel': np.int16,
'start_time': np.float64,
'passes_filtering': np.bool, <--------------------------------------------
'sequence_length_template': np.uint32,
'mean_qscore_template': np.float32,
'duration': np.float32}
to
sequencing_summary_datatypes = {
'channel': np.int16,
'start_time': np.float64,
'passes_filtering': np.bool_, <--------------------------------------------
'sequence_length_template': np.uint32,
'mean_qscore_template': np.float32,
'duration': np.float32}
then it seems to work as expected. cheers
hi
I am trying to create a report for a duplex analysis (guppy_basecaller_duplex
) and and I got this error message.
toulligqc --report-name QC_duplex \
--barcoding \
--telemetry-source duplex/sequencing_telemetry.js \
--sequencing-summary-source duplex/sequencing_summary.txt \
--html-report-path duplex/QC_duplex.html \
--barcodes barcode01
duplex/QC_duplex.html
ToulligQC version 2.2.1
* Initialize extractors
* Start Toulligqc info extractor
* End of Toulligqc info extractor (done in 0m0.00s)
* Start Sequencing telemetry extractor
* End of Sequencing telemetry extractor (done in 0m0.00s)
* Start Basecaller sequencing summary extractor
- Load sequencing summary file (0.04 MB used) in 0m0.06s
Traceback (most recent call last):
File "/home/minion/miniconda3/envs/nano/bin/toulligqc", line 10, in <module>
sys.exit(main())
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/toulligqc/toulligqc.py", line 343, in main
extractor.extract(result_dict)
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/toulligqc/sequencing_summary_extractor.py", line 234, in extract
extract_barcode_info(self, result_dict,
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/toulligqc/sequencing_summary_common.py", line 144, in extract_barcode_info
dataframe_dict["read.fail.barcoded"] = _barcode_frequency(extractor, barcode_selection, result_dict,
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/toulligqc/sequencing_summary_common.py", line 273, in _barcode_frequency
set_result_value(extractor, result_dict, entry + '.count', sum(count_sorted.drop("unclassified")))
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/pandas/core/series.py", line 4771, in drop
return super().drop(
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/pandas/core/generic.py", line 4267, in drop
obj = obj._drop_axis(labels, axis, level=level, errors=errors)
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/pandas/core/generic.py", line 4311, in _drop_axis
new_axis = axis.drop(labels, errors=errors)
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 6644, in drop
raise KeyError(f"{list(labels[mask])} not found in axis")
KeyError: "['unclassified'] not found in axis"
I can send you the two guppy basecaller output files if necessary.
ps. I also have a request. Would it be possible to specify a barcode range for the --barcodes
argument. I use a lot of barcode and the command quickly becomes very long. --barcodes barcode01,barcode02, ... barcode48
Hi there,
Thanks for a very interesting software package. I was wondering if you're still actively developing this whether you would consider making this available via bioconda?
Thanks,
Miika
Hi,
after installation of the latest version of ToulligQC (from PyPi) images are not included in the HTML report.
The command:
$ toulligqc -a guppy_out_5.0.11/sequencing_summary.txt -o test.html
test.html
ToulligQC version 2.2
* Initialize extractors
* Start Toulligqc info extractor
* End of Toulligqc info extractor (done in 0m0.00s)
* Start Basecaller sequencing summary extractor
- Load sequencing summary file (54.05 MB used) in 0m3.21s
- Extract info from sequencing summary file in 0m10.38s
- Creation of image "Read count histogram" in 0m0.28s
- Creation of image "Distribution of read lengths" in 0m4.99s
- Creation of image "Yield plot through time" in 0m2.62s
- Creation of image "PHRED score distribution" in 0m4.65s
- Creation of image "PHRED score density distribution" in 0m1.03s
- Creation of image "Channel occupancy of the flowcell" in 0m1.11s
- Creation of image "Correlation between read length and PHRED score" in 0m1.24s
- Creation of image "Read length over time" in 0m3.15s
- Creation of image "PHRED score over time" in 0m3.60s
- Creation of image "Translocation speed" in 0m3.69s
* End of Basecaller sequencing summary extractor (done in 0m39.96s)
* Write HTML report
* Write statistics files
* End of the QC extractor (done in 0m39.98s)
produces an HTML file with only Run statistics and Device and software parts.
When run with the --images-directory option, HTML files with graphs are correctly produced in the specified directory but are not shown in the HTML report, so there is no problem with graphs production.
When running the previous version (2.1.1) on the same input everything is correct.
Do you have any solution?
System: Ubuntu 20.04, kernel 5.4.0-47
Python: 3.8.5
plotly: 5.5.0
matplotlib: 3.4.3
numpy: 1.19.2
I'm trying to run toulilgqc to generate statistics for a run but I can't persuade it to read a directory of fast5 files, let along the standard sub-directory heirarchy produced by albacore. When I specify a directory containing all the fast5 files from a run toulligqc fails with:
ToulligQC version 0.5
* Initialize extractors
fast5_directory
* Start FAST5 extractor
Traceback (most recent call last):
File "/cluster/gjb_lab/nschurch/cluster_installs/miniconda2/envs/toulligQC/bin/toulligqc", line 11, in <module>
sys.exit(main())
File "/cluster/gjb_lab/nschurch/cluster_installs/miniconda2/envs/toulligQC/lib/python3.6/site-packages/toulligqc/toulligqc.py", line 252, in main
extractor.extract(result_dict)
File "/cluster/gjb_lab/nschurch/cluster_installs/miniconda2/envs/toulligQC/lib/python3.6/site-packages/toulligqc/fast5_extractor.py", line 87, in extract
result_dict['flow_cell_id'] = self._get_fast5_items(h5py_file,'flow_cell_id')
File "/cluster/gjb_lab/nschurch/cluster_installs/miniconda2/envs/toulligQC/lib/python3.6/site-packages/toulligqc/fast5_extractor.py", line 192, in _get_fast5_items
tracking_id_items = list(h5py_file["/UniqueGlobalKey/tracking_id"].attrs.items())
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/cluster/gjb_lab/nschurch/cluster_installs/miniconda2/envs/toulligQC/lib/python3.6/site-packages/h5py/_hl/group.py", line 167, in __getitem__
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: 'Unable to open object (component not found)'
Modifying the extractor python script int he _read_fast5(self):
method to see what is going on reveals that the directory extension is being set correctly printing the glob lines befow reveals:
elif self.fast5_file_extension == 'fast5_directory':
if glob.glob(self.fast5_source+self.run_name+'/*.fast5'):
self.fast5_file = self.fast5_source+self.run_name+'.fast5'
glob.glob: datadir/allfast5/*.fast5
self.fast5_file: datadir/allfast5.fast5
Where allfast5
is the run name, datadir
is the input path specified with --fast5-source
, and datadir/allfast5
contains all the *.fast5 files.
Should the extractor be looping over all the fast5 files in the glob?
Hej,
Thanks for this really pretty tool! I encountered a small issue when using it with dorado's output:
a) When demultiplexing with dorado demux, toulligQC (v2.6) does not produce the barcode plots. This is due to the sequencing_summary_extractor.py which recognized the column "barcode_arrangement" but dorado's column name is only "barcode".
I used dorado v0.5.1 and provided the barcodes via sample-sheet and demultiplexing kit (dorado basecaller ${basecallmodel} pod5/ --sample-sheet ${sampleSheet} $demuxKit ). When I changed my sequencing_summary.txt column to "barcode_arrangement", toulligQC produced the plots properly.
b) Is it possible to skip/modify the barcode-name check in toulligqc.py (ctrl+F: "Get barcode selection")? I'm have our lab's (integer) sampleID's as barcode names in the sequeuncing_summary and would have to add "BC" or "barcode" as prefix such that I don't et the error "ERROR: No known barcode found in provided list of barcodes". In dorado, you can specify now your own custom barcode sets as well with different names. That would be cool to be changed in toulligQC!
Thanks a ton and have a great week!
Philipp
When running toulligQC, i'm getting the following error:
ImportError: C extension: umpy.core.multiarray failed to import not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace --force' to build the C extensions first.
and running setup.py with --inplace --force does not solve it
Hi,
first of all: Thank you for this tool!
I have some question related the output and the usage of ToulligQC.
I'm running version 2.2
at the end of the Readme.md
you mentioned that the output should look like this:
RUN_ID
├── report.html
├── report.data
└── images
└── plots.html
└── plot.png
For me it looks like this after this program call:
toulligqc --report-name <ReportName> \
--telemetry-source /Path/To/sequencing_telemetry.js \
--sequencing-summary-source /PAth/To/sequencing_summary.txt \
--output-directory /Path/To/OutputDirectory/
RUN_ID
├── report.html
├── report.data
└── images
└── plots.html
Do I miss a option or do I missunderstand your outputgraph?
toulligqc --report-name <ReportName> \
--telemetry-source /Path/To/sequencing_telemetry.js \
--sequencing-summary-source /Path/To/sequencing_summary.txt \
--sequencing-summary-source /Path/To/barcoding_summary_pass.txt \
--sequencing-summary-source /Path/To/barcoding_summary_fail.txt \
--barcodes BP01,BP02,BP03,BP04,BP05,BP06,BP07,BP08,BP09,BP10,BP11,BP12 \
--output-directory /Path/To/OutputDirectory/
and got ERROR: No known barcode found in provided list of barcodes
Thank you in advance.
Good afternoon,
I'm setting this issue 'cause I have two running problems in two different computers:
The first one is:
Traceback (most recent call last):
File "/usr/local/bin/toulligqc", line 11, in <module>
load_entry_point('toulligqc==0.5', 'console_scripts', 'toulligqc')()
File "/usr/local/lib/python3.5/dist-packages/toulligqc/toulligqc.py", line 212, in main
check_conf(config_dictionary)
File "/usr/local/lib/python3.5/dist-packages/toulligqc/toulligqc.py", line 137, in check_conf
config_dictionary['result_directory'] = config_dictionary['result_directory'] + '/' + config_dictionary['run_name'] + '/'
TypeError: Can't convert 'NoneType' object to str implicitly
And the second is this one:
RuntimeError: module compiled against API version 0xb but this version of numpy is 0xa
Traceback (most recent call last):
File "/home/bio/miniconda3/lib/python3.6/site-packages/pandas-0.19.2-py3.6-linux-x86_64.egg/pandas/__init__.py", line 25, in <module>
from pandas import hashtable, tslib, lib
ImportError: numpy.core.multiarray failed to import
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/bio/miniconda3/bin/toulligqc", line 11, in <module>
load_entry_point('toulligqc===-0.1a1-', 'console_scripts', 'toulligqc')()
File "/home/bio/miniconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/pkg_resources/__init__.py", line 565, in load_entry_point
File "/home/bio/miniconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/pkg_resources/__init__.py", line 2598, in load_entry_point
File "/home/bio/miniconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/pkg_resources/__init__.py", line 2258, in load
File "/home/bio/miniconda3/lib/python3.6/site-packages/setuptools-27.2.0-py3.6.egg/pkg_resources/__init__.py", line 2264, in resolve
File "/home/bio/miniconda3/lib/python3.6/site-packages/toulligqc-_0.1a1_-py3.6.egg/toulligqc/toulligqc.py", line 36, in <module>
from toulligqc import fastq_extractor
File "/home/bio/miniconda3/lib/python3.6/site-packages/toulligqc-_0.1a1_-py3.6.egg/toulligqc/fastq_extractor.py", line 28, in <module>
import pandas as pd
File "/home/bio/miniconda3/lib/python3.6/site-packages/pandas-0.19.2-py3.6-linux-x86_64.egg/pandas/__init__.py", line 31, in <module>
"the C extensions first.".format(module))
ImportError: C extension: umpy.core.multiarray failed to import not built. If you want to import pandas from the source directory, you may need to run 'python setup.py build_ext --inplace --force' to build the C extensions first.
Trying to solve it, I've updated all the libraries, updated python, I removed and reinstalled the libraries and those problems still occur.
Do you know about what is happening? Is it something from the command line? or the python version? or is something from the libraries?
Honestly, I don't know which is the problem?
Could you help me please,
Thanks in advance.
Luis Alfonso
I got this error with the last version of toulligqc (2.5)
folder=ONV-15-2023_Detection
toulligqc --report-name $folder --barcoding \
--telemetry-source $folder/fastq/sequencing_telemetry.js \
--sequencing-summary-source $folder/fastq/sequencing_summary.txt \
--html-report-path $folder/QC_report.html \
--barcodes barcode01:barcode24
ONV-15-2023_Detection/QC_report.html
ToulligQC version 2.5
* Initialize extractors
* Start Toulligqc info extractor
* End of Toulligqc info extractor (done in 0m0.00s)
* Start Sequencing telemetry extractor
* End of Sequencing telemetry extractor (done in 0m0.00s)
* Start Basecaller sequencing summary extractor
Traceback (most recent call last):
File "/home/joel/miniforge3/envs/nano/bin/toulligqc", line 10, in <module>
sys.exit(main())
File "/home/joel/miniforge3/envs/nano/lib/python3.10/site-packages/toulligqc/toulligqc.py", line 388, in main
extractor.init()
File "/home/joel/miniforge3/envs/nano/lib/python3.10/site-packages/toulligqc/sequencing_summary_extractor.py", line 117, in init
self.dataframe_1d['barcode_arrangement'].cat.add_categories([0, 'other barcodes', 'passes_filtering'],
File "/home/joel/miniforge3/envs/nano/lib/python3.10/site-packages/pandas/core/accessor.py", line 112, in f
return self._delegate_method(name, *args, **kwargs)
File "/home/joel/miniforge3/envs/nano/lib/python3.10/site-packages/pandas/core/arrays/categorical.py", line 2893, in _delegate_method
res = method(*args, **kwargs)
TypeError: Categorical.add_categories() got an unexpected keyword argument 'inplace'
I will soon attach a link with the telemetry files (they are a bit big).
Hi GenomiqueENS/toulligQC team.
I tried to run the provided sample data with toulligQC but a get the following error:
❯ toulligqc
--report-name toulligqc_demo_aata
--barcoding
--barcodes BC01,BC02,BC03,BC04,BC05,B0C7
--telemetry-source sequencing_telemetry.js
--sequencing-summary-source sequencing_summary.txt
--sequencing-summary-source barcoding_summary_pass.txt
--sequencing-summary-source barcoding_summary_fail.txt
--output-directory output
output/toulligqc_demo_aata/report.html
ToulligQC version 2.2.1
I don´t understand what is happening.
Could you help me to solve this trouble?
Thanks
Hi, A colleague has shown me reports generated with toulligQC that would be great for me. However I am just a wet lab person! The output from the Nanopore Mk1c does not include a 'sequencing_telemetry' file. I see from the Github page that a fast5 file would be a fine substitute. Assuming this is true and the fastest way for me to get a report is to add a fast5 file, how do i isolate a fast5 file from the package of 4000 that is generated by the minion? (i think the default size of fast5 folders is 4000 reads).
Thanks,
Nick H.
Hi,
I'm running toulligc in my Ubuntu system, I have installed it as a python package with pip3 and using an environment.
When I run it I have the following problem:
`/home/grid/programas/ToulligQC/bin/toulligqc --report-name test --fast5-source /data/test/20220404_1721_X5_FAS58594_1dd0a346/fast5_pass/ --sequencing-summary-source /data/test/20220404_1721_X5_FAS58594_1dd0a346/sequencing_summary_FAS58594_6132143a.txt --barcoding -l BC01,BC04,BC05,BC06,BC07,BC08,BC09,BC10,BC11,BC12 --html-report-path test.QC.html
test.QC.html
ToulligQC version 2.2.2
Could you please help with me this?
Thank you!
It seems toulligQC will not work with multiple summary or telemetry files, like those given when basecalled using guppy_supervisor
. I tried to concatenate the files it gives the following error for telemetry
json.decoder.JSONDecodeError: Extra data: line 252161 column 2 (char 5928761)
any help appreciated
Mustafa
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.