1.Clone the project
git clone https://github.com/liuchuwei/pum6a.git
2.Install conda environment
conda create -n pum6a python=3.8
conda activate pum6a
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
pip install dask==2023.5.0 h5py==3.10.0 numpy==1.24.3 pandas==2.0.3 scikit-learn==1.3.2 tqdm==4.66.1 toml==0.10.2 statsmodels==0.14.1
The usage of pum6a for m6A detection require the tombo environment.
You can install MINES, m6Anet, ELIGOS, Nanom6A, and Epinano environment according to your need. Usage example can be found in the m6a_detection directory.
3.prepare tookit: check and modify the tool paths of tookit.py file (in 'utils' directory).
Experiment of pum6a framework for different positive and unlable bags datasets.
python run.py experiment --config $*.toml
for example: python run.py experiment --config log/Internet_pum6a_0.5Freq_88888.toml
1.Basecalling
python process/01.basecalling.py -i $fast5 -o $out
2.Resguiggle
preprocess
conda activate tombo
python process/02.resquiggle_pre.py -f $fast5 -o $out
annotate_raw_with_fastqs
cat *.fastq > merge.fastq
python process/03.resquiggle.py preprocess annotate_raw_with_fastqs \
--fast5-basedir $single \
--fastq-filenames $merge_fastq \
--overwrite \
--processes 8
resquiggling
python process/3.resquiggle.py resquiggle $fast5 $reference \
--rna \
--corrected-group RawGenomeCorrected_000 \
--basecall-group Basecall_1D_000 \
--overwrite \
--processes 16 \
--fit-global-scale \
--include-event-stdev
3.Minimap
python process/04.minimap.py -i <directory of fastq files> -o <output directory> -r <path of reference>
4.m6a detection
4.1 activate environment
conda activate pum6a
4.2 preprocess
python run.py preprocess --single $single_fast5 -o $output -g $genome.fa -r $transcript.fa -i $gene2transcripts.txt -b $bam
4.3 train model
python run.py train --config $config.toml
4.4 predict
python run.py predict --config $config.toml
4.5 evaluate
python run.py evaluate --config $config.toml
Distributed under the MIT License. See LICENSE for more information.