Coder Social home page Coder Social logo

autohic's Introduction

AutoHiC

GitHub release (with filter) a Docker Pulls run with docker run with singularity

AutoHiC is a deep learning tool that uses Hi-C data to support genome assembly. It can automatically correct errors during genome assembly and generate chromosome-level genome.

Author: Zijie Jiang

Email: [email protected]

Content

Notes

  • Currently AutoHiC has integrated 3d-dna into the complete process. If you are using YaHS, SALSA, Pin_hic etc, please read this document: Other tools

  • AutoHiC updates very fast. If you have already cloned AutoHiC, please delete the AutoHiC folder and clone it again.

Overview of AutoHiC

Citations

If you used AutoHiC in your research, please cite us:

AutoHiC: a deep-learning method for automatic and accurate chromosome-level genome assembly
Zijie Jiang, Zhixiang Peng, Yongjiang Luo, Lingzi Bie, Yi Wang

bioRxiv 2023.08.27.555031; doi: https://doi.org/10.1101/2023.08.27.555031

Installation

conda

# clone AutoHiC
git clone https://github.com/Jwindler/AutoHiC.git

# cd AutoHiC
cd AutoHiC

# create AutoHiC env
conda env create -f autohic.yaml

# activate AutoHiC
conda activate autohic

# configuration environment
cd ./src/models/swin

# install dependencies
pip install -e . -i https://pypi.tuna.tsinghua.edu.cn/simple/

Note:

  1. If src/straw.cpp:34:10: fatal error: curl/curl.h: No such file or directory is encountered during installation, enter the following command sudo apt-get install libcurl-dev libcurl4-openssl-dev libssl-dev in the terminal or refer: https://stackoverflow.com/questions/11471690/curl-h-no-such-file-or-directory.

  2. Either GPU or CPU can be installed according to the above steps, and the program will automatically identify the running configuration and environment.

  3. If you want to use GPU, please install CUDA-11.3 and cuDNN-8.2 before.

Docker

# pull images
sudo docker pull jwindler/autohic:main

# start container
sudo docker run -it -v $(pwd):/home/autohic jwindler/autohic:main bash

# You need to use mounts (-v) to exchange files between the host filesystem on which your user can write and the container filesystem. ( Default "./" )

# clone AutoHiC
git clone https://github.com/Jwindler/AutoHiC.git

# cd AutoHiC
cd AutoHiC

# activate AutoHiC
conda activate autohic

# configuration environment
cd ./src/models/swin

# install dependencies
pip install -e . -i https://pypi.tuna.tsinghua.edu.cn/simple/

Singularity

Considering that many users run AutoHiC on HPC, the build dependency environment may not be very free, and Docker has root restrictions, we provide a singularity version. Detailed documentation: doc

Pre-trained model download

Please select your most convenient download link below, You need to download error_model.pth, chr_model.pth , Juicer and 3d-dna for the configuration of subsequent configuration files

Google Drive (recommend) Baidu Netdisk (百度网盘) Quark (夸克)
Pre-trained model Pre-trained model Pre-trained model

Usages

Data Preparation

  • Contig level genome
  • Hi-C reads
  • directory structure (as below)
species_name/
├── rawdata
│   └── fastq
│       ├── SRR_X_R1.fastq.gz
│       ├── SRR_X_R2.fastq.gz
└── references
    └── contig.fasta

Notes:

  1. The directory structure must be consistent with the above image.
  2. Paired-end sequences must end with X_R1.fastq.gz and X_R2.fastq.gz ( also supports uncompressed formats such as : X_R1.fastq and X_R2.fastq )

Configs

Copy and edit the configuration file cft-autohic.txt in your local folder.

cft-autohic.txt example files are available in the AutoHiC directory.

To ensure run properly, please refrain from adding any additional lines to cft-autohic.txt .

  • Setting the configuration file
options value
JOB_NAME Name of the job
AutoHiC_DIR Path to AutoHiC eg: /path_to/AutoHiC
RESULT_DIR Path to AutoHiC result
N_CPU Number of CPU allows per job Default: 10
SPECIES_NAME Name of the species
REFERENCE_GENOME Path to reference genome
JUICER_DIR Path to Juicer
FASTQ_DIR Path to HiC reads (Just path to the rawdata directory, not fastq folder)
ENZYME Restriction enzyme eg: "HindIII" or "MboI"
TD_DNA_DIR Path to 3d-dna
NUMBER_OF_EDIT_ROUNDS Specifies number of iterative rounds for misjoin correction Default: 2 Modification is not recommended.
ERROR_PRETRAINED_MODEL Path to error pretrained model eg: /path/AutoHiC/src/models/cfgs/error_model.pth
CHR_PRETRAINED_MODEL Path to chromosome pretrained model eg: /path/AutoHiC/src/models/cfgs/chr_model.pth
TRANSLOCATION_ADJUST Whether to adjust for translocation errors Default: True
INVERSION_ADJUST Whether to adjust for inversion errors Default: True
DEBRIS_ADJUST Whether to adjust for debris Default: False
ERROR_MIN_LEN Minimum error length Default: 15000
ERROR_MAX_LEN Maximum error length Default: 20000000
ERROR_FILTER_IOU_SCORE Overlapping error filtering threshold Default: 0.8 Modification is not recommended.
ERROR_FILTER_SCORE Error filtering threshold Default: 0.9 Modification is not recommended.

Notes:

  1. PRETRAINED_MODEL and CHR_PRETRAINED_MODEL parameters come from the download path of your pre-trained model before and after
  2. JUICER_DIR and TD_DNA_DIR parameters come from the path you downloaded and decompressed respectively (If you have already installed it in advance, you can configure it directly)

Run

# cd AutoHiC directory
# Please modify according to your installation directory
cd /home/AutoHiC  

# run 
nohup python3.9 autohic.py -c cfg-autohic.txt > log.txt 2>&1 &

# nohup: Run the program ignoring pending signals

Notes:

  1. Please specify the absolute path of the cft-autohic.txt
  2. It is recommended to specify a directory for the log.txt, It will record the running information of AutoHiC
  3. Delete the nohup command if you don't want the program to run in the background.
  4. If you modify the configuration file and re-run AutoHiC, you must manually delete the previously generated result file.
  5. If a warning (like the image below) appears in the log while you are using it, this is normal and the program is running normally. You just have to wait for the results.

Results

After the AutoHiC operation is completed, the following results will be obtained.

species_name/
├── AutoHiC
│   ├── autohic_results
│   │   ├── 0
│   │   ├── 1
│   │   ├── 2
│   │   ├── 3
│   │   ├── 4
│   │   └── chromosome
│   ├── data
│   │   ├── reference
│   │   └── restriction_sites
│   ├── hic_results
│   │   ├── 3d-dna
│   │   └── juicer
│   ├── logs
│   │   ├── 3d-dna.log
│   │   ├── 3_epoch.log
│   │   ├── 4_epoch.log
│   │   ├── bwa_index.log
│   │   ├── chromosome_epoch.log
│   │   └── juicer.log
│   ├── quast_output
│   │   ├── chromosome
│   │   └── contig
│   ├── chromosome_autohic.fasta 
│   └── result.html
├── cfg-autohic.txt

The main output:

  1. fasta file with a "_autohic" suffix containing the output scaffolds at the chromosome level. If the size of genome_autohic.fasta differs significantly from the genome, it is recommended to use genome.FINAL.fasta (Path:/path/AutoHiC/autohic_results/chromosome) for optimal results. If the genome scaffolding effect is poor during chromosome assignment, the model may have errors in identifying chromosomes, which can lead to issues. Other scaffolding software can be used, followed by the use of AutoHiC to correct and assign the chromosomes. Please refer to Other tools.

  2. The result.html file, which provides detailed information before and after genome correction, where the error occurred, and a heat map of HiC interaction and chromosome length before and after.

  3. Please see this document for detailed results description.

Example

If you want to run AutoHiC with sample data, you can choose from the following data.

data

Please follow the link provided for the selected species to download the appropriate data and organize it into the required format, can refer to : Data Preparation.

Species Reference genome Hi-C Data
Halictus ligatus hl.fa SRR14251351
Lasioglossum leucozonium ll.fa SRR14251345
Schistosoma haematobium sh.fa SRR16086854
Arachis hypogaea peanut.fa SRR6796709; SRR6832914
  • Reference genome : Sample genome files are available at the example_genome file in the pre-trained model download link : Pre-trained model download
  • The default enzyme used for example data is DpnII

run

cd AutoHiC

nohup python3.9 autohic.py -c cfg-autohic.txt > log.txt 2>&1 &
  • Please modify the cfg-autohic.txt file according to the actual situation, can refer to : Configs.

result

The main results of AutoHiC are genome and assembly reports at the chromosome level. For a detailed description of the results, please refer to Results. At the same time, we also upload the assembly report to Google Drive for users to retrieve and view.

Plot HiC interaction map

AutoHiC also provides a script to visualise the HiC interaction matrix separately.

python3.9 visualizer.py -hic example.hic

For detailed commands, please refer to the help documentation (--help)

  • result

One Setp AutoHiC (optional)

If you have already run Juicer and 3d-dna, you can use the following extended script to use AutoHiC to help you detect HiC assembly errors and generate adjusted assembly files.

# Enter the AutoHiC directory.
cd /home/ubuntu/AutoHic  

# run onehic
python3.9 onehic.py -hic test.hic -asy test.assembly -autohic /home/ubuntu/AutoHic -p pretrained.pth -out ./

# run 3d-dna to get fasta
bash run-asm-pipeline-post-review.sh -r adjusted.assembly genome.fasta merged_nodups.txt 

# Please specify the absolute path of each file
# adjusted.assembly is output from onehic.py
# merged_nodups.txt is output from Juicer

Notes:

  1. .hic and .assembly : can be obtained from 3d-dna results
  2. -autohic : the parameter represents the path of AutoHiC
  3. -p: the path to the error pretrained model you downloaded before

example

If you want to run onehic.py with example data, please get the corresponding data from the previously linked Pre-trained model download example_onehic file.

Species Hi-C File Assembly File
Mastacembelus armatus Mastacembelus.hic Mastacembelus.assembly
Arachis hypogaea peanut.hic peanut.assembly

Split chromosome (optional)

If your genome is very complex, the model may not be very accurate in assigning the chromosomes. It is recommended that you import the last adjustment file into Juicxbox to manually split chromosomes.

The .hic and .assembly files you need to use can be obtained from the chromosome folder under the autohic_results directory.

License

AutoHiC Copyright (c) 2022 Wang lab. All rights reserved.

This software is distributed under the MIT License (MIT).

autohic's People

Contributors

jwindler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

autohic's Issues

Error with GPU/CUDA

Hi, I get this error while trying to use AutoHiC latest version

RuntimeError: nms is not compiled with GPU support

Seems like it is related to mmcv/mmdet - I have tried every possible step trying to solve this error but seems like I just can't. Can you please help>?

error

Traceback (most recent call last):
File "/home/lx_sky6/yt/soft/AutoHiC/onehic.py", line 14, in
import torch
File "/home/lx_sky6/yt/soft/miniconda3/envs/autohic/lib/python3.9/site-packages/torch/init.py", line 197, in
from torch._C import * # noqa: F403
ImportError: /home/lx_sky6/yt/soft/miniconda3/envs/autohic/lib/python3.9/site-packages/torch/lib/../../../../libgomp.so.1: version `GOMP_4.0' not found (required by /home/lx_sky6/yt/soft/miniconda3/envs/autohic/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)

Hello author, I have configured the environment and files. Started running, but kept reporting the following error

[2023-11-14 16:54:06] autohic.py -> whole line:33 [INFO] : AutoHiC start running ...

[2023-11-14 16:54:06] autohic.py -> whole line:35 [INFO] : Get configuration items

[2023-11-14 16:54:06] autohic.py -> whole line:54 [INFO] : Check if the GPU is available
[2023-11-14 16:54:06] autohic.py -> whole line:60 [INFO] : GPU is available, AutoHiC will run on GPU

[2023-11-14 16:54:06] autohic.py -> whole line:69 [INFO] : Fastq file number is correct.

[2023-11-14 16:54:06] autohic.py -> whole line:86 [INFO] : Check genome length whether > 80 base
[2023-11-14 16:54:06] autohic.py -> whole line:102 [INFO] : Genome len < 80 base

[2023-11-14 16:54:06] autohic.py -> whole line:105 [INFO] : Stage 1: Run Juicer and 3d-dna
[2023-11-14 17:04:33] autohic.py -> whole line:109 [INFO] : Run Juicer and 3d-dna finished

[2023-11-14 17:04:33] autohic.py -> whole line:112 [INFO] : Stage 2: Select the min error number of hic file

[2023-11-14 17:04:33] autohic.py -> whole line:127 [INFO] : Check the 0 file
[2023-11-14 17:04:33] mul_gen_png.py -> mul_process line:49 [INFO] : Multiple Process Initiating ...

[2023-11-14 17:04:33] hic_adv_model.py -> init line:30 [INFO] : Base Model Initiating

[2023-11-14 17:04:33] hic_adv_model.py -> init line:40 [INFO] : Create genome folder: /home/cyz/soft/AutoHiC/result/AutoHiC-test/autohic_results/0/png

File /home/cyz/soft/AutoHiC/result/AutoHiC-test/hic_results/3d-dna/ll.0.hic cannot be opened for reading

Files in chromosome directory

Thank you for this great tool!
I want to know files in the directory, autohic_results/chromosome.
What do the files xxx_final.fasta and xxx_FINAL.fasta mean?

utilzing omni-c reads

Hi thank you for developing this tool. I am interested to give a try and would like to know if you've tested it with omnic reads.

Best,

Sadii

File XXXX/autohic_results/3/XXXX.final.hic cannot be opened for reading

Hello,

I am using AutoHiC through docker.
I met an issue that looks like the one desribed in #5

[2024-03-26  16:04:00] hic_adv_model.py -> __init__ line:40 [INFO] : Create genome folder: /home/autohic/<PATH>/result/AutoHiC/autohic_results/3/png

File /home/autohic/<PATH>/result/AutoHiC/autohic_results/3/Pfai_hap1.final.hic cannot be opened for reading

However, in this case it was resolved by writing the fasta contig file in 80bp line, and mine is already in that format. I can't find any other difference between my format and that of the example dataset.

the command was:

nohup python3.9 autohic.py -c /home/autohic/cfg-autohic_Pfai.txt > /home/autohic/log4.txt 2>&1

Here is the log file and config file
cfg-autohic_Pfai.txt
log4.txt

I thank you in advance for any suggestion regarding the origin of the issue.

Best regards

Add Options for Pore-C Data

Dear author,
Would it be possible to add an option specifically for Pore-C data, which could allow for a more accurate alignment of chromosomes?

Docker Container doesn't contain the code.

In the docker image, we have to git clone the code and activate the conda environment. The point of a docker container is to have everything in there so we can just run the program. Can you git clone the code in there.

For us on a HPC cluster, user's can not edit docker containers. We have to convert it to a singularity container anyway to make it work for us.

Thanks.

Failure during chimera handling

Hi

I was able to install the conda version of AutoHiC correctly. However, the program crashed after running the juicer step. This is the last few lines of the juicer log file. Please let me know if you need me to send you the entire log file as well.

[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -SP5M -t 40 /data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/results_rhinolophus/AutoHiC_rhinolophus/data/reference/rename_genome.fasta /data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/results_rhinolophus/AutoHiC_rhinolophus/hic_results/juicer/rename_genome/splits/clean_hic_R1.fastq.gz /data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/results_rhinolophus/AutoHiC_rhinolophus/hic_results/juicer/rename_genome/splits/clean_hic_R2.fastq.gz
[main] Real time: 31742.853 sec; CPU: 411122.844 sec
(-: Align of /data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/results_rhinolophus/AutoHiC_rhinolophus/hic_results/juicer/rename_genome/splits/clean_hic.fastq.gz.sam done successfully
awk: /data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/juicer/scripts/common/chimeric_blacklist.awk: line 671: function and never defined
awk: /data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/juicer/scripts/common/chimeric_blacklist.awk: line 671: function and never defined
awk: /data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/juicer/scripts/common/chimeric_blacklist.awk: line 671: function and never defined
awk: /data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/juicer/scripts/common/chimeric_blacklist.awk: line 671: function and never defined
awk: /data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/juicer/scripts/common/chimeric_blacklist.awk: line 671: function and never defined
***! Failure during chimera handling of /data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/results_rhinolophus/AutoHiC_rhinolophus/hic_results/juicer/rename_genome/splits/clean_hic.fastq.gz

Thank you
Best
Kritika

Could you mind giving a detailed, step-by-step pipeline for onehic.py?

Hi.

Thank you very much for developing the software. I generated the bam file using chromap, but I get an error when using matlock. Here is my command line. Can you give a detailed tutorial on using onehic? I got results using the example data and had so many problems using my own data. Thank you very much.

##index
samtools faidx hifi_ONT_hifiasm.hap1_final_assembly.fa
chromap -i -r hifi_ONT_hifiasm.hap1_final_assembly.fa -o hap1.index

##mapping
chromap --preset hic -r hifi_ONT_hifiasm.hap1_final_assembly.fa -x hap1.index --remove-pcr-duplicates -1 P32_R1.clean.fq.gz -2 P32_R2.clean.fq.gz --SAM -o hap1.sam -t 50

##sort
samtools view -bh hap1.sam | samtools sort -@ 50 -n > hap1.bam
rm hap1.sam 

##.assembly
python ~/biosoft/juicebox_scripts/juicebox_scripts/makeAgpFromFasta.py hifi_ONT_hifiasm.hap1_final_assembly.fa out.agp
python ~/biosoft/juicebox_scripts/juicebox_scripts/agp2assembly.py out.agp out.assembly

##.hic
conda activate morehic
matlock bam2 juicer hap1.bam out.links.txt 

INFO: converting bam to juicer on hap1.bam
INFO: detected bam filetype
INFO: reading file "hap1.bam"
FATAL: something went wrong in process_pair

AutoHiC seems to be making changes, but results look same.

Hi, I have tested AutoHiC at least 5 times in the "onehic.py" mode, and every single time it seems to be making adjustments (As indicated by the infer_results folder) - but as soon as I use 3d-dna post review script to generate .hic map on the adjusted assembly, it looks exactly the same as what was supplied to AutoHiC. Am I missing something?

Before AutoHiC :
Screenshot 2023-10-05 at 10 18 25 am

After AutoHiC:
Screenshot 2023-10-06 at 10 19 47 am

{"Raw error number": {"translocation": 3, "inversion": 0, "debris": 573}, "Score filtered error number": {"translocation": 3, "inversion": 0, "debris": 561}, "Length filtered error number": {"translocation": 1, "inversion": 0, "debris": 559}, "Length removed error number": {"translocation": 2, "inversion": 0, "debris": 2}, "Overlap filtered error number": {"debris": 365, "translocation": 1}, "Chromosome real length filtered error number": {"translocation": {"normal": 0, "abnormal": 1}, "inversion": {"normal": 0, "abnormal": 0}, "debris": {"normal": 45, "abnormal": 320}}}

This is the error summary.

Can you please help me out here?

Thanks

issue with docker: [fputs] No space left on device ***! Alignment

Dear All

I am trying AutoHiC for my work. I am using the docker image for the analysis. However, I am getting the following error during the juicer step

[M::process] read 2666668 sequences (400000200 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 0, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] skip orientation FR as there are not enough pairs
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 2666668 reads in 1132.546 CPU sec, 28.731 real sec
[fputs] No space left on device
***! Alignment of /AutoHiC/results_rhinolophus/AutoHiC_rhinolophus/hic_results/juicer/rename_genome/splits/clean_hic_R1.fastq.gz /AutoHiC/results_rhinolophus/AutoHiC_rhinolophus/hic_results/juicer/rename_genome/splits/clean_hic_R2.fastq.gz failed.

There is ~3Tb space available, so I am not sure how to proceed further.

With the conda installation, I am getting the following error

Traceback (most recent call last):
File "/data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/AutoHiC/autohic.py", line 19, in
from src.assembly.adjust_all_error import adjust_all_error
File "/data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/AutoHiC/src/assembly/adjust_all_error.py", line 16, in
from src.assembly.cut_errors_ctg import cut_errors_ctg
File "/data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/AutoHiC/src/assembly/cut_errors_ctg.py", line 15, in
from src.utils.get_cfg import get_ratio
File "/data3/balaji/rhinolophus_raw_genome/scaffolding/5_auto_hic/AutoHiC/src/utils/get_cfg.py", line 17, in
import hicstraw

I tried installing hicstraw, but there is a compatibility issue with the other programs.

Please suggest a solution.

Thank you
Best regards
Kritika

How to resume operation after interruption.

Hi.
Thank you very much for developing the software.
When I ran autohic and finally generated the autohic_results directory, I suddenly got an error due to insufficient memory. The first steps of autohic, juicer and 3d-dna, have been completed. When the computing node is re-modified or the CPU is changed, the operation will start from scratch. I dont't want to run juicer from again, it takes too much time.
I would like to ask how I can generate a complete autohic_result directory from the second step of the interruption. Or should I use the .assembly file in which directory to run onehic.py?The following is my log output.Thank you very much.


[2024-04-11  16:00:59] autohic.py -> whole line:33 [INFO] : AutoHiC start running ...

[2024-04-11  16:00:59] autohic.py -> whole line:35 [INFO] : Get configuration items

[2024-04-11  16:00:59] autohic.py -> whole line:55 [INFO] : Check if the GPU is available
[2024-04-11  16:00:59] autohic.py -> whole line:59 [INFO] : GPU is not available, AutoHiC will run on CPU

[2024-04-11  16:00:59] autohic.py -> whole line:70 [INFO] : Fastq file number is correct.

[2024-04-11  16:00:59] autohic.py -> whole line:87 [INFO] : Check genome length whether > 80 base
[2024-04-11  16:00:59] autohic.py -> whole line:103 [INFO] : Genome len < 80 base

[2024-04-11  16:00:59] autohic.py -> whole line:106 [INFO] : Stage 1: Run Juicer and  3d-dna
[2024-04-12  08:32:04] autohic.py -> whole line:110 [INFO] : Run Juicer and  3d-dna finished

[2024-04-12  08:32:04] autohic.py -> whole line:113 [INFO] : Stage 2: Select the min error number of  hic file

[2024-04-12  08:32:04] autohic.py -> whole line:128 [INFO] : Check the 0 file
[2024-04-12  08:32:04] mul_gen_png.py -> mul_process line:49 [INFO] : Multiple Process Initiating ...

[2024-04-12  08:32:04] hic_adv_model.py -> __init__ line:30 [INFO] : Base Model Initiating

[2024-04-12  08:32:04] hic_adv_model.py -> __init__ line:40 [INFO] : Create genome folder: /public/home/off/project/02_mugua/07.autohic/AutoHiC/autohic_results/0/png

[2024-04-12  08:32:04] mul_gen_png.py -> mul_process line:56 [INFO] : Number of processes is : 90

Genome each line length:  60
[2024-04-12  08:32:05] hic_adv_model.py -> get_chr_len line:63 [INFO] : Hic file sequence length is : 353045643

[2024-04-12  08:32:12] mul_gen_png.py -> mul_process line:120 [INFO] : Multiple process finished

[2024-04-12  08:32:12] get_cfg.py -> get_ratio line:49 [INFO] : Ratio(assembly length / hic length) is 1.0

[2024-04-12  08:32:12] get_cfg.py -> get_hic_real_len line:105 [INFO] : Hic file real len: 353045643

[2024-04-12  08:32:12] autohic.py -> whole line:142 [INFO] : Detect the 0 file
/public/home/off/mambaforge/envs/autohic/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /opt/conda/conda-bld/pytorch_1639180487213/work/aten/src/ATen/native/TensorShape.cpp:2157.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/public/home/off/biosoft/AutoHiC/src/models/swin/mmdet/datasets/utils.py:64: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file.
  warnings.warn(
/public/home/off/biosoft/AutoHiC/src/models/swin/mmdet/models/roi_heads/bbox_heads/bbox_head.py:353: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at  /opt/conda/conda-bld/pytorch_1639180487213/work/torch/csrc/utils/tensor_new.cpp:201.)
  scale_factor = bboxes.new_tensor(scale_factor).unsqueeze(1).repeat(
[2024-04-12  09:24:53] error_pd.py -> de_diff_overlap line:436 [INFO] : Filter all error category Done
[2024-04-12  09:24:53] error_pd.py -> chr_len_filter line:479 [INFO] : KeyError: translocation not in errors_dict
[2024-04-12  09:24:53] error_pd.py -> chr_len_filter line:479 [INFO] : KeyError: inversion not in errors_dict
[2024-04-12  09:24:53] error_pd.py -> loci_zoom line:513 [INFO] : zoom threshold: 0
[2024-04-12  09:24:53] error_pd.py -> loci_zoom line:527 [INFO] : KeyError: translocation not in errors_dict
[2024-04-12  09:24:53] error_pd.py -> loci_zoom line:527 [INFO] : KeyError: inversion not in errors_dict
[2024-04-12  09:24:53] error_pd.py -> divide_error line:558 [INFO] : Divide all error category Done
[2024-04-12  09:24:53] error_pd.py -> json_vis line:722 [INFO] : Done loading json file.
[2024-04-12  09:24:54] autohic.py -> whole line:147 [INFO] : Detect the 0 file finished

[2024-04-12  09:24:54] autohic.py -> whole line:165 [INFO] : The 0 file done

[2024-04-12  09:24:54] autohic.py -> whole line:128 [INFO] : Check the 1 file
[2024-04-12  09:24:54] mul_gen_png.py -> mul_process line:49 [INFO] : Multiple Process Initiating ...

[2024-04-12  09:24:54] hic_adv_model.py -> __init__ line:30 [INFO] : Base Model Initiating

[2024-04-12  09:24:54] hic_adv_model.py -> __init__ line:40 [INFO] : Create genome folder: /public/home/off/project/02_mugua/07.autohic/AutoHiC/autohic_results/1/png

[2024-04-12  09:24:54] mul_gen_png.py -> mul_process line:56 [INFO] : Number of processes is : 90

img size:  (1116, 1116)
Traceback (most recent call last):

  File "/public/home/off/biosoft/AutoHiC/autohic.py", line 383, in <module>
    typer.run(whole)

  File "/public/home/off/biosoft/AutoHiC/autohic.py", line 134, in whole
    mul_process(hic_file_path, "png", adjust_path, "dia", int(cfg_data["N_CPU"]))

  File "/public/home/off/biosoft/AutoHiC/src/common/mul_gen_png.py", line 57, in mul_process
    pool = Pool(process_num)  # process number

  File "/public/home/off/mambaforge/envs/autohic/lib/python3.9/multiprocessing/context.py", line 119, in Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,

  File "/public/home/off/mambaforge/envs/autohic/lib/python3.9/multiprocessing/pool.py", line 212, in __init__
    self._repopulate_pool()

  File "/public/home/off/mambaforge/envs/autohic/lib/python3.9/multiprocessing/pool.py", line 303, in _repopulate_pool
    return self._repopulate_pool_static(self._ctx, self.Process,

  File "/public/home/off/mambaforge/envs/autohic/lib/python3.9/multiprocessing/pool.py", line 326, in _repopulate_pool_static
    w.start()

  File "/public/home/off/mambaforge/envs/autohic/lib/python3.9/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)

  File "/public/home/off/mambaforge/envs/autohic/lib/python3.9/multiprocessing/context.py", line 277, in _Popen
    return Popen(process_obj)

  File "/public/home/off/mambaforge/envs/autohic/lib/python3.9/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)

  File "/public/home/off/mambaforge/envs/autohic/lib/python3.9/multiprocessing/popen_fork.py", line 66, in _launch
    self.pid = os.fork()

OSError: [Errno 12] Cannot allocate memory

error report

HI,
log.txt
an error issue had happen :

File /home/genome/ass/chi/hic/01.X101SC23102206-Z01-J001_hic_clean/combine/chick/result/AutoHiC/hic_results/3d-dna/contig.0.hic cannot be opened for reading

can you help me fix it :?

fewer chromosomes

Hi Zijie Jiang

Thank you so much for your help. I was able to run AutoHiC finally. However, after the run, the genome is assembled into fewer chromosomes. In my case 21 chromosomes. We should expect 28 chromosomes for this species. Also, the final HI-C plot is not very good. I have attached it for your reference. Is it possible to provide the number of chromosomes as a parameter during the run?

chromosome

Thank you
Best regards
Kritika

Installation error (conda)

Hi, according to the conda installation process, I made an error at "conda env create -f autohic.yaml". The error is as follows:
Executing transaction: | b'By downloading and using the CUDA Toolkit conda packages, you accept the terms and conditions of the CUDA End User License Agreement (EULA): https://docs.nvidia.com/cuda/eula/index.html\n' done
Ran pip subprocess with arguments:
['/public3/home/scb9766/.conda/envs/autohic/bin/python', '-m', 'pip', 'install', '-U', '-r', '/public3/home/scb9766/soft/AutoHiC/condaenv.4p1kidx4.requirements.txt']
Pip subprocess output:
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting addict==2.4.0
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/6a/00/b08f23b7d7e1e14ce01419a467b583edbb93c6cdb8654e54a9cc579cd61f/addict-2.4.0-py3-none-any.whl (3.8 kB)
Collecting charset-normalizer==3.1.0
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/33/97/9967fb2d364a9da38557e4af323abcd58cc05bdd8f77e9fd5ae4882772cc/charset_normalizer-3.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (199 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 199.2/199.2 kB 2.3 MB/s eta 0:00:00
Collecting click==8.1.3
Using cached https://pypi.tuna.tsinghua.edu.cn/packages/c2/f1/df59e28c642d583f7dacffb1e0965d0e00b218e0186d7858ac5233dce840/click-8.1.3-py3-none-any.whl (96 kB)
Collecting colorama==0.4.6
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Collecting filelock==3.12.0
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ad/73/b094a662ae05cdc4ec95bc54e434e307986a5de5960166b8161b7c1373ee/filelock-3.12.0-py3-none-any.whl (10 kB)
Collecting fsspec==2023.5.0
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ec/4e/397b234a369df06ec782666fcdf9791d125ca6de48729814b381af8c6c03/fsspec-2023.5.0-py3-none-any.whl (160 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 160.1/160.1 kB 2.3 MB/s eta 0:00:00
Collecting hic-straw==1.3.1
Downloading https://pypi.tuna.tsinghua.edu.cn/packages/8e/ec/431c76970f8973ea5937a9b5f2d1689a641b3fe6475246a32451274fa2dd/hic-straw-1.3.1.tar.gz (18 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'error'

Pip subprocess error:
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [123 lines of output]
/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/setuptools/init.py:84: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated.
!!

          ********************************************************************************
          Requirements should be satisfied by a PEP 517 installer.
          If you are using pip, you can try `pip install --use-pep517`.
          ********************************************************************************

  !!
    dist.fetch_build_eggs(dist.setup_requires)
  ERROR: Exception:
  Traceback (most recent call last):
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 438, in _error_catcher
      yield
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 561, in read
      data = self._fp_read(amt) if not fp_closed else b""
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 527, in _fp_read
      return self._fp.read(amt) if amt is not None else self._fp.read()
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_vendor/cachecontrol/filewrapper.py", line 90, in read
      data = self.__fp.read(amt)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/http/client.py", line 463, in read
      n = self.readinto(b)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/http/client.py", line 507, in readinto
      n = self.fp.readinto(b)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/socket.py", line 704, in readinto
      return self._sock.recv_into(b)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/ssl.py", line 1242, in recv_into
      return self.read(nbytes, buffer)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/ssl.py", line 1100, in read
      return self._sslobj.read(len, buffer)
  socket.timeout: The read operation timed out

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/cli/base_command.py", line 160, in exc_logging_wrapper
      status = run_func(*args)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/cli/req_command.py", line 247, in wrapper
      return func(self, options, args)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/commands/wheel.py", line 170, in run
      requirement_set = resolver.resolve(reqs, check_supported_wheels=True)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 92, in resolve
      result = self._result = resolver.resolve(
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_vendor/resolvelib/resolvers.py", line 481, in resolve
      state = resolution.resolve(requirements, max_rounds=max_rounds)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_vendor/resolvelib/resolvers.py", line 348, in resolve
      self._add_to_criteria(self.state.criteria, r, parent=None)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_vendor/resolvelib/resolvers.py", line 172, in _add_to_criteria
      if not criterion.candidates:
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_vendor/resolvelib/structs.py", line 151, in __bool__
      return bool(self._sequence)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 155, in __bool__
      return any(self)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 143, in <genexpr>
      return (c for c in iterator if id(c) not in self._incompatible_ids)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 47, in _iter_built
      candidate = func()
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 206, in _make_candidate_from_link
      self._link_candidate_cache[link] = LinkCandidate(
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 297, in __init__
      super().__init__(
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 162, in __init__
      self.dist = self._prepare()
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 231, in _prepare
      dist = self._prepare_distribution()
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 308, in _prepare_distribution
      return preparer.prepare_linked_requirement(self._ireq, parallel_builds=True)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/operations/prepare.py", line 491, in prepare_linked_requirement
      return self._prepare_linked_requirement(req, parallel_builds)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/operations/prepare.py", line 536, in _prepare_linked_requirement
      local_file = unpack_url(
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/operations/prepare.py", line 166, in unpack_url
      file = get_http_url(
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/operations/prepare.py", line 107, in get_http_url
      from_path, content_type = download(link, temp_dir.path)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/network/download.py", line 147, in __call__
      for chunk in chunks:
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_internal/network/utils.py", line 63, in response_chunks
      for chunk in response.raw.stream(
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 622, in stream
      data = self.read(amt=amt, decode_content=decode_content)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 587, in read
      raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/contextlib.py", line 137, in __exit__
      self.gen.throw(typ, value, traceback)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pip/_vendor/urllib3/response.py", line 443, in _error_catcher
      raise ReadTimeoutError(self._pool, None, "Read timed out.")
  pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out.
  Traceback (most recent call last):
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/setuptools/installer.py", line 96, in _fetch_build_egg_no_warn
      subprocess.check_call(cmd)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/subprocess.py", line 373, in check_call
      raise CalledProcessError(retcode, cmd)
  subprocess.CalledProcessError: Command '['/public3/home/scb9766/.conda/envs/autohic/bin/python', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmp1ttb4jhq', '--quiet', 'pybind11>=2.4']' returned non-zero exit status 2.

  The above exception was the direct cause of the following exception:

  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/tmp/pip-install-xzo_gzll/hic-straw_ed8afd254b414e379963ae13223f25ac/setup.py", line 106, in <module>
      setup(
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/setuptools/__init__.py", line 106, in setup
      _install_setup_requires(attrs)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/setuptools/__init__.py", line 79, in _install_setup_requires
      _fetch_build_eggs(dist)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/setuptools/__init__.py", line 84, in _fetch_build_eggs
      dist.fetch_build_eggs(dist.setup_requires)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/setuptools/dist.py", line 917, in fetch_build_eggs
      return _fetch_build_eggs(self, requires)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/setuptools/installer.py", line 38, in _fetch_build_eggs
      resolved_dists = pkg_resources.working_set.resolve(
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pkg_resources/__init__.py", line 827, in resolve
      dist = self._resolve_dist(
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pkg_resources/__init__.py", line 863, in _resolve_dist
      dist = best[req.key] = env.best_match(
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pkg_resources/__init__.py", line 1133, in best_match
      return self.obtain(req, installer)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/pkg_resources/__init__.py", line 1145, in obtain
      return installer(requirement)
    File "/public3/home/scb9766/.conda/envs/autohic/lib/python3.9/site-packages/setuptools/installer.py", line 98, in _fetch_build_egg_no_warn
      raise DistutilsError(str(e)) from e
  distutils.errors.DistutilsError: Command '['/public3/home/scb9766/.conda/envs/autohic/bin/python', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmp1ttb4jhq', '--quiet', 'pybind11>=2.4']' returned non-zero exit status 2.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

CondaEnvException: Pip failed

IndexError: cannot do a non-empty take from an empty axes.

Hello,
I was testing the genome of a diploid animal with AutoHiC, and it has successfully run Juicer and 3D-DNA. However, the program stopped during the second round of execution, and the result files are as follows:

/lustre/home/geneticstest/hdx/cercopithecus/assembly/Hic/autohic/hongwei/result/AutoHiC_hongwei/autohic_results/2] $ ls chr_len_filtered_errors.json debris_error.json infer_result len_filtered_errors.xlsx overlap_filtered_errors.xlsx score_filtered_errors.xlsx chr_len_filtered_errors.xlsx error_summary.json inversion_error.json len_remove_error.xlsx overlap_remove_error.txt zoomed_errors.json chr_len_remove_error.txt error_summary.xlsx len_filtered_errors.json overlap_filtered_errors.json png
The error information in the log file is as follows:
│ │ ╭───────────────────────────────── locals ─────────────────────────────────╮ │ │ │ arr = array([], dtype=float64) │ │ │ │ axis = 0 │ │ │ │ DATA_AXIS = 0 │ │ │ │ method = { │ │ │ │ │ 'get_virtual_index': <function <lambda> at │ │ │ │ 0x7fd4b49fd1f0>, │ │ │ │ │ 'fix_gamma': <function <lambda> at │ │ │ │ 0x7fd4b49fd280> │ │ │ │ } │ │ │ │ next_indexes = array(0) │ │ │ │ out = None │ │ │ │ previous_indexes = array(0) │ │ │ │ quantiles = array(0.95) │ │ │ │ values_count = 0 │ │ │ │ virtual_indexes = array(-0.95) │ │ │ ╰──────────────────────────────────────────────────────────────────────────╯ │ │ in take:200 │ │ ╭──────────────────────────── locals ────────────────────────────╮ │ │ │ args = (array([], dtype=float64),) │ │ │ │ dispatcher = <function _take_dispatcher at 0x7fd4b4bb93a0> │ │ │ │ implementation = <function take at 0x7fd4b4bb94c0> │ │ │ │ kwargs = {'indices': -1, 'axis': 0} │ │ │ │ public_api = <function take at 0x7fd4b4bb9550> │ │ │ │ relevant_args = (array([], dtype=float64), None) │ │ │ │ use_like = False │ │ │ ╰────────────────────────────────────────────────────────────────╯ │ │ │ │ /lustre/home/geneticstest/mambaforge-pypy3/envs/autohic/lib/python3.9/site-p │ │ ackages/numpy/core/fromnumeric.py:190 in take │ │ │ │ 187 │ array([[4, 3], │ │ 188 │ │ [5, 7]]) │ │ 189 │ """ │ │ ❱ 190 │ return _wrapfunc(a, 'take', indices, axis=axis, out=out, mode=mod │ │ 191 │ │ 192 │ │ 193 def _reshape_dispatcher(a, newshape, order=None): │ │ │ │ ╭────────────── locals ──────────────╮ │ │ │ a = array([], dtype=float64) │ │ │ │ axis = 0 │ │ │ │ indices = -1 │ │ │ │ mode = 'raise' │ │ │ │ out = None │ │ │ ╰────────────────────────────────────╯ │ │ │ │ /lustre/home/geneticstest/mambaforge-pypy3/envs/autohic/lib/python3.9/site-p │ │ ackages/numpy/core/fromnumeric.py:57 in _wrapfunc │ │ │ │ 54 │ │ return _wrapit(obj, method, *args, **kwds) │ │ 55 │ │ │ 56 │ try: │ │ ❱ 57 │ │ return bound(*args, **kwds) │ │ 58 │ except TypeError: │ │ 59 │ │ # A TypeError occurs if the object does have such a method in │ │ 60 │ │ # class, but its signature is not identical to that of NumPy' │ │ │ │ ╭───────────────────────────────── locals ─────────────────────────────────╮ │ │ │ args = (-1,) │ │ │ │ bound = <built-in method take of numpy.ndarray object at │ │ │ │ 0x7fd499594f90> │ │ │ │ kwds = {'axis': 0, 'out': None, 'mode': 'raise'} │ │ │ │ method = 'take' │ │ │ │ obj = array([], dtype=float64) │ │ │ ╰──────────────────────────────────────────────────────────────────────────╯ │ ╰──────────────────────────────────────────────────────────────────────────────╯ IndexError: cannot do a non-empty take from an empty axes.

Here attached the configuration file and the log file.
hongwei_autohic.txt
log.txt
Could you please help me figure out what might be causing this issue? Besides, Would using the .hic file from the hic_results/3d-dna results folder to continue running onehic.py make a difference compared to the final autohic result?

Thank you very much for helping me solve this problem

Issues with running AutoHiC on two datasets

Hello,

I've been trying to run AutoHiC for two datasets from scratch. In both cases bwamem, juicer, 3d-dna steps seem to finish correctly, at least they didn't report any critical errors. But on the later stages both runs failed with different errors:
The first dataset is for an insect Nudaria mundana. The error is

File /lustre/scratch123/tol/teams/tola/users/kk16/autohic_data/ilNudMud1/result/AutoHiC_ilNudMud1/autohic_results/3/ilNudMud1.final.hic cannot be opened for reading

Indeed there is no such file but the folder contents are

$ls ilNudMud1/result/AutoHiC_ilNudMud1/autohic_results/3/
black_list.txt				       ilNudMud1_lines.final.assembly  ilNudMud1_lines.final.hic
ilNudMud1_lines.cprops			       ilNudMud1_lines.FINAL.assembly  ilNudMud1_lines.mnd.txt
ilNudMud1_lines.final.asm		       ilNudMud1_lines.final.cprops    png
ilNudMud1_lines.final_asm.scaffold_track.txt   ilNudMud1_lines.final.fasta     test.assembly
ilNudMud1_lines.final_asm.superscaf_track.txt  ilNudMud1_lines.FINAL.fasta

Another one is a high-coverage dataset for a protist Eimeria maxima, where the error is

│ AutoHiC/src/utils/get_chr_ │
│ data.py:127 in hic_loci2txt                                                  │
│                                                                              │
│   124 │   │   chr_len_list_sorted[chr_index + 1][0] = chr_len_list_sorted[ch │
│   125 │                                                                      │
│   126 │   #                                                                  │
│ ❱ 127 │   chr_len_list_sorted[0][0] = 0                                      │
│   128 │   if hic_len is not None:                                            │
│   129 │   │   chr_len_list_sorted[-1][1] = hic_len                           │
│   130 │   else:                                                              │
│                                                                              │
│ ╭───────────────────────────────── locals ─────────────────────────────────╮ │
│ │            chr_dict = {}                                                 │ │
│ │        chr_len_list = []                                                 │ │
│ │ chr_len_list_sorted = []                                                 │ │
│ │             hic_len = None                                               │ │
│ │       redundant_len = 200000                                             │ │
│ │            txt_path = 'aut… │ │
│ ╰──────────────────────────────────────────────────────────────────────────╯ │
╰──────────────────────────────────────────────────────────────────────────────╯
IndexError: list index out of range

The log files are attached.

autohic_ilNudMud121106.txt
autohic_pxEimMaxi123763.txt

I wonder if it's possible to get some help with troubleshooting?

Many thanks!

Some prerequisites for successful operation

Hi, a creative developer
I am very happy to try this latest scaffolding tool, but before I start, I still want to ask you a small question. For a 3.3G hybrid plant genome, whether only 45x Hi-C data meets the requirements and how large a hard disk is required for operation (because my hard disk is only 1t)

Onehic.py: list index out of range

Hi, when I used "onehic.py", I got the error "IndexError: list index out of range". I am unsure what the "list index" refers to.
genome size: 3.3 Gb, hic size: 4.7 Gb.

"slurm-3620847.log" provides more details.
slurm-3620847.log

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.