Coder Social home page Coder Social logo

bedtoolsr's People

Contributors

cwenger avatar danielskatz avatar dphansti avatar ericsdavis avatar mayurapatwardhan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

bedtoolsr's Issues

Add License

As per the ongoing review of the package for JOSS (openjournals/joss-reviews#1742 (comment)), please add a license file. The DESCRIPTION file for the package notes the MIT license, so please add both a LICENSE file containing the information CRAN will need for the MIT license (including just the year and copyright holder, as described at https://kbroman.org/pkg_primer/pages/licenses.html). Also, please add a file LICENSE.md with the full text of the MIT license, including the year and copyright holder information filled in (please make sure to add this second file to the .Rbuildignore).

bedtoolsr::bt.multicov issue

Hi,

First and foremost, thank you for the R version of bedtools. I'm attempting to use bedtools multicov but receiving the following error.

bedfile <- read.table("peaks_mod.bed", header = F, sep = "\t")
head(bedfile)
V1 V2 V3 V4
1 chr8 85554228 85555389 -
2 chr11 60877576 60879894 -
3 chr12 108793341 108795193 -
4 chr2 92432803 92434088 -
5 chr13 100124401 100125923 -
6 chr7 109519172 109521109 -

bamfile <-BamFile("bam_100k_mm10_chip_sorted.bam") #to check if bamfile is loaded corrected i did.
indexBam(bamfile)
E14_K4_100k_mm10_chip_sorted.bam
"E14_K4_100k_mm10_chip_sorted.bam.bai"

cov = bedtoolsr::bt.multicov(bams = bamfile, bed = bedfile)
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot coerce class ‘structure("BamFile", package = "Rsamtools")’ to a data.frame

could you please tell me how to solve it ? Thanks

i have done same thing with bedtools without any issue.

bedtools multicov -bams bam_100k_mm10_chip_sorted.bam -bed peaks_mod.bed > coverage.bed

head coverage.bed
chr8 85554228 85555389 - 1691
chr11 60877576 60879894 - 2410
chr12 108793341 108795193 - 3334
chr2 92432803 92434088 - 1043
chr13 100124401 100125923 - 1479
chr7 109519172 109521109 - 2703
chr16 35489817 35491172 - 2748
chr10 117844804 117846650 - 2151
chr6 119175202 119176003 - 1173
chr6 113075180 113077265 - 1608

bedtoolsr installation fails due to "sh: 1: bedtools: not found" error

Hi,

I am trying to run CNV-ClinViewer locally through RStudio Server, installed on Ubuntu 20.04 LTS. During the installation of all dependencies, the installation of bedtoolsr fails because RStudio Server cannot find bedtools. However, bedtools v2.30.0 is installed on the system and on PATH.

I also tried to manually set the path to bedtools manually like so: options(bedtools.path = "/home/analyst/anaconda3/bin/").

I tried the same thing on Windows WSL, and faced the exact same problem.

That did not help either. Do you have any idea on how to overcome this problem?

s = TRUE not working for bt.flank

I am using the following code to add flanks upstream of sequences while taking strand into account:

>bedtoolsr::bt.flank(i = ApalmGFF3, g = sequenceLengths, l = 1000,  r = 0,  s = TRUE)
data frame with 0 columns and 0 rows

If I run the same code with s = FALSE, I get a non-empty data frame:

>bedtoolsr::bt.flank(i = ApalmGFF3, g = sequenceLengths, l = 1000,  r = 0,  s = FALSE)
                        V1     V2     V3 V4
1   Sc0a5M3_100_HRSCAF_183     77   1077  +
2   Sc0a5M3_100_HRSCAF_183   3505   4505  +
3   Sc0a5M3_100_HRSCAF_183  11824  12824  -

Is there a bug, similar to what was reported in this issue, preventing me from getting output when s = TRUE? Or is there something else at play on my end? I am using bedtoolsr_2.30.0-5.

Problem with strandedness

Hello,

I am using bedtoolsr::bt.intersect() and I want to specify strandedness. Nevertheless, when I specify s=T and s=F, the number of output regions is always the same.

For example:

> bedtoolsr::bt.intersect(a = a, b = b, s = T, u = T) %>% nrow()
[1] 8092
> bedtoolsr::bt.intersect(a = a, b = b, S = T, u = T) %>% nrow()
[1] 1337
> bedtoolsr::bt.intersect(a = a, b = b, s = F, u = T) %>% nrow()
[1] 8092

Doing the same command with the command-line bedtools:

$ bedtools intersect -s -u -a a.tmp.bed -b b.tmp.bed | wc -l
8092
$ bedtools intersect -S -u -a a.tmp.bed -b b.tmp.bed | wc -l
1337
$ bedtools intersect -u -a a.tmp.bed -b b.tmp.bed | wc -l
8594

It looks like with bedtoolsr::bt.intersect() the strandedness is always the same unless if I specify that I want a different strandedness, when in theory, if I specify s = F, I should get intersections regardless the strand.

My bedtools version is 2.29.2, and my bedtoolsr version is 2.30.0.1. When I load bedtoolsr a warning says that it was developed under bedtools version 2.30. Can this be the source of my problem??

Thank you very much.

Adrià

Bug report (Failure due to the special characters)

Dear developers, bedtoolsr is an extremely useful software for me, but a potential issue occurred when I use the bedtoolsr package.
Here is an example of the issue.
`

A.bed <- data.frame(chrom=c("chr1", "chr1"), start=c(10, 30), end=c(20, 40),source=c("pubmed","pubmed"))
B.bed <- data.frame(chrom=c("chr1"), start=15, end=20)
#The result is fine
bedtoolsr::bt.intersect(A.bed, B.bed)
V1 V2 V3 V4
1 chr1 15 20 pubmed
`

However, when added another column which contains the special character " ' ", and problem happens.

`

A.bed <- data.frame(chrom=c("chr1", "chr1"), start=c(10, 30), end=c(20, 40), anno = c("5'UTR", "CDS"), source=c("pubmed","pubmed"))
B.bed <- data.frame(chrom=c("chr1"), start=15, end=20)
#problem occured
bedtoolsr::bt.intersect(A.bed, B.bed)
[1] V1 V2 V3 V4 V5
<0 rows> (or 0-length row.names)
Warning message:
In utils::read.table(tempfile, header = FALSE, sep = "\t") :
incomplete final line found by readTableHeader on '/tmp/RtmpHeeAXx/bedtoolsr15c554c9da9df.txt'
`

This problem may occur if the contents of the variables contain special characters and bedtoolsr may not handle them specially.

Best regards,

Can add windows support for windows R env though WSL

Dear,
Thanks for development bedtoolsr, Its really useful. According to Microsoft's documentation, it is actually feasible to directly call bedtools in wsl in wihdows R by options(bedtools.path = "wsl /home/username/miniconda3/bin/"). You can easily test it in poweshell by wsl /the/path/to/your/bedtools/in/wsl/ . But the temp file path in bedtoolsr is incompatible. Could you please add a patch to support it.
Best.
Zhang

bt.complement error

Hi @cwenger @dphansti ,

I am having trouble implementing bt.complement. Ultimately, I want to create a bed file that is the complement of a coding gene annotation (gtf/bed) file. So a bed file of all non-coding genomic regions (hg38).

If there is a publicly available non-coding annotation file for hg38, please let me know!

My subsetted data is attached for a reproducible example:

genes.sub.txt
genome.sub.txt

> head(genes.sub)
  V1    V2    V3
1 1  11869 14409
2 1  14404 29570
3 1  17369 17436
4 1  29554 31109
5 1  34554 36081
6 1  52473 53312
> head(genome.sub)
  V1        V2
1  1 248956422
2  2 242193529
3  3 198295559
4  4 190214555
5  5 181538259
6  6 170805979
> bt.complement(i=genes.sub,g=genome.sub)
Error: requested chromosome 1  does not exist in the genome file /tmp/RtmpjEuJzC/g_1.txt. Exiting.
data frame with 0 columns and 0 rows

Add Contribution Guidelines

As per the ongoing JOSS review, please add contribution guidelines for the package. At minimum, this should include a simple statement about how best to contribute to the package for those interest, placed in the README. One example might include adding a "Contributions" section to the README (e.g., https://github.com/tlverse/origami#contributions) as well as a CONTRIBUTING.md (e.g., https://github.com/tlverse/origami/blob/master/CONTRIBUTING.md) that provides further details.

Keep randomly getting an error when using bt.sort or bt.merge. The error always takes the form of "It looks as though you have less than...columns at line...in file...Are you sure your files are tab-delimited?"

I can manually inspect the bed files and there is always 3 columns of data at the line the error specifies.

Usually if I re-install the package the error will go away for a little while, but it continues to come back.

Package: bedtoolsr
Encoding: UTF-8
Type: Package
Title: Bedtools Wrapper
Version: 2.30.0-5
Date: 2022-11-08
Author: Mayura Patwardhan, Craig Wenger, Eric Davis, Doug Phanstiel
Maintainer: Doug Phanstiel [email protected]
Description: Wrapper library for the bedtools utilities for genome arithmetic.
Imports: utils
Suggests: testthat
License: MIT + file LICENSE
RoxygenNote: 7.2.1
RemoteType: github
RemoteHost: api.github.com
RemoteRepo: bedtoolsr
RemoteUsername: PhanstielLab
RemoteRef: HEAD
RemoteSha: cce152f
GithubRepo: bedtoolsr
GithubUsername: PhanstielLab
GithubRef: HEAD
GithubSHA1: cce152f
NeedsCompilation: no
Packaged: 2023-01-25 19:08:52 UTC; ben
Built: R 4.1.1; ; 2023-01-25 19:08:53 UTC; unix

-- File: /Library/Frameworks/R.framework/Versions/4.1/Resources/library/bedtoolsr/Meta/package.rds

Default sorting of genome file does not work for Fisher function

Thanks for making this wrapper! I ran into this issue with version 2.30.0.1.

I was attempting to run bedtools fisher on two bed files and I got the following error:

Error: Sorted input specified, but the file <tmp file> has the following record with a different sort order than the genomeFile /Users/andrewduncan/Library/R/4.0/library/bedtoolsr/data/mm10

I had sorted the two bed files based on the information found on the Bedtools Fisher page (https://bedtools.readthedocs.io/en/latest/content/tools/fisher.html), but it seems like the genome file also needs to be sorted as well. I ran the same sort command on the mm10 file from the above error and the Fisher command now works.

I'm not sure if there are any implications for other functions if you sort the genome file in this way (sort -k1,1 -k2,2n), but it may be useful to have the genome files sorted this way by default.

bt.shuffle() behavior

Hi all,

I was hoping I could get some assistance with strange behavior I'm observing when I try to use multiple options (seed and incl) in bt.shuffle(). When I run bt.shuffle(i=experimental.peaklist, g=genome, incl=regions.to.include), everything works well. When I run bt.shuffle(i=experimental.peaklist, g=genome, seed=1), everything also works well. However, when I try to specify both the seed and regions to include, bedtools suddenly doesn't recognize the number I set with seed.

Here is the command I ran and error output.

> bt.shuffle(i=experimental.peaklist, incl=peri, g=genome, seed=1) %>% head()

*****ERROR: Unrecognized parameter: 1 *****


Tool:    bedtools shuffle (aka shuffleBed)
Version: v2.29.2
Summary: Randomly permute the locations of a feature file among a genome.

Usage:   bedtools shuffle [OPTIONS] -i <bed/gff/vcf> -g <genome>

Options: 
	-excl	A BED/GFF/VCF file of coordinates in which features in -i
		should not be placed (e.g. gaps.bed).

	-incl	Instead of randomly placing features in a genome, the -incl
		options defines a BED/GFF/VCF file of coordinates in which 
		features in -i should be randomly placed (e.g. genes.bed). 
		Larger -incl intervals will contain more shuffled regions. 
		This method DISABLES -chromFirst. 
	-chrom	Keep features in -i on the same chromosome.
		- By default, the chrom and position are randomly chosen.
		- NOTE: Forces use of -chromFirst (see below).

	-seed	Supply an integer seed for the shuffling.
		- By default, the seed is chosen automatically.
		- (INTEGER)

	-f	Maximum overlap (as a fraction of the -i feature) with an -excl
		feature that is tolerated before searching for a new, 
		randomized locus. For example, -f 0.10 allows up to 10%
		of a randomized feature to overlap with a given feature
		in the -excl file. **Cannot be used with -incl file.**
		- Default is 1E-9 (i.e., 1bp).
		- FLOAT (e.g. 0.50)

	-chromFirst	
		Instead of choosing a position randomly among the entire
		genome (the default), first choose a chrom randomly, and then
		choose a random start coordinate on that chrom.  This leads
		to features being ~uniformly distributed among the chroms,
		as opposed to features being distribute as a function of chrom size.

	-bedpe	Indicate that the A file is in BEDPE format.

	-maxTries	
		Max. number of attempts to find a home for a shuffled interval
		in the presence of -incl or -excl.
		Default = 1000.
	-noOverlapping	
		Don't allow shuffled intervals to overlap.
	-allowBeyondChromEnd	
		Allow shuffled intervals to be relocated to a position
		in which the entire original interval cannot fit w/o exceeding
		the end of the chromosome.  In this case, the end coordinate of the
		shuffled interval will be set to the chromosome's length.
		By default, an interval's original length must be fully-contained
		within the chromosome.
Notes: 
	(1)  The genome file should tab delimited and structuredata frame with 0 columns and 0 rowsd as follows:
	     <chromName><TAB><chromSize>

	For example, Human (hg19):
	chr1	249250621
	chr2	243199373
	...
	chr18_gl000207_random	4262

Tips: 
	One can use the UCSC Genome Browser's MySQL database to extract
	chromosome sizes. For example, H. sapiens:

	mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e \
	"select chrom, size from hg19.chromInfo"  > hg19.genome

Any troubleshooting help you can provide would be greatly appreciated. Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.