Coder Social home page Coder Social logo

bedstat's People

Contributors

joseverdezoto avatar nsheff avatar oddodaoddo avatar stolarczyk avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

joseverdezoto

bedstat's Issues

GenomicDistributions plotGCContent issue

g = plotGCContent(gcvec)

I submitted a subset of the LOLACore database to bedstat but the pipeline execution is halted halfway when reading the regionstat.R script. The pipeline log.md for each sample shows the following error:

Error in as.data.frame(list(gc = gcvec)) : object 'gcvec' not found
Calls: doitall -> plotGCContent -> as.data.frame

summary statistic for distance to TSS

Related to: databio/GenomicDistributions#173

We noticed that the average distance to TSSs was very high. At first we found a bug in genomicdistributions, but this was fixed. Nevertheless, the average distances still seemed high.

After looking at this in depth, I realize now it's because these distances follow an exponential distribution, and I don't think the mean is a good way to summarize it.

I would suggest 2 changes to bedstat output:

  1. instead of mean distance to TSS, we record the median
  2. rename from "Mean absolute distance from TSS" to "Median TSS distance' (the first was too long)

Adding new GD plots

Now that new plots have been added to GD, we need to add the plots here:

  • peak width qthist
  • gc content
  • tissue specificity
  • FRIF

dbhost pipeline argument producing an error

When I try to run the bedstat pipeline on some samples, I keep coming across the error:
bedstat.py: error: the following arguments are required: --dbhost

I looked at the bedstat.py script and it looks like this argument should default to 'localhost'?

Reducing precision of data stored in database

Right now we are loading into the database a lot of significant digits:

image

In fact we don't really need this level of precision, and multiplied by 20k bed files means this is consuming a lot of our database, which costs storage and also could slow transfer speeds.

We should think carefully about how much data to store to streamline this.

regionstat changes

in regionstat.R

  1. don't make openSignalMatrix argument optional, we always pass it in the pipeline. It is just sometimes None (when not passed to the pipeline, since it's optional on that level)
  2. handle the None only in one place (in doitall function?)

this is not needed:

bedstat/tools/regionstat.R

Lines 151 to 157 in 0a372ec

if (!is.null(opt$openSignalMatrix)){
osm = opt$openSignalMatrix
cellMatrix = data.table::fread(osm)
doitall(query, fn, fileId, genome, cellMatrix)
} else {
doitall(query, fn, fileId, genome)
}

Test time

The tests take too long to run (> 30 min). At a first glance it looks like this is because we're installing lots of R packages, maybe including bsgenome objects (? need to double-check that), which is taking a long time.

Could it be better to build a container with these dependencies and use that?

elastic error when loading data

Related to #8. I can get elastic running, but it's saying this:

[2019-10-24T19:19:43,521][WARN ][o.e.c.r.a.DiskThresholdMonitor] [zglgMLr] flood stage disk watermark [95%] exceeded on [zglgMLr9SxW_vHYqPURTJg][zglgMLr][/usr/share/elasticsearch/data/nodes/0] free: 1tb[2.3%], all indices on this node will be marked read-only

could that be preventing the pipeline from sticking data into elastic?

Misplaced output folder

outfolder = os.path.abspath(os.path.join(args.outfolder, fileid))
# try to create the directory and ignore failure if it already exists
#os.makedirs(outfolder, exist_ok=True)
pm = pypiper.PipelineManager(name="bedstat-pipeline", outfolder=outfolder, args=args)
command = "Rscript tools/regionstat.R --bedfile=%s --fileid=%s --outputfolder=%s --genome=%s" % (bfile, fileid, outfolder, args.genome_assembly)
target = os.path.abspath(os.path.join(outfolder, bedfile_portion))

The pipeline is producing the statistics outputs it's supposed to, but there seems to be an issue with the storage place for the output folder. The folder isn't placed according to the path specified in the output_dir portion of the bedstat_config.yaml file.

Error in paste0(BSgenome, ".masked") : object 'BSgenome' not found

Not sure what's the reason, looking into this

[mstolarczyk@MichalsMBP bedstat]: looper run test_bedstat.yaml --package local
Looper version: 0.12.6-dev
Command: run
Activating compute package 'local'
## [1 of 4] sample: ews1; pipeline: BEDSTAT
Writing script to /Users/mstolarczyk/submission/BEDSTAT_ews1.sub
Job script (n=1; 0.00Gb): /Users/mstolarczyk/submission/BEDSTAT_ews1.sub
Compute node: MichalsMBP
Start time: 2020-04-19 13:59:55
### Pipeline run code and environment:

*              Command:  `/Users/mstolarczyk/Uczelnia/UVA/code/bedstat/pipeline/bedstat.py --bedfile /Users/mstolarczyk/Desktop/bedmaker_output/ews1.bed.gz --genome hg19 --sample-yaml /Users/mstolarczyk/submission/ews1.yaml --openSignalMatrix /Users/mstolarczyk/Desktop/oc_mtx_hg19.txt.gz -O /Users/mstolarczyk/results_pipeline -O /Users/mstolarczyk/results_pipeline`
*         Compute host:  MichalsMBP
*          Working dir:  /Users/mstolarczyk/Desktop/testing/bedstat
*            Outfolder:  /Users/mstolarczyk/Uczelnia/UVA/rivanna_project_sshfs/resources/regions/bedstat_output/2d8b5b8a6699d3db7837de9a9aaa36b3/
*  Pipeline started at:   (04-19 13:59:55) elapsed: 0.0 _TIME_

### Version log:

*       Python version:  3.6.5
*          Pypiper dir:  `/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pypiper`
*      Pypiper version:  0.12.1
*         Pipeline dir:  `/Users/mstolarczyk/Uczelnia/UVA/code/bedstat/pipeline`
*     Pipeline version:  None
*        Pipeline hash:  a79ea643f8b9e5ad3f9f447c918f0cb74bf0cff4
*      Pipeline branch:  * dev_newplots
*        Pipeline date:  2020-04-17 15:55:24 -0400
*        Pipeline diff:  1 file changed, 32 insertions(+), 5 deletions(-)

### Arguments passed to pipeline:

*     `bedbase_config`:  `None`
*            `bedfile`:  `/Users/mstolarczyk/Desktop/bedmaker_output/ews1.bed.gz`
*        `config_file`:  `bedstat.yaml`
*              `cores`:  `1`
*              `dirty`:  `False`
*       `force_follow`:  `False`
*    `genome_assembly`:  `hg19`
*              `input`:  `None`
*             `input2`:  `None`
*     `just_db_commit`:  `False`
*             `logdev`:  `False`
*                `mem`:  `4000`
*          `new_start`:  `False`
*       `no_db_commit`:  `False`
*   `openSignalMatrix`:  `/Users/mstolarczyk/Desktop/oc_mtx_hg19.txt.gz`
*      `output_parent`:  `/Users/mstolarczyk/results_pipeline`
*            `recover`:  `False`
*        `sample_name`:  `None`
*        `sample_yaml`:  `/Users/mstolarczyk/submission/ews1.yaml`
*             `silent`:  `False`
*   `single_or_paired`:  `single`
*           `testmode`:  `False`
*          `verbosity`:  `None`

----------------------------------------

Target to produce: `/Users/mstolarczyk/Uczelnia/UVA/rivanna_project_sshfs/resources/regions/bedstat_output/2d8b5b8a6699d3db7837de9a9aaa36b3/ews1.json`  

> `Rscript /Users/mstolarczyk/Uczelnia/UVA/code/bedstat/tools/regionstat.R --bedfile=/Users/mstolarczyk/Desktop/bedmaker_output/ews1.bed.gz --fileId=ews1 --openSignalMatrix=/Users/mstolarczyk/Desktop/oc_mtx_hg19.txt.gz --outputfolder=/Users/mstolarczyk/Uczelnia/UVA/rivanna_project_sshfs/resources/regions/bedstat_output/2d8b5b8a6699d3db7837de9a9aaa36b3 --genome=hg19 --digest=2d8b5b8a6699d3db7837de9a9aaa36b3` (49357)
<pre>
Loading required package: IRanges
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:

    expand.grid

Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Error in paste0(BSgenome, ".masked") : object 'BSgenome' not found
Execution halted
</pre>
Command completed. Elapsed time: 0:00:03. Running peak memory: 0.135GB.  
  PID: 49357;	Command: Rscript;	Return code: 1;	Memory used: 0.135GB


### Pipeline failed at:  (04-19 13:59:58) elapsed: 3.0 _TIME_

Total time: 0:00:03
Failure reason: Subprocess returned nonzero result. Check above output for details
Traceback (most recent call last):
  File "/Users/mstolarczyk/Uczelnia/UVA/code/bedstat/pipeline/bedstat.py", line 52, in <module>
    pm.run(cmd=command, target=json_file_path)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pypiper/manager.py", line 785, in run
    self.callprint(cmd, shell, lock_file, nofail, container)  # Run command
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pypiper/manager.py", line 1028, in callprint
    self._triage_error(SubprocessError(msg), nofail)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pypiper/manager.py", line 2131, in _triage_error
    self.fail_pipeline(e)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pypiper/manager.py", line 1660, in fail_pipeline
    raise exc
pypiper.exceptions.SubprocessError: Subprocess returned nonzero result. Check above output for details

Error in tutorial

The tutorial pipeline is producing this command:

> `Rscript /home/nsheff/bedbase_tutorial/bedstat/tools/regionstat.R --bedfile=bedbase_BEDfiles/GSE105587_ENCFF018NNF_conservative_idr_thresholded_peaks_GRCh38.bed.gz --fileId=GSE105587_ENCFF018NNF_conservative_idr_thresholded_peaks_GRCh38 --outputfolder=/home/nsheff/bedbase_tutorial/bedstat/bedstat_output/78c0e4753d04b238fc07e4ebe5a02984 --genome=hg38 --digest=78c0e4753d04b238fc07e4ebe5a02984` (19376)

which is leading to this error:

[1] "Plotting: /home/nsheff/bedbase_tutorial/bedstat/bedstat_output/78c0e4753d04b238fc07e4ebe5a02984/GSE105587_ENCFF018NNF_conservative_idr_thresholded_peaks_GRCh38_tssdist"
Error in integer(binCountByChrom + 1) : invalid 'length' argument
Calls: doitall ... eval -> eval -> binRegion -> unlist -> vapply -> integer
In addition: Warning message:
Vectorized input to `element_text()` is not officially supported.
Results may be unexpected or may change in future versions of ggplot2. 
Execution halted

Decide on proper/unique naming scheme for samples in database

Problem: currently we decide the "id" of a sample when committed to database (or when producing bedstat pipeline output/results_pipeline/<id> by simply taking the name of the bed file and throwing away the extension. For example /path/to/LOLA/hg38/cistrome_cistrome/regions/3.bed gives us an id "3". However, any other unrelated sample in a separate run of bedstat can also be named /path/to/somewhere/else/3.bed and will produce another entry with id of "3". This new entry will then simply overwrite the old one.

We need a naming scheme that is relatively unique but not unfriendly. Using UUIDs would be unique (this is what we did in episb-provider) but maybe unfriendly? We could try inferring and concocting names like cistrome_cistrome_3 but that may fail if someone is providing samples where such inference may not be possible. @nsheff any input is welcome! 😃

files are not getting inserted into elastic

Related to #11

I finally got elastic running -- but bedstat does not put any data in it.

Bedstat runs correctly, and no errors are given. In kibana I can see an index called "bedstat_bedfiles"

But it is empty and running the pipeline doesn't change it.

 ## [2 of 2773] /home/nsheff/code/bedstat/LOLACore/hg38/encode_tfbs/regions/wgEncodeAwgTfbsBroadDnd41Ezh239875UniPk.narrowPeak (bedstat)
Submission settings lack memory specification
Writing script to /home/nsheff/code/bedstat/output/submission/bedstat_/home/nsheff/code/bedstat/LOLACore/hg38/encode_tfbs/regions/wgEncodeAwgTfbsBroadDnd41Ezh239875UniPk.narrowPeak.sub
Job script (n=1; 0.00 Gb): output/submission/bedstat_/home/nsheff/code/bedstat/LOLACore/hg38/encode_tfbs/regions/wgEncodeAwgTfbsBroadDnd41Ezh239875UniPk.narrowPeak.sub
Compute node: puma
Start time: 2019-10-24 15:41:57
### Pipeline run code and environment:

*              Command:  `pipeline/bedstat.py --bedfile /home/nsheff/code/bedstat/LOLACore/hg38/encode_tfbs/regions/wgEncodeAwgTfbsBroadDnd41Ezh239875UniPk.narrowPeak --genome hg38 -O output/results_pipeline -R`
*         Compute host:  puma
*          Working dir:  /home/nsheff/code/bedstat
*            Outfolder:  /home/nsheff/code/bedstat/output/results_pipeline/wgEncodeAwgTfbsBroadDnd41Ezh239875UniPk/
*  Pipeline started at:   (10-24 15:42:00) elapsed: 3.0 _TIME_

### Version log:

*       Python version:  3.5.2
*          Pypiper dir:  `/home/nsheff/.local/lib/python3.5/site-packages/pypiper`
*      Pypiper version:  0.12.1
*         Pipeline dir:  `/home/nsheff/code/bedstat/pipeline`
*     Pipeline version:  None
*        Pipeline hash:  7c4973148eee59e1887a7b4961a4c049ef1409aa
*      Pipeline branch:  * dev
*        Pipeline date:  2019-10-23 08:31:24 -0400
*        Pipeline diff:  1 file changed, 2 insertions(+), 1 deletion(-)

### Arguments passed to pipeline:

*            `bedfile`:  `/home/nsheff/code/bedstat/LOLACore/hg38/encode_tfbs/regions/wgEncodeAwgTfbsBroadDnd41Ezh239875UniPk.narrowPeak`
*        `config_file`:  `bedstat.yaml`
*              `cores`:  `1`
*              `dirty`:  `False`
*       `force_follow`:  `False`
*    `genome_assembly`:  `hg38`
*              `input`:  `None`
*             `input2`:  `None`
*             `logdev`:  `False`
*                `mem`:  `4000`
*          `new_start`:  `False`
*         `nodbcommit`:  `False`
*      `output_parent`:  `output/results_pipeline`
*            `recover`:  `True`
*        `sample_name`:  `None`
*             `silent`:  `False`
*   `single_or_paired`:  `single`
*           `testmode`:  `False`
*          `verbosity`:  `None`

----------------------------------------

Target to produce: `/home/nsheff/code/bedstat/output/results_pipeline/wgEncodeAwgTfbsBroadDnd41Ezh239875UniPk/wgEncodeAwgTfbsBroadDnd41Ezh239875UniPk.narrowPeak`  

> `Rscript tools/regionstat.R --bedfile=/home/nsheff/code/bedstat/LOLACore/hg38/encode_tfbs/regions/wgEncodeAwgTfbsBroadDnd41Ezh239875UniPk.narrowPeak --fileid=wgEncodeAwgTfbsBroadDnd41Ezh239875UniPk --outputfolder=/home/nsheff/code/bedstat/output/results_pipeline/wgEncodeAwgTfbsBroadDnd41Ezh239875UniPk --genome=hg38` (32631)
<pre>
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colMeans,
    colnames, colSums, dirname, do.call, duplicated, eval, evalq,
    Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply,
    Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
    pmin.int, Position, rank, rbind, Reduce, rowMeans, rownames,
    rowSums, sapply, setdiff, sort, table, tapply, union, unique,
    unsplit, which, which.max, which.min

Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:

    expand.grid

Loading required package: IRanges
Loading required package: GenomeInfoDb
Registered S3 methods overwritten by 'ggplot2':
  method         from 
  [.quosures     rlang
  c.quosures     rlang
  print.quosures rlang
Saving 7 x 7 in image
BSAggregate: Calculating sizes. You can speed this up by supplying a regionsGRL.length vector...Done counting regionsGRL lengths.
Finding overlaps...
Setting regionIDs...
jExpr: .N
Combining...
Saving 7 x 7 in image
Loading required namespace: BSgenome.Hsapiens.UCSC.hg38.masked
Saving 7 x 7 in image
promoterCore :	found 227
promoterProx :	found 241
exon :	found 553
intron :	found 421
Saving 7 x 7 in image
</pre>
Command completed. Elapsed time: 0:00:13. Running peak memory: 0.02GB.  
  PID: 32631;	Command: Rscript;	Return code: 0;	Memory used: 0.02GB


### Pipeline completed. Epilogue
*        Elapsed time (this run):  0:00:17
*  Total elapsed time (all runs):  0:00:13
*         Peak memory (this run):  0.0197 GB
*        Pipeline completed time: 2019-10-24 15:42:13

Looper finished
Samples valid for job generation: 2 of 2
Successful samples: 2 of 2
Commands submitted: 2 of 2
Jobs submitted: 2

BED file digest

New way to digest BED files

e.g:

  1. join chr, start, end col values by , (sort the regions?)
  2. hash the joint col values
  3. hash the digests for the three cols

bsgenome objects

in this scripts/installRdeps.R you are downloading and installing gigabytes of bsgenome objects.

genomes = list(Hsapiens = c("hg18","hg19","hg38"),
Mmusculus = c("mm10","mm9"))
for(name in names(genomes)) {
for(genome in genomes[[name]]) {
# should install non-masked too
.install_pkg(p=paste0("BSgenome.", name,
".UCSC.", genome,".masked"),
bioc=TRUE)
}

referencing non-existent object -- gcvec

GC content calculation became optional, depending on BSgenome package availability, but the result of that is still required for JSON document creation:

Setting regionIDs...
Combining...
[1] "Plotting: /Users/mstolarczyk/Desktop/testing/bedbase_tutorial/outputs/bedstat_output/9cd65cf4f07b83af35770c4a098fd4c6/GSE91663_ENCFF319TPR_conservative_idr_thresholded_peaks_GRCh38_chrombins"
hg38 BSgenome package is not installed.
Calculating overlaps...
[1] "Plotting: /Users/mstolarczyk/Desktop/testing/bedbase_tutorial/outputs/bedstat_output/9cd65cf4f07b83af35770c4a098fd4c6/GSE91663_ENCFF319TPR_conservative_idr_thresholded_peaks_GRCh38_partitions"
[1] "Plotting: /Users/mstolarczyk/Desktop/testing/bedbase_tutorial/outputs/bedstat_output/9cd65cf4f07b83af35770c4a098fd4c6/GSE91663_ENCFF319TPR_conservative_idr_thresholded_peaks_GRCh38_expected_partitions"
[1] "Plotting: /Users/mstolarczyk/Desktop/testing/bedbase_tutorial/outputs/bedstat_output/9cd65cf4f07b83af35770c4a098fd4c6/GSE91663_ENCFF319TPR_conservative_idr_thresholded_peaks_GRCh38_cumulative_partitions"
[1] "Plotting: /Users/mstolarczyk/Desktop/testing/bedbase_tutorial/outputs/bedstat_output/9cd65cf4f07b83af35770c4a098fd4c6/GSE91663_ENCFF319TPR_conservative_idr_thresholded_peaks_GRCh38_widths_histogram"
[1] "Plotting: /Users/mstolarczyk/Desktop/testing/bedbase_tutorial/outputs/bedstat_output/9cd65cf4f07b83af35770c4a098fd4c6/GSE91663_ENCFF319TPR_conservative_idr_thresholded_peaks_GRCh38_neighbor_distances"
[1] "Plotting: /Users/mstolarczyk/Desktop/testing/bedbase_tutorial/outputs/bedstat_output/9cd65cf4f07b83af35770c4a098fd4c6/GSE91663_ENCFF319TPR_conservative_idr_thresholded_peaks_GRCh38_open_chromatin"
Error in mean(gcvec) : object 'gcvec' not found

Can't use kibana

I'm trying to use kibana to see what got into elastic, as described in the readme.

 docker ps | grep elasticsearch
ec89fe84e486        elasticsearch:6.5.4   "/usr/local/bin/do..."   11 minutes ago      Up 11 minutes                           elasticsearch

nsheff@puma:~$ docker --link ec89fe84e486:elasticsearch -p 5601:5601  docker.elastic.co/kibana/kibana:6.5.4
unknown flag: --link
See 'docker --help'.

I see this: https://docs.docker.com/network/links/

Paths should be relative

bedstat inserts absolute local paths into the database. It assumes the database instance will be on the same computer as the one running bedstat/bedhost.

These paths should be relative.

BED file metadata inclusion in JSON document?

Currently bedstat produces a JSON document with the bedfile id, path and stats calculated by GenomicDistributions. Is it a good idea to include the bedfile metadata (cell type, genome, description, protocol, data source etc) in the JSON doc?. That way bedbuncher could access those key:value mappings to construct a more elaborate bedset PEP (currently it just encompasses sample name and file path).

dots in filenames may lead to data loss

based on real-life example

bedstat.py pipeline does not account for BED files that consist of dots (.) in their names. Currently, only the first part of the name after splitting by . is used to construct the output dir name:

dot_separator_idx = bedfile_portion.find('.')
if (dot_separator_idx < 0):
fileid = bedfile_portion
else:
fileid = bedfile_portion[0:dot_separator_idx]

in case two files are named similarily, e.g. experimentX.sample1.bed and experimentX.sample2.bed the results of both pipeline runs are saved to the same dir experimentX

we need to remove just the file extension to construct the dir, so that the results are saved to: experimentX.sample1 and experimentX.sample2

elasticsearch container memory error

I can't run the elasticsearch container... i think it's a memory limit thing. Do you have to change some other settings to get this to work?

docker run --rm -p 9200:9200 -p 9300:9300 -v /project/shefflab/database/elastic:/usr/share/elasticsearch/data elasticsearch:6.5.4
OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
[2019-10-11T17:41:26,294][INFO ][o.e.e.NodeEnvironment    ] [zglgMLr] using [1] data paths, mounts [[/usr/share/elasticsearch/data (rivi:/project/shefflab)]], net usable_space [1.9tb], net total_space [45.4tb], types [fuse.sshfs]
[2019-10-11T17:41:26,296][INFO ][o.e.e.NodeEnvironment    ] [zglgMLr] heap size [989.8mb], compressed ordinary object pointers [true]
[2019-10-11T17:41:26,311][INFO ][o.e.n.Node               ] [zglgMLr] node name derived from node ID [zglgMLr9SxW_vHYqPURTJg]; set [node.name] to override
[2019-10-11T17:41:26,311][INFO ][o.e.n.Node               ] [zglgMLr] version[6.5.4], pid[1], build[default/tar/d2ef93d/2018-12-17T21:17:40.758843Z], OS[Linux/4.4.0-164-generic/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/11.0.1/11.0.1+13]
[2019-10-11T17:41:26,312][INFO ][o.e.n.Node               ] [zglgMLr] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch.QTGJybZo, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m, -Djava.locale.providers=COMPAT, -XX:UseAVX=2, -Des.cgroups.hierarchy.override=/, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/usr/share/elasticsearch/config, -Des.distribution.flavor=default, -Des.distribution.type=tar]
[2019-10-11T17:41:27,503][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [aggs-matrix-stats]
[2019-10-11T17:41:27,503][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [analysis-common]
[2019-10-11T17:41:27,503][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [ingest-common]
[2019-10-11T17:41:27,503][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [lang-expression]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [lang-mustache]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [lang-painless]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [mapper-extras]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [parent-join]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [percolator]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [rank-eval]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [reindex]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [repository-url]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [transport-netty4]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [tribe]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [x-pack-ccr]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [x-pack-core]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [x-pack-deprecation]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [x-pack-graph]
[2019-10-11T17:41:27,504][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [x-pack-logstash]
[2019-10-11T17:41:27,505][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [x-pack-ml]
[2019-10-11T17:41:27,505][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [x-pack-monitoring]
[2019-10-11T17:41:27,505][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [x-pack-rollup]
[2019-10-11T17:41:27,505][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [x-pack-security]
[2019-10-11T17:41:27,505][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [x-pack-sql]
[2019-10-11T17:41:27,505][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [x-pack-upgrade]
[2019-10-11T17:41:27,505][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded module [x-pack-watcher]
[2019-10-11T17:41:27,505][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded plugin [ingest-geoip]
[2019-10-11T17:41:27,505][INFO ][o.e.p.PluginsService     ] [zglgMLr] loaded plugin [ingest-user-agent]
[2019-10-11T17:41:29,946][INFO ][o.e.x.s.a.s.FileRolesStore] [zglgMLr] parsed [0] roles from file [/usr/share/elasticsearch/config/roles.yml]
[2019-10-11T17:41:30,354][INFO ][o.e.x.m.j.p.l.CppLogMessageHandler] [zglgMLr] [controller/76] [Main.cc@109] controller (64 bit): Version 6.5.4 (Build b616085ef32393) Copyright (c) 2018 Elasticsearch BV
[2019-10-11T17:41:30,836][INFO ][o.e.d.DiscoveryModule    ] [zglgMLr] using discovery type [zen] and host providers [settings]
[2019-10-11T17:41:31,387][INFO ][o.e.n.Node               ] [zglgMLr] initialized
[2019-10-11T17:41:31,388][INFO ][o.e.n.Node               ] [zglgMLr] starting ...
[2019-10-11T17:41:31,497][INFO ][o.e.t.TransportService   ] [zglgMLr] publish_address {172.17.0.2:9300}, bound_addresses {0.0.0.0:9300}
[2019-10-11T17:41:31,510][INFO ][o.e.b.BootstrapChecks    ] [zglgMLr] bound or publishing to a non-loopback address, enforcing bootstrap checks
ERROR: [1] bootstrap checks failed
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2019-10-11T17:41:31,520][INFO ][o.e.n.Node               ] [zglgMLr] stopping ...
[2019-10-11T17:41:31,554][INFO ][o.e.n.Node               ] [zglgMLr] stopped
[2019-10-11T17:41:31,554][INFO ][o.e.n.Node               ] [zglgMLr] closing ...
[2019-10-11T17:41:31,569][INFO ][o.e.n.Node               ] [zglgMLr] closed
[2019-10-11T17:41:31,571][INFO ][o.e.x.m.j.p.NativeController] [zglgMLr] Native controller process has stopped - no new native processes can be started

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.