I'm trying to run the evolverMammals example on an SGE cluster (where Docker and Singularity aren't supported) using the latest version of Cactus installed through git. My problem seems similar to another recent issue report: #57.
In case it's relevant, here are a few notes about how I installed Cactus. I first compiled the older version (progressiveCactus) from GitHub, because this automatically downloads and compiles the needed dependencies, including Kyoto Tycoon (the newest version of Cactus doesn't include this). I then sourced the environment from progressiveCactus, compiled the newer version of Cactus and installed it via Pip into a freshly created Conda environment.
The evolverMammals test works fine for me on a single node, i.e. when running the following through qsub:
However, things are failing when running distributed on multiple nodes of an SGE queue, as such:
Below are some of the errors I get. There are multiple retries, but the job never manages to continue successfully. The cluster nodes should be able to communicate to each other, so I'm not sure about what could cause the ST_KV_DATABASE_EXCEPTION messages.
INFO:toil.leader:Issued job 'StartPrimaryDB' D/F/jobwjfXJl with job batch system ID: 150 and cores: 1, disk: 2.0 G, and memory: 3.3 G
INFO:toil.leader:Job ended successfully: 'StartPrimaryDB' D/F/jobwjfXJl
INFO:toil.leader:Issued job 'KtServerService' B/T/jobNX_Wvk with job batch system ID: 151 and cores: 0, disk: 2.0 G, and memory: 2.3 G
INFO:toil.leader:Issued job 'CactusSetupPhase' G/Y/jobmegNP3 with job batch system ID: 152 and cores: 1, disk: 2.0 G, and memory: 3.3 G
INFO:toil.leader:Job ended successfully: 'KtServerService' B/T/jobNX_Wvk
WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'KtServerService' B/T/jobNX_Wvk
WARNING:toil.leader:B/T/jobNX_Wvk INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
WARNING:toil.leader:B/T/jobNX_Wvk INFO:toil:Running Toil version 3.18.0-84239d802248a5f4a220e762b3b8ce5cc92af0be.
WARNING:toil.leader:B/T/jobNX_Wvk WARNING:toil.resource:'JTRES_5d2f846cd67858267ed5af4717d96bda' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:B/T/jobNX_Wvk WARNING:toil.resource:'JTRES_5d2f846cd67858267ed5af4717d96bda' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:B/T/jobNX_Wvk INFO:cactus.shared.common:Running the command ['netstat', '-tuplen']
WARNING:toil.leader:B/T/jobNX_Wvk (No info could be read for "-p": geteuid()=98354 but you should be root.)
WARNING:toil.leader:B/T/jobNX_Wvk INFO:cactus.shared.common:Running the command ['ktserver', '-port', '29439', '-ls', '-tout', '200000', '-th', '64', '-bgs', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-6386a5c9-5d92-486a-9720-412b1ca610f6/tmpxersIz/e6b71d4f-cc17-405b-9945-bf74e2503b84/t7jtQpA/snapshot', '-bgsc', 'lzo', '-bgsi', '1000000', '-log', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-6386a5c9-5d92-486a-9720-412b1ca610f6/tmpxersIz/e6b71d4f-cc17-405b-9945-bf74e2503b84/tmpdU7AZM.tmp', ':#opts=ls#bnum=30m#msiz=50g#ktopts=p']
WARNING:toil.leader:B/T/jobNX_Wvk terminate called after throwing an instance of 'std::runtime_error'
WARNING:toil.leader:B/T/jobNX_Wvk what(): pthread_create
WARNING:toil.leader:B/T/jobNX_Wvk INFO:toil.lib.bioio:Ktserver running.
WARNING:toil.leader:B/T/jobNX_Wvk INFO:toil.lib.bioio:Ktserver running.
WARNING:toil.leader:B/T/jobNX_Wvk INFO:toil.lib.bioio:Ktserver running.
WARNING:toil.leader:B/T/jobNX_Wvk INFO:cactus.shared.common:Running the command ['ktremotemgr', 'get', '-port', '29439', '-host', '172.16.13.37', 'TERMINATE']
WARNING:toil.leader:B/T/jobNX_Wvk Process ServerProcess-1:
WARNING:toil.leader:B/T/jobNX_Wvk Traceback (most recent call last):
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
WARNING:toil.leader:B/T/jobNX_Wvk self.run()
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 82, in run
WARNING:toil.leader:B/T/jobNX_Wvk self.tryRun(*self.args, **self.kwargs)
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 118, in tryRun
WARNING:toil.leader:B/T/jobNX_Wvk raise RuntimeError("KTServer failed. Log: %s" % f.read())
WARNING:toil.leader:B/T/jobNX_Wvk RuntimeError: KTServer failed. Log: 2019-03-08T10:07:26.636823+02:00: [SYSTEM]: ================ [START]: pid=20742
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:07:26.637007+02:00: [SYSTEM]: opening a database: path=:#opts=ls#bnum=30m#msiz=50g#ktopts=p
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:07:26.638447+02:00: [SYSTEM]: starting the server: expr=:29439
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:07:26.638549+02:00: [SYSTEM]: server socket opened: expr=:29439 timeout=200000.0
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:07:26.638575+02:00: [SYSTEM]: listening server socket started: fd=4
WARNING:toil.leader:B/T/jobNX_Wvk
WARNING:toil.leader:B/T/jobNX_Wvk INFO:cactus.shared.common:Running the command ['ktremotemgr', 'set', '-port', '29439', '-host', '172.16.13.37', 'TERMINATE', '1']
WARNING:toil.leader:B/T/jobNX_Wvk ktremotemgr: DB::open failed: : 6: network error: connection failed
WARNING:toil.leader:B/T/jobNX_Wvk Traceback (most recent call last):
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript
WARNING:toil.leader:B/T/jobNX_Wvk job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1351, in _runner
WARNING:toil.leader:B/T/jobNX_Wvk returnValues = self._run(jobGraph, fileStore)
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1694, in _run
WARNING:toil.leader:B/T/jobNX_Wvk returnValues = self.run(fileStore)
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1673, in run
WARNING:toil.leader:B/T/jobNX_Wvk if not service.check():
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverToil.py", line 55, in check
WARNING:toil.leader:B/T/jobNX_Wvk raise RuntimeError(msg)
WARNING:toil.leader:B/T/jobNX_Wvk RuntimeError: Traceback (most recent call last):
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 82, in run
WARNING:toil.leader:B/T/jobNX_Wvk self.tryRun(*self.args, **self.kwargs)
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 118, in tryRun
WARNING:toil.leader:B/T/jobNX_Wvk raise RuntimeError("KTServer failed. Log: %s" % f.read())
WARNING:toil.leader:B/T/jobNX_Wvk RuntimeError: KTServer failed. Log: 2019-03-08T10:07:26.636823+02:00: [SYSTEM]: ================ [START]: pid=20742
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:07:26.637007+02:00: [SYSTEM]: opening a database: path=:#opts=ls#bnum=30m#msiz=50g#ktopts=p
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:07:26.638447+02:00: [SYSTEM]: starting the server: expr=:29439
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:07:26.638549+02:00: [SYSTEM]: server socket opened: expr=:29439 timeout=200000.0
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:07:26.638575+02:00: [SYSTEM]: listening server socket started: fd=4
WARNING:toil.leader:B/T/jobNX_Wvk
WARNING:toil.leader:B/T/jobNX_Wvk
WARNING:toil.leader:B/T/jobNX_Wvk ERROR:toil.worker:Exiting the worker because of a failed job on host haswell-wn37.grid.pub.ro
WARNING:toil.leader:B/T/jobNX_Wvk WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'KtServerService' B/T/jobNX_Wvk with ID B/T/jobNX_Wvk to 5
INFO:toil.leader:Issued job 'KtServerService' B/T/jobNX_Wvk with job batch system ID: 153 and cores: 0, disk: 2.0 G, and memory: 2.3 G
INFO:toil.leader:Job ended successfully: 'CactusSetupPhase' G/Y/jobmegNP3
WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'CactusSetupPhase' G/Y/jobmegNP3
WARNING:toil.leader:G/Y/jobmegNP3 INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
WARNING:toil.leader:G/Y/jobmegNP3 INFO:toil:Running Toil version 3.18.0-84239d802248a5f4a220e762b3b8ce5cc92af0be.
WARNING:toil.leader:G/Y/jobmegNP3 WARNING:toil.resource:'JTRES_5d2f846cd67858267ed5af4717d96bda' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:G/Y/jobmegNP3 INFO:toil.lib.bioio:Sequences in cactus setup: ['simHuman_chr6', 'simMouse_chr6', 'simRat_chr6', 'simCow_chr6', 'simDog_chr6']
WARNING:toil.leader:G/Y/jobmegNP3 INFO:toil.lib.bioio:Sequences in cactus setup filenames: ['>id=1|simHuman.chr6|0\n', '>id=0|simMouse.chr6\n', '>id=2|simRat.chr6\n', '>id=4|simCow.chr6|0\n', '>id=3|simDog.chr6|0\n']
WARNING:toil.leader:G/Y/jobmegNP3 INFO:cactus.shared.common:Running the command ['cactus_setup', '--speciesTree', '((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_chr6:0.091589)mr:0.271974)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;', '--cactusDisk', '<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="172.16.13.37" port="29439" />\n\t\t</st_kv_database_conf>\n\t', '--logLevel', 'INFO', '--outgroupEvents', 'simHuman_chr6 simDog_chr6 simCow_chr6', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmp3wGM7F.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpqkriEI.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpo20GAf.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpL6ca4z.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmp5PbyOe.tmp']
WARNING:toil.leader:G/Y/jobmegNP3 Set log level to INFO
WARNING:toil.leader:G/Y/jobmegNP3 Flower disk name : <st_kv_database_conf type="kyoto_tycoon">
WARNING:toil.leader:G/Y/jobmegNP3 <kyoto_tycoon database_dir="fakepath" host="172.16.13.37" port="29439" />
WARNING:toil.leader:G/Y/jobmegNP3 </st_kv_database_conf>
WARNING:toil.leader:G/Y/jobmegNP3
WARNING:toil.leader:G/Y/jobmegNP3 Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmp3wGM7F.tmp
WARNING:toil.leader:G/Y/jobmegNP3 Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpqkriEI.tmp
WARNING:toil.leader:G/Y/jobmegNP3 Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpo20GAf.tmp
WARNING:toil.leader:G/Y/jobmegNP3 Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpL6ca4z.tmp
WARNING:toil.leader:G/Y/jobmegNP3 Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmp5PbyOe.tmp
WARNING:toil.leader:G/Y/jobmegNP3 Exception: ST_KV_DATABASE_EXCEPTION: Opening connection to host: 172.16.13.37 with error: network error
WARNING:toil.leader:G/Y/jobmegNP3 Uncaught exception
WARNING:toil.leader:G/Y/jobmegNP3 Traceback (most recent call last):
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript
WARNING:toil.leader:G/Y/jobmegNP3 job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/shared/common.py", line 1096, in _runner
WARNING:toil.leader:G/Y/jobmegNP3 super(RoundedJob, self)._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1351, in _runner
WARNING:toil.leader:G/Y/jobmegNP3 returnValues = self._run(jobGraph, fileStore)
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1296, in _run
WARNING:toil.leader:G/Y/jobmegNP3 return self.run(fileStore)
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/cactus_workflow.py", line 641, in run
WARNING:toil.leader:G/Y/jobmegNP3 makeEventHeadersAlphaNumeric=self.getOptionalPhaseAttrib("makeEventHeadersAlphaNumeric", bool, False))
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/shared/common.py", line 220, in runCactusSetup
WARNING:toil.leader:G/Y/jobmegNP3 parameters=["cactus_setup"] + args + sequences)
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/shared/common.py", line 1040, in cactus_call
WARNING:toil.leader:G/Y/jobmegNP3 raise RuntimeError("Command %s failed with output: %s" % (call, output))
WARNING:toil.leader:G/Y/jobmegNP3 RuntimeError: Command ['cactus_setup', '--speciesTree', '((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_chr6:0.091589)mr:0.271974)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;', '--cactusDisk', '<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="172.16.13.37" port="29439" />\n\t\t</st_kv_database_conf>\n\t', '--logLevel', 'INFO', '--outgroupEvents', 'simHuman_chr6 simDog_chr6 simCow_chr6', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmp3wGM7F.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpqkriEI.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpo20GAf.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmpL6ca4z.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-ee32135c-bc45-4f9b-bd5f-12666414cf0b/tmprNSNY8/9bdf7175-8ea1-4f43-a01a-815454f61b67/tmp5PbyOe.tmp'] failed with output:
WARNING:toil.leader:G/Y/jobmegNP3 ERROR:toil.worker:Exiting the worker because of a failed job on host haswell-wn41.grid.pub.ro
WARNING:toil.leader:G/Y/jobmegNP3 WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'CactusSetupPhase' G/Y/jobmegNP3 with ID G/Y/jobmegNP3 to 5
INFO:toil.leader:Issued job 'CactusSetupPhase' G/Y/jobmegNP3 with job batch system ID: 154 and cores: 1, disk: 2.0 G, and memory: 3.3 G
INFO:toil.leader:Job ended successfully: 'CactusSetupPhase' G/Y/jobmegNP3
WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'CactusSetupPhase' G/Y/jobmegNP3
WARNING:toil.leader:G/Y/jobmegNP3 INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
WARNING:toil.leader:G/Y/jobmegNP3 INFO:toil:Running Toil version 3.18.0-84239d802248a5f4a220e762b3b8ce5cc92af0be.
WARNING:toil.leader:G/Y/jobmegNP3 WARNING:toil.resource:'JTRES_5d2f846cd67858267ed5af4717d96bda' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:G/Y/jobmegNP3 INFO:toil.lib.bioio:Sequences in cactus setup: ['simHuman_chr6', 'simMouse_chr6', 'simRat_chr6', 'simCow_chr6', 'simDog_chr6']
WARNING:toil.leader:G/Y/jobmegNP3 INFO:toil.lib.bioio:Sequences in cactus setup filenames: ['>id=1|simHuman.chr6|0\n', '>id=0|simMouse.chr6\n', '>id=2|simRat.chr6\n', '>id=4|simCow.chr6|0\n', '>id=3|simDog.chr6|0\n']
WARNING:toil.leader:G/Y/jobmegNP3 INFO:cactus.shared.common:Running the command ['cactus_setup', '--speciesTree', '((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_chr6:0.091589)mr:0.271974)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;', '--cactusDisk', '<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="172.16.13.37" port="29439" />\n\t\t</st_kv_database_conf>\n\t', '--logLevel', 'INFO', '--outgroupEvents', 'simHuman_chr6 simDog_chr6 simCow_chr6', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp6DYzJV.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp4U7wPE.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp2oM2za.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmpUgFVai.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmpWG2t9T.tmp']
WARNING:toil.leader:G/Y/jobmegNP3 Set log level to INFO
WARNING:toil.leader:G/Y/jobmegNP3 Flower disk name : <st_kv_database_conf type="kyoto_tycoon">
WARNING:toil.leader:G/Y/jobmegNP3 <kyoto_tycoon database_dir="fakepath" host="172.16.13.37" port="29439" />
WARNING:toil.leader:G/Y/jobmegNP3 </st_kv_database_conf>
WARNING:toil.leader:G/Y/jobmegNP3
WARNING:toil.leader:G/Y/jobmegNP3 Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp6DYzJV.tmp
WARNING:toil.leader:G/Y/jobmegNP3 Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp4U7wPE.tmp
WARNING:toil.leader:G/Y/jobmegNP3 Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp2oM2za.tmp
WARNING:toil.leader:G/Y/jobmegNP3 Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmpUgFVai.tmp
WARNING:toil.leader:G/Y/jobmegNP3 Sequence file/directory /export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmpWG2t9T.tmp
WARNING:toil.leader:G/Y/jobmegNP3 Exception: ST_KV_DATABASE_EXCEPTION: Opening connection to host: 172.16.13.37 with error: network error
WARNING:toil.leader:G/Y/jobmegNP3 Uncaught exception
WARNING:toil.leader:G/Y/jobmegNP3 Traceback (most recent call last):
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript
WARNING:toil.leader:G/Y/jobmegNP3 job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/shared/common.py", line 1096, in _runner
WARNING:toil.leader:G/Y/jobmegNP3 super(RoundedJob, self)._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1351, in _runner
WARNING:toil.leader:G/Y/jobmegNP3 returnValues = self._run(jobGraph, fileStore)
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1296, in _run
WARNING:toil.leader:G/Y/jobmegNP3 return self.run(fileStore)
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/cactus_workflow.py", line 641, in run
WARNING:toil.leader:G/Y/jobmegNP3 makeEventHeadersAlphaNumeric=self.getOptionalPhaseAttrib("makeEventHeadersAlphaNumeric", bool, False))
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/shared/common.py", line 220, in runCactusSetup
WARNING:toil.leader:G/Y/jobmegNP3 parameters=["cactus_setup"] + args + sequences)
WARNING:toil.leader:G/Y/jobmegNP3 File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/shared/common.py", line 1040, in cactus_call
WARNING:toil.leader:G/Y/jobmegNP3 raise RuntimeError("Command %s failed with output: %s" % (call, output))
WARNING:toil.leader:G/Y/jobmegNP3 RuntimeError: Command ['cactus_setup', '--speciesTree', '((simHuman_chr6:0.144018,(simMouse_chr6:0.084509,simRat_chr6:0.091589)mr:0.271974)Anc1:0.020593,(simCow_chr6:0.18908,simDog_chr6:0.16303)Anc2:0.032898)Anc0;', '--cactusDisk', '<st_kv_database_conf type="kyoto_tycoon">\n\t\t\t<kyoto_tycoon database_dir="fakepath" host="172.16.13.37" port="29439" />\n\t\t</st_kv_database_conf>\n\t', '--logLevel', 'INFO', '--outgroupEvents', 'simHuman_chr6 simDog_chr6 simCow_chr6', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp6DYzJV.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp4U7wPE.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmp2oM2za.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmpUgFVai.tmp', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-88711676-5948-47d0-acd3-569974301115/tmpKeNbt7/ef794177-3ebd-4c51-8ae5-971a58ac7d96/tmpWG2t9T.tmp'] failed with output:
WARNING:toil.leader:G/Y/jobmegNP3 ERROR:toil.worker:Exiting the worker because of a failed job on host haswell-wn35.grid.pub.ro
WARNING:toil.leader:G/Y/jobmegNP3 WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'CactusSetupPhase' G/Y/jobmegNP3 with ID G/Y/jobmegNP3 to 4
INFO:toil.leader:Issued job 'CactusSetupPhase' G/Y/jobmegNP3 with job batch system ID: 155 and cores: 1, disk: 2.0 G, and memory: 3.3 G
INFO:toil.leader:Job ended successfully: 'KtServerService' B/T/jobNX_Wvk
WARNING:toil.leader:The job seems to have left a log file, indicating failure: 'KtServerService' B/T/jobNX_Wvk
WARNING:toil.leader:B/T/jobNX_Wvk INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
WARNING:toil.leader:B/T/jobNX_Wvk INFO:toil:Running Toil version 3.18.0-84239d802248a5f4a220e762b3b8ce5cc92af0be.
WARNING:toil.leader:B/T/jobNX_Wvk WARNING:toil.resource:'JTRES_5d2f846cd67858267ed5af4717d96bda' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:B/T/jobNX_Wvk WARNING:toil.resource:'JTRES_5d2f846cd67858267ed5af4717d96bda' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
WARNING:toil.leader:B/T/jobNX_Wvk INFO:cactus.shared.common:Running the command ['netstat', '-tuplen']
WARNING:toil.leader:B/T/jobNX_Wvk (No info could be read for "-p": geteuid()=98354 but you should be root.)
WARNING:toil.leader:B/T/jobNX_Wvk INFO:cactus.shared.common:Running the command ['ktserver', '-port', '26666', '-ls', '-tout', '200000', '-th', '64', '-bgs', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-b02a3811-2b63-4208-851d-7815af46a62d/tmp2i3iEe/bbd7502d-1905-454c-8a42-2a91f1f28f96/tTaU1Kr/snapshot', '-bgsc', 'lzo', '-bgsi', '1000000', '-log', u'/export/home/ncit/external/a.mizeranschi/temp/cactus-test/cactusTemp/toil-f97fff5e-27d1-4f96-a60e-e3618942fc1e-b02a3811-2b63-4208-851d-7815af46a62d/tmp2i3iEe/bbd7502d-1905-454c-8a42-2a91f1f28f96/tmpFZwVOe.tmp', ':#opts=ls#bnum=30m#msiz=50g#ktopts=p']
WARNING:toil.leader:B/T/jobNX_Wvk terminate called after throwing an instance of 'std::runtime_error'
WARNING:toil.leader:B/T/jobNX_Wvk what(): pthread_create
WARNING:toil.leader:B/T/jobNX_Wvk INFO:toil.lib.bioio:Ktserver running.
WARNING:toil.leader:B/T/jobNX_Wvk INFO:toil.lib.bioio:Ktserver running.
WARNING:toil.leader:B/T/jobNX_Wvk INFO:toil.lib.bioio:Ktserver running.
WARNING:toil.leader:B/T/jobNX_Wvk INFO:cactus.shared.common:Running the command ['ktremotemgr', 'get', '-port', '26666', '-host', '172.16.13.39', 'TERMINATE']
WARNING:toil.leader:B/T/jobNX_Wvk Process ServerProcess-1:
WARNING:toil.leader:B/T/jobNX_Wvk Traceback (most recent call last):
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
WARNING:toil.leader:B/T/jobNX_Wvk self.run()
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 82, in run
WARNING:toil.leader:B/T/jobNX_Wvk self.tryRun(*self.args, **self.kwargs)
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 118, in tryRun
WARNING:toil.leader:B/T/jobNX_Wvk raise RuntimeError("KTServer failed. Log: %s" % f.read())
WARNING:toil.leader:B/T/jobNX_Wvk RuntimeError: KTServer failed. Log: 2019-03-08T10:11:46.189587+02:00: [SYSTEM]: ================ [START]: pid=9125
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:11:46.189773+02:00: [SYSTEM]: opening a database: path=:#opts=ls#bnum=30m#msiz=50g#ktopts=p
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:11:46.191313+02:00: [SYSTEM]: starting the server: expr=:26666
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:11:46.191411+02:00: [SYSTEM]: server socket opened: expr=:26666 timeout=200000.0
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:11:46.191438+02:00: [SYSTEM]: listening server socket started: fd=4
WARNING:toil.leader:B/T/jobNX_Wvk
WARNING:toil.leader:B/T/jobNX_Wvk INFO:cactus.shared.common:Running the command ['ktremotemgr', 'set', '-port', '26666', '-host', '172.16.13.39', 'TERMINATE', '1']
WARNING:toil.leader:B/T/jobNX_Wvk ktremotemgr: DB::open failed: : 6: network error: connection failed
WARNING:toil.leader:B/T/jobNX_Wvk Traceback (most recent call last):
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/worker.py", line 314, in workerScript
WARNING:toil.leader:B/T/jobNX_Wvk job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore)
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1351, in _runner
WARNING:toil.leader:B/T/jobNX_Wvk returnValues = self._run(jobGraph, fileStore)
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1694, in _run
WARNING:toil.leader:B/T/jobNX_Wvk returnValues = self.run(fileStore)
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/toil/job.py", line 1673, in run
WARNING:toil.leader:B/T/jobNX_Wvk if not service.check():
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverToil.py", line 55, in check
WARNING:toil.leader:B/T/jobNX_Wvk raise RuntimeError(msg)
WARNING:toil.leader:B/T/jobNX_Wvk RuntimeError: Traceback (most recent call last):
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 82, in run
WARNING:toil.leader:B/T/jobNX_Wvk self.tryRun(*self.args, **self.kwargs)
WARNING:toil.leader:B/T/jobNX_Wvk File "/export/home/ncit/external/a.mizeranschi/toil_conda/lib/python2.7/site-packages/cactus/pipeline/ktserverControl.py", line 118, in tryRun
WARNING:toil.leader:B/T/jobNX_Wvk raise RuntimeError("KTServer failed. Log: %s" % f.read())
WARNING:toil.leader:B/T/jobNX_Wvk RuntimeError: KTServer failed. Log: 2019-03-08T10:11:46.189587+02:00: [SYSTEM]: ================ [START]: pid=9125
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:11:46.189773+02:00: [SYSTEM]: opening a database: path=:#opts=ls#bnum=30m#msiz=50g#ktopts=p
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:11:46.191313+02:00: [SYSTEM]: starting the server: expr=:26666
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:11:46.191411+02:00: [SYSTEM]: server socket opened: expr=:26666 timeout=200000.0
WARNING:toil.leader:B/T/jobNX_Wvk 2019-03-08T10:11:46.191438+02:00: [SYSTEM]: listening server socket started: fd=4