darc,loostrum

Wrong number of return values in `get_triggers`

Sometimes the following error occurs:

File "/home/oostrum/python36/lib/python3.6/site-packages/darc/amber_clustering.py", line 293, in _check_triggers
    **sys_params)
ValueError: not enough values to unpack (expected 7, got 6)

This suggests tools.get_triggers sometimes returns only 6 values instead of the expected 7. Perhaps this is related to the recent addition of the number of candidates per cluster to the returned values.

Support external config file

DARC should try to load external config file and use inernal file as fall-back.
This way, it is possible to put sensitive information in a local file

AMBER header is counted as trigger

In the email from the processor, 3 triggers are listed even if there are 0 AMBER candidates.
Apparently it's counting the 3 headers.

Clustering process sometimes fails to stop

PDF merging with pypdf

Instead of using ghostscript, PDF merging could be done in python with e.g. pypdf3

Add Dany's SB filtering

Add fetch_after_obs command to email

Add to both old and new emailer

Reduce image size

Plots to be attached to email are quite big; could reduce by using png instead of pdf.

Read config file on observation start

Upon a config change, the current pipeline requires a restart of the affected service.
This interrupts running observations. Instead, service could reload the config
at each observation start

Use Queue logger in Processor subprocesses

To avoid deadlocks

Add command to print attributes

Something like darc --service dada_trigger --attr port_iquv should print the DADATrigger.port_iquv attribute.

-1 vs 0 in offline processing output

Verify that -1 is only printed if there is an error, and 0 if there are no triggers.
Sometimes it seems -1 is printed when there are 0 triggers, perhaps due to a bug
when reading an empty string or list

Move emailer.py into DARC

emailer.py is still part of ARTS-obs.
It should be converted to python3 and made part of OfflineProcessing (master)

Let offline processing use freq-time model from config

Currently using hardcoded 20190416freq_time.hdf5, which is the one that should be used

Processor sometimes gets stuck

This even happens when there are no triggers to process. There are no errors in the logs.
Recent example: arts003 processing 20210311/2021-03-11-07:46:27.FRB20190117A/, taskid=210311011. During this
observation, the data streams were down. The data folder is empty, grouped_pulses contains only a header. This is all as expected.

Did the node miss/fail to process the stop_observation message? Relevant part of the processor log:

2021-03-11 07:46:23,834.INFO.processor: Starting observation with task ID 210311011
2021-03-11 07:46:23,923.INFO.processor: Processor initialized
2021-03-11 07:46:24,023.INFO.processor: Starting Processor
2021-03-11 07:46:24,025.INFO.processor: Starting observation
2021-03-11 07:46:25,328.INFO.processor: Observation started
2021-03-11 07:46:25,411.INFO.clustering: Starting clustering thread
2021-03-11 07:46:25,417.INFO.extractor: Starting extractor thread
2021-03-11 07:46:25,420.INFO.extractor: Starting extractor thread
2021-03-11 07:46:25,423.INFO.extractor: Starting extractor thread
2021-03-11 07:46:25,426.INFO.extractor: Starting extractor thread
2021-03-11 07:46:25,427.INFO.classifier: Starting classifier thread
2021-03-11 07:46:26,304.INFO.processor: Received header: ['beam_id', 'batch_id', 'sample_id', 'integration_step', 'compacted_integration_steps', 'time', 'DM_id', 'D
2021-03-11 07:46:26,304.INFO.processor: Received header: ['beam_id', 'batch_id', 'sample_id', 'integration_step', 'compacted_integration_steps', 'time', 'DM_id', 'D
2021-03-11 07:46:26,304.INFO.processor: Received header: ['beam_id', 'batch_id', 'sample_id', 'integration_step', 'compacted_integration_steps', 'time', 'DM_id', 'D
2021-03-11 07:46:26,304.INFO.processor: Only header received - Canceling processing
2021-03-11 08:55:11,951.INFO.processor: Starting observation
2021-03-11 08:55:12,037.INFO.processor: Observation parset not found in input config, looking for master parset
2021-03-11 08:55:12,703.INFO.processor: Starting observation with task ID 210311018

There is no stop observation message in the log!
Compare to a snippet of the same part for arts004:

021-03-11 07:46:26,625.INFO.classifier: Starting classifier thread
2021-03-11 08:17:24,650.INFO.processor: Stopping observation
2021-03-11 08:17:24,650.INFO.processor: Observation parset not found in input config, looking for master parset
2021-03-11 08:17:24,654.INFO.processor: Stopping observation
2021-03-11 08:17:24,655.INFO.processor: Finishing observation
2021-03-11 08:17:27,717.INFO.clustering: Stopping clustering thread
2021-03-11 08:17:29,085.INFO.extractor: Stopping extractor thread
2021-03-11 08:17:30,330.INFO.extractor: Stopping extractor thread
2021-03-11 08:17:31,593.INFO.extractor: Stopping extractor thread
2021-03-11 08:17:32,996.INFO.extractor: Stopping extractor thread
2021-03-11 08:17:33,311.INFO.classifier: Stopping classifier thread
2021-03-11 08:17:33,748.INFO.processor: No post-classifier candidates found, skipping visualization for taskid 210311011
2021-03-11 08:17:34,149.INFO.processor: Observation finished: 210311011: 2021-03-11-07:46:27.FRB20190117A
2021-03-11 08:17:56,084.INFO.processor: Scavenging thread of taskid 210311011
2021-03-11 08:55:11,951.INFO.processor: Starting observation

The problem is indeed that stop_observation was not processed. There is also no stop_observation in logs of the other modules on arts003, so it is not processor-specific.

Abort of Processor(Master) not working

Not all subprocesses exit - check which ones are hanging and why

Add observation name to all log messages

Perhaps enough to only do it for the processor as it's the only one that can run multiple observations at the same time.
The offline processing module will be deprecated anyway, and prints the commands it runs including observation name so there it is not required.

Add DM0 S/N filter

If a candidate has a higher S/N at DM=0 than at the detection DM by some amount,
it should be ignored

Add optional processing time limit to processor

Set VOEvent type in config

Test vs real event default should be set in config.

Add option to start only offline processing

darc --offline --parset foo start_observation should send the start_observation command only to the offline processing queue. This would allow observations to be (re)processed while a real-time observation is running

Implement re-run of past observation in Processor

Needs a way to read the AMBER triggers

Run drift scan calibration tools

Based on the field name (and reference frame), the pipeline could automatically detect drift scans and run the calibration tools. Field names are like <source_name>drift<startCB><endCB> or <source_name>drift<CB>

Set LOFAR trigger / VOEvent states through their queue instead of direct method access

Required for running them as Process instead of Thread

Remove visualization white background

New processor should not use white background for missing/zero data in visualization, but the same as the old pipeline

Add option to send start/stop observation to a specific service

Enable use of Classifier as Process with mp.Pipe

The candidates_to_visualize attribute of Classifier could be extracted by the parent process with a Pipe, see https://docs.python.org/3/library/multiprocessing.html

Add test for polcal observations in dada_trigger

Lingering processes

Sometimes the processor thread does not exit, so it keeps showing on the processing web page. The observation itself does finish correctly. May be caused by reaching the processing time limit

Crash on stop of non-existing observation

arts001; stop of non-existing observation:

2020-12-02 16:46:18,570.ERROR.processor: Failed to stop observation: no such task ID 201202032
2020-12-02 16:46:18,571.ERROR.processor: Caught exception in main loop: <class 'KeyError'>: '201202032'

loostrum / darc Goto Github PK

darc's People

Contributors

Watchers

Forkers

darc's Issues

Recommend Projects

Recommend Topics

Recommend Org