Coder Social home page Coder Social logo

katdal's Introduction

katdal

This package serves as a data access library to interact with the chunk stores and HDF5 files produced by the MeerKAT radio telescope and its predecessors (KAT-7 and Fringe Finder), which are collectively known as MeerKAT Visibility Format (MVF) data sets. It uses memory carefully, allowing data sets to be inspected and partially loaded into memory. Data sets may be concatenated and split via a flexible selection mechanism. In addition, it provides a script to convert these data sets to CASA MeasurementSets.

Quick Tutorial

Open any data set through a single function to obtain a data set object:

import katdal
d = katdal.open('1234567890.h5')

The open function automatically determines the version and storage location of the data set. The versions roughly map to the various instruments:

- v1 : Fringe Finder (HDF5 file)
- v2 : KAT-7 (HDF5 file)
- v3 : MeerKAT (HDF5 file)
- v4 : MeerKAT (RDB file + chunk store based on objects in Ceph)

Each MVFv4 data set is split into a Redis dump (aka RDB) file containing the metadata in the form of a telescope state database, and a chunk store containing the visibility data split into many small blocks or chunks (typically served by a Ceph object store over the network). The RDB file is the main entry point to the data set and it can be accessed directly from the MeerKAT SDP archive if you have the appropriate permissions:

# This is just for illustration - the real URL looks a bit different
d = katdal.open('https://archive/1234567890/1234567890_sdp_l0.rdb?token=AsD3')

Multiple data sets (even of different versions) may also be concatenated together (as long as they have the same dump rate):

d = katdal.open(['1234567890.h5', '1234567891.h5'])

Inspect the contents of the data set by printing the object:

print(d)

Here is a typical output:

===============================================================================
Name: 1313067732.h5 (version 2.0)
===============================================================================
Observer: someone  Experiment ID: 2118d346-c41a-11e0-b2df-a4badb44fe9f
Description: 'Track on Hyd A,Vir A, 3C 286 and 3C 273'
Observed from 2011-08-11 15:02:14.072 SAST to 2011-08-11 15:19:47.810 SAST
Dump rate: 1.00025 Hz
Subarrays: 1
ID  Antennas                            Inputs  Corrprods
 0  ant1,ant2,ant3,ant4,ant5,ant6,ant7  14      112
Spectral Windows: 1
ID  CentreFreq(MHz)  Bandwidth(MHz)  Channels  ChannelWidth(kHz)
 0  1822.000         400.000          1024      390.625
-------------------------------------------------------------------------------
Data selected according to the following criteria:
subarray=0
ants=['ant1', 'ant2', 'ant3', 'ant4', 'ant5', 'ant6', 'ant7']
spw=0
-------------------------------------------------------------------------------
Shape: (1054 dumps, 1024 channels, 112 correlation products) => Size: 967.049 MB
Antennas: *ant1,ant2,ant3,ant4,ant5,ant6,ant7  Inputs: 14  Autocorr: yes  Crosscorr: yes
Channels: 1024 (index 0 - 1023, 2021.805 MHz - 1622.195 MHz), each 390.625 kHz wide
Targets: 4 selected out of 4 in catalogue
ID  Name    Type      RA(J2000)     DEC(J2000)  Tags  Dumps  ModelFlux(Jy)
 0  Hyd A   radec      9:18:05.28  -12:05:48.9          333      33.63
 1  Vir A   radec     12:30:49.42   12:23:28.0          251     166.50
 2  3C 286  radec     13:31:08.29   30:30:33.0          230      12.97
 3  3C 273  radec     12:29:06.70    2:03:08.6          240      39.96
Scans: 8 selected out of 8 total       Compscans: 1 selected out of 1 total
Date        Timerange(UTC)       ScanState  CompScanLabel  Dumps  Target
11-Aug-2011/13:02:14 - 13:04:26    0:slew     0:             133    0:Hyd A
            13:04:27 - 13:07:46    1:track    0:             200    0:Hyd A
            13:07:47 - 13:08:37    2:slew     0:              51    1:Vir A
            13:08:38 - 13:11:57    3:track    0:             200    1:Vir A
            13:11:58 - 13:12:27    4:slew     0:              30    2:3C 286
            13:12:28 - 13:15:47    5:track    0:             200    2:3C 286
            13:15:48 - 13:16:27    6:slew     0:              40    3:3C 273
            13:16:28 - 13:19:47    7:track    0:             200    3:3C 273

The first segment of the printout displays the static information of the data set, including observer, dump rate and all the available subarrays and spectral windows in the data set. The second segment (between the dashed lines) highlights the active selection criteria. The last segment displays dynamic information that is influenced by the selection, including the overall visibility array shape, antennas, channel frequencies, targets and scan info.

The data set is built around the concept of a three-dimensional visibility array with dimensions of time, frequency and correlation product. This is reflected in the shape of the dataset:

d.shape

which returns (1054, 1024, 112), meaning 1054 dumps by 1024 channels by 112 correlation products.

Let's select a subset of the data set:

d.select(scans='track', channels=slice(200, 300), ants='ant4')
print(d)

This results in the following printout:

===============================================================================
Name: /Users/schwardt/Downloads/1313067732.h5 (version 2.0)
===============================================================================
Observer: siphelele  Experiment ID: 2118d346-c41a-11e0-b2df-a4badb44fe9f
Description: 'track on Hyd A,Vir A, 3C 286 and 3C 273 for Lud'
Observed from 2011-08-11 15:02:14.072 SAST to 2011-08-11 15:19:47.810 SAST
Dump rate: 1.00025 Hz
Subarrays: 1
ID  Antennas                            Inputs  Corrprods
 0  ant1,ant2,ant3,ant4,ant5,ant6,ant7  14      112
Spectral Windows: 1
ID  CentreFreq(MHz)  Bandwidth(MHz)  Channels  ChannelWidth(kHz)
 0  1822.000         400.000          1024      390.625
-------------------------------------------------------------------------------
Data selected according to the following criteria:
channels=slice(200, 300, None)
subarray=0
scans='track'
ants='ant4'
spw=0
-------------------------------------------------------------------------------
Shape: (800 dumps, 100 channels, 4 correlation products) => Size: 2.560 MB
Antennas: ant4  Inputs: 2  Autocorr: yes  Crosscorr: no
Channels: 100 (index 200 - 299, 1943.680 MHz - 1905.008 MHz), each 390.625 kHz wide
Targets: 4 selected out of 4 in catalogue
ID  Name    Type      RA(J2000)     DEC(J2000)  Tags  Dumps  ModelFlux(Jy)
 0  Hyd A   radec      9:18:05.28  -12:05:48.9          200      31.83
 1  Vir A   radec     12:30:49.42   12:23:28.0          200     159.06
 2  3C 286  radec     13:31:08.29   30:30:33.0          200      12.61
 3  3C 273  radec     12:29:06.70    2:03:08.6          200      39.32
Scans: 4 selected out of 8 total       Compscans: 1 selected out of 1 total
Date        Timerange(UTC)       ScanState  CompScanLabel  Dumps  Target
11-Aug-2011/13:04:27 - 13:07:46    1:track    0:             200    0:Hyd A
            13:08:38 - 13:11:57    3:track    0:             200    1:Vir A
            13:12:28 - 13:15:47    5:track    0:             200    2:3C 286
            13:16:28 - 13:19:47    7:track    0:             200    3:3C 273

Compared to the first printout, the static information has remained the same while the dynamic information now reflects the selected subset. There are many possible selection criteria, as illustrated below:

d.select(timerange=('2011-08-11 13:10:00', '2011-08-11 13:15:00'), targets=[1, 2])
d.select(spw=0, subarray=0)
d.select(ants='ant1,ant2', pol='H', scans=(0,1,2), freqrange=(1700e6, 1800e6))

See the docstring of DataSet.select for more detailed information (i.e. do d.select? in IPython). Take note that only one subarray and one spectral window must be selected.

Once a subset of the data has been selected, you can access the data and timestamps on the data set object:

vis = d.vis[:]
timestamps = d.timestamps[:]

Note the [:] indexing, as the vis and timestamps properties are special LazyIndexer objects that only give you the actual data when you use indexing, in order not to inadvertently load the entire array into memory.

For the example dataset and no selection the vis array will have a shape of (1054, 1024, 112). The time dimension is labelled by d.timestamps, the frequency dimension by d.channel_freqs and the correlation product dimension by d.corr_products.

Another key concept in the data set object is that of sensors. These are named time series of arbitrary data that are either loaded from the data set (actual sensors) or calculated on the fly (virtual sensors). Both variants are accessed through the sensor cache (available as d.sensor) and cached there after the first access. The data set object also provides convenient properties to expose commonly-used sensors, as shown in the plot example below:

import matplotlib.pyplot as plt
plt.plot(d.az, d.el, 'o')
plt.xlabel('Azimuth (degrees)')
plt.ylabel('Elevation (degrees)')

Other useful attributes include ra, dec, lst, mjd, u, v, w, target_x and target_y. These are all one-dimensional NumPy arrays that dynamically change length depending on the active selection.

As in katdal's predecessor (scape) there is a DataSet.scans generator that allows you to step through the scans in the data set. It returns the scan index, scan state and target object on each iteration, and updates the active selection on the data set to include only the current scan. It is also possible to iterate through the compound scans with the DataSet.compscans generator, which yields the compound scan index, label and first target on each iteration for convenience. These two iterators may also be used together to traverse the data set structure:

for compscan, label, target in d.compscans():
    plt.figure()
    for scan, state, target in d.scans():
        if state in ('scan', 'track'):
            plt.plot(d.ra, d.dec, 'o')
    plt.xlabel('Right ascension (J2000 degrees)')
    plt.ylabel('Declination (J2000 degrees)')
    plt.title(target.name)

Finally, all the targets (or fields) in the data set are stored in a catalogue available at d.catalogue, and the original HDF5 file is still accessible via a back door installed at d.file in the case of a single-file data set (v3 or older). On a v4 data set, d.source provides access to the underlying telstate for metadata and the chunk store for data.

katdal's People

Contributors

adriaanph avatar bennahugo avatar bmerry avatar brickza avatar cchristelis avatar ctgschollar avatar james-smith-za avatar jordatious avatar kimmcalpine avatar laurarichter avatar ludwigschwardt avatar mattieudv avatar neiljyoung avatar ratt-priv-ci avatar rubyvanrooyen avatar sharmilagoedhart avatar sjperkins avatar spassmoor avatar sratcliffe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

katdal's Issues

BadChunk: Array '1543661287/correlator_data

I'm trying to download data from the archive using a link with token but it fails with bad chunk array data.

By running:
KATSDPTELSTATE_ALLOW_PICKLE=1 mvftoms.py --flags cam -f https://archive-gw-1.kat.ac.za/1543661287/1543661287_sdp_l0.full.rdb?token=********************

Gives the following output:

No calibration products will be applied                                                                  
Per user request the following antennas will be selected: 'm000', 'm001', 'm002', 'm003', 'm004', 'm005', 'm006', 'm007', 'm009', 'm010', 'm011', 'm012', 'm013', 'm014', 'm015', 'm016', 'm017', 'm018', 'm019', '
m020', 'm021', 'm022', 'm023', 'm024', 'm025', 'm026', 'm027', 'm028', 'm029', 'm030', 'm031', 'm032', 'm033', 'm034', 'm035', 'm036', 'm037', 'm038', 'm039', 'm040', 'm041', 'm042', 'm043', 'm044', 'm045', 'm04
7', 'm048', 'm049', 'm050', 'm051', 'm053', 'm054', 'm055', 'm056', 'm057', 'm058', 'm059', 'm060', 'm061', 'm062'       
Per user request the following target fields will be selected: 'J1939-6342', '1613-586', 'T16R02C02'
Per user request the following scans will be dumped: 1, 3, 5
Extract MS for spw 0: centre frequency 1284000000 Hz
Will create MS output in 1543661287_sdp_l0.ms

#### Producing a full polarisation MS (HH,HV,VH,VV) ####


Using array as the reference antenna. All targets and scans will be based on this antenna.


Iterating through scans in dataset(s)...

Writing static meta data...
scan   1 (  37 samples) loaded. Target: 'J1939-6342'. Writing to disk...
Traceback (most recent call last):
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/site-packages/katdal/chunkstore.py", line 453, in chunk_metadata
    shape = tuple(s.stop - s.start for s in slices)
TypeError: 'int' object is not iterable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/aramaila/.virtualenvs/katdal/bin/mvftoms.py", line 916, in <module>
    main()
  File "/home/aramaila/.virtualenvs/katdal/bin/mvftoms.py", line 666, in main
    scan_vis_data, scan_weight_data, scan_flag_data)
  File "/home/aramaila/.virtualenvs/katdal/bin/mvftoms.py", line 116, in load
    out=[vis, weights, flags])
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 585, in get
    kept = [dask_getitem(array.dataset, keep) for array in arrays]
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 585, in <listcomp>
    kept = [dask_getitem(array.dataset, keep) for array in arrays]
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 139, in dask_getitem
    dsk = dask.optimization.cull(out.dask, out.__dask_keys__())[0]
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/site-packages/dask/optimization.py", line 51, in cull
    dependencies_k = get_dependencies(dsk, k, as_list=True)  # fuse needs lists
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/site-packages/dask/core.py", line 258, in get_dependencies
    return keys_in_tasks(dsk, [arg], as_list=as_list)
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/site-packages/dask/core.py", line 186, in keys_in_tasks
    if w in keys:
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/_collections_abc.py", line 666, in __contains__
    self[key]
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/site-packages/dask/highlevelgraph.py", line 643, in __getitem__
    return self.layers[key[0]][key]
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/site-packages/katdal/chunkstore.py", line 145, in __getitem__
    return self.getter(self.array_name, slices, self.dtype, **self.kwargs)
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/site-packages/katdal/chunkstore.py", line 323, in get_chunk_or_placeholder
    return self.get_chunk(array_name, slices, dtype)
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 631, in get_chunk
    chunk_name, shape = self.chunk_metadata(array_name, slices, dtype=dtype)
  File "/home/aramaila/.virtualenvs/katdal/lib/python3.6/site-packages/katdal/chunkstore.py", line 455, in chunk_metadata
    raise BadChunk(f'Array {array_name!r}: chunk ID should be '
katdal.chunkstore.BadChunk: Array '1543661287_sdp_l0/correlator_data': chunk ID should be a sequence of slice objects, not 0

System info

  • Ubuntu 18.04
  • Python3
  • Katdal 0.20

Not converting all scans in archive data to ms and Read timed out

mvftoms.py does not convert all scans in the archive data to ms. Each time I try generates different ms with several scans missing. Also sometimes gives Read timed out error somewhere midway and stop converting. Have tried this on following datasets

1536252666_sdp_l0.rdb
1536431487_sdp_l0.rdb
1536438420_sdp_l0.rdb
1536607816_sdp_l0.rdb

log file for two of conversion using 1536252666_sdp_l0.rdb are attached

1536252666_std.txt
1536252666_std_1.txt

h5toms fails to extract some scans

This may be a problem with the h5 file itself - fresh installation of master into a venv:

(katdal)> $ which h5toms.py                                                                                                          
/home/hugo/DATASETS/meerkat_reductions/oms_ghosts/katdal/bin/h5toms.py
(katdal)> $ h5toms.py -b /home/hugo/DATASETS/meerkat_reductions/deep_2_wednesday/msdir/blank.ms -o 1497034809.ms -p HH,VV 1497034809.h5
Using 'pyrap' casacore binding to produce MS
WARNING:katdal.h5datav3:Irregular timestamps detected in file '1497034809.h5': expected 516.875 dumps based on dump period and start/end times, got 517 instead
WARNING:katpoint.catalogue:Skipped '1934-638' [radec] (already in catalogue)
WARNING:katpoint.catalogue:Skipped 'main_target' [radec] (already in catalogue)
WARNING:katpoint.catalogue:Skipped '1934-638' [radec] (already in catalogue)
WARNING:katpoint.catalogue:Skipped 'main_target' [radec] (already in catalogue)
Extract MS for spw 0: central frequency 1284.00 MHz
Will create MS output in 1497034809.ms

#### Producing MS with HH,VV polarisation(s) ####


Using m057 as the reference antenna. All targets and activity detection will be based on this antenna.

Writing static meta data...

Iterating through scans in file(s)...

scan   1 (  36 samples) loaded. Target: '1934-638'. Writing to disk...
Wrote scan data (497.250000 MB) in 7.684730 s (64.706242 MBps)

scan   3 (  37 samples) loaded. Target: 'main_target'. Writing to disk...
Wrote scan data (511.062500 MB) in 7.806931 s (65.462666 MBps)

scan   4 (  37 samples) loaded. Target: 'main_target'. Writing to disk...
Wrote scan data (511.062500 MB) in 7.825770 s (65.305074 MBps)

scan   5 (  38 samples) loaded. Target: 'main_target'. Writing to disk...
Wrote scan data (524.875000 MB) in 8.144289 s (64.447001 MBps)

scan   6 (  37 samples) loaded. Target: 'main_target'. Writing to disk...
Wrote scan data (511.062500 MB) in 8.626223 s (59.245223 MBps)

scan   7 (  38 samples) loaded. Target: 'main_target'. Writing to disk...
Wrote scan data (524.875000 MB) in 8.000473 s (65.605496 MBps)

Traceback (most recent call last):
  File "/home/hugo/DATASETS/meerkat_reductions/oms_ghosts/katdal/bin/h5toms.py", line 344, in <module>
    for scan_ind, scan_state, target in h5.scans():
  File "/home/hugo/DATASETS/meerkat_reductions/oms_ghosts/katdal/local/lib/python2.7/site-packages/katdal/dataset.py", line 873, in scans
    target = self.catalogue.targets[self.target_indices[0]]
IndexError: list index out of range

the augment script doesn't exist

When trying to read an h5 file with the open function, I get the error "HDF5 file not augmented - please run augment4.py(provided by k7augment package). There are several similar mentions to augment scripts throughout the code with package names like "k7augment", "k7_augment" and "katsdisp", but I can't seem to find these packages and/or scripts anywhere.

what's the proper way to read MeerKAT visibility out?

@ludwigschwardt
if I write
vis = data.vis

it's quick

but if I write
vis = data.vis[:,:,:128]

it is a long long time...

The reason why it is slow seems to be caused by the limited network speed (10MBps for me) or inefficient readout method (e.g., maybe continuously reading can be much more fast?).
So what is the proper way to read out a segment of the visibility data (for me I'm interested in auto-corr data).

Use MeasurementSet.requiredTableDesc to avoid empty MS usage when creating a new MS

Opening this as a placeholder, enhancement issue.

Currently katdal uses an empty MeasurementSet to create a new one. Members of the SDP team have indicated that this is not ideal:

  1. MS are brittle.
  2. Reordering an existing MS to take new problem sizes is probably inefficient from a disk access POV.

python-casacore is actively looking at exposing further MS functionality in casacore/python-casacore#61, particularly MeasurementSet.requiredTableDesc which defines the default columns required to create an MS. This paves the way for creating new MS using the python-casacore wrapper without depending on an existing empty MS.

Note that OSKAR2 has also implemented creation of an MS from scratch in C++, see ms/src/oskar_measurement_set.cpp in the source zip, but this is more work compared to using python-casacore.

Dump of hdf5 files on disk

I haven't found a method yet to dump an hdf5 file as read onto a local disk. So, read a file from the archive with katdal.open, then dump it on the disk as is, to then read it again with katdal.open . If you have a local copy, this makes things much faster if you have to repeat them. If there is such method, I'd appreciate a hint, if not, it might be good to implement it.

mvftoms.py crashing on 'chunk_name not found'

Hi

I have been running the mvftoms.py on the recent 1K dataset 1539238540_sdp_l0.full.rdb on chpc-com4

Please find the error message I am getting:

nadeem@imgr-com-4:/data/nadeem/MeerKAT/1K_Mode$ mvftoms.py -m --caltables 1539238540_sdp_l0.full.rdb 
/home/nadeem/.local/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using 'pyrap' casacore binding to produce MS
Traceback (most recent call last):
  File "/home/nadeem/.local/bin/mvftoms.py", line 787, in <module>
    main()
  File "/home/nadeem/.local/bin/mvftoms.py", line 245, in main
    dataset = katdal.open(open_args, ref_ant=options.ref_ant)
  File "/home/nadeem/.local/lib/python2.7/site-packages/katdal/__init__.py", line 335, in open
    dataset = VisibilityDataV4(open_data_source(f, **kwargs),
  File "/home/nadeem/.local/lib/python2.7/site-packages/katdal/datasources.py", line 395, in open_data_source
    return TelstateDataSource.from_url(url, **kwargs)
  File "/home/nadeem/.local/lib/python2.7/site-packages/katdal/datasources.py", line 388, in from_url
    chunk_store = _infer_chunk_store(url_parts, telstate, **kwargs)
  File "/home/nadeem/.local/lib/python2.7/site-packages/katdal/datasources.py", line 292, in _infer_chunk_store
    data_path = os.path.join(store_path, telstate['chunk_name'])
  File "/home/nadeem/.local/lib/python2.7/site-packages/katsdptelstate/telescope_state.py", line 223, in __getitem__
    return self._get(key)
  File "/home/nadeem/.local/lib/python2.7/site-packages/katsdptelstate/telescope_state.py", line 500, in _get
    raise KeyError('{} not found'.format(key))
KeyError: 'chunk_name not found'

pip failure

Hi,

When I try to install katdal with pip I get the following.
Am I doing anything wrong?

Thanks,
Rhys Morris

pip3.6 install katdal --user
Collecting katdal
Downloading https://files.pythonhosted.org/packages/cd/1f/1aa5d487dc3e28d180cabdc2fd0ce158455e557bdbe1a2ec1cf57c23434a/katdal-0.9.5.tar.gz (129kB)
โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 133kB 359kB/s
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-build-l7y6hjr3/katdal/setup.py", line 26, in
news = open(os.path.join(here, 'NEWS.rst')).read()
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-build-l7y6hjr3/katdal/NEWS.rst'

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-l7y6hjr3/katdal/
You are using pip version 9.0.1, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

IndexError in mvftoms.py

Happens both with current master and v0.16:

$ mvftoms.py -f --flags=cam 1564201853_sdp_l0.rdb 1564356988_sdp_l0.rdb
...

Traceback (most recent call last):
  File "/home/oms/.venv/katdal/bin/mvftoms.py", line 816, in <module>
    main()
  File "/home/oms/.venv/katdal/bin/mvftoms.py", line 477, in main
    scan_vis_data = np.empty(in_chunk_shape, dataset.vis.dtype)
  File "/home/oms/.venv/katdal/lib/python3.6/site-packages/katdal/concatdata.py", line 634, in vis
    return ConcatenatedLazyIndexer([d.vis for d in self.datasets])
  File "/home/oms/.venv/katdal/lib/python3.6/site-packages/katdal/concatdata.py", line 71, in __init__
    self.indexers = [indexer for indexer in indexers if indexer.shape[0]]
  File "/home/oms/.venv/katdal/lib/python3.6/site-packages/katdal/concatdata.py", line 71, in <listcomp>
    self.indexers = [indexer for indexer in indexers if indexer.shape[0]]
  File "/home/oms/.venv/katdal/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 624, in shape
    return self.dataset.shape
  File "/home/oms/.venv/katdal/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 523, in dataset
    dataset = dask_getitem(self._orig_dataset, self.keep)
  File "/home/oms/.venv/katdal/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 131, in dask_getitem
    indices = _simplify_index(indices, x.shape)
  File "/home/oms/.venv/katdal/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 78, in _simplify_index
    indices = da.slicing.normalize_index(indices, shape)
  File "/home/oms/.venv/katdal/lib/python3.6/site-packages/dask/array/slicing.py", line 827, in normalize_index
    check_index(i, d)
  File "/home/oms/.venv/katdal/lib/python3.6/site-packages/dask/array/slicing.py", line 883, in check_index
    % (x.size, dimension)
IndexError: Boolean array length 7320 doesn't equal dimension 7080

Replace underscores with dashes when loading buckets from RDBs

The new Ceph installation at MeerKAT has done away with underscores in bucket names. This is to make the ceph s3 comply with the s3 protocol.

That means that some of our legacy data from before 2019 has had their buckets renamed, but the RDBs have not been updated to reflect that.

Eg. http://archive-gw-1.kat.ac.za:7480/1540580961_sdp_l1_flags is now http://archive-gw-1.kat.ac.za:7480/1540580961-sdp-l1-flags

Link to the example in the web-archive. https://archive.sarao.ac.za/search/1540580961/ This data is currently in ceph, so a good test.

We need to make a fix for katdal to replace underscores with dashes for all buckets accessed directly from the archive.

I am happy to test this out on our transfer nodes once you have an update for this.

Cascading selection results in superset of fields

If I'm not missing something here the select method modifies the dataset in place. However, a cascading selection can result in a superset of the original selection. The following ought to be equivalent:

n [39]: kd.select(targets=["PKS 0408-65"])

In [40]: kd.select(scans=range(0, 100))

In [41]: list(kd.scans())
Out[41]: 
[(0, 'slew', <katpoint.Target 'PKS 0408-65' body=radec at 0x7fe2e6ef9c90>),
 (1, 'track', <katpoint.Target 'PKS 0408-65' body=radec at 0x7fe2e6ef9c90>),
 (2, 'slew', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (3, 'track', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (4, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (5, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (6, 'slew', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (7, 'track', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (8, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (9, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (10, 'slew', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (11, 'track', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (12, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (13, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (14, 'slew', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (15, 'track', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (16, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (17, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (18, 'slew', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (19, 'track', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (20, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (21, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (22, 'slew', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (23, 'track', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (24, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (25, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (26, 'slew', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (27, 'track', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (28, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (29, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (30, 'slew', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (31, 'track', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (32, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (33, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (34, 'slew', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (35, 'track', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (36, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (37, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (38, 'slew', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (39, 'track', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (40, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (41, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (42, 'slew', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (43, 'track', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (44, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (45, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (46, 'slew', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (47, 'track', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (48, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (49, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (50, 'slew', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (51, 'track', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (52, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (53, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (54, 'slew', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (55, 'track', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (56, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (57, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (58, 'slew', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (59, 'track', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (60, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (61, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (62, 'slew', <katpoint.Target 'PKS 1934-63' body=radec at 0x7fe2e6ec8510>),
 (63, 'track', <katpoint.Target 'PKS 1934-63' body=radec at 0x7fe2e6ec8510>),
 (64, 'slew', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (65, 'track', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (66, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (67, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (68, 'slew', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (69, 'track', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (70, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (71, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (72, 'slew', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (73, 'track', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (74, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (75, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (76, 'slew', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (77, 'track', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (78, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (79, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (80, 'slew', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (81, 'track', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (82, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (83, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (84, 'slew', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (85, 'track', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (86, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (87, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (88, 'slew', <katpoint.Target 'PKS 1934-63' body=radec at 0x7fe2e6ec8510>),
 (89, 'track', <katpoint.Target 'PKS 1934-63' body=radec at 0x7fe2e6ec8510>),
 (90, 'slew', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (91, 'track', <katpoint.Target 'A3528N' body=radec at 0x7fe2e6ef9d50>),
 (92, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (93, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (94, 'slew', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (95, 'track', <katpoint.Target 'A3528S' body=radec at 0x7fe2e6ef9dd0>),
 (96, 'slew', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (97, 'track', <katpoint.Target 'J1311-222' body=radec at 0x7fe2e6ef9d90>),
 (98, 'slew', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>),
 (99, 'track', <katpoint.Target 'A3532' body=radec at 0x7fe2e6ef9e50>)]

In [42]: kd.select(reset='T')

In [43]: kd.select(targets=["PKS 0408-65"], scans=range(0, 100))

In [44]: list(kd.scans())
Out[44]: 
[(0, 'slew', <katpoint.Target 'PKS 0408-65' body=radec at 0x7fe2e6ef9c90>),
 (1, 'track', <katpoint.Target 'PKS 0408-65' body=radec at 0x7fe2e6ef9c90>)]

scans don't apply multiple negated tags, or a mix including at least one negated tag, correctly

currently, select(scans=["~slew","~stop"]) does nothing. that's because the scan filters are ORed together. same with compscans.

the current behaviour is unexpected since it is reasonable to expect the rules of boolean algebra to apply, i.e. all un-negated tags get ORed while negated tags get ANDed with the rest.

the relevant code appears in katdal/dataset.py line 740, where negated tags are ORed. this OR could simply be changed to AND.

while one may go overboard with this issue and invent a syntax to support complicated sequences of ANDs and ORs, i propose that a reasonable user would be happy to accept that all un-negated tags are ORed together, before being ANDed with the negated tags.

uvw coordinates calculated from wrong coordinate centre

Looking a a data set 1497034809.h5 where the dataset contains multiple scans where the pointing centre is kept constant but the phase centre is moved by 15 degrees. I have notice that the uvw coordinates have been calculated from the antenna pointing target and not the correlater's phase centre.

Splitting an MS created with h5toms fails

Failing to split out a calibrated field from duel correlation data in CASA 4.7. The error isn't particularly informative, so not sure what triggers this. Anyone has any ideas of what CATEGORY refers to? The closest thing I can find in memo 229 is the FLAG_CATEGORY field, but that looks fine to me. Even more puzzling things work fine on single correlation data taken earlier:

vis                 = 'msdir/1491291289.ms' #  Name of input Measurement set or Multi-MS
outputvis           = 'msdir/1491291289.deep2.ms' #  Name of output Measurement set or Multi-MS
keepmms             =       True        #  If the input is a Multi-MS the output will also be a Multi-MS.
field               = 'DEEP_2'        #  Select field using ID(s) or name(s).
spw                 =         ''        #  Select spectral window/channels.
scan                =         ''        #  Select data by scan numbers.
antenna             =         ''        #  Select data based on antenna/baseline.
correlation         =         'XX'        #  Correlation: '' ==> all, correlation='XX,YY'.
timerange           =         ''        #  Select data by time range.
intent              =         ''        #  Select data by scan intent.
array               =         ''        #  Select (sub)array(s) by array ID number.
uvrange             =         ''        #  Select data by baseline length.
observation         =         ''        #  Select by observation ID(s).
feed                =         ''        #  Multi-feed numbers: Not yet implemented.
datacolumn          = 'corrected'       #  Which data column(s) to process.
keepflags           =       True        #  Keep *completely flagged rows* instead of dropping them.
width               =          1        #  Number of channels to average to form one output channel
timebin             =       '0s'        #  Bin width for time averaging

CASA <20>: go split
---------> go(split)
Executing:  split()

2017-04-07 04:09:19     SEVERE  MSTransformDataHandler::makeMSBasicStructure    Exception filling the sub-tables: TableRecordRep::get_pointer - incorrect data type used for field CATEGORY
2017-04-07 04:09:19     SEVERE  MSTransformDataHandler::makeMSBasicStructure+   Stack Trace: 
2017-04-07 04:09:19     SEVERE  MSTransformDataHandler::makeMSBasicStructure    msdir/1491291289.deep2.ms left unfinished.
2017-04-07 04:09:19     SEVERE  mstransformer:: (file /var/rpmbuild/BUILD/casa-prerelease/casa-prerelease-4.7.0/gcwrap/tools/mstransformer/mstransformer_cmpt.cc, line 30)    Exception Reported: Error creating output MS structure
2017-04-07 04:09:19     SEVERE  split::::       Error creating output MS structure

Missing AUTOCORRS

@ludwigschwardt
Running katdal 0.9.5 to convert some recent skarab data to MSs for processing. I use autocorrelations to monitor power levels across all fields and preflag suspect data before doing calibration.

When running h5toms without -a no autocorrelations are dumped:

In [2]: t = tbl("/scratch/bhugo/modelling0408-65/msdir/raw/1523187060.ms")
Successful readonly open of default-locked table /scratch/bhugo/modelling0408-65/msdir/raw/1523187060.ms: 23 columns, 203532 rows

In [3]: a = t.getcol("ANTENNA1")
In [4]: b = t.getcol("ANTENNA2")
In [5]: c = t.getcol("DATA")
In [6]: c[a == b]
Out[6]: array([], shape=(0, 4096, 4), dtype=complex64)

Same thing happens on an older dataset 1477074305.h5

Funny enough when you do specify -a the autocorrelations are added to the MS and filled with zeros!

katdal not finding TelescopeState receiver serial number sensor due to ignoring the band

For example in the file produced on the AR1 correlator :

/var/kat/archive3/data/RTS/telescope_products/2018/01/18/1516315038.h5

The h5 file has the sensors :

In [13]: print h5.sensor['TelescopeState/m005_ap_indexer_position']
Out[13]: ['l' 'l' 'l' ..., 'l' 'l' 'l']
In [14]: print h5.sensor['TelescopeState/m005_rsc_rxl_serial_number']
Out[14]: [4029 4029 4029 ..., 4029 4029 4029]

But the code does not examine the band :

# Try sanitised version of RX serial number first
rx_sensor = 'TelescopeState/%s_rx_serial_number' % (ant,)
rx_serial = self.sensor[rx_sensor][0] if rx_sensor in self.sensor else 0

Need a nice file identifier from katdal object

katdal needs to provided a unique identifier that can be used to produce file for data products that don't clobber each other.

I have used :
filename.split('/')[-1].split('?')[0]
where file name could be a file or a url with token.
But it is getting cumbersome .

cityhash issue with windows

it's not possible to install cityhash under windows, unless one makes a few minor code modifications to it.
see
tensorflow/datasets#690
escherba/python-cityhash#25

unfortunately the cityhash maintainer doesn't seem to want to incorporate the required changes.

could you look into replacing cityhash with an alternative, perhaps like the tensorflow team, siphash / siphashc / csiphash?

Populate SIGMA_SPECTRUM

@landmanbester found the mvftoms populated (Wishart) WEIGHT_SPECTRUM is a very good reflection of the correct weights. However, there is an issue if casa (>=4.4) is used to transfer calibrate / self calibrate.
According to https://casa.nrao.edu/casadocs/casa-5-1.2/reference-material/data-weights the weights are going to reinitialized based on SIGMA (or SIGMA_SPECTRUM), so we should not just have that inverse filled correctly here to make calWt work correctly in applycal. Currently, the SIGMA_SPECTRUM is set to unity, so all the hard work @ludwigschwardt has done is being undone under the hood if CASA is used. This does not affect calibration performed by cubical/quartical at present because the cal weights are not being written to MS. This should be a very simple modification to make in mvftoms

@IanHeywood @o-smirnov @JSKenyon FYI

MStoh5

HI Ludwig,
I find a script which can convert the .h5 files to MS to do some processing on a MeerKat data. I now need some script which can convert back from MS to .h5 files. Is there some script like this?

Cheers,
Abhik

mvtoms.py: AttributeError: module 'numpy' has no attribute 'bool'.

Hi developers.

I am trying to retrieve the 32k data from the SARAO archive to IDIA.

I installed katdal on IDIA as follows:

srun --X11 --cpus-per-task=2 --mem=16GB --pty bash virtualenv -p python3 /scratch3/users/mpati/virtualenvk/katdal source /scratch3/users/blah/virtualenvk/katdal/bin/activate pip install katdal pip install python-casacore deactivate

I am running katdal as below:

source /scratch3/users/blah/virtualenvk/katdal/bin/activate export KATSDPTELSTATE_ALLOW_PICKLE=1 mvtoms.py <token-link> -v -C 19829,21551 -p HH,VV --flags '' -o outputfile.ms

I get the error below: The pip installed katdal seems to be incompatible with something. In any case, how can we fix this issue?

/scratch3/users/blah/virtualenvk/katdal/lib/python3.8/site-packages/katdal/ms_extra.py:97: FutureWarning: In the futurenp.boolwill be defined as the corresponding NumPy scalar. 'BOOLEAN': np.bool, Traceback (most recent call last): File "/scratch3/users/blah/virtualenvk/katdal/bin/mvftoms.py", line 39, in <module> from katdal import averager, ms_async, ms_extra File "/scratch3/users/blah/virtualenvk/katdal/lib/python3.8/site-packages/katdal/ms_async.py", line 36, in <module> from . import ms_extra File "/scratch3/users/blah/virtualenvk/katdal/lib/python3.8/site-packages/katdal/ms_extra.py", line 97, in <module> 'BOOLEAN': np.bool, File "/scratch3/users/blah/virtualenvk/katdal/lib/python3.8/site-packages/numpy/__init__.py", line 305, in __getattr__ raise AttributeError(__former_attrs__[attr]) AttributeError: module 'numpy' has no attribute 'bool'.np.boolwas a deprecated alias for the builtinbool. To avoid this error in existing code, use boolby itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, usenp.bool_here. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

update scans() to correctly propagate cascaded select()

die dataset.scans() kode herstel nie die selection filter heeltemal reg nie - dit ignoreer "cascaded selects".

soos tans op http://kat-imager.kat.ac.za:8888/notebooks

data = katdal.open(...)
data.select(scans=range(3,12))
print(">>>", len(list(data.scans())))      # out: >>>9
data.select(reset="",scans=["scan"])
print(">>>", len(list(data.scans())))      # out: >>>5
# tot hier is dit soos verwag, maar die volgende ignoreer die eerste select
data.select(reset="",scans=["scan"])
print(">>>", len(list(data.scans())))      # out: >>>520

presies dielsefde gebeur as mens die laaste "data.select" hierbo verwyder.

terugvoer van ludwig schwardt

Ja, dis waar.

Die probleem is dat ek die selections in a dict stoor (sien 'data._selections'), so die laaste 'scans' setting wis die voriges uit. Ek moet die seleksies eintlik in a list of key-value pairs stoor of iets dergeliks.

Die 'cascading' werk as jy 'scans' met bv 'targets' meng, wat beide time-based selections is en andersins mekaar sou vervang.

Jy is welkom om 'n feature request te maak vir die toekoms.

POINTING table is not filled

As discussed elsewhere in calibration implementation the (compulsory) POINTING table is not being generated. This means that the pointing information (necessary for applying the far field response to a model) is lost on datasets that were rephased.

I suspect the following is never actually added into the generation of the dataset:

katdal/katdal/ms_extra.py

Lines 855 to 900 in 4898eaf

def populate_pointing_dict(num_antennas, observation_duration, start_time, phase_center, pointing_name='default'):
"""Construct a dictionary containing the columns of the POINTING subtable.
The POINTING subtable contains data on individual antennas tracking a target.
It has one row per pointing/antenna?
Parameters
----------
num_antennas : integer
Number of antennas
observation_duration : float
Length of observation, in seconds
start_time : float
Start time of observation, as a Modified Julian Date in seconds
phase_center : array of float, shape (2,)
Direction of phase center, in ra-dec coordinates as 2-element array
pointing_name : string, optional
Name for pointing
Returns
-------
pointing_dict : dict
Dictionary containing columns of POINTING subtable
"""
phase_center = phase_center.reshape((2, 1, 1))
pointing_dict = {}
# Antenna Id (integer)
pointing_dict['ANTENNA_ID'] = np.arange(num_antennas, dtype=np.int32)
# Antenna pointing direction as polynomial in time (double, 2-dim)
pointing_dict['DIRECTION'] = np.repeat(phase_center, num_antennas)
# Time interval (double)
pointing_dict['INTERVAL'] = np.tile(np.float64(observation_duration), num_antennas)
# Pointing position name (string)
pointing_dict['NAME'] = np.array([pointing_name] * num_antennas)
# Series order (integer)
pointing_dict['NUM_POLY'] = np.zeros(num_antennas, dtype=np.int32)
# Target direction as polynomial in time (double, -1-dim)
pointing_dict['TARGET'] = np.repeat(phase_center, num_antennas)
# Time interval midpoint (double)
pointing_dict['TIME'] = np.tile(np.float64(start_time), num_antennas)
# Time origin for direction (double)
pointing_dict['TIME_ORIGIN'] = np.tile(np.float64(start_time), num_antennas)
# Tracking flag - True if on position (boolean)
pointing_dict['TRACKING'] = np.ones(num_antennas, dtype=np.uint8)
return pointing_dict

image

Question about applying calibration solutions

According to the information in this link for the L1 visibilities:

The generated calibration solutions are stored in the observation meta-data and can optionally be applied using katdal or when exporting to MSv2.

But when I push the data using the default calibrated setting, a corrected column is not obtained, unless the -m flag is enabled. Does this mean that the solutions are applied directly to the data column? Is there a way to keep the data column intact and have the solutions applied to only the corrected column?

Importing memory module fails

katdal seems to be unable to find katsdptelstate.memory

`meerkathi - 2019-02-12 12:29:45,475 INFO - running: mvftoms.py --output-ms /home/jenkins/msdir/1477074305.ms --tar --channel-range 2525,2776 --model-data /input/1477074305.h5

meerkathi - 2019-02-12 12:29:45,475 INFO - Traceback (most recent call last):

meerkathi - 2019-02-12 12:29:45,475 INFO - File "/usr/local/bin/mvftoms.py", line 42, in
meerkathi - 2019-02-12 12:29:45,476 INFO - import katdal
meerkathi - 2019-02-12 12:29:45,476 INFO - File "/usr/local/lib/python2.7/dist-packages/katdal/init.py", line 229, in
meerkathi - 2019-02-12 12:29:45,476 INFO - from .datasources import open_data_source
meerkathi - 2019-02-12 12:29:45,476 INFO - File "/usr/local/lib/python2.7/dist-packages/katdal/datasources.py", line 29, in
meerkathi - 2019-02-12 12:29:45,476 INFO - import katsdptelstate.memory
meerkathi - 2019-02-12 12:29:45,476 INFO - ImportError: No module named memory
meerkathi - 2019-02-12 12:29:45,476 INFO - Traceback (most recent call last):
meerkathi - 2019-02-12 12:29:45,476 INFO - File "/scratch/code/run.py", line 41, in
meerkathi - 2019-02-12 12:29:45,476 INFO - utils.xrun(cab["binary"], args+files )
meerkathi - 2019-02-12 12:29:45,476 INFO - File "/scratch/stimela/utils/init.py", line 73, in xrun
meerkathi - 2019-02-12 12:29:45,476 INFO - raise SystemError('%s: returns errr code %d'%(command, process.returncode))`

OutOfRangeError when changing the projection in datasets

When changing the projection on datasets with slews in them, the error OutOfRangeError: Target point more than 0.5 pi radians away from reference point is produces when there is a slew of that size.

I think this is a bug in katpoint that it can come to a sensible answer but a workaround would be set the katpoint.projection.set_out_of_range_treatment to either clip or nan rather than raise in katdal.

Centre frequency is reported incorrectly on older files

katdal version is 0.9.dev569+master.1a0f71c

For the below observation, the Centre Frequency is reported as .428Ghz. The channels then get shifted accordingly and we end up with a band of 0.0 - .856 Ghz.

K = katdal.open('/var/kat/archive2/data/MeerKATAR1/telescope_products/2017/09/14/1505426738.h5')
>>> K
<katdal.H5DataV3 '/var/kat/archive2/data/MeerKATAR1/telescope_products/2017/09/14/1505426738.h5' shape (1846, 32768, 544) at 0x7fb0a769ee50>
>>> print K
===============================================================================
Name: /var/kat/archive2/data/MeerKATAR1/telescope_products/2017/09/14/1505426738.h5 (version 3.0)
===============================================================================
Observer: Tom  Experiment ID: 20170914-0179
Description: 'MKAIV-608 Imaging observation PHOENIX_DEEP'
Observed from 2017-09-14 22:05:38.379 UTC to 2017-09-15 02:11:40.134 UTC
Dump rate / period: 0.12505 Hz / 7.997 s
Subarrays: 1
  ID  Antennas                            Inputs  Corrprods
   0  m000,m002,m005,m017,m020,m021,m034,m041,m042,m043,m048,m049,m050,m054,m055,m056  32      544
Spectral Windows: 1
  ID Band Product  CentreFreq(MHz)  Bandwidth(MHz)  Channels  ChannelWidth(kHz)
   0 L    c856M32k    428.000         856.000          32768        26.123
-------------------------------------------------------------------------------
Data selected according to the following criteria:
  subarray=0
  ants=['m054', 'm055', 'm056', 'm043', 'm042', 'm041', 'm002', 'm000', 'm005', 'm017', 'm021', 'm020', 'm034', 'm049', 'm050', 'm048']
  spw=0
-------------------------------------------------------------------------------

This value seems to be obtained from center_freq attribute on TelescopeModel.cbf:

>>> K.file['TelescopeModel']['cbf'].attrs['center_freq']
428000000.0
>>> K.file['TelescopeState'].attrs['sdp_l0_center_freq']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/kat/ve/local/lib/python2.7/site-packages/h5py/_hl/attrs.py", line 60, in __getitem__
    attr = h5a.open(self._id, self._e(name))
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5a.pyx", line 77, in h5py.h5a.open
KeyError: "Can't open attribute (can't locate attribute in name index)"

UVW Sign Convention Issue

We've converted DEEP2 data using h5toms.py, which seems to produce data which has a 180-degree rotation due to UVW sign convention issues.

Now -- we are using KATDAL version 0.7.1. as built from KERN-2, and as released in January 2017. I see that @bmerry committed an change in April 2017 to "Fix upside-down MeerKAT images". Does this mean that our version of h5toms.py does/does-not make the appropriate correction?

SPECTRAL_WINDOW subtable is missing with new RDB format conversion

hugo@stevie /scratch/bhugo/G330/msdir/raw [19:03:34]

$ aoflagger -strategy ../../../NGCxxx/input/jan2018_firstpass.rfis 1524852817.ms
AOFlagger 2.9.0 (2016-12-20) command line application
This program will execute an RFI strategy as can be created with the RFI gui
and executes it on one or several observations.

Author: Andrรฉ Offringa ([email protected])

Modified single-baseline strategy so it will execute strategy on all baselines and write flags.
Starting strategy on 2018-May-07 19:03:47.675879
0% : strategy...
0% : +-For each measurement set...
0% : +-+-Processing measurement set 1524852817.ms...
An exception occured during execution of the strategy!
Your set might not be fully flagged. Exception was:
SSMIndex::getIndex - access to non-existing row 0 in column NUM_CHAN of table /scratch/bhugo/G330/msdir/raw/1524852817.ms/SPECTRAL_WINDOW
No polarization statistics were collected.

Herd-based sensor readings

@spassmoor
Currently h5toms simply uses the first antenna by default as a "reference antenna" for sensor readings, in particular to work out scan boundaries for one:
https://github.com/ska-sa/katdal/blob/master/katdal/h5datav3.py#L401
and
https://github.com/ska-sa/katdal/blob/master/katdal/h5datav3.py#L538

This fails for datasets where the first antenna isn't tracking properly for some reason. Is there a particular reason why this reading cannot be herd-based such that when, say, more than half of the antennae are on target the state go into tracking and the rest are quack-flagged until such time that sensors indicate they reach target?

See for instance this dataset:

If you want a  test case, here's the culprit: 1508095451.h5

running: h5toms.py --blank-ms /var/kat/static/blank.ms --output-ms /home/sharmila/msdir/1508095451.ms --tar  --verbose  --channel-range  16997,22738 --model-data  /input/1508095451.h5
WARNING:katdal.dataset:The selection criteria resulted in an empty data set
WARNING:katdal.dataset:The selection criteria resulted in an empty data set
WARNING:katdal.dataset:The selection criteria resulted in an empty data set
Using 'pyrap' casacore binding to produce MS
Extract MS for spw 0: central frequency 1284.00 MHz
Will create MS output in /home/sharmila/msdir/1508095451.ms

#### Producing MS with HH,VV polarisation(s) ####


Channel range 16997 through 22738.

Using m045 as the reference antenna. All targets and activity detection will be based on this antenna.


Iterating through scans in file(s)...

Successful read/write open of default-locked table /home/sharmila/msdir/1508095451.ms: 25 columns, 0 rows
Writing static meta data...
Table FEED:
  opened successfully
  added 16 rows
  wrote column 'NUM_RECEPTORS' with shape (16,)
  wrote column 'SPECTRAL_WINDOW_ID' with shape (16,)
  wrote column 'RECEPTOR_ANGLE' with shape (16, 2)
  wrote column 'INTERVAL' with shape (16,)
  wrote column 'POL_RESPONSE' with shape (16, 2, 2)
  wrote column 'TIME' with shape (16,)
  wrote column 'POLARIZATION_TYPE' with shape (16, 2)
  wrote column 'FEED_ID' with shape (16,)
  wrote column 'POSITION' with shape (16, 3)
  wrote column 'BEAM_ID' with shape (16,)
  wrote column 'ANTENNA_ID' with shape (16,)
  wrote column 'BEAM_OFFSET' with shape (16, 2, 2)
  closed successfully
Table POLARIZATION:
  opened successfully
  added 1 rows
  wrote column 'CORR_TYPE' with shape (1, 2)
  wrote column 'CORR_PRODUCT' with shape (1, 2, 2)
  wrote column 'NUM_CORR' with shape (1,)
  wrote column 'FLAG_ROW' with shape (1,)
  closed successfully
Table DATA_DESCRIPTION:
  opened successfully
  added 1 rows
  wrote column 'SPECTRAL_WINDOW_ID' with shape (1,)
  wrote column 'POLARIZATION_ID' with shape (1,)
  wrote column 'FLAG_ROW' with shape (1,)
  closed successfully
Table OBSERVATION:
  opened successfully
  added 1 rows
  wrote column 'TELESCOPE_NAME' with shape (1,)
  wrote column 'LOG' with shape (1, 1)
  wrote column 'OBSERVER' with shape (1,)
  wrote column 'SCHEDULE' with shape (1, 1)
  wrote column 'RELEASE_DATE' with shape (1,)
  wrote column 'SCHEDULE_TYPE' with shape (1,)
  wrote column 'PROJECT' with shape (1,)
  wrote column 'TIME_RANGE' with shape (1, 2)
  wrote column 'FLAG_ROW' with shape (1,)
  closed successfully
Table ANTENNA:
  opened successfully
  added 16 rows
  wrote column 'NAME' with shape (16,)
  wrote column 'MOUNT' with shape (16,)
  wrote column 'DISH_DIAMETER' with shape (16,)
  wrote column 'STATION' with shape (16,)
  wrote column 'OFFSET' with shape (16, 3)
  wrote column 'POSITION' with shape (16, 3)
  wrote column 'TYPE' with shape (16,)
  wrote column 'FLAG_ROW' with shape (16,)
  closed successfully
Traceback (most recent call last):
  File "/usr/local/bin/h5toms.py", line 548, in <module>
    raise RuntimeError("No usable data found in HDF5 file "
RuntimeError: No usable data found in HDF5 file (pick another reference antenna, maybe?)

Log level for data read is too low

When reading data that is not found in the archive the log level is set to debug and the missing data is filled in with zeros. This should be a warning level output
example
DEBUG: Starting new HTTPS connection (1): archive-gw-1.kat.ac.za:443
DEBUG: https://archive-gw-1.kat.ac.za:443 "GET /1537611365_sdp_l0/correlator_data/00004_01920_00000.npy HTTP/1.1" 404 230

mvftoms creates incorrect ms file

Hi,

I'm trying to convert some latest UHF meerklass data using the mvftoms tool (The rdb file path is : /idia/raw/hi_im/SCI-20220822-MS-01/1675623808/1675623808/1675623808_sdp_l0.full.rdb). This is my first time using the mvftoms tool and my apologies if I'm missing any important details here.

I did the following on a jupyter notebook:

fname=1675623808
rdb_link="file:///idia/raw/hi_im/SCI-20220822-MS-01/1675623808/1675623808/1675623808_sdp_l0.full.rdb"

!python3 /users/sourabh/OTF/katdal/scripts/mvftoms.py -a --flags cam --scans "" {rdb_link} -o /scratch3/users/sourabh/OTF/UHF/{fname}.ms

The error I get is:

Iterating through scans in dataset(s)...

Traceback (most recent call last):
  File "/users/sourabh/OTF/katdal/scripts/mvftoms.py", line 916, in <module>
    main()
  File "/users/sourabh/OTF/katdal/scripts/mvftoms.py", line 563, in main
    ms_dict['FEED'] = ms_extra.populate_feed_dict(len(dataset.ants), num_receptors_per_feed=2)
  File "/usr/local/lib/python3.8/dist-packages/katdal/ms_extra.py", line 543, in populate_feed_dict
    feed_dict['POL_RESPONSE'] = np.dstack([np.eye(2, dtype=np.complex64) for n in range(num_feeds)]).transpose()
  File "<__array_function__ internals>", line 200, in dstack
  File "/usr/local/lib/python3.8/dist-packages/numpy/lib/shape_base.py", line 723, in dstack
    return _nx.concatenate(arrs, 2)
  File "<__array_function__ internals>", line 200, in concatenate
ValueError: need at least one array to concatenate

I get no error if I remove all the options and it creates the ms file:

!python3 /users/sourabh/OTF/katdal/scripts/mvftoms.py {rdb_link} -o /scratch3/users/sourabh/OTF/UHF/{fname}.ms
However, when I run listobs on the output ms file, I get the following error:

RuntimeError: Exception: Illegal ANTENNA1 value 60 found in main table. /scratch3/users/sourabh/OTF/UHF/[1675623808.ms/ANTENNA](http://1675623808.ms/ANTENNA) only has 0 rows (IDs).
... thrown by void casa::MSChecker::checkReferentialIntegrity() const at File: /source/casa6/casatools/src/code/msvis/MSVis/MSChecker.cc, line: 73

CASA applycal-able tables

So I have implemented a CASA caltable dumper in cubical - took a B table and stripped it of all metadata and refilled it - guess where I got the idea ;) See e.g https://github.com/ratt-ru/CubiCal/blob/master/cubical/database/casa_db_adaptor.py

I can confirm B-jones viewing and applycal works. By the looks of it it is not enough to set G or K Jones in the keywords of the calibration table - there must be some hidden magic somewhere in the table header that tells CASA which one of the Jones family it is dealing with. So I suggest if we implement it here we have 3 types of caltables stored with the repo?

missing source.timestamps in 1535789408_sdp_l0.full.rdb

I'm getting this error when converting this data set into an MS using mvftoms.py.

  File "/usr/local/bin/mvftoms.py", line 777, in <module>
    main()
  File "/usr/local/bin/mvftoms.py", line 254, in main
    dataset = katdal.open(open_args, ref_ant=options.ref_ant)
  File "/usr/local/lib/python2.7/dist-packages/katdal/__init__.py", line 336, in open
    ref_ant, time_offset, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/katdal/visdatav4.py", line 112, in __init__
    num_dumps = len(source.timestamps)
TypeError: object of type 'NoneType' has no len()

botocore 403 error with katdal installed on com07

Not sure why this is being raised (other datasets of the same period downloads successfully):
Observation: 1527694287 (2018/05/30 - commissioning dataset):

Wrote scan data (2349.207479 MB) in 37.662396 s (62.375412 MBps)

scan  43 (  37 samples) loaded. Target: 'PKS 1934-63'. Writing to disk...
Traceback (most recent call last):
  File "/usr/local/bin/mvftoms.py", line 777, in <module>
    main()
  File "/usr/local/bin/mvftoms.py", line 539, in main
    scan_vis_data, scan_weight_data, scan_flag_data)
  File "/usr/local/bin/mvftoms.py", line 69, in load
    [vis, weights, flags], lock=False)
  File "/usr/local/lib/python2.7/dist-packages/dask/array/core.py", line 949, in store
    result.compute(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/dask/base.py", line 154, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/dask/base.py", line 407, in compute
    results = get(dsk, keys, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/dask/threaded.py", line 75, in get
    pack_exception=pack_exception, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/dask/local.py", line 521, in get_async
    raise_exception(exc, tb)
  File "/usr/local/lib/python2.7/dist-packages/dask/local.py", line 290, in execute_task
    result = _execute_task(task, data)
  File "/usr/local/lib/python2.7/dist-packages/dask/local.py", line 270, in _execute_task
    args2 = [_execute_task(a, cache) for a in args]
  File "/usr/local/lib/python2.7/dist-packages/dask/local.py", line 267, in _execute_task
    return [_execute_task(a, cache) for a in arg]
  File "/usr/local/lib/python2.7/dist-packages/dask/local.py", line 271, in _execute_task
    return func(*args2)
  File "/usr/local/lib/python2.7/dist-packages/katdal/chunkstore.py", line 127, in func_returning_chunk
    value = func(array_name, slices, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/katdal/chunkstore_s3.py", line 167, in has_chunk
    self.client.head_object(Bucket=bucket, Key=key)
  File "/usr/local/lib/python2.7/dist-packages/botocore/client.py", line 314, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python2.7/dist-packages/botocore/client.py", line 612, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

Simple download does not work

The following script (tested on several public data sets and on different computers and with different pythons) causes an error:

#! /usr/bin/env python
import katdal

file = '1629930087_sdp_l0.full.rdb'
d = katdal.open(file)
a = d.vis[0,0,0]

This is Python 3.6.9 on an Ubuntu box
Distributor ID: Ubuntu
Description: Ubuntu 18.04.6 LTS
Release: 18.04
Codename: bionic
Katdal has been installed using PyPi:

pip install katdal

Results in the error mirrored below. Not sure if this is an error on the server side or on katdal's side or even on my side (although I don't think so). Please help!

WARNING:katdal.dataset:Extending flux density model frequency range of 'J0408-6545' from 1410-8400 MHz to 855-8400 MHz
Traceback (most recent call last):
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 638, in get_chunk
    headers=headers, stream=True)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 594, in complete_request
    with self.request(method, url, chunk_name, **kwargs) as response:
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 543, in request
    raise S3ObjectNotFound(msg)
katdal.chunkstore_s3.S3ObjectNotFound: Chunk '1629930087-sdp-l0/correlator_data/00000_00000_00000': Store responded with HTTP error 404 (Not Found) to request: GET http://archive-gw-1.kat.ac.za/1629930087-sdp-l0/correlator_data/00000_00000_00000.npy
Details of server response: <?xml version="1.0" encoding="UTF-8"?><Error><Code>NoSuchKey</Code><BucketName>1629930087-sdp-l0</BucketName><RequestId>tx00000000000000df9d20a-006319b9e6-da08a2b-default</RequestId><HostId>da08a2b-default-default</HostId></Error>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./rfitest.py", line 6, in <module>
    a = d.vis[0,0,0]
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 558, in __getitem__
    return self.get([self], keep)[0]
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/lazy_indexer.py", line 591, in get
    da.store(kept, out, lock=False)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/array/core.py", line 1041, in store
    result.compute(**kwargs)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/base.py", line 283, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/base.py", line 565, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/threaded.py", line 84, in get
    **kwargs
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/local.py", line 487, in get_async
    raise_exception(exc, tb)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/local.py", line 317, in reraise
    raise exc
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore.py", line 145, in __getitem__
    return self.getter(self.array_name, slices, self.dtype, **self.kwargs)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore.py", line 325, in get_chunk_or_placeholder
    return self.get_chunk(array_name, slices, dtype)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 641, in get_chunk
    self._verify_bucket(url, err)
  File "/home/jozsa/software/virtualenv/g23/lib/python3.6/site-packages/katdal/chunkstore_s3.py", line 625, in _verify_bucket
    raise StoreUnavailable(msg) from chunk_error
katdal.chunkstore.StoreUnavailable: S3 bucket http://archive-gw-1.kat.ac.za/1629930087-sdp-l0 is empty - your data is not currently accessible

improve target_[xy] documentation

current documentation in dataset.py does not state the sign convention adopted for target_x & target_y. consider adding something like one of the following:

a) "Conventions for target_x & target_y are described in katpoint.projection.", or

b) "The (target_x, target_y) coordinates correspond to the (L, M) direction cosines calculated in [Gre1993a]_ and
[Gre1993b]_." (adapted from katpoint.projection), or

c) "The target_y coordinate axis in the plane points along the target's meridian of longitude towards the positive singularity of the sphere (in the direction of increasing elevation or declination, depending on coordinate system used). The target_x coordinate axis is perpendicular to it and points in the direction of increasing latitude (azimuth or right ascension)." (adapted from katpoint.projection).

it might be useful to add simple statements like:
"The coordinate convention roughly translates to [antenna pointing]_x ~ [target centre]_x + [offset]target_x"
"The coordinate convention roughly translates to [antenna pointing]_y ~ [target centre]_y + [offset]target_y"

Cannot convert 1525016070 rdb dataset to ms

Unable to convert
http://stgr1.sdp.mkat.chpc.kat.ac.za:7480/1525016070/1525016070_sdp_l0.full.rdb
dataset to ms, whereas it is possible to convert dataset just before and after this.
The error it pasted below:

Traceback (most recent call last):
File "/usr/local/bin/mvftoms.py", line 800, in
Using 'pyrap' casacore binding to produce MS
Accessing http://archive-gw-1.kat.ac.za:7480//auth.html
Extract MS for spw 0: central frequency 1284.00 MHz
Will create MS output in 1525016070.ms

Producing a full polarisation MS (HH,HV,VH,VV)

Cross-correlations only.

Using m047 as the reference antenna. All targets and activity detection will be based on this antenna.

Iterating through scans in file(s)...

Writing static meta data...
scan 1 ( 8 samples) loaded. Target: 'J0742-1459'. Writing to disk...
main()
File "/usr/local/bin/mvftoms.py", line 522, in main
scan_vis_data, scan_weight_data, scan_flag_data)
File "/usr/local/bin/mvftoms.py", line 64, in load
[vis, weights, flags], lock=False)
File "/usr/local/lib/python2.7/dist-packages/dask/array/core.py", line 949, in store
result.compute(**kwargs)
File "/usr/local/lib/python2.7/dist-packages/dask/base.py", line 154, in compute
(result,) = compute(self, traverse=False, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/dask/base.py", line 407, in compute
results = get(dsk, keys, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/dask/threaded.py", line 75, in get
pack_exception=pack_exception, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/dask/local.py", line 521, in get_async
raise_exception(exc, tb)
File "/usr/local/lib/python2.7/dist-packages/dask/local.py", line 290, in execute_task
result = _execute_task(task, data)
File "/usr/local/lib/python2.7/dist-packages/dask/local.py", line 270, in _execute_task
args2 = [_execute_task(a, cache) for a in args]
File "/usr/local/lib/python2.7/dist-packages/dask/local.py", line 271, in _execute_task
return func(*args2)
File "/usr/local/lib/python2.7/dist-packages/katdal/chunkstore_s3.py", line 140, in get_chunk
response = self.client.get_object(Bucket=bucket, Key=key)
File "/usr/lib/python2.7/contextlib.py", line 35, in exit
self.gen.throw(type, value, traceback)
File "/usr/local/lib/python2.7/dist-packages/katdal/chunkstore.py", line 364, in _standard_errors
raise StandardisedError(prefix + str(e))
katdal.chunkstore.ChunkNotFound: "Chunk u'1525016070_sdp_l0/weights_channel/00003_01024': An error occurred (NoSuchKey) when calling the GetObject operation: Unknown"

casacore missing as a dependency

Hi there,

Trying to run mvftoms --help and I get:

Traceback (most recent call last):
  File "/home/v_envs/katdal/bin/mvftoms.py", line 39, in <module>
    from katdal import averager, ms_async, ms_extra
  File "/home/v_envs/katdal/lib/python3.8/site-packages/katdal/ms_async.py", line 36, in <module>
    from . import ms_extra
  File "/home/v_envs/katdal/lib/python3.8/site-packages/katdal/ms_extra.py", line 27, in <module>
    import casacore
ModuleNotFoundError: No module named 'casacore'

installing python-casacore seems to help. Perhaps it's a missing required dependency?

Tested on: python 3.8

mvftoms produced malformed MS (TiledStMan::headerFileGet: mismatch in #row)

Trying to use mvftoms.py to pull 32k data from the archive to IDIA. The connection frequently drops with:

socket.timeout: The read operation timed out

but when I restart it continues. It's got most of the way there after a couple of days (11 TB downloaded), but now I'm getting the error below. I get the same error when I try to access the MS using other software. Any tips on how to recover this gratefully received. I'll start the transfer again in the meantime.

Cheers.

/users/ianh/venv/katdal/lib/python3.8/site-packages/katdal/applycal.py:156: RuntimeWarning: invalid value encountered in reciprocal
  corrections.append(ComparableArrayWrapper(np.reciprocal(bp)))
/users/ianh/venv/katdal/lib/python3.8/site-packages/katdal/applycal.py:204: RuntimeWarning: invalid value encountered in reciprocal
  return np.reciprocal(smooth_gains)
The following calibration products will be applied: l1.K, l1.B, l1.G, l2.GPHASE
Per user request the following antennas will be selected: 'm000', 'm001', 'm002', 'm003', 'm004', 'm005', 'm006', 'm007', 'm008', 'm010', 'm012', 'm013', 'm014', 'm015', 'm016', 'm017', 'm018', 'm019', 'm020', 'm021', 'm022', 'm023', 'm024', 'm025', 'm026', 'm027', 'm028', 'm029', 'm030', 'm031', 'm032', 'm033', 'm034', 'm035', 'm036', 'm037', 'm038', 'm039', 'm041', 'm042', 'm043', 'm044', 'm045', 'm046', 'm047', 'm048', 'm049', 'm050', 'm051', 'm052', 'm053', 'm054', 'm055', 'm056', 'm057', 'm058', 'm059', 'm060', 'm061', 'm062', 'm063'
Per user request the following target fields will be selected: 'COSMOS_3'
Per user request the following scans will be dumped: 34, 8, 42, 14, 48, 22, 28
Extract MS for spw 0: centre frequency 1284000000 Hz
Will create MS output in 1622376680_sdp_l2.full.ms

#### Producing a full polarisation MS (HH,HV,VH,VV) ####


Using array as the reference antenna. All targets and scans will be based on this antenna.


Iterating through scans in dataset(s)...

Traceback (most recent call last):
  File "/users/ianh/venv/katdal/bin/mvftoms.py", line 916, in <module>
    main()
  File "/users/ianh/venv/katdal/bin/mvftoms.py", line 545, in main
    with ms_extra.open_table(ms_name, verbose=options.verbose) as t:
  File "/users/ianh/venv/katdal/lib/python3.8/site-packages/katdal/ms_extra.py", line 42, in open_table
    return tables.table(name, readonly=readonly, ack=verbose, **kwargs)
  File "/users/ianh/venv/katdal/lib/python3.8/site-packages/casacore/tables/table.py", line 373, in __init__
    Table.__init__(self, tabname, lockopt, opt)
RuntimeError: Table DataManager error: Internal error: TiledStMan::headerFileGet: mismatch in #row; expected 0, found 5313710

mvtoms.py throws connection reset by peer error

I'm using katdal 0.15 in an ubuntu 18.04 docker container

can   4 ( 599 samples) loaded. Target: 'J1939-6342'. Writing to disk...
Added new field 1: 'J1939-6342' 19:39:25.03 -63:42:45.6
Wrote scan data (201912.166424 MiB) in 2285.384612 s (88.349316 MiBps)

scan   5 ( 602 samples) loaded. Target: 'J1939-6342'. Writing to disk...
Traceback (most recent call last):
  File "/usr/local/bin/mvftoms.py", line 816, in <module>
    main()
  File "/usr/local/bin/mvftoms.py", line 566, in main
    scan_vis_data, scan_weight_data, scan_flag_data)
  File "/usr/local/bin/mvftoms.py", line 92, in load
    out=[vis, weights, flags])
  File "/usr/local/lib/python3.6/dist-packages/katdal/lazy_indexer.py", line 594, in get
    da.store(kept, out, lock=False)
  File "/usr/local/lib/python3.6/dist-packages/dask/array/core.py", line 951, in store
    result.compute(**kwargs)
  File "/usr/local/lib/python3.6/dist-packages/dask/base.py", line 166, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/dask/base.py", line 437, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/dask/threaded.py", line 84, in get
    **kwargs
  File "/usr/local/lib/python3.6/dist-packages/dask/local.py", line 486, in get_async
    raise_exception(exc, tb)
  File "/usr/local/lib/python3.6/dist-packages/dask/local.py", line 316, in reraise
    raise exc
  File "/usr/local/lib/python3.6/dist-packages/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/usr/local/lib/python3.6/dist-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/usr/local/lib/python3.6/dist-packages/katdal/chunkstore.py", line 243, in get_chunk_or_zeros
    return self.get_chunk(array_name, slices, dtype)
  File "/usr/local/lib/python3.6/dist-packages/katdal/chunkstore_s3.py", line 610, in get_chunk
    headers=headers, stream=True)
  File "/usr/local/lib/python3.6/dist-packages/katdal/chunkstore_s3.py", line 587, in complete_request
    result = process(response)
  File "/usr/local/lib/python3.6/dist-packages/katdal/chunkstore_s3.py", line 173, in _read_chunk
    chunk = read_array(data._fp)
  File "/usr/local/lib/python3.6/dist-packages/katdal/chunkstore_s3.py", line 151, in read_array
    bytes_read = fp.readinto(memoryview(data.view(np.uint8)))
  File "/usr/lib/python3.6/http/client.py", line 503, in readinto
    n = self.fp.readinto(b)
  File "/usr/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
  File "/usr/lib/python3.6/ssl.py", line 1012, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/lib/python3.6/ssl.py", line 874, in read
    return self._sslobj.read(len, buffer)
  File "/usr/lib/python3.6/ssl.py", line 631, in read
    v = self._sslobj.read(len, buffer)
ConnectionResetError: [Errno 104] Connection reset by peer

add a WEIGHT_SPECTRUM column

Not having these weights may be limiting our dynamic range a bit. It is briefly mentioned in this PR #62, but seems to have fallen through the cracks.

Getting corrupt MS from mvftoms

I tried to inspect the MS using CASA listobs task and got the following error:

The mvf file I used is http://stgr1.sdp.mkat.chpc.kat.ac.za:7480/1535789408/1535789408_sdp_l0.full.rdb

2018-09-25 13:57:32	INFO	listobs::::	
2018-09-25 13:57:32	INFO	listobs::::+	##########################################
2018-09-25 13:57:32	INFO	listobs::::+	##### Begin Task: listobs            #####
2018-09-25 13:57:32	INFO	listobs::::	listobs(vis="NGC4993_OFF_SEPT0118_RAW.ms",selectdata=True,spw="",field="",antenna="",
2018-09-25 13:57:32	INFO	listobs::::+	        uvrange="",timerange="",correlation="",scan="",intent="",
2018-09-25 13:57:32	INFO	listobs::::+	        feed="",array="",observation="",verbose=True,listfile="",
2018-09-25 13:57:32	INFO	listobs::::+	        listunfl=False,cachesize=50,overwrite=False)
2018-09-25 13:57:33	SEVERE	listobs::ms::close	Exception Reported: Exception: Illegal FIELD_ID value 27 found in main table. /data/sphe/NGC4993_SEPT01/msdir/NGC4993_OFF_SEPT0118_RAW.ms/FIELD only has 0 rows (IDs).
2018-09-25 13:57:33	SEVERE	listobs::ms::close+	... thrown by void casa::MSChecker::checkReferentialIntegrity() const at File: /var/rpmbuild/BUILD/casa-test/casa-test-5.0.101/code/msvis/MSVis/MSChecker.cc, line: 78
2018-09-25 13:57:33	SEVERE	listobs::::	*** Error *** Exception: Illegal FIELD_ID value 27 found in main table. /data/sphe/NGC4993_SEPT01/msdir/NGC4993_OFF_SEPT0118_RAW.ms/FIELD only has 0 rows (IDs).
2018-09-25 13:57:33	SEVERE	listobs::::+	... thrown by void casa::MSChecker::checkReferentialIntegrity() const at File: /var/rpmbuild/BUILD/casa-test/casa-test-5.0.101/code/msvis/MSVis/MSChecker.cc, line: 78
2018-09-25 13:57:33	SEVERE	ms::detached	ms is not attached to a file - cannot perform operation.
2018-09-25 13:57:33	SEVERE	ms::detached+	Call ms.open('filename') to reattach.
2018-09-25 13:57:33	INFO	listobs::::	##### End Task: listobs              #####
2018-09-25 13:57:33	INFO	listobs::::+	##########################################

I've been struggling to convert this file to an MS for almost a month now. Is it possible for someone who more familiar with katdal to get me the MS from it?

Error in van_vleck.py

Hi,

I've built katdal into an Ubuntu 20.04 container with all the prerequisites installed, and it seems to install fine via

pip3 install git+https://github.com/ska-sa/katdal.git

But when I come to run the mv2ms.py script (in the container), I get the following error:

File "/usr/local/bin/mvftoms.py", line 38, in
import katdal
File "/usr/local/lib/python3.8/dist-packages/katdal/init.py", line 22, in
from .datasources import open_data_source
File "/usr/local/lib/python3.8/dist-packages/katdal/datasources.py", line 31, in
from .vis_flags_weights import ChunkStoreVisFlagsWeights
File "/usr/local/lib/python3.8/dist-packages/katdal/vis_flags_weights.py", line 31, in
from .van_vleck import autocorr_lookup_table
File "/usr/local/lib/python3.8/dist-packages/katdal/van_vleck.py", line 26, in
def norm0_cdf(x, scale):
File "/usr/local/lib/python3.8/dist-packages/numba/np/ufunc/decorators.py", line 123, in wrap
vec = Vectorize(func, **kws)
File "/usr/local/lib/python3.8/dist-packages/numba/np/ufunc/decorators.py", line 38, in new
return imp(func, identity=identity, cache=cache, targetoptions=kws)
File "/usr/local/lib/python3.8/dist-packages/numba/np/ufunc/dufunc.py", line 82, in init
dispatcher = jit(_target='npyufunc',
File "/usr/local/lib/python3.8/dist-packages/numba/core/decorators.py", line 219, in wrapper
disp.enable_caching()
File "/usr/local/lib/python3.8/dist-packages/numba/np/ufunc/ufuncbuilder.py", line 101, in enable_caching
self.cache = FunctionCache(self.py_func)
File "/usr/local/lib/python3.8/dist-packages/numba/core/caching.py", line 610, in init
self._impl = self._impl_class(py_func)
File "/usr/local/lib/python3.8/dist-packages/numba/core/caching.py", line 347, in init
raise RuntimeError("cannot cache function %r: no locator available "
RuntimeError: cannot cache function 'norm0_cdf': no locator available for file '/usr/local/lib/python3.8/dist-packages/katdal/van_vleck.py'

In case it helps, here are what might be relevant python module versions via pip3 freeze:

cityhash==0.2.3.post9
dask==2021.9.0
katdal==0.19.dev1565+master.c3c57cd
katpoint==0.10
katsdptelstate==0.11
h5py==3.4.0
numba==0.54.0
numpy==1.17.4
PyJWT==2.1.0
python-casacore==3.2.0
requests==2.22.0

These all meet the requirements in setup.py so I'm not sure what could be going wrong. Maybe you have an unstated numba version requirement?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.