Coder Social home page Coder Social logo

kyleaoman / martini Goto Github PK

View Code? Open in Web Editor NEW
13.0 13.0 3.0 23.98 MB

Mock spatially resolved spectral line observations of simulated galaxies.

Home Page: https://martini.readthedocs.io/

License: GNU General Public License v3.0

Python 100.00%
astrophysics radio-astronomy

martini's People

Contributors

kyleaoman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

martini's Issues

json vs. html

it seems your ipynb notebook was accidentally stored in html format.

Improve mass insertion logging

Currently the total mass in the source is recorded in the log, and then the fraction of that mass that is inserted in the cube. This often leads to low fractions simply because much of the mass is outside of the datacube aperture. Should additionally record the mass in the source after it is pruned to the desired aperture, and also log fractions of this mass. It's useful to flag for the user both (i) when much of their source is outside of the aperture (which they might not have expected), and (ii) when significant amounts of mass that are in aperture don't end up in the cube (probably indicating that something has gone wrong in the approximations used in the SPH kernels or similar).

Channels in frequency mode by default

From a simulation point of view, velocity is the fundamental quantity that we convert to frequency. Realistic radio detectors, however, have frequency as the fundamental quantity, e.g. channels are evenly spaced in frequency. Martini supports switching back and forth between frequency and velocity channels for a DataCube object, but current implementation is that channels are initialised with an even velocity spacing, so that frequency spacing is irregular. Instead, should initialise with a constant spacing in frequency and convert to velocity. Should accept either a velocity width or a frequency width. If a velocity width is provided, it can be converted to a frequency spacing at the velocity of the central channel.

Source, DataCube objects can be in unusable state

The pruning mechanism of source modules and the padding mechanism of datacubes mean that they can be put into a state where they should not be passed to martini. For example, once a mock is made with a source (it has been pruned in the process), applying a rotation and trying to produce a second mock of the same source should be invalid because it has lost its particles. Either the object should be flagged as no longer valid and martini modified to reject the source, or pruned particles should be kept aside (or masked) within the object and restored after use. Some treatment of padding of datacubes along the same lines would also be a good idea.

Tests for simulation-specific source modules

Implement unit tests? Hosting sample input data is the main hurdle. Consider using the jupyter notebooks to fill this role, I think they can be used in automated testing, but the sample data remains the hurdle.

`save_current_rotation` ambiguous with translations

If a source undergoes a mix of rotations and translations saving the "current rotation" with SPHSource.save_current_rotation("cache.txt") and then applying that result to the source before transformations with `SPHSource.rotate(rotmat=np.loadtxt("cache.txt")) will (perhaps unintuitively) not necessarily give back the transformed source.

Options to remedy this include:

  • Raise an exception on trying to use save_current_rotation if a translation has been applied.
  • Emulate the 4x4 "transformation matrix" that combines arbitrary rotations and translations that I used in swiftgalaxy to enable writing out the transformation history of a source.
  • More radically, could add an option to serialize the entire source object, but having the option to save a lightweight 3x3 or 4x4 matrix is probably still worthwhile.

Kernel helper

A helper function that makes histograms of particle smoothing lengths in pixel units and is annotated with kernel validity intervals would be useful. This should be hinted in error messages for kernel validation failure.

Parallelize calculation of spectra?

The evaluation of the particle spectra is the second most expensive calculation in martini and could potentially be done in parallel. The parallelization of the core source insertion loop could serve as a rough template. Currently everything happens in this one line:

        raw_spectra = self.spectral_function(
            (
                np.tile(
                    channel_edges.to_value(channel_edges_unit)[:-1], vmids.shape + (1,)
                )
                * channel_edges.unit
            ).astype(self.spec_dtype),
            (
                np.tile(
                    channel_edges.to_value(channel_edges_unit)[1:], vmids.shape + (1,)
                )
                * channel_edges.unit
            ).astype(self.spec_dtype),
            (
                np.tile(
                    vmids.to_value(vmids_unit),
                    np.shape(channel_edges[:-1]) + (1,) * vmids.ndim,
                ).T
                * vmids.unit
            ).astype(self.spec_dtype),
        ).to_value(U.dimensionless_unscaled)

Would just need to split up vmids into chunks, evaluate the same thing in parallel, and assemble the result. The overhead could plausibly makes the code slower for small to moderate numbers of particles, but could speed up particle-heavy use cases.

Move source rotations and translations to a later step

Currently as soon as the source is initialised the particles are rotated and translated according to any provided rotation and according to the (RA, Dec, D). Allowing a source to be manipulated after it is initialised (e.g. when consecutively making multiple mocks of one source from different viewing angles) could make some workflows more efficient. This feature is also important for simulation-specific source modules where the user probably wants to inspect the obtained particles and orient them before making a mock, without having to separately extract the corresponding particles from the simulation and derive a rotation. The (RA, Dec, D) rotations and translations could be applied at the last minute by the Martini module when it obtains the particles from the source.

Improve log messages

Add log messages with input HI mass and some diagnostics along the way: HI mass that survives pruning, that is inserted in cube, that is in final cube, flux of final cube.

Tests error out or fail without additional dependencies

Describe the bug

  • Tests do not run without installing h5py.
  • Tests do not pass without installing an additional dependencies.

To Reproduce

$ conda create -n martini python=3.7 
$ conda activate martini
$ pip install pytest
$ pip install -e .
$ pytest
...
============================================== short test summary info ==============================================
ERROR tests/test_martini.py
ERROR tests/test_write.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 2 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================================= 2 errors in 0.11s =================================================
$ pip install h5py
$ pytest
...
============================================== short test summary info ==============================================
FAILED tests/test_martini.py::TestMartini::test_preview - ModuleNotFoundError: No module named 'matplotlib'
FAILED tests/test_martini.py::TestParallel::test_parallel_consistent_with_serial[True] - ModuleNotFoundError: No module named 'multiprocess'
FAILED tests/test_martini.py::TestParallel::test_parallel_consistent_with_serial[False] - ModuleNotFoundError: No module named 'multiprocess'
FAILED tests/test_martini.py::TestGlobalProfile::test_preview - ModuleNotFoundError: No module named 'matplotlib'
FAILED tests/test_martini.py::TestGlobalProfile::test_view_spectrum - ModuleNotFoundError: No module named 'matplotlib'
FAILED tests/test_sources.py::TestSPHSource::test_preview - ModuleNotFoundError: No module named 'matplotlib'
FAILED tests/test_sources.py::TestSPHSource::test_preview_save[pdf] - ModuleNotFoundError: No module named 'matplotlib'
FAILED tests/test_sources.py::TestSPHSource::test_preview_save[png] - ModuleNotFoundError: No module named 'matplotlib'
FAILED tests/test_spectral_models.py::TestParallelSpectra::test_parallel_spectra[GaussianSpectrum] - ModuleNotFoundError: No module named 'multiprocess'
FAILED tests/test_spectral_models.py::TestParallelSpectra::test_parallel_spectra[DiracDeltaSpectrum] - ModuleNotFoundError: No module named 'multiprocess'
==================================== 10 failed, 887 passed, 7 xfailed in 57.08s =====================================

Expected behaviour

Tests should pass or be skipped if optional dependencies are not installed.

Additional context
Review for pyOpenSci/software-submission#164.

Preview function for `Martini`

It would be good to extend the preview functionality to Martini, too. This can simply use the existing preview function from SPHSource but draw a box or truncate the axes according to the datacube aperture and bandwidth. This could be done in the "observed" frame (RA, Dec, ...) but then a separate plotter is needed (or some extensive modification of the existing one). A simpler option is to let Martini call SPHSource.preview, grab the figure instance (maybe want to return the axes, too), query the datacube and source to work out the conversions (arcsec->kpc and so on), then draw the approximate cube target region in "physical" coordinates.

Option to serialize a source

It's probably reasonable to expect a user to be able to hold (1) the particle data of the source and (2) the datacube in memory simultaneously as a minimum memory requirement. As currently implemented, however, martini is more memory-hungry than this because it pre-computes and stores the spectra of all particles simultaneously, requiring nchannels * nparticles floating point values, which for some setups could be a very large number. Pre-computing the spectra is done for speed, but it would be possible to cut down on memory by segmenting the source into a series of chunks of ~equal numbers of particles. The simplest implementation would be to run the source insertion loop once for each such chunk, which is a significant hit in terms of cpu time cost. Giving some flexibility to keep memory requirements under control would be good though. Could explore other options like optionally using a memmap to store the spectra on disk rather than in memory. Costs a bit of access speed but maybe better than running the main loop more than once.

Make use of beam_angular_area equivalency

Astropy provides a beam_angular_area equivalency. Use this instead of our own implementation (currently in martini.beams.GaussianBeam.arcsec_to_beam). Instead each beam class can define its angular area in steradians for use with the astropy helper.

Document source inspection, add preview tool

Some example code showing how to inspect and manipulate source particles before making the data cube would be a good addition to the docs. This could be a good starting point (the plot_yz function could be a good addition to the source module code as a "preview" function):

import numpy as np
import unyt as u
import matplotlib.pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages
from astropy import units as U
from swiftgalaxy import SWIFTGalaxy, Velociraptor
from velociraptor import load as load_catalogue
from os import path
from martini.sources import SWIFTGalaxySource
from martini.sources._L_align import L_align

np.seterr(divide="ignore", invalid="ignore")

snapnum = 23
base_dir = "106e3_104b2_norm_0p3_new_cooling_L006N188"
velociraptor_filebase = path.join(base_dir, f"halo_{snapnum:04d}")
snapshot_filename = path.join(base_dir, f"colibre_{snapnum:04d}.hdf5")

catalogue = load_catalogue(f"{velociraptor_filebase}.properties")
# select centrals with vmax > 80km/s
target_mask = np.logical_and.reduce(
    (
        catalogue.ids.hosthaloid == -1,
        catalogue.velocities.vmax > 80 * u.km / u.s,
    )
)
target_halo_indices = np.argwhere(target_mask).flatten()
# pick the first one from the list, why not...
target_halo_index = target_halo_indices[0]

sg = SWIFTGalaxy(
    snapshot_filename=snapshot_filename,
    halo_finder=Velociraptor(
        velociraptor_filebase=velociraptor_filebase,
        halo_index=target_halo_index,
    ),
)

source = SWIFTGalaxySource(
    sg,
    distance=3 * U.Mpc,
    vpeculiar=0 * U.km / U.s,
    rotation=dict(rotmat=np.eye(3)),  # identity matrix, we'll rotate "by hand"
    ra=0 * U.deg,
    dec=0 * U.deg,
)
# source.coordinates_g is an astropy CartesianRepresentation object.
# It has an attached CartesianDifferential object for the velocities.
# See https://docs.astropy.org/en/stable/coordinates/representations.html


def plot_yz(source, pdffile, prune=np.s_[::25], title=""):
    # prune: by default plot every 25th particle.
    fig = plt.figure(1)
    plt.clf()
    sp = fig.add_subplot(1, 1, 1, aspect="equal")
    scatter = sp.scatter(
        source.coordinates_g.y[prune].to_value(U.kpc),
        source.coordinates_g.z[prune].to_value(U.kpc),
        c=source.coordinates_g.differentials["s"].d_x[prune].to_value(U.km / U.s),
        marker="o",
        s=1,
        vmin=-150,
        vmax=150,
        cmap="RdBu",
    )
    sp.set_xlabel(r"$y\,[\mathrm{kpc}]$")
    sp.set_ylabel(r"$z\,[\mathrm{kpc}]$")
    sp.set_xlim((-30, 30))
    sp.set_ylim((-30, 30))
    sp.set_title(title)
    cb = fig.colorbar(mappable=scatter)
    cb.set_label(r"$v_x\,[\mathrm{km}\,\mathrm{s}^{-1}]$")
    plt.savefig(pdffile, format="pdf")


with PdfPages("forKatharina.pdf") as pdffile:
    plot_yz(source, pdffile, title="unrotated")
    # This is the same as you would get using
    # source.rotate(L_align=(0 * U.deg, 0 * U.deg, 270 * U.deg))
    # but you can adapt as needed:
    rotmat = L_align(
        source.coordinates_g.xyz,
        source.coordinates_g.differentials["s"].d_xyz,
        source.mHI_g,
        frac=0.3,
        Laxis="x",
    )
    source.rotate(rotmat=rotmat)
    plot_yz(source, pdffile, title="face-on")
    # rotation about the galactic pole:
    source.rotate(axis_angle=("x", 45 * U.deg))
    plot_yz(source, pdffile, title="face-on, polar rotation 45deg")
    # inclination:
    source.rotate(axis_angle=("y", 60 * U.deg))
    plot_yz(source, pdffile, title="inclined 60deg")

    # SWIFTGalaxy has already re-centred (x,y,z)=(0,0,0) on the velociraptor
    # centre of potential, and the (vx,vy,vz)=(0,0,0) on the velociraptor
    # subhalo bulk velocity. The centre of potential is already what we want,
    # but the bulk velocity probably does not result in the central gas
    # particles having 0 net velocity, so we can adjust like this:
    r = np.sqrt(np.sum(np.power(source.coordinates_g.xyz.to_value(U.kpc), 2), axis=0))
    rsort = np.argsort(r)
    # we'll reject particles with small HI fractions with this:
    HI_mask = (source.mHI_g / source.mHI_g.max()).to_value(U.one) > 0.3
    # LoS velocities:
    vxyz = source.coordinates_g.differentials["s"].d_xyz
    # mean velocity of 100 innermost HI-rich gas particles:
    vxyz0 = np.mean(vxyz[:, rsort][:, HI_mask[rsort]][:, :100], axis=1)
    # In principle we only need to adjust the LoS velocity, but if we adjust
    # the 3D velocity the correction will be ok for any orientation.
    source.boost(-vxyz0)
    plot_yz(source, pdffile, title="inclined 60deg, with velocity recentred")

Streamline SPH kernels

The recommended best practice is to use something like AdaptiveKernel((CubicSplineKernel, DiracDeltaKernel, GaussianKernel)), so why not just implement directly? Conceptually: CubicSplineKernel = _AdaptiveKernel((_CubicSplineKernel, _DiracDeltaKernel, _GaussianKernel)). This would remove some technical burden from users.

Add a global profile mode

Sometimes you just want a spectrum. Simplified output formats for this case, and simplified/optimised source insertion. Need just a single "pixel" and to calculate the convolution of the beam and sph kernel for each particle. This could be via a new "dataspectrum" mirroring datacube module and new "insert_source_in_spectrum" function mirroring insert_source_in_cube.

Support for multiple source insertion

Inserting more than one source into a DataCube seems like a reasonable workflow to support. Need to consider best implementation. Some options:

  • Martini accepts a list of sources instead of a single source and simply inserts each one in turn? Since iteration is over pixels, this is probably inefficient.
  • Martini accepts a list of sources but combines them into a single source before inserting into cube. A CombinedSource module to facilitate this? Questions arise like what is the notional position or vsys of a combined source.
  • Don't add any dedicated features but support with example workflows in documentation.

Source insertion into existing cube

Might be good to support inserting a source into a cube that is read in from file. This implies initialising a datacube from file, which is easy enough. However if the cube is already beam-convolved (which seems the most likely case) then we need a separate cube internally until convolution is done.

Given the above maybe it makes sense to not load the file into a DataCube but use it to guide its creation (match channel and pixel size, could take a subregion as an argument to specify a target insertion location?). Some functions to mostly automate the process plus some cookbook examples should about do it.

Insert noise by desired RMS

Current the RMS of noise to be generated is the raw RMS. Once this is smoothed by the beam, the RMS is lower. However, usually the user knows the RMS desired in the final mock observation (i.e. after convolution with the beam). It's straightforward to calculate the raw RMS required to produce a desired RMS after convolution with a Gaussian smoothing kernel, should therefore accept the desired RMS as the parameter instead of the raw RMS.

is there a standard way to store HDF5 image cubes?

At ADASS recently the CARTA visualization tool is using a new IDIA standard (?) scheme to efficiently store large cubes, but serves just as well for normal cubes. It looks of course much like your HDF5 files, but if we're going to write cubes in hdf5 instead of fits, might as well use some kind of standard in astronomy. I just don't know if there is one.
See https://github.com/idia-astro/cpp_hdf5converter for the code they use to create these HDF5 files.

TypeDict not found in numpy

Nadine reports that on Ilifu the TypeDict class (in martini demo) is not found. Probably need to specify a more strict version dependency for numpy (?).

two classes EAGLESource ?

I could not reproduce the martini_eagle notebook, but then continued with the python converted one. The first error is:
TypeError: init() got an unexpected keyword argument 'db_user'
so I comment this out, and then get another error from which it appears as if there is a 2nd EAGLESource class, now appearing in the installed sources.py which seems to use local files.The parent sources.py in the github repo is very different. I guess this is some python trick that newbies like me are not used to, but this is then very confusing. Is that related to the optional pip installs mentioned inthe installation notes?

Problem with the IllustrisTNG example

I am trying to run the Illustris TNG example in the Illustris server, but I found the following error:

from martini import tngdemo; tngdemo()

AttributeError Traceback (most recent call last)
in
----> 1 from martini import tngdemo; tngdemo()

/opt/conda/lib/python3.6/site-packages/martini/_tngdemo.py in tngdemo(cubefile, beamfile, hdf5file)
123 # pixel scale) and consulting the documentation of the sph_kernels sub-
124 # module to select an appropriate kernel.
--> 125 sph_kernel = GaussianKernel.mimic(
126 CubicSplineKernel(rescale_sph_h=.5),
127 truncate=3

AttributeError: type object 'GaussianKernel' has no attribute 'mimic'****

Thanks for your help!

Toggle for Stokes axis

Add a keyword argument in DataCube initialisation with stokes=True/False, Stokes axis can often be omitted for convenience.

DataCube.load_state unsafe with change in beam?

I think that as currently implemented, a datacube could be loaded with DataCube.load_state with a padding region that is too small for the beam. Roughly:

m = Martini(beam=small_beam, ...)
m.insert_source_in_cube()
m.datacube.save_state(cachefile)
del m
m = Martini(beam=big_beam, ...)
m.load_state(cachefile)  # I'll just save some compute time to make a lower angular resolution mock...
m.convolve_beam()  # pad is too small!

Fourier filtering

Filter out emission on a range of scales according to a set of interferometer baselines.

provenance in fits files

It would be nice to have an idea of the processing history (e.g. original filenames) etc. such that in the HISTORY or COMMENT section of the fits file, one can read how the cube was made.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.