realfastvla / realfast Goto Github PK
View Code? Open in Web Editor NEWReal-time interferometric data analysis for the VLA
Home Page: http://realfast.io
License: BSD 3-Clause "New" or "Revised" License
Real-time interferometric data analysis for the VLA
Home Page: http://realfast.io
License: BSD 3-Clause "New" or "Revised" License
Nominal plan that needs to be finalized:
Open question:
We will generate visibility snippets to be archived. What metadata and format should they have?
Baseline plan is to write SDMs with sdmpy without dedispersion correction.
Is it adequate to modify metadata only for new time range?
We have the ability to simulate transients, but do not regularly use it in real-time observing. It would be valuable to implement an end-to-end plan for adding, finding, and tracking mock transients as a regular check of data quality.
Martin Pokorny has ideas..
R6: A more granular schedule, with definite targets should be developed to track progress. This schedule shall be shared with affected aspects of NRAO to avoid priority collisions.
queue_monitor can accidentally parse the wrong mergepkl if two scans finish at the same time. In one case, the two scans were from different SBs, so the scans to archive were not correctly identified.
Offending code is:
for jobid in jobids:
...
if job == finishedjobs[-1]:
Need to finalize nominal plan for purchasing 64 GPUs:
Does choice of GPU affect/limit how we program for it?
Related q: is numba efficient for GPUs?
Currently the realfast system runs by setting the realfast@cbe-node-01 path to include @caseyjlaw software. Bad!
A group-controlled realfast user needs to be defined.
Work with the NRAO SSA group to create an ICD or requirements document to define the realfast / archive interface.
During a science run for 17A-396, we noticed a bdf remaining after archiving completed. It was the last scan and a calibrator scan. The SDM downloaded from the NRAO archive included the bdf, so that seems to have worked properly.
It is likely a bug related to bdf cleanup within the realfast code.
CBE will write SDMs for realfast to process. The metadata in those SDMs is not well defined yet, as they are a new data product.
The integration of realfast with the VLA observing system should be a prototype of a general system for third-party systems. As such, a general interface needs to be designed for systems like this.
R5: Identify and document (ICD) all external interfaces to the realfast system. Major subsystems within realfast should consider similar documentation.
Definition of data formats:
Can realfast operate commensally with VLA OTF mode?
Potential issues:
We have working prototype that uses jupyterhub as a front end. Do we want to redevelop front end for internal portal or further develop notebooks for use with jupyterhub?
Development of algorithm and science behind concept of periodicity imaging. Open questions:
If all goes well, write it up!
Having "magic disk" and GPU hardware would make testing more effective and informative for eventual cluster order.
VOEventdb is a way of searching a history of VOEvents.
Could use realfast server for this, too.
See https://github.com/timstaley/voeventdb.
R2: Elaborate the scope, design and operation of the portal. Include all applicable interfaces (both software and hardware). Discuss the design with the NRAO SSA group.
Would be nice to get some rudimentary version of rfpipe (using CPUs via dask distributed scheduler) on CBE for NRAO review.
End of scheduling block process will trip up if a bdf never arrives. A few potential issues here:
Prototypes and new features coming together:
A test would motivate integration and answer rudimentary questions about performance before making big purchase.
Rudimentary co-observing working with LWA (from @xiggystardust), but how to manage this for:
R4: Develop a commissioning plan and explicitly identify the tests which will mitigate the risks identified.
There are several areas of risk to the operations of VLA:
Many limits will be empirically determined, so a robust process to identify, monitor, and organize results must be defined.
R3: Work with the appropriate NRAO stakeholders define the policy for realfast data, proprietary periods, access controls etc.
Project plan graphically designed at: https://drive.google.com/open?id=0B7Md1PdoX8joQ2JGbTNGbmFHRFU. These times and issues need to be put into github.
September 2016 meeting notes at: https://docs.google.com/document/d/1sVUEs7_nO9kfFOml7mkK1W6e034KBGJu64Qsl_CxnZg.
Need detailed diagram of all systems and interfaces
First draft from years ago:
https://plus.google.com/photos/106898753996583382613/albums/5888271939578516929?authkey=CLjAy9PC38-HGg
Whiteboard exercise
Need to define path to get output visibility cutouts into archive.
Currently Paul's sdmpy library can create cutouts. Those products need to be ingested to archive system.
key questions:
get new server with gpus on infiniband connection to cbe
NRAO PMD recommends some documents to support the realfast development process.
Candidate detections will create visibility data to be archived. However, many candidates will be bad and should be cut before archiving.
Initially, this will require human feedback ("Astronomer on duty", AOD) to remove bad candidates and reduce the archived data rate. It is expected that this can be done on a ~day cadence.
Data volumes and rates are large and may be limiting either in the correlator or within the realfast cluster. Using compression/decompression may help trade compute for data rate limitations.
Prior art:
The current implementation uses Jupyterhub and elasticsearch, both hosted at http://realfast.berkeley.edu.
Need to define and plan for electrical upgrades required to support realfast.
James Robnett knows details.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.