Coder Social home page Coder Social logo

j535d165 / coronawatchnl Goto Github PK

View Code? Open in Web Editor NEW
143.0 15.0 73.0 1.15 GB

Numbers concerning COVID-19 disease cases in The Netherlands by RIVM, LCPS, NICE, ECML, and Rijksoverheid.

License: Creative Commons Zero v1.0 Universal

Python 10.26% R 6.17% Shell 0.17% HTML 6.51% Jupyter Notebook 76.90%
coronavirus coronavirus-tracking coronavirus-globaloutbreak coronavirus-real-time rivm netherlands covid-19 utrecht-university dataset covid19-data

coronawatchnl's Introduction

Dear CoronaWatchers,

One year after the start of the CoronaWatchNL project, the coronavirus is still with us. As a community, we made an extensive collection of data on COVID-19 case counts Findable, Accessible, Interoperable, and Reusable (FAIR) (Wilkinson, M. D. et al., 2016). During the first wave of the COVID-19 outbreak in The Netherlands, our project was the primary source of structured open data for researchers, hospitals, (local) governments, and the public. In June 2020, RIVM started to publish their first open and structured data on COVID-19 case counts. More and more users of our project migrated to the RIVM open data.

It's essential to have data openly available and FAIR. Data is an important building block for nowadays research and policymaking. We see that many researchers and organizations struggle with making data and software FAIR. It is important to realize that FAIRness of data is a step-by-step process, and there is no such a thing as a perfect FAIR dataset. We see that many organizations with an important role in the COVID-19 pandemic in the Netherlands started to publish their data openly and take their first steps in making data FAIR. It is still far from perfect, but we are moving forward.

For the award-winning CoronaWatchNL project, this implies we are no longer the connecting link between the user and the suppliers of the COVID-19 data. Users can now make use of the data of RIVM, LCPS, and NICE directly. We have become largely redundant, and therefore we decided to no longer update the project. Our main goal was to become redundant, so we are pleased with the outcome. We will keep an eye on the developments and stay in contact with the suppliers of COVID-19 data. The journey for them has only yet begun.

Users who are still using our data should migrate. Our main sources of data in CoronaWatchNL were RIVM, LCPS, and NICE. Most of the datasets we offer are nowadays available on their websites. See the following sources for more information:

Now we are no longer updating the data anymore; I would like to thank the CoronaWatchNL community and users. This project was an open community project from the start. Without the help of more than 50 CoronaWatchers, it wouldn't have been possible to collect this amount of data for more than a year. The importance and quality of the data collection were widely recognized in academia, and we were awarded the Dutch Data Prize 2020. Hopefully, we see each other in the future in a new project!

Feel free to contact me with questions: [email protected].

Best regards,

Jonathan de Bruin

corona_artwork.jpg

Dataset: COVID-19 case counts in The Netherlands

CoronaWatchNL collects numbers on COVID-19 disease count cases in The Netherlands. The numbers are collected from various sources on a daily basis, like RIVM (National Institute for Public Health and the Environment), LCPS (Landelijk Coördinatiecentrum Patiënten Spreiding), NICE (Nationale Intesive Care Evaluatie), and the National Corona Dashboard. This project standardizes, and publishes data and makes it Findable, Accessible, Interoperable, and Reusable (FAIR). We aim to collect a complete time series and prepare a dataset for reproducible analysis and academic use.

Dutch:

CoronalWatchNL verzamelt ziektecijfers over COVID-19 in Nederland. Dagelijks worden de cijfers verzameld van het RIVM (Rijksinstituut voor de Volksgezondheid en Milieu), LCPS (Landelijk Coördinatiecentrum Patiënten Spreiding), NICE (Nationale Intesive Care Evaluatie) en Nationale Corona Dashboard. Dit project standaardiseert en publiceert de gegevens en maakt ze vindbaar, toegankelijk, interoperabel en herbruikbaar (FAIR). We streven ernaar om een dataset beschikbaar te stellen voor reproduceerbare analyses en wetenschappelijk gebruik.

Datasets

The datasets available on CoronaWatchNL are updated on a daily basis. Availability depends on the publication by the respective sources (N.B. since July 1st, the epidemiological reports published by RIVM will be released on a weekly instead of a daily basis). The CoronaWatchNL project divides the datasets into four main categories:

For (interactive) applications based on these datasets, have a look at the applications folder. For predictive models based on these datasets, check out the parallel repository CoronaWatchNL Extended. Please note that the intention of these (too) simplistic models - made by CoronaWatchNL volunteers - is to show how the data can be used for modelling, not to answer specific hypotheses or follow scientific protocol.

Please see the Remarks document for notes about the datasets. Do you have remarks? Please let us know.

Geographical datasets

Reference time: 10:00 AM

These datasets describe the new and cumulative number of confirmed, hospitalized, and deceased COVID-19 cases. Every day, RIVM retrieves the data from the central database OSIRIS at 10:00 AM. Here, the datasets are categorized by their geographical level (i.e., national, provincial, municipal).

For more detail about the specific structure of these geographical datasets, have a look at the data-geocodebook.

Dataset Source Variables
Reported case counts by date in NL RIVM Date, Type (Total, hopitalized and deceased COVID-19 cases), (Cumulative) Count
Reported case counts by date in NL per province RIVM Date, Province, Type (Total, hopitalized and deceased COVID-19 cases), (Cumulative) Count
Reported case counts by date in NL per municipality RIVM Date, Municipality, Province, Type (Total, hopitalized and deceased COVID-19 cases), (Cumulative) Count

Reference time: by day (0:00 AM)

These datasets describe the new and cumulative number of confirmed, hospitalized, and deceased COVID-19 cases per day. The data is retrieved from the central database OSIRIS and counts the number of cases per day (0:00 AM) by RIVM. The dataset concerns numbers on a national level.

For more detail about the specific structure of this geographical dataset, have a look at the data-geocodebook.

Dataset Source Variables
Case counts by date in NL RIVM Date, Type (Total, hopitalized and deceased patients), (Cumulative) Count

Visualizations geographical data

To get a better picture of the content of the geographical datasets, have a look at the following visuals. These visuals show the development of the COVID-19 disease outbreak on a national level.

plots/map_province.png

Descriptive datasets

The datasets in this section describe the new and cumulative number of confirmed, hospitalized, and deceased COVID-19 cases per day and contain variables like age and sex.

For more detail about the specific structure of these descriptive datasets, have a look at the data-desccodebook.

Dataset Source Variables
Case counts in NL per age RIVM Date, Age group, Type (Total, hopitalized and deceased COVID-19 cases), (Cumulative) Count
Case counts in NL per sex RIVM Date, Sex, Type (Total, hopitalized and deceased COVID-19 cases), (Cumulative) Count
Deceased case counts in NL per sex and age group RIVM Date, Age group, Sex, (Cumulative) Count of deceased cases

Visualizations descriptive data

The graphs below visualize the development of the COVID-19 disease outbreak per sex and age group.

Intensive care datasets

The intensive care datasets describe the new and cumulative number of COVID-19 intensive care unit (ICU) admissions per day. The datasets are categorized by their source. Compared to RIVM (reporting COVID-19 hospital admissions), CoronaWatchNL collects COVID-19 related intensive care data from LCPS and NICE.

  • RIVM reports hospitalized COVID-19 cases, including - but not limited to - the intensive care unit (ICU) admissions. These are the largest numbers and most inclusive counts.
  • NICE only reports COVID-19 cases that are admitted to the ICU.
  • LCPS, similarly to NICE, reports COVID-19 ICU admissions. However, LCPS tries to compensate for the reporting lag, by estimating its size and adding it to the numbers reported by NICE.

For more detail about the specific structure of the intensive care datasets, have a look at the data-iccodebook.

Dataset Source Variables
COVID-19 intensive care patient counts in NL Stichting NICE Date, New, Total and Cumulative ICU admissions per day, Number of ICUs with at least one COVID-19 case, New and Cumulative fatal, survived and discharged ICU admissions
COVID-19 intensive care patient counts with country of hospitalisation LCPS Date, Country of Hospitalization, Total COVID-19 ICU admissions

Visualizations intensive care

The first two graphs show the number of new (Nieuw), total (Actueel), cumulative (Cumulatief), deceased (Overleden), and survived (Overleefd) COVID-19 ICU admissions per day, as declared by NICE. The total number of ICU admissions per day as reported by LCPS is also shown.

Dashboard datasets

The datasets underlying the National Dashboard are listed in this folder. These datasets concern various topics, such as an overview of the number and age distribution of hospitalized, positively tested, and suspected cases, an estimate of the number of contagious people, the reproduction index, the number of (deceased) infected nursery home residents, and the amount of virus particles measured in the sewage water.

For more detail about the specific structure of the dashboard datasets, have a look at the data-dashboardcodebook.

Dataset Source Variables
Reported case counts in NL National Dashboard Date, Type of measure, (Cumulative) Count
Age distribution of reported cases in NL National Dashboard Date, Age group, Count
Suspected patients in NL National Dashboard Date, Type of measure, Count
COVID-19 particles in sewage National Dashboard Date, Type of measure, Count, Measurement units
Reproduction index COVID-19 virus National Dashboard Date, Type of measure, Value
Contagion estimate COVID-19 virus National Dashboard Date, Type of measure, Value
Number of infected and deceased nursery home cases National Dashboard Date, Type of measure, (Cumulative) Count

Visualizations dashboard data

These visuals show the development of the COVID-19 disease outbreak on a national level as reported by the National Dashboard and by the RIVM reports.

Below, the number of suspected COVID-19 patients as registered by the GPs, and the amount of COVID-19 particles per milliliter sewage water are depicted.

The reproduction index and estimated contagious people are plotted with their corresponding minimum and maximum values. The reproduction index indicates how quickly the COVID-19 virus is spreading in the Netherlands. The estimated contagious people represent the number of COVID-19 people per 100.000 inhabitants that are contagious for others.

The number of (deceased) nursery home residents infected with COVID-19 are shown here.

Miscellaneous datasets

This folder contains datasets describing various miscellaneous topics, such as the number of (positively) tested people, the underlying conditions and/or pregnancy of deceased cases younger than 70, an overview of the reinforced measures and press releases in the Netherlands, and a list of companies that requested and received an advance on their reimbursement.

For more detail about the specific structure of the miscellaneous datasets, have a look at the data-misccodebook.

Dataset Source Variables
COVID-19 tests in NL per week RIVM Year, Calendar week, Start date (Monday), End date (Sunday), Included labs, Type (Total and positive tests), Count
COVID-19 tests in NL per week by GGD-GHOR GGD-GHOR Year, Calendar week, Start date (Monday), End date (Sunday), Type (Total), Count
Underlying conditions and/or pregnancy in deceased COVID-19 cased under the age of 70 RIVM Date, Type of condition, Cumulative count
COVID-19 measures by the government European Commission Joint Research Centre Various variables on governmental measures (in English)
RIVM press releases RIVM Date and Time, Content of press release
NOW registry UWV Company, Location, Advance

Visualizations miscellaneous data

These graphs display the number of (positively) tested people per week. The end date of each week - Sunday - is used as indicator for the respective week.

Below, the cumulative number of deceased COVID-19 cases younger than 70 with and without underlying conditions and/or pregnancy are displayed per notification date.

The cumulative number of specific conditions found in these deceased COVID-19 cases are shown here.

Inactive/deprecated datasets

Deprecated (pending)

The following datasets are awaiting deprecation. They are (being) replaced by new datasets.

Dataset Source Variables Alternative
COVID-19 disease case counts in NL RIVM Date, Number of positive COVID-19 disease cases in NL COVID-19 case counts in NL
COVID-19 fatalities in NL RIVM Date, Number of COVID-19 fatalities in NL COVID-19 case counts in NL
COVID-19 hospitalizations in NL RIVM Date, Number of COVID-19 hospitalized patients in NL COVID-19 case counts in NL
Newly reported relative case counts by date in NL per municipality (PDF maps)* RIVM Date, Type, Number of positive COVID-19 disease cases, hospitalizations and fatalities per 100.000 people, Municipality, Province Reported case counts by date in NL per municipality
COVID-19 age distribution RIVM Date, Type, Age, number of cases data-desc#age
COVID-19 sex distribution RIVM Date, Type, Sex, number of cases data-desc#sex

* This dataset is extracted from the maps in the PDF's. The values are relative counts per 100.000 residents in the municipality.

Inactive

The following datasets are no longer appended with new data (because RIVM is no longer providing the data).

Dataset URL Source Variables Expire date
COVID-19 disease case counts in NL* [long format] [wide format] RIVM Date, Number of positive COVID-19 disease cases in NL, Municipality of residence, Municipality code (2019), Province 2020-03-30
Test count (before 2020-04-20) Test count RIVM PublicatieDatum, Datum, Labs, Type, Aantal 2020-04-20

* Nowadays, the data is published again. Please use dataset data-geo#municipal.

Raw data

CoronaWatchNL collects copies of the raw data such that data collection is verifiable. Copies of the collected data can be found in the folder raw_data. The data isn't standardised.

Data collection sources

The following sources are used for data collection.

Source Institute Variables
https://www.rivm.nl/coronavirus-covid-19/actueel RIVM National cumulative numbers and press releases
https://www.rivm.nl/coronavirus-covid-19/grafieken RIVM Case counts per day
https://www.rivm.nl/coronavirus-covid-19/actueel/wekelijkse-update-epidemiologische-situatie-covid-19-in-nederland RIVM Epidemiological report
https://ggdghor.nl/actueel-bericht/ GGD-GHOR Test data
https://www.stichting-nice.nl/ Stichting NICE Intensive care numbers on COVID-19 patients
https://www.lcsp.nu/ LCPS Intensive care numbers on COVID-19 patients
https://coronadashboard.rijksoverheid.nl/ National Dashboard Various variables and estimations like Reproduction Index
https://covid-statistics.jrc.ec.europa.eu/ European Commision Joint Research Centre Governmental measures database
https://www.uwv.nl/overuwv/pers/documenten/2020/gegevens-ontvangers-now-1-0-regeling.aspx/ Employee Insurance Agency NOW registry

License and academic use

The graphs and data are licensed CC0. The original data is copyright RIVM.

For academic use, use presistent data from DOI. This is a persistent copy of the data. Version number refer to the date. Please cite:

De Bruin, J. (2020). Number of diagnoses with coronavirus disease (COVID-19) in The Netherlands (Version v2020.3.15) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3711575

Image from iXimus via Pixabay

CoronaWatchNL

CoronaWatchNL is collective of researchers and volunteers in The Netherlands. We aim to make the reported number on COVID-19 disease in The Netherlands FAIR. The project is initiated and maintained by Utrecht University Research Data Management Support and receives support from Utrecht University Applied Data Science.

Help on this project is appreciated. We are looking for new datasets, data updates, graphs and maps. Please report issues in the Issue Tracker. Want to contribute? Please check out the help wanted tag in the Issue Tracker. Do you wish to share an application based on these datasets? Have a look at the applications folder. For predictive models, check out the parallel repository CoronaWatchNL Extended.

Please send an email to [email protected] and/or [email protected]

coronawatchnl's People

Contributors

blogem avatar ghostleyjim avatar goedzo avatar henriterhofte avatar hungrxyz avatar j535d165 avatar japhir avatar jeroenr1 avatar jpvandervelden avatar rvoor avatar sebastiaanbekker avatar shadkam avatar sikerdebaard avatar timvosch avatar userlandkernel avatar vega-s avatar vmenger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

coronawatchnl's Issues

Improve sigmoid simulations

Do you have any suggestions how to improve the simulations of the sigmoid fitting? Currently it samples known data points (with replacement), but the range is almost certainly way too low.

If have also tried inferring data for the two next days, sampled from the distribution of the growth factor of the last five days. But I doubt the resulting image is still useful/readable:

sigmoid-with-two-inferred-days

Any suggestions are very welcome.

Fix rendering of the maps in Actions

In the Github Actions workflow, the rendering of the maps fails. Unknown error to me, it works locally. It might be an idea to replace this by Python code.

Extract info from pdf maps

I think it is quite doable to extract the color values from the maps in an automated way.

Outline:

  • Screenshot/image the relevant pdf page.
  • Compute the center of mass of each municipality with CBS 2019 Wijk en Buurtkaart maps
  • Make this an overlay for the image.
  • Extract the color values and map them to the corresponding values.

Is someone interested in giving this one a try?

Another attempt at animating new cases / help with collecting data?

Hi there,
i created https://corona-map-nl.web.app/
I've separated the new hospital intakes and the new cases, it looks a bit like what you guys made.
You can also click on a municipality for historic data.
And if you click on the country icon you see a complete history nationally.

I've also got quite a lot of more structured data by parsing the RIVM javascript, that way i don't have to deal with the PDF files. I also process the data from NICE and LCPS.

At the moment it takes me about 10 minutes to extract all the data and convert it to the format my app uses. Maybe i can help collecting part of the data?

Also, anybody notice that the chart "Aantal overledenen naar datum van overlijden" at https://www.rivm.nl/coronavirus-covid-19/grafieken has the value 109 for the 25th of march and in the pdf it shows 108 for that date. The total used by RIVM is the total in the PDF, so i assume that's the correct total. I noticed they have a one off error everyday for that date...

Let me know if i can be of help,

Best regards,
Jeroen

Contact with RIVM

Hi all,

First of all, thank you all for contributing to this repository. It is a challenge to keep up-to-date with all the daily changes on the RIVM website. Without your contributions, it isn't possible to keep this up-to-date and add new datasets.

There are quite some questions regarding our contacts with RIVM and our offer to help them set up a proper data repository. We, CoronaWatchNL and Utrecht University, had contact with several people of RIVM. Unfortunately, there is no progress yet. More and more (prominent) researchers are teaming up with this project. They help us convincing RIVM to set up a data repository and use our/your expertise.

If there is anything to share, please drop it in this thread or send me an email at [email protected].

Reconstruct data of March 2

Since 3 March, RIVM reports the number of diagnoses with the coronavirus and their municipality of residence on a daily basis. This issue is used to reconstruct the data of March 1 based on RIVM media reports.

Result:

  • 1 person from Tilburg
  • 1 person from Diemen
  • ...

Reconstruct data of February 29

Since 3 March, RIVM reports the number of diagnoses with the coronavirus and their municipality of residence on a daily basis. This issue is used to reconstruct the data of February 29 based on RIVM media reports.

Result:

  • 3 persons from Tilburg
  • 3 persons from Diemen
  • 1 from Delft

Sigmoid plotting fails

The plotting of the sigmoid fails with the latest data @vmenger

Sys.setlocale(category="LC_ALL", locale = "en_US.UTF-8")
Inflection expected after 15.6 days, at date 13/03/2020 13:00
/Users/jonathan/.pyenv/versions/anaconda3-2018.12/lib/python3.7/site-packages/pandas/plotting/_matplotlib/converter.py:103: FutureWarning: Using an implicitly registered datetime converter for a matplotlib plotting method. The converter was registered by pandas on import. Future versions of pandas will require you to explicitly register matplotlib converters.

To register the converters:
	>>> from pandas.plotting import register_matplotlib_converters
	>>> register_matplotlib_converters()
  warnings.warn(msg, FutureWarning)
Traceback (most recent call last):
  File "/Users/jonathan/Dropbox/Projects/Corona/CoronaWatchNL/python_plots.py", line 287, in <module>
    inflection_y = compute_inflection_cases(df, inflection_x)
  File "/Users/jonathan/Dropbox/Projects/Corona/CoronaWatchNL/python_plots.py", line 158, in compute_inflection_cases
    upper_bound = df[df['Dag'] == math.ceil(inflection_x)].iloc[0]['Aantal']
  File "/Users/jonathan/.pyenv/versions/anaconda3-2018.12/lib/python3.7/site-packages/pandas/core/indexing.py", line 1424, in __getitem__
    return self._getitem_axis(maybe_callable, axis=axis)
  File "/Users/jonathan/.pyenv/versions/anaconda3-2018.12/lib/python3.7/site-packages/pandas/core/indexing.py", line 2157, in _getitem_axis
    self._validate_integer(key, axis)
  File "/Users/jonathan/.pyenv/versions/anaconda3-2018.12/lib/python3.7/site-packages/pandas/core/indexing.py", line 2088, in _validate_integer
    raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
[Finished in 3.5s with exit code 1]

Amsterdam and Eindhoven always have the same number of infected

In https://raw.githubusercontent.com/J535D165/CoronaWatchNL/master/data/rivm_NL_covid19_municipality_range.csv the number of infections for Amsterdam and Eindhoven is always the same. Even though these numbers are reported in ranges, this still seems a bit suspicious.

These are the rows for the last couple of days, the values in the last two columns are always the same:

Datum,Gemeentenaam,Gemeentecode,Provincienaam,Type,Aantal_min,Aantal_max
2020-04-07,Amsterdam,363,Noord-Holland,Totaal,70.0,115.0
2020-04-07,Eindhoven,772,Noord-Brabant,Totaal,70.0,115.0
2020-04-08,Eindhoven,772,Noord-Brabant,Totaal,70.0,115.0
2020-04-09,Amsterdam,363,Noord-Holland,Totaal,70.0,115.0
2020-04-09,Eindhoven,772,Noord-Brabant,Totaal,70.0,115.0
2020-04-10,Amsterdam,363,Noord-Holland,Totaal,115.0,185.0
2020-04-10,Eindhoven,772,Noord-Brabant,Totaal,115.0,185.0
2020-04-11,Amsterdam,363,Noord-Holland,Totaal,115.0,185.0
2020-04-11,Eindhoven,772,Noord-Brabant,Totaal,115.0,185.0
2020-04-12,Amsterdam,363,Noord-Holland,Totaal,115.0,185.0
2020-04-12,Eindhoven,772,Noord-Brabant,Totaal,115.0,185.0
2020-04-13,Amsterdam,363,Noord-Holland,Totaal,90.0,160.0
2020-04-13,Eindhoven,772,Noord-Brabant,Totaal,90.0,160.0
2020-04-14,Amsterdam,363,Noord-Holland,Totaal,90.0,160.0
2020-04-14,Eindhoven,772,Noord-Brabant,Totaal,90.0,160.0
2020-04-15,Amsterdam,363,Noord-Holland,Totaal,90.0,160.0
2020-04-15,Eindhoven,772,Noord-Brabant,Totaal,90.0,160.0

Exclude 'buitenland' in total stats

Seems like RIVM excludes the (single?) person living abroad in their statistics. Therefore, our totals are too high. Need to remove this one from our counts imo.

Data missing in rivm_NL_covid19_hosp_municipality.csv and rivm_NL_covid19_total_municipality.csv?

Hi,

Before in 'rivm_NL_covid19_hosp_municipality.csv' and 'rivm_NL_covid19_total_municipality.csv' data for unknown municipalities/provinces was reported with a blank row.

For example:
Datum,Gemeentecode,Gemeentenaam,Provincienaam,Provinciecode,Aantal
2020-04-16,-1,,,,431

I identified that in the extract of today (28 April) that this is not reported anymore for both tables.
Datum,Gemeentecode,Gemeentenaam,Provincienaam,Provinciecode,Aantal
2020-04-28,-1,,,,

Is this missing data, or will this the new data format and unknow municipalities won't be reported anymore?

Thanks in advance!

Provide Geocoded data

Compliments with this project!

For many applications that produce maps and/or apply spatial analysis, geocoded COVID-19 data (data with coordinate attributes/columns) would be very helpful. This sounds more complex than it actually is:

  • many of the produced data CSVs here, have columns like Municipality (Gemeente)/Province code and or name
  • the Dutch government via Kadaster-PDOK provides Open datasets for Administrative Borders (Bestuurlijke Grenzen) with those same names/codes.
  • GeoJSON is an ideal format for supplying geospatial data
  • there is ample Open Source software to convert/simplify these (GML) datasets. With little search found e.g. this script
  • "geocoding" is mainly a matter of JOIN-ing on Municipality/Province names/codes (i.s.o. using Geocoding backends like Nominatim ), possibly GeoPandas can be of help. e.g. https://geopandas.org/mergingdata.html

I hope this triggers interest. Eventually I could foresee some extra/derived GeoJSON files generated from the CSVs under /data/ like under /data/geocoded. Is all data here generated via GitHub Workflows/Actions? Then contributors could also add the required geocoding steps there.

Next step are OpenAPI endpoints from this GeoJSON data. This could be served directly from GitHub. We have a project based on OGC OpenAPI REST standards: pygeoapi where we are working on providing an Open Endpoint for COVID-19 data: https://demo.pygeoapi.io/covid-19/collections?f=html. Some Collections there already serve directly from GitHub repos, like Italy. For NL we use/proxy ESRI Endpoints, but would rather serve directly from a/this GH repo.

In theory I could do the work on this issue, but already quite occupied with the pygeoapi part...

Fit a logistic regression as well?

Hi Jonathan,

After watching this epic video by 3blue1brown https://www.youtube.com/watch?v=Kas0tIxDvrg I thought you could perhaps fit a sigmoid line as well as the exponential.

Or even better (will probably only make sense if you've seen the video)

  1. calculate the change per time step (current value - previous value)
  2. calculate the growth rate (ratio of the current and previous change)
  3. estimate when the above will be 1 (with another kind of fit?)
  4. fit the sigmoid with that time as the inflection point 😲?

Missing daterange rivm_NL_covid19_total_municipality.csv

Hi Jonathan,

First of all: fantastic effort by pulling all of this together. It helps many people from many industries (like mine: publishing) to make the data accessible.

Question about rivm_NL_covid19_total_municipality.csv: the dataset seems to be jumping from 31-mrt to 8-apr. Do you know a way how to get the missing data available? I tried to copy/paste from different dataset but there always seems to be a problem (day count vs. cum. count etc.)

Include data with reporting lag per province

The RIVM reports nicely show the data that is added per day, to make it clear that there is a significant lag in the reporting. However I did not find the source data published anywhere, a set that shows the reported cases/deaths/hospitalizations per reporting date and per date.

I took the liberty of extracting the data from the pdf reports. Data is here

Please note that no individual reports are available before march 27th, so this shows in the graphs below.


interactive


interactive

So now I made this, don't really know what to do with it. I can make a PR of course, but not sure how to structure the data for in here. Updating this requires a few minutes to export the data from the pdf graphs. Also I only did this for hospitalizations, not for the deaths/confirmed cases yet

difference lcps and nice data

Does anyone know what the difference is between the intakeCount in the nice dataset and the number of cases in the lcps dataset? In the media they use the lcps data and say it is the number of corona patients in the icu. The Rivm daily epidemiological report calls it the confirmed covid-19 cases. Does this mean that the lcps data also includes suspected cases?

Error in dataset on e.g. row 18

in file rivm_corona_in_nl.csv i notice some lines which dont specify a "gemeenten". Like e.g. row 18
2020-03-02 |   | -1 |   | 8 -- | -- | -- | -- | --

There a few more of these.

How should i interpret these lines please?

Thanks
PAtrick

Concerns about missing locations

On 30 March, we had 310 missing municipalities. Based on the RIVM report, there are no missing provinces today. Is this possible? Are they imputing with the GGD region? Or do they no longer include missing locations? That would be interesting. Any thoughts on this?

12595 - (126 + 166 + 130 + 1475 + 161 + 1426 + 3412 + 1845 + 698 + 1046 + 161 + 1949) = 0

image

Importing error Python

Hi all,

When I try to import the data into Python with Pandas, I keep hitting the same error:

ParserError: Error tokenizing data. C error: Expected 1 fields in line 49, saw 2

Any suggestions how I can by pass this? Might be an easy question to answer but I haven't been able to find a solution that actually works :(

Intensive Care data from NICE

First of all thank you so much for the efforts! It was the best efforts on getting all the numbers together I saw and thank you for that.

Regarding the IC data from NICE... understand you said it was experimental but I noticed that it has stopped updating for 4 days. What's the plan on this? Is there temporary issues on getting the data, or is this dataset being deprecated?

thanks!

How to interpret the number of hospitalizations?

The maps for hospitalizations and infections look very different. With hospitalizations the big cities light up:

Screenshot from 2020-04-16 09-21-07

With infections the areas where Corona is most widespread ATM light up:

Screenshot from 2020-04-16 09-21-15

I first thought that the hospitalizations are recorded for the municipality in which the hospital is located. But this is not possible, because there are municipalities with no hospital that have a non-zero number.

What is the best way to interpret the number of hospitalizations?

Data changes by RIVM - 31 March 2020

Hello Jonathan,

Quick heads up! as you will find on the RIVM website they changed the data from confirmed cases to number of people hospitalized in the municipalities.

Kind regards,
Jim

Our website uses data from CoronaWatchNL

For our website (covid-analytics.nl) RIVM and LCPS data trough CoronaWatchNL is used. How would you like me to reference to CoronaWatchNL? Currently all charts that use RIVM and LCPS data trough CoronaWatchNL point to RIVM and LCPS directly. Would a mention in the charts explanation page or in some sort of about us page be acceptable? Or perhaps some sort of CoronaWatchNL logo somewhere on the page?

coronawatch website

is het een idee om een website te genereren aan de hand van alle data/grafieken in deze repo?
zoals bijvoorbeeld http://covid19.healthdata.org/

ik zat zelf te denken aan een gatsby site gehost op S3.
met daar op de datasets, grafieken, kaarten etc.

We hebben zelf het domein caard.nl "over" waar vast wel een acronym van te maken is. Maar een ander domein kan natuurlijk ook.

Mocht er interesse zijn kan ik uiteraard een PR sturen.

Add row with unknown location

A couple of records don't have a location (or the sum of all cases doesn't sum up to the reported total). We might want to add an additional row/column to the data with unknown locations.

test data not updated

I noticed that test data is last updated on the 20th and have not been updated since then. Also noticed that the format in the pdf was slightly changed to weekly so not sure if this was the cause but might be related. Thanks!

Create overview page with dashboards/website using CoronaWatchNL data

Thank you all for sending us emails with dashboards, visualizations, and gifs made with CoronaWatchNL data. We love to see them.

At the moment, we are receiving a lot of emails on this and are having a hard time answering these emails. It might be an idea to make a separate page with an overview of these initiatives. (And integrate the Interesting links section of the README. Is there anyone interested in setting this up?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.