belgianbiodiversityplatform / natagora-occurrences Goto Github PK
View Code? Open in Web Editor NEW🦆 Observations.be - Species occurrence datasets published by Natagora
License: MIT License
🦆 Observations.be - Species occurrence datasets published by Natagora
License: MIT License
Record https://observations.be/waarneming/view/143238934 has found as tracks | found as tracks
as occurrenceRemarks
. Can this duplication be avoided?
I inspected the content of the current field samplingProtocol
, originally mapped from typeActid
. Some suggestions for change:
samplingProtocol | decision | source | new Darwin Core term |
---|---|---|---|
planted | remove | typeActID | occurrenceRemarks |
unknown | remove | typeActID | |
escaped | remove | typeActID | occurrenceRemarks |
indigenous | remove | typeActID | |
sown | remove | typeActID | occurrenceRemarks |
present | remove | typeActID | |
adventive | remove | typeActID | accidentally introduced in occurrenceRemarks |
micr_examined_mat_present | remove | typeActID | microscopic examination in identificationRemarks |
catch_by_cat | remove | typeActID | occurrenceRemarks (= catch by cat ) |
washed_ashore | remove | typeActID | occurrenceRemarks (= washed ashore ) |
taxonID
currently contains links as http://observations.be/soort/view/110. soort/view
should be corrected to species
so we have links as https://observations.be/species/110/
Limit is 25 Mb. Do I slice it?
This field (for which no documentation can be found on the web) only contains at present time
information from kleed : 'queen','worker','winged gyne' and 'unwinged gyne'
I suggest we could also add to this field the following information that is now sent to other fields :
1/ now sent to behavior : 'territorial behavior', 'copulating', 'laying egg', 'transporting feed or faeces', 'courtship/mating', 'nest building', 'distraction display',
2/ now sent to occurrenceRemarks : 'adult in territory', 'near nest', 'colony in trees', 'colony','found as nest','occupied nest','occupied nest with eggs','occupied nest with young','with broodpatch','pair in territory','probably nesting place', 'recently hatched young', ''recently used nest', and also 'found as substrate with miner damage', 'found as gall','found as egg mass','found as cocoon'
and also 'abandoned nest'
Do you find it useful or unappropriate ?
I went through the mapping of the following Darwin Core terms: behavior
, lifeStage
, occurrenceRemarks
, reproductiveCondition
and samplingProtocol
. For the following terms, some adaptations are required to fit the natuurpunt vocabularies:
collected
is not integrated in the vocabulary for behavior and should be NA
instead (thus obsbe_act
= COLLECTED
has no value for behavior
)collected
is not integrated in the vocabulary for occurrenceRemarks and should be NA
instead (thus obsbe_act
= COLLECTED
has no value for occurrenceRemarks
)|
as a separator for multiple values, rather then ;
(mark: there's a space before and after the hash)samplingProtocol
. Now, the field often contains a combination between casual observation
and another sampling protocol. Instead, casual observation
is the default value, used only when no alternatives are presented. This is why I suggest to map obsbe_act
and obsbe_method
directly to samplingProtocol
, rather then joining the content of the intermediary columns act_samplingProtocol
and met_samplingProtocol
. The mapping should look like this (use a case_when statement):obsbe_act | obsbe_method | samplingprotocol |
---|---|---|
CAMERATRAP | (...) | camera trap |
CATCH | (...) | catch |
CATCH_ELECTRIC | (...) | catch by electrofishing |
CATCH_POLE | (...) | catch by fishing rod |
CATCH_NET | (...) | catch by net |
COLLECTED | (...) | specimen collected |
WITH_DETECTORHUNTING | (...) | observation with bat detector |
FLASHLIGHT_NIGHT_OBSERVATION | (...) | observation with flashlight |
IN_PELLET | (...) | pellet examination |
COLLECTED | (...) | specimen collected |
(...) | BATDETECTOR | observation with bat detector |
(...) | CAMERATRAP | camera trap |
(...) | CAUGHT | catch |
(...) | CAUGHT_ELECTRIC | catch by electrofishing |
(...) | CAUGHT_BY_HAND | catch by hand |
(...) | CAUGHT_BY_HAND_AND_COLLECTED | catch by hand and collected |
(...) | CAUGHT_NET | catch by net |
(...) | CAUGHT_POLE | catch by pole |
(...) | BEATING_SCREEN | catch by screen |
(...) | COLOURTRAP | colour trap |
(...) | HEARD | heard |
(...) | LIGHTTRAP | light trap |
(...) | IN_PELLET | pellet examination |
(...) | SEEN | seen |
(...) | SEEN_AND_HEARD | seen and heard |
(...) | INDOORS | seen indoors |
(...) | SOUNDTRAPPED | sound trap |
(...) | SPOTLIGHT_NIGHT_OBSERVATION | spotlight |
(...) | TRACK_BED | track bed |
(...) | (...) | casual observation |
Can you get me the DOI entry for URL https://ipt.biodiversity.be/resource?r=natagora-alien-occurrences ?
Thank you.
references
should have a link like https://observations.be/observation/193304784/
. Currently the values are https://observations.be/waarneming/view/193304784
.
"migrating" and "resting" should go to behavior
Thanks for this first data input. I browsed through the file, here are my remarks:
The following terms are not Darwin core terms:
typekid
typeactid
typemid
act_occrem
kleed_occrem
met_occrem
act_samplingprotocol
met_samplingprotocol
typekid
, typeactid
and typemid
are the respective id's from database tables type_kleed
, type_activiteit
and type_determination_method
. These id's have been translated to a vocabulary for the mapping of the Darwin Core fields shown in the table below. The content of a Darwin Core term can be a compilation of id's from type_kleed
and/or type_activiteit
and/or type_determination_method
, this is why act_occrem
, kleed_occrem
, met_occrem
(for occurrenceRemarks
), act_samplingprotocol
and met_samplingprotocol
(for samplingProtocol
) have been created.
id | identificationRemarks | lifeStage | occurrenceRemarks | samplingProtocol | reproductiveCondition | behavior |
---|---|---|---|---|---|---|
typekid | x | x | x | |||
typeactid | x | x | x | |||
typemid | x | x | x |
accessRights: the current link https://www.natagora.be/usage-des-donnees does not work (already discussed this)
There should be no dataGeneralizations in the datasets: all locations should be point coordinates and not generalized to a 5x5km UTM grid
decimalLatitude and decimalLongitude are not in the correct format. now valeus like 507098773
and 567654132843018
are present, should be something like 50.7098
and 5.6765
Observations with samplingProtocol
= sound trap
or camera trap
should have basisOfRecord
= MachineObservation
. That is currently not the case for all camera trap obs.
Affects 28010 records
I inspected the content of the current field occurrenceRemarks
, a compilation of the information contained in typeActid
, typdeKid
and typeMid
. Some suggestions for change:
occurrenceRemarks | decision | source | assign to new Darwin Core term |
---|---|---|---|
recently hatched young | remove | typeActID | |
unknown | remove | typeActID | |
indigenous | remove | typeActID | |
present | remove | typeActID | |
occupied nest with eggs | remove | typeActID | |
adventive | remove | typeActID | accidentally introduced |
micr_examined_mat_present | remove | typeActID | microscopic examination ) |
catch_by_cat | remove | typeActID | leave in occurrenceRemarks, but write as catch by cat |
washed_ashore | remove | typeActID | leave in occurrenceRemarks, but write as washed ashore |
with broodpatch | remove | typeActID | with brood patch |
microscopic_examined | remove | typeActID | microscopic examination ) |
seen while diving | remove | typeMid | samplingProtocol |
I reviewed the whole Natagora dataset, below are my remarks. Could you please tick off the boxes when the changes are integrated and send me a new export afterwards?
en
(now e
)animal
from title, dataset is applicable to animals, plants and fungicollected
should be NA
, see #8collected
should be NA
, see #8collected
should be specimen collected
, see #8BE
(now B
)ICZN
for Animalia and ICN
for plants and fungiThe following values in the Natagora alien species dataset are incorrectly mapped. @LouisNatagora can you correct this:
value | behavior | occurrenceRemarks | samplingProtocol |
---|---|---|---|
dead | (empty) | found dead | (default) |
destroyed_nest | (empty) | found as destroyed nest | (default) |
drowning_victim | (empty) | drowning victim | (default) |
eating | feeding | (empty) | (default) |
laying_egg | laying egg | (empty) | (default) |
prey_dead | (empty) | found dead | (default) |
tagged | (empty) | tagged | (default) |
tracks | (empty) | found as tracks | (default) |
unknown | (empty) | (empty) | (default) |
Note that (default)
is the default mapping to casual observation
Whip specifications have been started for this dataset but are far from complete.
If we plan to get regular updates, than a complete whip specification can be very useful. @LienReyserhove are you intending to write one? If so, I can help.
Can the test-upload-SQL-dump
branch be deleted?
A selection should be made based on the following criteria:
category | status | details |
---|---|---|
1b | Incidental / Vagrant / Migrant | Regular sightings in this country |
2a | Naturalized | Introduced by man, now autonomously reproducing |
2b | Naturalizing | Introduced by man, autonomous populations for 10-100 years. |
2c | Exotic | Introduced by man, no autonomous populations for more than 10 years. |
2d | Incidental import | Introduced by man, no autonomous population. |
Two important questions:
If option one is true, then all data can be obtained by filtering on observations from the walloon provinces brabant wallon
, hainaut
, liège
, luxembourg
, namur
This raw selection can be uploaded in the branch upload-SQL-dump where you can create a new folder raw
.
For license
, use https://creativecommons.org/publicdomain/zero/1.0/
not http
This repository will be used for all Natagora data publication. @silenius can you update the repository name (and thus its url) to natagora-occurrences
I know we discussed this issue before (and to be honest, I can't remember why we concluded to keep them in), but in hindsight, I believe we should remove the generalized records (those coordinates generalized to a 4x4km IFBL grid). This because:
I also wonder why georeferenceRemarks
is set to coordinates are centroid of used grid square
for 2745 records. I suppose this is due to other reasons then to secrecy reasons?
Many samplingProtocol
values are empty. They should have the default value casual observation
instead.
2 questions:
@LienReyserhove as per other datasets, can you complete the README for this datasets. I'm especially missing links to the dataset on IPT and GBIF.
I would also rename the src
directory to sql
I inspected the content of the current field behavior
, originally mapped from typeActid. Some suggestions for change:
behavior | decision | source | new Darwin Core term |
---|---|---|---|
planted | remove | typeActID | occurrenceRemarks |
escaped | remove | typeActID | occurrenceRemarks |
indigenous | remove | typeActID | estblishmentMeans |
sown | remove | typeActID | occurrenceRemarks |
present | remove | typeActID | occurrenceStatus |
I need the list of the different possible values for field "Identification verification status". Where can I find it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.