Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update florabank SQL #75

Open
23 of 32 tasks
peterdesmet opened this issue Oct 21, 2015 · 3 comments
Open
23 of 32 tasks

Update florabank SQL #75

peterdesmet opened this issue Oct 21, 2015 · 3 comments

Comments

@peterdesmet
Copy link
Member

peterdesmet commented Oct 21, 2015

Note: Data verified for version 45.5 on 2015-10-21. Metadata is still under review (see #76)

Changes to apply for occurrence core:

  • Use lowerCamelCase field names
  • Map occurrenceID instead of GUID
  • Remove modified
  • Update language to en
  • Rename rights to license
  • Update license
  • Update accessRights
  • Verify references: spelling, correct field, etc. Currently an error for wrong data type
  • Use dataset DOI in datasetID
  • Remove collectionCode
  • Update datasetName to the actual title (not CAPITALIZED)
  • Verify basisOfRecord values
  • Remove catalogNumber
  • Update recordedBy to use | as identifier (currently Danny Minnebo;Karel De Waele;William Sierens;J. De Troyer;...).
  • Add samplingProtocol
  • Verify eventDate: all ISO?
  • Keep verbatimDate:that one has date ranges in ISO format, eventDate is a single ISO date.
  • Remove country
  • Update verbatimCoordinateSystem to IFBL 1x1km or similar
  • Add verbatimSRS. Is it Belgian Datum 1972?
  • Update coordinateUncertaintyInMeters to 707
  • Verify kingdom: some are incertea sedis
  • Verify taxonRank: are those all species?
  • Use lowercase for taxonRank
  • Add vernacularName
  • Remove unmapped columns: CatalogNumberNumeric
  • Apply changes in production

Questions:

  • Can we add rightsHolder = INBO?
  • Do we keep bibliographicCitation? We don't have it for VIS and the DOI is in datasetID.
  • Shouldn't we move dataGeneralizations to georeferenceRemarks? It currently mentions The centroid coördinates (sic) of the IFBL square containing the occurence (sic) were given, but that is just a calculation. Maybe rephrase as coordinates are centroid of 1x1km IFBL square
  • Should we have all names in recordedBy?
  • Can we add a samplingEffort?
@peterdesmet peterdesmet added this to the Update SQL milestone Oct 21, 2015
@stijnvanhoey
Copy link
Contributor

Regarding the references, these are the distinct values in the data set:

Streeplijst Limburg mod. 1995
Streeplijst Limburg mod. 1973
IGO leuven
Streeplijst Limburg mod. 1962
Parkeninventaris KULeuven
Hoogveenrelicten provincie Antwerpen
Herbarium Albert Vermeijen
databank BIM
Detailkartering doelsoorten Grensmaas
Herbarium Dirk De Beer
doctoraat Martin Hermy
Herbarium Filip Verloove
Herbarium Leo Andriessen
Project indicatoren bosvitaliteit
ecosyteemvisies bovenlopen Dender
Gewestelijke Bosinventarisatie
Streeplijst flower 1995
Streeplijst IFFB. Mod. 1972
Bestrijding invasieve exoten provincie West-Vlaanderen
Ecohydrologie IN
Herbarium Universiteit Luik
Herbarium Universiteit Namen
Atlas van de Belgische en Luxemburgse flora 1978 (tekstgedeelte)
Streeplijst IFFB. Mod. 1973
Doctoraat Kris Verheyen
Detailkartering doelsoorten duinen
Herbarium Universiteit Louvain-La-Neuve
digitale streeplijst
Typologie Stilstaande wateren IN
Prodrome de la Flora Belge
Streeplijst Ivan Hoste (planteninventarisatie Bellem)
Detailkartering Rode-Lijstsoorten
Herbarium Nationale Plantentuin van België
Streeplijst Natuurpunt West-Vlaanderen
kaartjes Filip Verloove
Streeplijst polders Leo Vanhecke
Streeplijst IFB. Mod. 1956
literatuur
Strandinventarisatie BEST 2003
Herbarium Universiteit Gent
Natuur inrichtingsproject van Uitkerkse Polder
Typologie Waterlopen / UIA
Atlas van de Belgische en Luxemburgse flora 1972
Streeplijst IFB. Mod. 1940
Kaartje Schelde-Leie (Karel De Waele)
Streeplijst flower 2001 (nederlandstalig)
Herbarium BRVU
Kaderrichtlijn water macrofyten
ecosysteemvisie Zwarte Beek
Streeplijst IFFB. Mod. 1978
Biologische Waarderingskaart
Streeplijst IFB. Mod. 1948
Inv. Autochtone bomen en struiken
Limburgse Atlas
Vlaamse Vegetatie Databank
Waarnemingen.be
Natuurtechnische verwerking van bermmaaisel
Streeplijst IFB. Mod. 1952
Ecologische typologie stilstaande wateren
Herbarium Van Heurck
Streeplijst Floron mod. 1996
Bestrijding aquatische neofyten provincie Antwerpen
Monitoring bosreservaten
HPG polderkartering
Atlas van de Flora van het Brussels Hoofdstedelijk Gewest (1994)
losse waarneming
Vlaamse Landmaatschappij
Streeplijst IFB. Mod. 1962

Not completely sure what defines the 'error for wrong data type', but how could this be improved?

@stijnvanhoey
Copy link
Contributor

stijnvanhoey commented Sep 21, 2016

  • Verify eventDate: all ISO?: Dates from 1792-01-01 till 2016-02-23, all convertible to datetime and checked for with regex
  • Add verbatimSRS: Is it Belgian Datum 1972? GIS data .prj file refers to the Lambert 1972

@stijnvanhoey
Copy link
Contributor

stijnvanhoey commented Sep 21, 2016

  • Verify taxonRank: are those all species: I did a count of the different taxonrank fields:

    taxonRank kingdom count
    Cultivar Plantae 7
    Form Plantae 3
    Generic hybrid Plantae 3
    Genus Plantae 26410
    Nothosubspecies Plantae 6
    Species Plantae 3545162
    Species aggregate Plantae 2489
    Species group Plantae 22731
    Species hybrid Plantae 15259
    Subspecies Plantae 155616
    Variety Plantae 12024

Should we only select the species and exclude the others from the data set?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants