Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Measurements for materials and more #103

Open
lhmarsden opened this issue Mar 30, 2023 · 18 comments
Open

Measurements for materials and more #103

lhmarsden opened this issue Mar 30, 2023 · 18 comments

Comments

@lhmarsden
Copy link

The extendedMeasurementOrFacts extension is a very useful way to record measurements or facts related to an occurrence or event in a standardised, potentially machine-readable way.

However, one might have measurements or facts related to a range of different things. For example, I work with many biologists who take measurements of materials or samples they are logging in a Material Sample Extension.

In this pull request, I am suggesting that the relatedResourceID term is added to the extendedMeasurementOrFact extension - after some discussion with @dagendresen.
#102

One could use this to record measurements related not only to material samples, but anything else, without the need of a resourceRelationship extension.

@timrobertson100
Copy link
Member

Thanks for opening this @lhmarsden
I'll ping the OBIS group to comment

@albenson-usgs
Copy link
Contributor

One other thing to consider- if we make this change to EMoF should we also make the same change to MoF?

@albenson-usgs
Copy link
Contributor

Haven't thought it through, just thinking out loud, would this be a workable solution for this request tdwg/dwc#362 ?

@tucotuco
Copy link
Collaborator

Haven't thought it through, just thinking out loud, would this be a workable solution for this request tdwg/dwc#362 ?

This request [tdwg/dwc#362] has passed public review and is being prepared for an Executive decision.

@pieterprovoost
Copy link

This sounds like a generalization of occurrenceID in ExtendedMeasurementOrFact, if we add this then maybe occurrenceID should be retired or deprecated? I agree with @albenson-usgs regarding tdwg/dwc#362 but I'm not sure how to reconcile the two proposals.

@tucotuco
Copy link
Collaborator

If there are doubts aI urge you to jump in and question tdwg/dwc#362 before it goes to the ratification process. In the Unified Model, we are proposing to allow Assertions on anything be declaring both the type of thing the Assertion is about (which "table") and the key for the record for that type (the equivalent of relatedResourceID).

@ymgan
Copy link

ymgan commented Mar 31, 2023

AntOBIS supports this proposal.

We have some use cases. For example, stomach content of a predator in an Occurrence is assessed to determine the fraction of the predator diet that a prey type made up (by weight). Having this term in emof will allow us to establish predator-prey relationship in an easier manner by having the occurrenceID of the prey as relatedResourceID for the Measurement of the predator. So, we might still need occurrenceID for emof here (I think), unless the dataset has to be published as Occurrence core.

edit: after talking to @pieterprovoost, the relationship probably should be established at Occurrence level (e.g. associatedTaxa or associatedOccurrence)

@dagendresen
Copy link

dwc:ResourceRelationship
dwc:resourceID is the subject, and
dwc:relatedResourceID is the object

Would the resource in a measurement (or fact) be the subject or the object? Would the eMoF document something the resource is doing or something done to the resource?

I have been thinking of the occurrenceID resource of the eMoF as the subject of the measurement and thus better replaced by adding the resourceID term to the eMoF extension?

@ymgan
Copy link

ymgan commented Apr 11, 2023

For AntOBIS example:

  • 1 predator eats multiple preys.
  • stomach of predator was analyzed (regurgitate content, stomach flushing etc ...) to identify the preys.
  • the fraction of diet by weight was measured.

So

  • resourceID/occurrenceID is the predator's occurrence because it is the measurement of its stomach content.
  • relatedResourceID is the prey's occurrence.
  • Each prey item (because there are multiple preys) that consists of the predator's diet will get a record in eMoF for this measurement.

I think it is nicer to specify it here than putting a list of prey occurrences under predator's occurrence (associatedOccurrence). And of course, alternatively, we can use resource relationship extension.

I hope our example makes sense?

@lhmarsden
Copy link
Author

Hi,

Do you know how long it is likely to be before I can use (if accepted) resourceID in the emof extension? I have some data to publish, and am wondering if I should proceed with a resourceRelationship extension instead.

Thanks!

@pieterprovoost
Copy link

@ymgan Would you mind writing out an example, because I'm not clear on how the predator/prey problem this relates to this proposal. This is how I interpret the current proposal:

subject predicate object
ResourceRelationship resourceID relationshipOfResourceID relatedResourceID
eMoF occurrenceID measurementTypeID measurementValueID
eMoF change proposal resourceID measurementTypeID measurementValueID

@lhmarsden
Copy link
Author

I think your interpretation of the proposal is correct, @pieterprovoost. A way of recording measurements related to a materialSample or any other resource.

@ymgan
Copy link

ymgan commented May 2, 2023

occurrence

occurrenceID scientificName associatedOccurrences
occ_001 Pachyptila belcheri "predator of" : ["occ_002", "occ_003"]
occ_002 Crustacea
occ_003 Euphausia vallentini

eMoF

occurrenceID relatedResourceID measurementType measurementValue
occ_001 occ_002 fraction diet by prey items based on regurgitate content 0.997
occ_001 occ_003 fraction diet by prey items based on regurgitate content 0.002

It is the measurement of the stomach content of the predator (occ_001), so I think the eMoF records should point to occ_001. Without the relatedResourceID, the information of the prey established based on stomach content of the bird is lost unless I use the resourceRelationship extension.

occurrenceID measurementType measurementValue
occ_001 fraction diet by prey items based on regurgitate content 0.997
occ_001 fraction diet by prey items based on regurgitate content 0.002

That is how I look at it, but please correct me if my understanding is wrong.


Edit: looking at this after thinking a little more based on Guillaume's comment:

occurrence

occurrenceID scientificName basisOfRecord preparations associatedOccurrences
occ_001 Pachyptila belcheri HumanObservation "predator of" : ["occ_002", "occ_003"]
occ_002 Crustacea MaterialSample regurgitate content
occ_003 Euphausia vallentini MaterialSample regurgitate content

eMoF

occurrenceID relatedResourceID measurementType measurementValue
occ_002 occ_001 fraction diet based on regurgitate content 0.997
occ_003 occ_001 fraction diet based on regurgitate content 0.002

@pieterprovoost
Copy link

@lhmarsden Replacing occurrenceID with resourceID has considerable impact on our indexing and is not something that can be achieved in the short term. What we could do is add resourceID, keep occurrenceID for now, and keep indexing as we do now taking only into account occurrenceID.

@lhmarsden
Copy link
Author

lhmarsden commented May 3, 2023 via email

@guillaumebody
Copy link

Hi,

occurrence

occurrenceID scientificName associatedOccurrences
occ_001 Pachyptila belcheri "predator of" : ["occ_002", "occ_003"]
occ_002 Crustacea
occ_003 Euphausia vallentini

eMoF

occurrenceID relatedResourceID measurementType measurementValue
occ_001 occ_002 fraction diet by prey items based on regurgitate content 0.997
occ_001 occ_003 fraction diet by prey items based on regurgitate content 0.002

It is the measurement of the stomach content of the predator (occ_001), so I think the eMoF records should point to occ_001. Without the relatedResourceID, the information of the prey established based on stomach content of the bird is lost unless I use the resourceRelationship extension.
occurrenceID measurementType measurementValue
occ_001 fraction diet by prey items based on regurgitate content 0.997
occ_001 fraction diet by prey items based on regurgitate content 0.002

That is how I look at it, but please correct me if my understanding is wrong.

We actually proposed another way to deal with such exemple. We had similar issues while identifying pathogens within another species. (Applying Darwin core data standard to wildlife disease – advancements toward a new data model). See also #413.

Using this parentOccurenceID terms it would results in

occurrence

occurrenceID parentOccurenceID scientificName basisOfRecord preparation
occ_001 Pachyptila belcheri human observation
occ_002 occ_001 Crustacea material sample regurgigate content
occ_003 occ_001 Euphausia vallentini material sample regurgigate content

eMoF

measurementID occurrenceID measurementType measurementValue
mea_001 occ_002 fraction diet 0.997
mea_002 occ_003 fraction diet 0.002

@ymgan
Copy link

ymgan commented May 4, 2023

That seems to work!! Thank you very much for taking your time to write this down @guillaumebody !! I appreciate it!

@guillaumebody
Copy link

You're welcome @ymgan , but please, indicate that this is a relevant solution for your situation in #413. The parentOccurenceID is currently not an accepted term of DwC.

@tucotuco this situation plaid to have "parentAssertionID" aside to "relatedAssertionID" concept in the new GBIF model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants