Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added relatedResourceID to eMoF #102

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

lhmarsden
Copy link

The extendedMeasurementOrFacts extension is a very useful way to record measurements or facts related to an occurrence or event in a standardised, potentially machine-readable way.

However, one might have measurements or facts related to a range of different things. For example, I work with many biologists who take measurements of materials or samples they are logging in a Material Sample Extension.

In this pull request, I am suggesting that the relatedResourceID term is added to the extendedMeasurementOrFact extension - after some discussion with @dagendresen.

One could use this to record measurements related not only to material samples, but anything else, without the need of a resourceRelationship extension.

@timrobertson100
Copy link
Member

Thank you @lhmarsden!

Can you please open an issue describing the rationale for this (i.e. copying from above), so those who are using this extensively have the opportunity to comment (mainly OBIS) before we move to implement it?

@pieterprovoost
Copy link

@lhmarsden Would you mind updating this PR to resourceID?

@lhmarsden
Copy link
Author

@pieterprovoost Done

@timrobertson100
Copy link
Member

@pieterprovoost - thanks for approving this request.

@MattBlissett @ManonGros @tucotuco - please can you comment if you have concerns, or thumb up that you agree to merge this? It's an extension used by OBIS primarily (exclusively?) and has come from them.

@timrobertson100 timrobertson100 marked this pull request as draft May 15, 2023 14:34
@timrobertson100
Copy link
Member

timrobertson100 commented May 15, 2023

I've converted this to draft as we will need to implement this as a new edition (filename and issued tags) but will fix that up as we merge this.

I've pinged a couple of people for a final possibility to comment before merging.

@ManonGros
Copy link

ManonGros commented May 15, 2023

@lhmarsden I don't think I understand the idea.
When would someone be using the relatedResourceID?
Would it be in addition to an occurrence or eventID?
If there is one eMoF in a dataset with occurrences and a materialSample extension. How would users know where to look for the resource based on the ID? Would the relatedResourceID always refer to the extensions and never the core?

[EDIT] This thread (#103) answers some questions (it wouldn't replace the occurrence and eventID and I can see examples of how it can be used for occurrences).
I am not sure I understand how this would work though:

One could use this to record measurements related not only to material samples, but anything else, without the need of a resourceRelationship extension.

How would users know what the measurement refers to?

@tucotuco
Copy link
Collaborator

OBIS is definitely not the only group to be using the extension. I have recommended to others to use it many times.

Regardless of who might be using it, I would like to echo @ManonGros's questions about implementation. I like the added capacity this reflects, but how will anyone know what file to look in for the connection, without scanning until finding the matching identifier. And then, having found one, what happens if it is in a one-to-many relationship? What happens if identifiers are not GUIDs, but identifiers unique only within the scope of the classes they belong in?

I think it would be more robust to add specific identifiers for the classes for which there is demand to support, e.g., materialSampleID. And even in that case, the MaterialSample task group is hoping to recommend phasing out the term MaterialSample in favor of MaterialEntity. If they both begin to come into use, there is going to be some confusion. My best guess is that MaterialEntity and materialEntityID will be ratified as new DwC terms. You could anticipate that, with the associated risk, by including materialEntityID in the EMoF extension.

These aren't objections, they are suggestions about potential consequences of the proposed way forward.

@lhmarsden
Copy link
Author

If there is one eMoF in a dataset with occurrences and a materialSample extension. How would users know where to look for the resource based on the ID?

How would users know what the measurement refers to?

Regardless of who might be using it, I would like to echo @ManonGros's questions about implementation. I like the added capacity this reflects, but how will anyone know what file to look in for the connection, without scanning until finding the matching identifier. And then, having found one, what happens if it is in a one-to-many relationship? What happens if identifiers are not GUIDs, but identifiers unique only within the scope of the classes they belong in?

How does this currently work for the resourceRelationship extension? I guess the same problem is encountered here.

I have no objection to use materialSampleID or materialEntityID either.

But then what about other IDs? taxonID? organismID? geologicalContextID? There are many scenarios where it would be useful to include measurements for which existing terms are not available/suitable. Maybe all the relevant ID terms should be added?

@timrobertson100
Copy link
Member

timrobertson100 commented May 16, 2023

How does this currently work for the resourceRelationship extension? I guess the same problem is encountered here.

Yes, it appears to be the same problem to me.

There are many scenarios where it would be useful to include measurements for which existing terms are not available/suitable. Maybe all the relevant ID terms should be added?

This is precisely one of the challenges posed by the DwC-A star schema and its forced denormalization. That's a reason
we are exploring more expressive models using Frictionless Data schemas, such as the partial Material model found here, which we are researching alongside the IPT v3 branch (some months out still).

It's not ideal, but considering the circumstances, I believe it's reasonable to include the requested term in this pull request. I want to acknowledge the limitations highlighted by @tucotuco, the upcoming proper fix with the model, and the fact that neither GBIF.org (nor OBIS?) will attempt to interpret this. However, it's important to remember that extensions were designed to allow sub-communities to add additional elements according to their specific needs.

@pieterprovoost
Copy link

@timrobertson100 I can confirm that OBIS will not interpret this either.

@ManonGros
Copy link

Thanks Luke! I was under the impression that the Resource Relationship extension was only to express relationships between the records of the core. I see now that I was mistaken:

Support for relationships between resources in the Core, in an extension, or external to the data set

I guess it does include the same limitations as having a relatedResourceID. I would have expected the Resource Relationship extension to have some additional field specifying where to find the related resource.

I find that relatedResourceID without any context in the eMoF and in the resource relationship extension is difficult to interpret for users. That being said, the new model should address those limitations. In the meanwhile, maybe it is good to have relatedResourceID in the eMoF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants