Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current non-compliance with JSON:API - fetching paginated relationships #420

Open
ml-evs opened this issue Jul 6, 2022 · 8 comments
Open

Comments

@ml-evs
Copy link
Member

ml-evs commented Jul 6, 2022

Below is the text of an issue I wrote but never posted (one of a few...), this has come up again in the discussion on #386. Alternative discussion in the context of collections:

The Fetching relationships part of the JSON:API spec states that we should also serve relationships at e.g., https://example.com/v1/collections/42/relationships/structures. This would then be a normal entry listing endpoint that could be paginated. For such one-to-many relationships this is probably preferable to using included, which is primarily for condensing many-to-one relationships into a single response (e.g., all structures pointing at the same reference).

This is actually part of a bigger non-compliance, as JSON:API mandates in Fetching resources that you should be able to pick out particular attributes of an entry via e.g., https://example.com/v1/structures/123/elements. This would break our allowed IDs (which can have slashes, as I currently do with odbx/1... hence why I never posted the issue 😅).

Orignally posted in #386 (review)


I've just been flicking through the JSON:API specification (as one does) and I noticed that there's a mandatory aspect of fetching data that we do not support in OPTIMADE v1.

From the section Fetching Resources, it is clear that, for compliance with JSON:API, we would need to support 3 levels of URLs for an endpoint, namely (rebranded for OPTIMADE):

For example, the following request fetches a collection of structures:

GET /structures HTTP/1.1
Accept: application/vnd.api+json

The following request fetches a structure:

GET /structures/1 HTTP/1.1
Accept: application/vnd.api+json

EDIT: see #420 (comment) for clarification on the next example.

And the following request fetches a structures's elements:

GET /structures/1/elements HTTP/1.1
Accept: application/vnd.api+json

To add support for this, we would need to ban "/" from appearing in the entry id field, and require that all implementations add these additional /{entry_type}/{entry_id}/{field}-style endpoints.

I don't see these endpoints being useful for OPTIMADE itself as we have so few entry types, so we may decide to diverge from JSON:API, but in that case we should probably add a note in the specification, or make these endpoints optional (which would still require banning "/" from IDs, I guess).

NB: we also have the same problem with relationships, which "MUST" be fetchable via e.g. /structures/<entry_id>/relationships/references.

@rartino
Copy link
Contributor

rartino commented Jul 7, 2022

The argument has been that we want to support all mandatory features of JSON:API so that we can use libraries and tools developed for that standard. If we leave out significant parts, that is likely to not work and we could just as well completely abandon JSON:API. (On the other hand, I don't think any of the implementations I've seen actually uses any such libraries or tools.)

I'll add a slightly edited version of my post from #386:

The Fetching relationships part of the JSON:API spec states that we should also serve relationships at e.g., https://example.com/v1/collections/42/relationships/structures.

Reading Section 1 I'd say that strictly speaking, the specification already requires this construct to be supported:


"The API specification described in this document builds on top of the JSON API v1.0 specification. In particular, the JSON API specification is assumed to apply wherever it is stricter than what is formulated in this document. Exceptions to this rule are stated explicitly (e.g. non-compliant responses are tolerated if a non-standard response format is explicitly requested)."


This is actually part of a bigger non-compliance, as JSON:API mandates in Fetching resources that you should be able to pick out particular attributes of an entry via e.g., https://example.com/v1/structures/123/elements. This would break our allowed IDs (which can have slashes, as I currently do with odbx/1... hence why I never posted the issue 😅).

Based on the quote above, I think the implication is also that we MUST support this.

It doesn't necessary "break" IDs with slashes, just makes the implementation more difficult. How your implementation works out the separation between your "/"-based IDs and these fields is up to you, but you can do it by, e.g., relying on that your ids have a fixed format so you know the first slash is always part of the id. It isn't very helpful of us to give examples showing "/"-based ids though.

@merkys
Copy link
Member

merkys commented Jul 7, 2022

I think we can restore the JSON:API compliance. I recall we removed all restrictions on IDs, but reserved URL characters still need URL-encoding (ID cannot have unencoded ?, for instance). Slash is among the reserved URL characters, thus it has to be URL-encoded when standing for a part of a value. Thus our current specification is OK as is here.

A bit more problematic is the support for 3rd level of URLs. One may argue that current OPTIMADE specification says it MUST be supported as per

API specification described in this document builds on top of the JSON API v1.0 specification.

However, I doubt any implementation supports it now, rendering all of them non-compliant. But this could be fixed (maybe validator could poke implementations towards that 😅).

In short, I find nothing contradictory with JSON:API in the current specification with this regard. It could although be more verbose about transitive JSON:API requirements.

@rartino
Copy link
Contributor

rartino commented Jul 7, 2022

@merkys

Excellent points - so, the only thing we need to resolve this issue is to clarify as part of the JSON API output format documentation that these 3rd level URLs are supposed to be supported. And perhaps clarify your finding that there is no conflict for IDs containing slashes due to the URL encoding.

(Surely our commitment to JSON API makes it too late to back off an integral feature in that standard at this point?)

@merkys
Copy link
Member

merkys commented Jul 7, 2022

@rartino

(Surely our commitment to JSON API makes it too late to back off an integral feature in that standard at this point?)

We may as well actually claim that this was a bug in our specification all along and roll out a patch release stating that 3rd level URL support is the sole exception where we do not follow JSON:API. This would render current implementations valid as they are.

I personally like that OPTIMADE follows JSON:API. I have used some 3rd party tools to interact with OPTIMADE providers as JSON:API-compliant servers, and that was nice. I even mentored a student implementing JSON:API client in C++, as a side advantage being also an OPTIMADE client. Another point is that if we uncouple from JSON:API, we will essentially have to rewrite its specification in OPTIMADE, possibly doubling its current size.

@ml-evs
Copy link
Member Author

ml-evs commented Jul 7, 2022

Agree with both your comments, but not sure of the best way to proceed.

In terms of the feature in its own merits, to me /structures/1/elements feels a bit unnatural (does it just return ["O", "Si"], or is it equivalent to /structures/1?response_fields=elements (i.e. retaining the ID and other OPTIMADE metadata?). Given the example under Fetching Resources section I linked, its not actually super-clear to me if this is distinct from the Fetching Relationships bit. The example given is /articles/1/author, yet the example data for /articles/1 only lists author as a relationship, not an attribute.

The relationships approach, with /structures/1/relationships/references returning the equivalent of our current two-step process of aggregating IDs, seems to me to be very useful. Since most providers do not currently use relationships (just COD and me, I think...) perhaps we could feasibly introduce this as a requirement.

@merkys
Copy link
Member

merkys commented Jul 8, 2022

In terms of the feature in its own merits, to me /structures/1/elements feels a bit unnatural (does it just return ["O", "Si"], or is it equivalent to /structures/1?response_fields=elements (i.e. retaining the ID and other OPTIMADE metadata?).

Interestingly indeed, JSON:API specification (even the upcoming v1.1 version) seems silent on this. I think the second option (/structures/1/elements = /structures/1?response_fields=elements) fits better the spirit of JSON:API (top-level has to be an object, have data/meta/errors fields etc.).

@rartino
Copy link
Contributor

rartino commented Jul 8, 2022

After reading these sections in the JSON:API specification a bit more carefully, I think we are barking up the wrong tree regarding the third level URLs for resources (but the third level URL for relationsships remains relevant.)

The JSON:API specification makes a distinction between Attributes and Resources. In OPTIMADE we have stuffed all OPTIMADE properties into JSON:API attributes. Maybe we shouldn't have done that - in hindsight it seems a bit against the spirit of their design to have large data structures as attributes. But, on the other hand, the JSON:API "subresources" are proper JSON API resource objects, so then every OPTIMADE property would have been a resource object, which I think would have led to a degree of overhead we do not want.

To clarify, look at the following example from the JSON:API specification of the response to requesting article 1. It just indicates a relationship to an author which then points to the author "subresource" of article 1:

{
  "type": "articles",
  "id": "1",
  "attributes": {
    "title": "Rails is Omakase"
  },
  "relationships": {
    "author": {
      "links": {
        "self": "http://example.com/articles/1/relationships/author",
        "related": "http://example.com/articles/1/author"
      },
      "data": { "type": "people", "id": "9" }
    }
  },
  "links": {
    "self": "http://example.com/articles/1"
  }
}

But, this is a different thing from expecting to be able to request attributes on the format: /{entry_type}/{id}/{attribute}, e.g., http://example.com/articles/1/title. I don't see any support for that in the JSON:API specification. Since we use attributes, my conclusion is that we are all fully within the JSON:API specification not supporting any "third level" URLs.

That said, for JSON:API implementations that uses these "subresources", I think the above implies that the response must be the same as if you fetch a "top level" resource, because it is the same thing. Their example of fetching a missing author field at least makes it explicitly clear that they intend a JSON:API-formatted response:

HTTP/1.1 200 OK
Content-Type: application/vnd.api+json

{
  "links": {
    "self": "http://example.com/articles/1/author"
  },
  "data": null
}

@ml-evs
Copy link
Member Author

ml-evs commented Jul 8, 2022

Agreed, I think it was my example in the OP that was misleading. I will rename the issue to shift the discussion onto relationships only.

@ml-evs ml-evs changed the title Current non-compliance with JSON:API - fetching a single field from a resource Current non-compliance with JSON:API - fetching paginated relationships Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants