Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an MDS Metrics API #485

Closed
whereissean opened this issue Apr 27, 2020 · 22 comments · Fixed by #487 or #587
Closed

Add an MDS Metrics API #485

whereissean opened this issue Apr 27, 2020 · 22 comments · Fixed by #487 or #587
Labels
Agency Specific to the Agency API Metrics Related to the Metrics API and related topics privacy Implications around privacy for the attention of the OMF Privacy Committee Provider Specific to the Provider API
Milestone

Comments

@whereissean
Copy link
Contributor

whereissean commented Apr 27, 2020

Is your feature request related to a problem? Please describe.

There is currently no standard way to retrieve metrics calculated from MDS data (provider or agency) or to define a standard set of useful MDS-based data aggregations.  We have heard from the OMF community that this leads to a number of problems:

  • Cities need a number of clearly defined best practice metrics for operating, measuring, and managing emerging micro-mobility, and other transportation programs using MDS data.
  • There is currently no standard counting methodology that mobility providers and cities have agreed upon. This consequently causes friction when establishing a new mobility program and evaluating its impact.
  • Cities need to rely upon trusted data sources upon which to perform longer term studies on citizen impact.
  • Mobility providers need consistent rules and measures between city deployments in order to make their operations more scalable.

Describe the solution you'd like

The proposed Metrics API is intended to help users of MDS - both cities, mobility service providers, and third-party ecosystem services - to have a standard way to consistently describe available metrics, and create an extensible interface for querying core MDS metrics and future metrics still to be defined.  It should be a framework to describe how different API users and hosts can:

  1. Define and communicate available metrics;
  2. Request these metrics across multiple dimensions and filters;
  3. Serve these metrics either to external parties or to other MDS API consumers, without requiring the transmission of the underlying raw data;
  4. Ensure that multiple parties can reliably reproduce the same metrics, given the same data

The goal is to be able to define “Metric X” and then ensure that when “X” is calculated by the city, authorized parties, or transportation providers, the result will be identical. For example, while n different methods may exist to calculate the utilization of a vehicle or a fleet for a given time range, the Metrics API is intended to ensure that for given method k, the same result will be produced regardless of who conducts the calculation, and there is a standard interface for authorized users to receive this data without requiring access to underlying raw data.

The Metrics API is intended to be useful for future MDS use cases, best practices and requirements.  Particularly notable is that it provides the foundation to implement data anonymization best practices, such as k-anonymity.  It also represents an important component needed to enable new MDS policy types and compliance evaluation as well as operations management use cases that can only be achieved by linking MDS metrics and MDS policy.

This proposed specification is not intended to represent a complete data pipeline or analytics service. It is also not meant to define the complete set of MDS metrics, only a useful starting point.

Is this a breaking change

  • No, not breaking

Impacted Spec

  • agency
  • provider

Describe alternatives you've considered

It is hoped that this work can be complementary to other projects working to define, develop, and implement metrics services or metrics processing pipelines for MDS data.  Much of this proposal was inspired by excellent work done by OMF member cities and SharedStreets with their SharedStreets Mobility Metrics.

This proposal represents work done without full visibility into the efforts of the Mobility Data Collaborative (MDC).  We hope to bring the metrics defined in the Metrics Definitions PR to alignment with those MDC describes, once they become public.

Additional context

This specification received initial input from a variety of OMF contributors, representing city transportation departments (LADOT), ecosystem services stakeholders (Blue Systems, Lacuna, Ellis & Associates), and mobility service providers (Bird).  We hope it encourages discussion and creation in the OMF on this important subject.  A reference implementation of this API is not included at this time, but hopefully will be developed and contributed following additional community feedback.

Specific thanks to @bhandzo and @HenriJ.

Proposal consists of the following PRs
Metrics API PR #486 and Metrics Definitions PR #487

@Retzoh
Copy link
Contributor

Retzoh commented Apr 30, 2020

Charles Noling, please explicit here the use-cases you had in mind for the API during the city-services working group call.

@Retzoh
Copy link
Contributor

Retzoh commented Apr 30, 2020

@whereissean , it would be great if you could be there during the next city-services working group session on May 14th since we plan to discuss the metrics API in details.

@whereissean
Copy link
Contributor Author

@Retzoh absolutely will be there. Apologies that I couldn't join today.

@schnuerle
Copy link
Member

I wanted to add some examples of what cities are asking for in their reporting requirement from providers. Some of it could be derived from MDS (if we can agree on how to calculate things) and some is outside of MDS (and should be).

Here is Louisville, KY's example from their dockless policy (page 16):

The operator shall provide a monthly report by the end of the first full week of the
following month that is in a format acceptable to Metro that includes, but is not be limited
to, the following:

  1. [* trips] Total number of rides for the previous month and total miles ridden.
  2. [* status_changes] Total number of vehicles in service for the previous month.
  3. [* trips] Number of rides per vehicle per day.
  4. [* status_changes/trips] Location and performance of all preferred and designated parking areas.
  5. [* status_changes] Number of vehicles removed from service
  6. Operator staffing levels
  7. Customer Service Cases, including complaints registered
  8. Vandalism Incidents
  9. Crash reports (to include injury/fatalities)
  10. If available to the Operator, an aggregated breakdown of customers by gender
    and age monthly. Gender must be reported as male, female, and non‐binary. Age
    must be reported using these eight age groups: under 5, 5‐17, 18‐24, 25‐34, 35‐
    44, 45‐54, 55‐64, 65 and over.

Items with a [* api] (I added) can be derived from the MDS feed but the methodology is not always agreed upon. Other items cannot be derived from MDS.

Interesting ones here that could be part of Metrics and are not in MDS are:

  1. Trips with complaints/customer service calls.
  2. Trips with vandalism incidents.
  3. Trips with crash reports to operator.
  4. Trip counts broken down by sex (though categories may need to be aligned with Fenway Institute or Williams Institute recommendations or current best practices).
  5. Trip counts broken down by age brackets. (though I'd recommend 5 year buckets like US Census or 10 year buckets)

It would be good to collect other examples of this from current city policy documents.

@thekaveman
Copy link
Collaborator

thekaveman commented May 20, 2020

Here are Santa Monica's examples from the Shared Mobility Device Pilot Program Administrative Regulations pages 14-15 (last updated April 2019):

3.16.2 Reporting

Operators must provide accurate weekly summaries to the City describing customer and staff incidents, injuries, system operation, system use, reported complaints, customer service responses, system maintenance, and education and outreach efforts. Reports will be provided to the City in the format defined by the City.

A monthly dynamic cap report must be submitted to the City on the second business day of each month following the program launch to allow the City to assess and potentially adjust fleet deployment quantities.

...

3.16.3 System Reports

Anonymized data reports to the City are required weekly for the following municipal-level data:

(a) Total users in system by month
(b) Trip number by day, week and month
(c) Detailed, aggregate trip origin/destination information
(d) Trip length and time
(e) Hourly fleet utilization with trip origin or destination in Santa Monica and within the Downtown area*
(f) Hourly device quantities within Santa Monica and within the Downtown area*

@joshuaandrewjohnson1
Copy link

The Mobility Data Collaborative recently published their Data Sharing Glossary and Metrics document, referenced above and which OMF reviewed/contributed to, and should be utilized here.

MDCGlossaryMetrics02202004.pdf

@sharades
Copy link

sharades commented May 21, 2020

DC requires 7 additional monthly reports within 10 days of the end of the month. I've included the overarching concepts below but for specific fields please see the document attached.
2020.2.24 Attatchment C 2020 Dockless Permit Reporting V2 .pdf

These include:
-Aggregated user data
-Aggregated vehicle data
-Summary report
-Customer Service report (interactions with customers)
-Customer summary report (low-income customer plan ridership)
-Staging areas
-Unmet needs identifying the first location that a user opened the application when searching for a vehicle and did not unlock a vehicle by census block.

@Retzoh
Copy link
Contributor

Retzoh commented May 22, 2020

Subjects discussed during 2020-05-14 city-services working group call:

Presentation by @whereissean: https://docs.google.com/presentation/d/1bg36oyQhZlBCQb07JCUFyVe97WsCAeRCDeXjaFMXYM8/edit?usp=sharing

Presentations of reports by @dirkdk (Spin): https://docs.google.com/document/d/1qZvmJzoWrnOVZeaubqxOLNVYKzueWQEaU3C7H3kQ1dw/edit?pli=1#

Short-term actions:

  • @whereissean: add a section about data sensitivity, authentication and reservations against open data.

  • @whereissean: explicit the compatibility with MDC metrics (@jfh01 as official contact person for MDC).

  • Use cases:

    • @schnuerle: let cities and providers agree on the interpretation of MDS data.
  • Technical issues:

    • @billdirks: how could we version metrics to reflect definition changes / historical data changes?
    • @whereissean @schnuerle: what is the best way to control the period on which metrics should be returned?
    • @billdirks: How should response lines be ordered?

@schnuerle
Copy link
Member

For discussion around how to request the time period.

Should it be interval counts w/ a start date and no end date, or start and end date with interval length only? The first is more machine readable and the latter is more human readable and something a data analyst would use.

I think start and end dates are more consistent with MDS and other kinds of APIs where you request data over a time range. And you can specify the interval (minute, hour, day, week, month) over that time range and those values would be returned in an array.

@schnuerle schnuerle mentioned this issue Jul 16, 2020
@schnuerle schnuerle linked a pull request Aug 13, 2020 that will close this issue
@schnuerle schnuerle added this to the 1.1.0 milestone Aug 27, 2020
@schnuerle schnuerle added Agency Specific to the Agency API Provider Specific to the Provider API labels Aug 27, 2020
@johnclary
Copy link
Contributor

@whereissean would you mind making your slide deck publicly viewable?

@schnuerle
Copy link
Member

Some notes from our WG call yesterday:

  • Issue 486 was written prior to the MDC proposal. A more thorough analysis of the MDC proposal would be needed. Jascha and Sean will reach out to MDC. Try to cover existing metrics out there in specs where relevant: SharedStreets, SAE/MDC, NUMO
  • Need to determine whether that calculation methodologies themselves are part of the spec.
  • Metrics are useful if 1) we create a flexible framework and 2) define the metrics
  • Could have Core Metrics defined in the spec that are more universal, easy and agreed upon, and Supplemental Metrics in a doc/guide external to the spec that are more optional or variable.

See also new issue #569 from folks at Spin to cross reference use cases.

@schnuerle schnuerle added the Metrics Related to the Metrics API and related topics label Aug 28, 2020
@whereissean
Copy link
Contributor Author

@whereissean would you mind making your slide deck publicly viewable? @johnclary

Sorry, appears that permissions were changed on the document. Until I can resolve, here is a new public version:
https://docs.google.com/presentation/d/1rVwGSYb4d8myGSN9VJrDl1AOGtmdbqvAbXL8-a5VA-o/edit?usp=sharing

@johnclary
Copy link
Contributor

thanks @whereissean. @schnuerle would you mind adding the Privacy label to this?

@johnclary
Copy link
Contributor

johnclary commented Aug 28, 2020

r.e. this bit from the doc:

For the fields that involve special_users, we propose an x number of subcategories like low_income, student or unbanked

😵 this is the first time I'm seeing mention of this in MDS. is this information that is held by providers?

update: will move this discussion to #569

@johnclary
Copy link
Contributor

johnclary commented Aug 28, 2020

It looks like this spec might support operational use cases in a way that would avoid the need for agencies and providers to exchange telemetry data. I.e, it might be a drop-in replacement for /status_changes or /trips.

For example, as an agency, I'd like to query for the number of vehicles in service in x geography during the last hour.

Are there limitations that would prohibit such a use case as the spec is currently proposed? The use case above requires fairly high temporal and spatial resolution, and minimal latency.

@schnuerle schnuerle added the privacy Implications around privacy for the attention of the OMF Privacy Committee label Aug 28, 2020
@whereissean
Copy link
Contributor Author

whereissean commented Sep 14, 2020

I've reviewed the MDC Glossary and the good news is that methodology looks consistent with the proposed MDS metrics. There were a couple of metrics that were not proposed and a number that are not in MDC Glossary. I've added the ones (maximum/minimum average) that were not in the MDS dockless metrics. I also renamed a number of the proposed metrics to try to align.

I also attached a proposed metrics methodology document that discusses how to compute the metrics and compatibility with MDC Glossary definitions. Thanks @joanathan for putting this document together.

@joshuaandrewjohnson1 @jfh01 @schnuerle Please have a look in #487.

@schnuerle
Copy link
Member

schnuerle commented Oct 2, 2020

We reviewed this issue as part of the second OMF Working Group Steering Committee release Checkpoint. Both WGSCs had some feedback and I'm documenting it here for discussion.

  1. Is this statement true to what you are proposing?
    The entire proposed Metrics API is meant to be published by cities to providers, after cities have ingested MDS data from providers. So the city is doing the data processing.

If so, how much value is this to cities, and will they be able to justify the heavy lift implementing an API for this? Why not just pull CSV reports from a city database and share those with providers like they do now? Does an API provide enough benefit?

If not, can you clarify how a city can use it and how a provider can use it, both in the issue description and the PR details?

  1. One use case mentioned in the original description is to that a city could make this endpoint public. It does not seem that making this endpoint public is a good idea, and instead data derived from the API could be published by the city and made public.

  2. Maybe just creating a defined methodology that cities (and providers) can use to calculate reports from MDS is enough, vs creating an endpoint?

These questions could be explored with a city survey to gauge interest if needed.

@schnuerle schnuerle pinned this issue Oct 2, 2020
@johnclary
Copy link
Contributor

johnclary commented Oct 2, 2020

Ah, I completely missed that was being proposed as a city endpoint. As such, it cannot serve as an alternative to consuming raw trip data, and in fact this proposal necessitates adding more attributes to trip records.

That answers my own question.

@marie-x
Copy link
Collaborator

marie-x commented Oct 2, 2020

@johnclary @schnuerle @jfh01 I think there's a misunderstanding here.

The Metrics API is not just for Agencies; it could be implemented by Providers. And the consumers of an Agency implementation of metrics are not necessarily (only) Providers, in fact the main use cases are for city-internal consumption by analytics and visualization tools.

Yes, this could be an alternative to consuming raw trip data, although in the absence of such data, it makes the metrics essentially impossible to verify.

@dirkdk
Copy link
Contributor

dirkdk commented Oct 9, 2020

@johnclary @schnuerle @jfh01 I think there's a misunderstanding here.

The Metrics API is not just for Agencies; it could be implemented by Providers. And the consumers of an Agency implementation of metrics are not necessarily (only) Providers, in fact the main use cases are for city-internal consumption by analytics and visualization tools.

Yes, this could be an alternative to consuming raw trip data, although in the absence of such data, it makes the metrics essentially impossible to verify.

That is how I saw the Metrics API as well. It is a standard that can be implemented by Agency or Provider, or even 3rd party Data aggregator. Either with input data from other MDS endpoints, or different sources (like Special Groups data that would only be available to the Provider)

@schnuerle
Copy link
Member

Note that for 1.1.0 we have merged with #582 the new Geography API to the 'dev' branch. Please update this pull request with the latest code, resolve any conflicts, and make references to the Geography API where appropriate, e.g. with UUIDs.

We will be discussing Metrics at this week's Working Group meeting, so if available please come prepared to talk about your latest updates and ideas.

@schnuerle
Copy link
Member

The content of the 2 Metrics pull requests #486 and #487 have been merged to the new [feature-metrics](https://github.com/openmobilityfoundation/mobility-data-specification/tree/feature-metrics/metrics) feature branch for everyone to review in context with MDS and the new Geography API, and make PRs against.

We will leave this issue open until that branch is ready to be merged to dev so please continue to leave feedback/ideas here, or on the new feature branch PR #587.

@schnuerle schnuerle linked a pull request Oct 16, 2020 that will close this issue
@schnuerle schnuerle unpinned this issue Jan 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Agency Specific to the Agency API Metrics Related to the Metrics API and related topics privacy Implications around privacy for the attention of the OMF Privacy Committee Provider Specific to the Provider API
Projects
None yet
9 participants