Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat 267: route.long_name should not contain route.short_name #1325

Conversation

briandonahue
Copy link
Collaborator

@briandonahue briandonahue commented Feb 2, 2023

Summary:

Closes #267. Based on implementation suggestion by @aababilov here

check that long name does not start from the short name followed by ' ', '-' or '('.

Expected behavior:

Should generate WARNING level notice that route.long_name should not contain route.short_name as referenced in Best Practices: https://gtfs.org/schedule/best-practices/#routestxt

Please make sure these boxes are checked before submitting your pull request - thanks!

  • Run the unit tests with gradle test to make sure you didn't break anything
  • Format the title like "feat: [new feature short description]". Title must follow the Conventional Commit Specification(https://www.conventionalcommits.org/en/v1.0.0/).
  • Linked all relevant issues
  • Include screenshot(s) showing how this pull request works and fixes the issue(s) (N/A)

@isabelle-dr isabelle-dr linked an issue Feb 10, 2023 that may be closed by this pull request
Copy link
Contributor

@isabelle-dr isabelle-dr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this @briandonahue !
Your new validator notice replaces https://github.com/MobilityData/gtfs-validator/blob/master/RULES.md#route_short_and_long_name_equal. Can you remove it in this PR, so that we can close #267?

We also need the update in RULES.md :)

@briandonahue briandonahue marked this pull request as draft February 10, 2023 20:27
@briandonahue
Copy link
Collaborator Author

@isabelle-dr changing this to a draft to avoid accidental merging - realizing we could use a few more test cases and document good/bad data examples

@briandonahue
Copy link
Collaborator Author

@bdferris-v2 Any thoughts on using parameterized tests where they make sense? For this where there are many ways the short_name can be contained in the long_name, it seems like a perfect candidate. Plus some way off day in the future you might be able to use tests params as sample good/bad data in the generated documentation :)

I see this library which looks useful, but wondering if you had any other recommendations or concerns with doing this.

@briandonahue
Copy link
Collaborator Author

briandonahue commented Feb 10, 2023

@isabelle-dr @bdferris-v2 I've made some significant changes. Documenting the rule made me realize it was missing some use cases and that it may not be perfectly clear and/or covering all potential issues.

So far, I've updated to account for short/long are equal (case insensitive), long cannot start with short and be followed by a space, or disallowed character (, -. I added ) and am wondering if : or other characters should be added?

Here are my current test cases for bad data:

           // short, long
            {"L1", "L1 Long Name"},
            {"L1", "L1-Long Name"},
            {"L1", "L1- Long Name"},
            {"L1", "L1 - Long Name"},
            {"L1", "L1)Long Name"},
            {"L1", "L1) Long Name"},
            {"L1", "L1 ) Long Name"},
            {"L1", "L1(Long Name)"},
            {"L1", "L1 (Long Name)"},
            {"L1", "L1( Long Name)"},

Please let me know if any should be removed or added.

RULES.md Outdated Show resolved Hide resolved
@briandonahue briandonahue marked this pull request as ready for review February 10, 2023 22:55
Copy link

@julianharty julianharty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know enough of this codebase to approve this PR. The code looks clean, I've added some comments/suggestions. Let me know if I can help further.

RULES.md Outdated Show resolved Hide resolved
RULES.md Outdated Show resolved Hide resolved
RULES.md Outdated Show resolved Hide resolved
RULES.md Outdated Show resolved Hide resolved
RULES.md Outdated Show resolved Hide resolved
Copy link
Contributor

@isabelle-dr isabelle-dr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments in-line, thank you so much for working on this @briandonahue!
A great addition to this validator :)

RULES.md Outdated Show resolved Hide resolved
RULES.md Outdated Show resolved Hide resolved
RULES.md Outdated Show resolved Hide resolved
RULES.md Outdated Show resolved Hide resolved
RULES.md Outdated Show resolved Hide resolved
RULES.md Outdated Show resolved Hide resolved
RULES.md Outdated Show resolved Hide resolved
@briandonahue briandonahue changed the title feat: route.long_name should not contain route.short_name feat 267: route.long_name should not contain route.short_name Feb 23, 2023
RULES.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@bdferris-v2 bdferris-v2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo the small changes requested, this LGTM.

@briandonahue
Copy link
Collaborator Author

@bdferris-v2 Made those fixes, thanks.

RULES.md Outdated Show resolved Hide resolved
@isabelle-dr
Copy link
Contributor

One last small change before we merge :)

@bdferris-v2
Copy link
Collaborator

@isabelle-dr because the acceptance test workflow only has permission to post the comment with the report summary if the PR is coming from a branch of the gtfs-validator repo itself, as opposed to a fork of the repo, as is the case with this PR (and many coming from external contributors). This is discussed in more detail here. Unfortunately, I've not seen a good way around this.

@davidgamez
Copy link
Member

@bdferris-v2 Any thoughts on using parameterized tests where they make sense? For this where there are many ways the short_name can be contained in the long_name, it seems like a perfect candidate. Plus some way off day in the future you might be able to use tests params as sample good/bad data in the generated documentation :)

I see this library which looks useful, but wondering if you had any other recommendations or concerns with doing this.

That's a good call. Definitely the bad list test case is a great example for parameterizing a test. We can have this "feature" for free if we decide to upgrade the JUnit version. JUnit 5 Parameterized Tests
Let's keep the topic alive and have the conversion out of scope of this PR.

Copy link
Member

@davidgamez davidgamez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@briandonahue
Copy link
Collaborator Author

@isabelle-dr because the acceptance test workflow only has permission to post the comment with the report summary if the PR is coming from a branch of the gtfs-validator repo itself, as opposed to a fork of the repo, as is the case with this PR (and many coming from external contributors). This is discussed in more detail here. Unfortunately, I've not seen a good way around this.

@isabelle-dr @bdferris-v2 This was created before I had access to the main repo. Should I close this PR and open a new one from a local branch?

@bdferris-v2
Copy link
Collaborator

Two things. The acceptance test still ran and you can access the logs in the Checks tab to see the result:

✅ Rule acceptance tests passed.
New Errors: 0 out of 1419 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
Dropped Errors: 0 out of 1419 datasets (~0%) are invalid due to code change, which is less than the provided threshold of 1%.
0 out of 1419 sources (~0 %) are corrupted.
Commit: 4a956ff
Download the full acceptance test report here (report will disappear after 90 days).
✅ Rule acceptance tests passed.

That said, per #1349, the acceptance test only tracks changes to ERROR notices, not warnings. As this PR only changes a warning notice, it wouldn't show up as a change in the acceptance test. That's something I'd like to fix.

@isabelle-dr
Copy link
Contributor

@briandonahue good to merge?

@briandonahue
Copy link
Collaborator Author

@briandonahue good to merge?

@isabelle-dr Yes! 😄

@fredericsimard fredericsimard merged commit aeaecb7 into MobilityData:master Mar 8, 2023
@briandonahue briandonahue deleted the issue/267/route_long_name_contains_short_name branch March 8, 2023 16:34
ryon pushed a commit to JarvusInnovations/gtfs-validator that referenced this pull request Apr 1, 2023
…tyData#1325)

* fix: 267 - route.long_name should not contain route.short_name

* formatting

* Remove unnecessary comment

* Remove unescaped regex, do startsWith check then regex on remainder

* update RULES.md

* update RULES.md examples

* Add more test examples and remove redundant warning

* remove redundant test data

* fix typo

* Remove redundant rule

* 'should not', not 'can not'

* simplify test data

* Missed a spot to remove redundant rule

* Update RULES.md

Better examples!

Co-authored-by: isabelle-dr <[email protected]>

* Missed a spot to remove redundant rule

* Make requested updates to rules docs

* Address PR feedback

* Correct field name

---------

Co-authored-by: Kevin Clough <[email protected]>
Co-authored-by: isabelle-dr <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Smarter check for if route_long_name contains route_short_name
7 participants