Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: add docstrings and examples for String functions #4166

Merged

Conversation

austinletson
Copy link
Contributor

@austinletson austinletson commented May 14, 2024

Add docstrings, usage examples, and doc tests for String.prev, .front, .back, .atEnd.

Improve docstring examples for String.next based on discussion examples for String.prev.

@github-actions github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label May 14, 2024
@leanprover-community-mathlib4-bot
Copy link
Collaborator

leanprover-community-mathlib4-bot commented May 14, 2024

Mathlib CI status (docs):

  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 367b97885a484a2e17e4cdade8dcd7fe05dbfb4c --onto 91244b2dd9d223006227648659203373f5a46b0b. (2024-05-14 22:15:31)
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 367b97885a484a2e17e4cdade8dcd7fe05dbfb4c --onto 3035d2f8f689b52963f49b2414414913ca296953. (2024-05-16 02:38:19)
  • ❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 367b97885a484a2e17e4cdade8dcd7fe05dbfb4c --onto 3a457e6ad60272f03c3555e443379dbde10507ea. (2024-05-18 13:09:42)

@semorrison semorrison added the will-merge-soon …unless someone speaks up label May 16, 2024
Comment on lines 203 to 206
Examples:
* `"abc".prev ⟨2⟩ = String.Pos.mk 1`
* `"abc".prev ⟨0⟩ = String.Pos.mk 0`
* `"L∃∀N".prev ⟨4⟩ = String.Pos.mk 1`, since `'∃'` is a multi-byte UTF-8 character
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These examples all worry me, because they seem to suggest that users should be counting bytes and creating a String.Pos from scratch. Additionally, the use of two separate notations for String.Pos.mk risks throwing off some users, and I think the String API is more likely to attract new users than many other areas. Finally, it shows off an unspecified result, which I think is better left unwritten.

I also think that byte 4 is valid, based on:

#eval let i := ⟨0⟩; let str := "L∃∀N"; str.next i

yielding 1 and

#eval let i := ⟨0⟩; let str := "L∃∀N"; str.next <| str.next i

yielding 4.

What about:

Suggested change
Examples:
* `"abc".prev ⟨2⟩ = String.Pos.mk 1`
* `"abc".prev ⟨0⟩ = String.Pos.mk 0`
* `"L∃∀N".prev ⟨4⟩ = String.Pos.mk 1`, since `'∃'` is a multi-byte UTF-8 character
Examples:
* `"abc".prev ⟨2⟩ = 1`
* `"abc".prev ⟨0⟩ = 0`
* `"L∃∀N".prev ⟨3⟩` is unspecified, since `'∃'` is a multi-byte UTF-8 character and byte 3 is in the middle of it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that byte 4 is valid, based on:

I intended for "L∃∀N".prev ⟨4⟩ = String.Pos.mk 1 to illustrate a valid result in the case where the string contains multi-byte characters. I didn't include an unspecified example here (unlike String.next) but can definitely add one.

More generally, I think your concerns regarding byte counts from scratch is spot on. What about the following modification modeled after your suggestion for String.atEnd to encourage more natural usage:

Suggested change
Examples:
* `"abc".prev ⟨2⟩ = String.Pos.mk 1`
* `"abc".prev ⟨0⟩ = String.Pos.mk 0`
* `"L∃∀N".prev ⟨4⟩ = String.Pos.mk 1`, since `'∃'` is a multi-byte UTF-8 character
Examples:
Given `def abc := "abc"` and `def lean := "L∃∀N"`,
* `abc.endPos |> abc.prev = ⟨2⟩`
* `lean.endPos |> lean.prev |> lean.prev |> lean.prev = ⟨1⟩`
* `"L∃∀N".prev ⟨3⟩` is unspecified, since `'∃'` is a multi-byte UTF-8 character and byte 3 is in the middle of it

The other advantage here is demonstrating that if you call String.prev with a constructed String.pos, instead of with an iterative pattern, you run the risk of unspecified results.

Here is an alternative version with String.get which is more explicit at the cost of complicating the example code:

Suggested change
Examples:
* `"abc".prev ⟨2⟩ = String.Pos.mk 1`
* `"abc".prev ⟨0⟩ = String.Pos.mk 0`
* `"L∃∀N".prev ⟨4⟩ = String.Pos.mk 1`, since `'∃'` is a multi-byte UTF-8 character
Examples:
Given `def abc := "abc"` and `def lean := "L∃∀N"`,
* `abc.get <| abc.endPos |> abc.prev = 'c'`
* `lean.get <| lean.endPos |> lean.prev |> lean.prev |> lean.prev = '∃'`
* `"L∃∀N".prev ⟨3⟩` is unspecified, since `'∃'` is a multi-byte UTF-8 character and byte 3 is in the middle of it

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, except for the mixing of <| and |> - the precedence of them is not necessarily obvious. What about parens around the chains of |> instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I changed these examples to use parens instead of <|.

Additionally, I converted the examples for the similar String.next which was documented a few weeks ago in #4001 to this iterative style instead of constructing positions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks!

@austinletson
Copy link
Contributor Author

@semorrison Thank you for adding the tests! After all the threads are resolved, I will go back and update the test to reflect the final docstrings.

@austinletson
Copy link
Contributor Author

I updated the tests based on the updated examples.

Let me know if there is any additional areas for improvement. If not, this should be ready to go!

Copy link
Contributor

@david-christiansen david-christiansen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good now!

Sorry for the long turnaround, this was a holiday weekend in Denmark.

@david-christiansen david-christiansen added this pull request to the merge queue May 21, 2024
Merged via the queue into leanprover:master with commit 2faa81d May 21, 2024
12 checks passed
@austinletson
Copy link
Contributor Author

Thanks, David! I hope you had a nice holiday!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN will-merge-soon …unless someone speaks up
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants