-
Notifications
You must be signed in to change notification settings - Fork 388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: add docstrings and examples for String functions #4166
doc: add docstrings and examples for String functions #4166
Conversation
Mathlib CI status (docs):
|
src/Init/Data/String/Basic.lean
Outdated
Examples: | ||
* `"abc".prev ⟨2⟩ = String.Pos.mk 1` | ||
* `"abc".prev ⟨0⟩ = String.Pos.mk 0` | ||
* `"L∃∀N".prev ⟨4⟩ = String.Pos.mk 1`, since `'∃'` is a multi-byte UTF-8 character |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These examples all worry me, because they seem to suggest that users should be counting bytes and creating a String.Pos
from scratch. Additionally, the use of two separate notations for String.Pos.mk
risks throwing off some users, and I think the String
API is more likely to attract new users than many other areas. Finally, it shows off an unspecified result, which I think is better left unwritten.
I also think that byte 4 is valid, based on:
#eval let i := ⟨0⟩; let str := "L∃∀N"; str.next i
yielding 1 and
#eval let i := ⟨0⟩; let str := "L∃∀N"; str.next <| str.next i
yielding 4.
What about:
Examples: | |
* `"abc".prev ⟨2⟩ = String.Pos.mk 1` | |
* `"abc".prev ⟨0⟩ = String.Pos.mk 0` | |
* `"L∃∀N".prev ⟨4⟩ = String.Pos.mk 1`, since `'∃'` is a multi-byte UTF-8 character | |
Examples: | |
* `"abc".prev ⟨2⟩ = ⟨1⟩` | |
* `"abc".prev ⟨0⟩ = ⟨0⟩` | |
* `"L∃∀N".prev ⟨3⟩` is unspecified, since `'∃'` is a multi-byte UTF-8 character and byte 3 is in the middle of it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think that byte 4 is valid, based on:
I intended for "L∃∀N".prev ⟨4⟩ = String.Pos.mk 1
to illustrate a valid result in the case where the string contains multi-byte characters. I didn't include an unspecified example here (unlike String.next
) but can definitely add one.
More generally, I think your concerns regarding byte counts from scratch is spot on. What about the following modification modeled after your suggestion for String.atEnd
to encourage more natural usage:
Examples: | |
* `"abc".prev ⟨2⟩ = String.Pos.mk 1` | |
* `"abc".prev ⟨0⟩ = String.Pos.mk 0` | |
* `"L∃∀N".prev ⟨4⟩ = String.Pos.mk 1`, since `'∃'` is a multi-byte UTF-8 character | |
Examples: | |
Given `def abc := "abc"` and `def lean := "L∃∀N"`, | |
* `abc.endPos |> abc.prev = ⟨2⟩` | |
* `lean.endPos |> lean.prev |> lean.prev |> lean.prev = ⟨1⟩` | |
* `"L∃∀N".prev ⟨3⟩` is unspecified, since `'∃'` is a multi-byte UTF-8 character and byte 3 is in the middle of it |
The other advantage here is demonstrating that if you call String.prev
with a constructed String.pos
, instead of with an iterative pattern, you run the risk of unspecified results.
Here is an alternative version with String.get
which is more explicit at the cost of complicating the example code:
Examples: | |
* `"abc".prev ⟨2⟩ = String.Pos.mk 1` | |
* `"abc".prev ⟨0⟩ = String.Pos.mk 0` | |
* `"L∃∀N".prev ⟨4⟩ = String.Pos.mk 1`, since `'∃'` is a multi-byte UTF-8 character | |
Examples: | |
Given `def abc := "abc"` and `def lean := "L∃∀N"`, | |
* `abc.get <| abc.endPos |> abc.prev = 'c'` | |
* `lean.get <| lean.endPos |> lean.prev |> lean.prev |> lean.prev = '∃'` | |
* `"L∃∀N".prev ⟨3⟩` is unspecified, since `'∃'` is a multi-byte UTF-8 character and byte 3 is in the middle of it |
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, except for the mixing of <|
and |>
- the precedence of them is not necessarily obvious. What about parens around the chains of |>
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I changed these examples to use parens instead of <|
.
Additionally, I converted the examples for the similar String.next
which was documented a few weeks ago in #4001 to this iterative style instead of constructing positions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, thanks!
Co-authored-by: Kim Morrison <[email protected]>
Co-authored-by: Kim Morrison <[email protected]>
@semorrison Thank you for adding the tests! After all the threads are resolved, I will go back and update the test to reflect the final docstrings. |
I updated the tests based on the updated examples. Let me know if there is any additional areas for improvement. If not, this should be ready to go! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good now!
Sorry for the long turnaround, this was a holiday weekend in Denmark.
Thanks, David! I hope you had a nice holiday! |
Add docstrings, usage examples, and doc tests for
String.prev
,.front
,.back
,.atEnd
.Improve docstring examples for
String.next
based on discussion examples forString.prev
.