doc: add docstrings and examples for String functions #4166

austinletson · 2024-05-14T21:57:40Z

Add docstrings, usage examples, and doc tests for String.prev, .front, .back, .atEnd.

Improve docstring examples for String.next based on discussion examples for String.prev.

leanprover-community-mathlib4-bot · 2024-05-14T22:15:31Z

Mathlib CI status (docs):

❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 367b97885a484a2e17e4cdade8dcd7fe05dbfb4c --onto 91244b2dd9d223006227648659203373f5a46b0b. (2024-05-14 22:15:31)
❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 367b97885a484a2e17e4cdade8dcd7fe05dbfb4c --onto 3035d2f8f689b52963f49b2414414913ca296953. (2024-05-16 02:38:19)
❗ Batteries/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 367b97885a484a2e17e4cdade8dcd7fe05dbfb4c --onto 3a457e6ad60272f03c3555e443379dbde10507ea. (2024-05-18 13:09:42)

src/Init/Data/String/Basic.lean

david-christiansen · 2024-05-16T10:34:39Z

src/Init/Data/String/Basic.lean

+Examples:
+* `"abc".prev ⟨2⟩ = String.Pos.mk 1`
+* `"abc".prev ⟨0⟩ = String.Pos.mk 0`
+* `"L∃∀N".prev ⟨4⟩ = String.Pos.mk 1`, since `'∃'` is a multi-byte UTF-8 character


These examples all worry me, because they seem to suggest that users should be counting bytes and creating a String.Pos from scratch. Additionally, the use of two separate notations for String.Pos.mk risks throwing off some users, and I think the String API is more likely to attract new users than many other areas. Finally, it shows off an unspecified result, which I think is better left unwritten.

I also think that byte 4 is valid, based on:

#eval let i := ⟨0⟩; let str := "L∃∀N"; str.next i

yielding 1 and

#eval let i := ⟨0⟩; let str := "L∃∀N"; str.next <| str.next i

yielding 4.

What about:

Suggested change

Examples:

* `"abc".prev ⟨2⟩ = String.Pos.mk 1`

* `"abc".prev ⟨0⟩ = String.Pos.mk 0`

* `"L∃∀N".prev ⟨4⟩ = String.Pos.mk 1`, since `'∃'` is a multi-byte UTF-8 character

Examples:

* `"abc".prev ⟨2⟩ = ⟨1⟩`

* `"abc".prev ⟨0⟩ = ⟨0⟩`

* `"L∃∀N".prev ⟨3⟩` is unspecified, since `'∃'` is a multi-byte UTF-8 character and byte 3 is in the middle of it

I also think that byte 4 is valid, based on:

I intended for "L∃∀N".prev ⟨4⟩ = String.Pos.mk 1 to illustrate a valid result in the case where the string contains multi-byte characters. I didn't include an unspecified example here (unlike String.next) but can definitely add one.

More generally, I think your concerns regarding byte counts from scratch is spot on. What about the following modification modeled after your suggestion for String.atEnd to encourage more natural usage:

Suggested change

Examples:

* `"abc".prev ⟨2⟩ = String.Pos.mk 1`

* `"abc".prev ⟨0⟩ = String.Pos.mk 0`

* `"L∃∀N".prev ⟨4⟩ = String.Pos.mk 1`, since `'∃'` is a multi-byte UTF-8 character

Examples:

Given `def abc := "abc"` and `def lean := "L∃∀N"`,

* `abc.endPos |> abc.prev = ⟨2⟩`

* `lean.endPos |> lean.prev |> lean.prev |> lean.prev = ⟨1⟩`

* `"L∃∀N".prev ⟨3⟩` is unspecified, since `'∃'` is a multi-byte UTF-8 character and byte 3 is in the middle of it

The other advantage here is demonstrating that if you call String.prev with a constructed String.pos, instead of with an iterative pattern, you run the risk of unspecified results.

Here is an alternative version with String.get which is more explicit at the cost of complicating the example code:

Suggested change

Examples:

* `"abc".prev ⟨2⟩ = String.Pos.mk 1`

* `"abc".prev ⟨0⟩ = String.Pos.mk 0`

* `"L∃∀N".prev ⟨4⟩ = String.Pos.mk 1`, since `'∃'` is a multi-byte UTF-8 character

Examples:

Given `def abc := "abc"` and `def lean := "L∃∀N"`,

* `abc.get <| abc.endPos |> abc.prev = 'c'`

* `lean.get <| lean.endPos |> lean.prev |> lean.prev |> lean.prev = '∃'`

* `"L∃∀N".prev ⟨3⟩` is unspecified, since `'∃'` is a multi-byte UTF-8 character and byte 3 is in the middle of it

What do you think?

I like it, except for the mixing of <| and |> - the precedence of them is not necessarily obvious. What about parens around the chains of |> instead?

👍 I changed these examples to use parens instead of <|.

Additionally, I converted the examples for the similar String.next which was documented a few weeks ago in #4001 to this iterative style instead of constructing positions.

This looks great, thanks!

src/Init/Data/String/Basic.lean

Co-authored-by: Kim Morrison <[email protected]>

austinletson · 2024-05-16T13:13:49Z

@semorrison Thank you for adding the tests! After all the threads are resolved, I will go back and update the test to reflect the final docstrings.

austinletson · 2024-05-18T13:03:17Z

I updated the tests based on the updated examples.

Let me know if there is any additional areas for improvement. If not, this should be ready to go!

david-christiansen

This looks really good now!

Sorry for the long turnaround, this was a holiday weekend in Denmark.

austinletson · 2024-05-21T11:13:52Z

Thanks, David! I hope you had a nice holiday!

doc: add docstrings and examples for String functions

7226b54

austinletson requested a review from semorrison as a code owner May 14, 2024 21:57

github-actions bot added the toolchain-available A toolchain is available for this PR, at leanprover/lean4-pr-releases:pr-release-NNNN label May 14, 2024

semorrison reviewed May 16, 2024

View reviewed changes

src/Init/Data/String/Basic.lean Outdated Show resolved Hide resolved

semorrison reviewed May 16, 2024

View reviewed changes

src/Init/Data/String/Basic.lean Outdated Show resolved Hide resolved

semorrison reviewed May 16, 2024

View reviewed changes

src/Init/Data/String/Basic.lean Outdated Show resolved Hide resolved

add examples as tests

3dd82fd

semorrison added the will-merge-soon …unless someone speaks up label May 16, 2024

david-christiansen reviewed May 16, 2024

View reviewed changes

src/Init/Data/String/Basic.lean Outdated Show resolved Hide resolved

austinletson and others added 3 commits May 16, 2024 09:02

remove extra white space

a641cf7

Co-authored-by: Kim Morrison <[email protected]>

Remove = 'A' from (default : Char) examples

c1a0b9e

Co-authored-by: Kim Morrison <[email protected]>

doc: update String.atEnd with more iterative suggestion

a9aaa84

doc: switch to iterative examples for .next and .prev

300c722

semorrison requested a review from david-christiansen May 20, 2024 01:05

david-christiansen approved these changes May 21, 2024

View reviewed changes

david-christiansen added this pull request to the merge queue May 21, 2024

Merged via the queue into leanprover:master with commit 2faa81d May 21, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doc: add docstrings and examples for String functions #4166

doc: add docstrings and examples for String functions #4166

austinletson commented May 14, 2024 •

edited

Loading

leanprover-community-mathlib4-bot commented May 14, 2024 •

edited

Loading

david-christiansen May 16, 2024

austinletson May 16, 2024

david-christiansen May 17, 2024

austinletson May 18, 2024

david-christiansen May 21, 2024

austinletson commented May 16, 2024

austinletson commented May 18, 2024

david-christiansen left a comment

austinletson commented May 21, 2024

doc: add docstrings and examples for String functions #4166

doc: add docstrings and examples for String functions #4166

Conversation

austinletson commented May 14, 2024 • edited Loading

leanprover-community-mathlib4-bot commented May 14, 2024 • edited Loading

david-christiansen May 16, 2024

Choose a reason for hiding this comment

austinletson May 16, 2024

Choose a reason for hiding this comment

david-christiansen May 17, 2024

Choose a reason for hiding this comment

austinletson May 18, 2024

Choose a reason for hiding this comment

david-christiansen May 21, 2024

Choose a reason for hiding this comment

austinletson commented May 16, 2024

austinletson commented May 18, 2024

david-christiansen left a comment

Choose a reason for hiding this comment

austinletson commented May 21, 2024

austinletson commented May 14, 2024 •

edited

Loading

leanprover-community-mathlib4-bot commented May 14, 2024 •

edited

Loading