Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional methods for Set type #497

Merged
merged 12 commits into from
Aug 29, 2023

Conversation

SamWheating
Copy link
Contributor

Re: #492

Added set.add(x), set.pop(), set.remove(x), set.discard(x), set.clear(), as well as updates to the Spec and tests.

Demo:

>>> a = set([1,2,3])
>>> a.add(4)
>>> a
set([1, 2, 3, 4])
>>> a.remove(1)
>>> a
set([2, 3, 4])
>>> a.remove(1)
Traceback (most recent call last):
  <stdin>:1:9: in <expr>
Error in remove: remove: missing key
>>> a.discard(2)
>>> a
set([3, 4])
>>> a.discard(2)
>>> a.pop()
3
>>> a
set([4])
>>> a.pop()
4
>>> a.pop()
Traceback (most recent call last):
  <stdin>:1:6: in <expr>
Error in pop: pop: empty set
>>> a.add(1)
>>> a.add(2)
>>> a
set([1, 2])
>>> a.clear()
>>> a
set([])

Copy link
Collaborator

@adonovan adonovan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Sam, many thanks for this contribution. I've made a number of comments but the necessary changes should all be fairly straightforward.

doc/spec.md Show resolved Hide resolved
doc/spec.md Outdated Show resolved Hide resolved
doc/spec.md Outdated Show resolved Hide resolved
doc/spec.md Outdated Show resolved Hide resolved
doc/spec.md Outdated

`S.remove(x)` removes `x` from the set and returns None.

`remove` fails if the set does not contain `x` or is frozen.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the mutating methods should ideally mention what happens when the set is frozen, especially given the subtlety that add may not always attempt to mutate the set (if the element is already present), so it does not always fail on a frozen dict. The spec should probably take a position on this, by stating that the lookup precedes the attempt at mutation (that's what dict.setdefault does, although the document is not particularly explicit).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, all of the frozen checks defer to the underlying hashtable so I wasn't thinking about this case. I'll verify and document this behaviour.

Related - how do you feel about calling clear() on an empty, frozen dict or set? Should that succeed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Landed on the following semantics around frozen sets:

  • set.clear() will succeed on an empty frozen set
  • set.discard(x) will succeed on a frozen set which does not contain x
  • set.add(x) will succeed on a frozen set which already contains x
  • pop() and remove() will always fail on frozen sets with cannot delete from frozen hash table.

Two edge cases to consider -

  • Should pop() on an empty frozen set fail because its empty, or because its frozen?
  • Should remove(x) on a frozen set not containing x fail because of missing key, or because its frozen?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should pop() on an empty frozen set fail because its empty, or because its frozen?
Should remove(x) on a frozen set not containing x fail because of missing key, or because its frozen?

Generally we don't specify which error you get when more than one applies. What's more important is that we specify whether successful shortcuts trump slow-path errors (as in the case of discarding an unhashable from an empty set: does it fail because unhashable or succeed because empty?). So what you have looks fine.

return nil, err
}
if found, err := b.Receiver().(*Set).Delete(k); err != nil {
return nil, nameErr(b, err) // dict is frozen or key is unhashable
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If key is unhashable then it can't be a member of the set. Shouldn't it simply return in that case? I wonder what Python does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - looks like Python 3.7.13 will throw a typeError when trying to remove (or discard) an unhashable type from a set.

>>> a = set([1,2,3])
>>> a.remove(1)
>>> a.remove([4,5,6])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

I think that this makes sense, especially in the case of remove where a successful return indicates that the item was found and removed. Thoughts?

starlark/library.go Outdated Show resolved Hide resolved
starlark/testdata/set.star Show resolved Hide resolved
@@ -116,3 +116,27 @@ assert.eq(iter(), [1, 2, 3])

# sets are not indexable
assert.fails(lambda : x[0], "unhandled.*operation")

# adding and removing
x.add(4)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add x = set() before each of these stanzas so that there is no residual effect across them.

Copy link
Contributor Author

@SamWheating SamWheating Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starlark doesn't permit this because of global variable reassignment, but I can just create a unique set for each block - any objections?

starlark/testdata/set.star Outdated Show resolved Hide resolved
@SamWheating
Copy link
Contributor Author

Thanks for the super thorough review - I think I have addressed all of your feedback now and I left a few discussions open where I need some additional confirmation.

Copy link
Collaborator

@adonovan adonovan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. More nitpicks, but we're nearly there. Thanks again.

doc/spec.md Outdated Show resolved Hide resolved
doc/spec.md Outdated Show resolved Hide resolved
doc/spec.md Outdated Show resolved Hide resolved
doc/spec.md Outdated Show resolved Hide resolved
starlark/library.go Outdated Show resolved Hide resolved
return nil, err
}
if found, err := b.Receiver().(*Set).Has(k); err != nil {
return nil, nameErr(b, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For discard, unhashable => not found, so I think this should not be an error. Can we add a test case?

Copy link
Contributor Author

@SamWheating SamWheating Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had some thoughts about this here:
#497 (comment)

If we're aiming for consistency with the python implementation we should be throwing an error when discarding with an unhashable key. Thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems fine. Thanks for checking.

starlark/library.go Outdated Show resolved Hide resolved
@SamWheating
Copy link
Contributor Author

Thanks for the review, got everything addressed aside from a few open discussions.

While you're here - how would you feel about adding built-in set parsing and set comprehension? I'd be interested in working on the implementation but also think this might be too much divergence from the official spec.

@adonovan
Copy link
Collaborator

how would you feel about adding built-in set parsing and set comprehension? I'd be interested in working on the implementation but also think this might be too much divergence from the official spec.

I would rather not implement those since they change the dialect accepted by the parser. If you can persuade the Bazel folks to accept set as a standard feature, then these expressions would be appropriate, but I don't think they feel much need for it.

Thanks again for contributing this, and for your patience during the review.

return nil, err
}
if found, err := b.Receiver().(*Set).Has(k); err != nil {
return nil, nameErr(b, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems fine. Thanks for checking.

doc/spec.md Outdated

`S.remove(x)` removes `x` from the set and returns None.

`remove` fails if the set does not contain `x` or is frozen.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should pop() on an empty frozen set fail because its empty, or because its frozen?
Should remove(x) on a frozen set not containing x fail because of missing key, or because its frozen?

Generally we don't specify which error you get when more than one applies. What's more important is that we specify whether successful shortcuts trump slow-path errors (as in the case of discarding an unhashable from an empty set: does it fail because unhashable or succeed because empty?). So what you have looks fine.

@adonovan adonovan merged commit 68633c9 into google:master Aug 29, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants