-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added doc on simulator API. #565
base: main
Are you sure you want to change the base?
Conversation
Gen: func(*rand.Rand) int { ... }, | ||
Func: func(ctx context.Context, replies map[int]int, x int, a A) error { | ||
if y, err := a.A(ctx, x); err == nil { | ||
replies[x] = y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this option because it's difficult to explain what instance of replies
will be passed to the function. In reality, for each "workload", you'll be creating a state, and running a series of ops on it. But this isn't as encapsulated as it is in Option 3
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
return nil | ||
} | ||
|
||
func (*Workload) GenCallA(r *rand.Rand) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed possibly defining a generator as a different type, like we did for routing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied to your other comment with an example of this proposal.
docs/randomized_testing.md
Outdated
|
||
### Decision | ||
|
||
NOTE(mwhittaker): I don't love any of these options. I have a slight preference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for Option 3
or a tweaked version of it, because it encapsulates a "run" or a "workload" really well, along with its state etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are a few cosmetic tweaks to Option 3 that might make it more appealing. First, we can move the generator to a separate struct, as you suggested. Second, we can change the Init
method to return all fakes using the existing weavertest.Fake
function. Note that I'm showing the simulator in a sim
package, but it will probably end up in the weavertest
package as well.
type Workload struct {
a weaver.Ref[A]
b weaver.Ref[B]
replies map[int]int
}
func (w *Workload) Init(context.Context) ([]weavertest.FakeComponent, error) {
w.replies = map[int]int{}
return []weavertest.FakeComponent{weavertest.Fake[B](&fakeb{})}, nil
}
func (w *Workload) CallA(ctx context.Context, x int) error {
if y, err := w.a.Get().A(ctx, x); err == nil {
w.replies[x] = y
}
return nil
}
func (w *Workload) CallB(ctx context.Context) error {
w.b.Get().B(ctx)
return nil
}
type Generator struct {}
func (Generator) CallA(r *rand.Rand) int {
...
}
func TestWorkload(t *testing.T) {
s, err := sim.New[Workload, Generator](sim.Options{...})
if err != nil {
t.Fatal(err)
}
results, err := s.Simulate(10 * time.Second)
if err != nil {
t.Fatal(err)
}
}
88a81be
to
0a1412d
Compare
@ghemawat @spetrovic77 Added the options we discussed today to the doc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clear writeup of the options and the pros and cons. Very helpful.
|
||
**Cons.** | ||
Magic struct tags. Also, it is awkward to fake a component to which we don't | ||
need a direct reference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we change to something like:
type Workload struct {
a weaver.Ref[A]
b weavertest.Fake[B,fakeb]
replies map[int]int
}
Would Fake[I,F] be something applicable to weavertest as well, which might make it more palatable.
E.g., chain_test.go's fake could instead by supplied as follows:
runner.Test(t, func(t *testing.T, a weavertest.Fake[A, fake]) {
f := a.Fake() // type *fake
i := a.Component() // type A
...
})
We can then get rid of weavertest.Fake and weavertest.FakeComponent. We should try writing some user documentation for this and see how it looks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this option to the doc. I can work on writing user docs.
- The `Generator[T]` interface is clunky. A value of type `T` generates a value | ||
of type `T`, which is weird. It means that op arguments are doubling as | ||
generators and generated values. Plus, the user sometimes has to typecast | ||
these arguments. We saw that in the `CallA` method above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of type-casting, we should perhaps have a method. That will provide two benefits:
- The representation of Generator[T] can hold more state than T.
- It will reduce the chance of typos like the user saying byte(x) instead of int(x).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that typecasting is both un-ergonomic and error-prone. I described one approach to a Get
method in Option 8 which separates the Generate
method and Get
method across two different interfaces, but we can maybe merge them into a single interface.
- If a workload has many ops that don't re-use types much, this approach is more | ||
onerous that defining generator methods. | ||
- The `Generator[T]` interface is clunky. A value of type `T` generates a value | ||
of type `T`, which is weird. It means that op arguments are doubling as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we renamed the Generate() method to Init() and have it modify the receiver. E.g.,
func (i *myInt) Init(r *rand.Rand) {
*i = myInt(r.Intn(100)) }
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still working on addressing the other comments, but I've been thinking more about how to specify generators. The approach I like the most is a Frankenstein combination of the existing proposals. First, we introduce a handful of interfaces:
type Generator[T any] interface {
Generate(r *rand.Rand) T
}
type Shrinker[T any] interface {
Shrink(T) []T
}
type Formatter[T any] interface {
Format(T) string
}
Generator[T]
generates values of typeT
.Shrinker[T]
shrinks values, which is used for minimization.Formatter[T]
pretty prints values, which is used for visualization.
I think these interfaces are as simple as possible, without any subtleties or oddities of some of the other proposals.
Next, workload methods have plain argument types:
type Workload struct { ... }
func (w *Workload) Foo(ctx context.Context, x int, y string) error { ... }
func (w *Workload) Bar(ctx context.Context, z bool) error { ... }
There is no casting, no Get
method, and no confusion about a type doubling as its generator. The user then has to define a separate struct with one method for every op in their workload struct. Methods return the Generator
s used to generate arguments to the ops.
type WorkloadGenerator struct {}
func (WorkloadGenerator) Foo() (Generator[int], Generator[string]) { ... }
func (WorkloadGenerator) Bar() (Generator[bool]) { ... }
To make implementing these methods easier, we provide a family of general purpose Generator
s. For example, sim.Int
, sim.String
, sim.ReadableString
, and so on. We can also provide basic functions to build Generator
s. For example, Pick[T any](xs ...T) Generator[T]
returns a Generator
that randomly returns the provided elements. Here's how we might implement WorkloadGenerator
:
type WorkloadGenerator struct {}
func (WorkloadGenerator) Foo() (Generator[int], Generator[string]) {
return sim.NegativeInt, sim.Pick("a", "b", "c")
}
func (WorkloadGenerator) Bar() (Generator[bool]) {
return sim.BiasedFlip(0.75)
}
Of course, users are free to implement their own custom instances of Generator[T]
as well. Moreover, a returned Generator[T]
may optionally implement the Shrinker[T]
or Formatter[T]
interfaces. The general purpose Generator
s we provide all implement the Shrinker
and Formatter
interfaces automatically.
If a user finds themselves re-using the same Generator
across a bunch of methods, they can use helper functions to avoid some of the repetition. Imagine a key-value store based workload, for example:
func key() Generator[string] { return sim.Pick("a", "b", "c") }
func value() Generator[string] { return sim.ReadableString }
type StoreGenerator struct {}
func (StoreGenerator) Get() Generator[string] { return key() }
func (StoreGenerator) Set() (Generator[string], Generator[string]) { return key(), value() }
func (StoreGenerator) Swap() (Generator[string], Generator[string]) { return key(), key() }
func (StoreGenerator) Delete() Generator[string] { return key() }
The main downside of this approach is that it requires by far the most typing out of any approach. The more I thought about this though, the less it bothered me. I started to think of it similar to how Go implements errors and how code is littered with if err != nil { ... }
code. It's more typing, but there's not really any subtleties or gotchas. I started to feel that this approach is the easiest to understand, but the most annoying to type out.
Finally, a small but related point. We've discussed the idea of generators having state. I started to think this is a bad idea and muddies the otherwise simple abstraction of a generator. I think all state should be in the workload, and the generator is left as a stateless and relatively independent entity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatter
can we use the normal "func (...) String() string" API instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Sanjay!
- If a workload has many ops that don't re-use types much, this approach is more | ||
onerous that defining generator methods. | ||
- The `Generator[T]` interface is clunky. A value of type `T` generates a value | ||
of type `T`, which is weird. It means that op arguments are doubling as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still working on addressing the other comments, but I've been thinking more about how to specify generators. The approach I like the most is a Frankenstein combination of the existing proposals. First, we introduce a handful of interfaces:
type Generator[T any] interface {
Generate(r *rand.Rand) T
}
type Shrinker[T any] interface {
Shrink(T) []T
}
type Formatter[T any] interface {
Format(T) string
}
Generator[T]
generates values of typeT
.Shrinker[T]
shrinks values, which is used for minimization.Formatter[T]
pretty prints values, which is used for visualization.
I think these interfaces are as simple as possible, without any subtleties or oddities of some of the other proposals.
Next, workload methods have plain argument types:
type Workload struct { ... }
func (w *Workload) Foo(ctx context.Context, x int, y string) error { ... }
func (w *Workload) Bar(ctx context.Context, z bool) error { ... }
There is no casting, no Get
method, and no confusion about a type doubling as its generator. The user then has to define a separate struct with one method for every op in their workload struct. Methods return the Generator
s used to generate arguments to the ops.
type WorkloadGenerator struct {}
func (WorkloadGenerator) Foo() (Generator[int], Generator[string]) { ... }
func (WorkloadGenerator) Bar() (Generator[bool]) { ... }
To make implementing these methods easier, we provide a family of general purpose Generator
s. For example, sim.Int
, sim.String
, sim.ReadableString
, and so on. We can also provide basic functions to build Generator
s. For example, Pick[T any](xs ...T) Generator[T]
returns a Generator
that randomly returns the provided elements. Here's how we might implement WorkloadGenerator
:
type WorkloadGenerator struct {}
func (WorkloadGenerator) Foo() (Generator[int], Generator[string]) {
return sim.NegativeInt, sim.Pick("a", "b", "c")
}
func (WorkloadGenerator) Bar() (Generator[bool]) {
return sim.BiasedFlip(0.75)
}
Of course, users are free to implement their own custom instances of Generator[T]
as well. Moreover, a returned Generator[T]
may optionally implement the Shrinker[T]
or Formatter[T]
interfaces. The general purpose Generator
s we provide all implement the Shrinker
and Formatter
interfaces automatically.
If a user finds themselves re-using the same Generator
across a bunch of methods, they can use helper functions to avoid some of the repetition. Imagine a key-value store based workload, for example:
func key() Generator[string] { return sim.Pick("a", "b", "c") }
func value() Generator[string] { return sim.ReadableString }
type StoreGenerator struct {}
func (StoreGenerator) Get() Generator[string] { return key() }
func (StoreGenerator) Set() (Generator[string], Generator[string]) { return key(), value() }
func (StoreGenerator) Swap() (Generator[string], Generator[string]) { return key(), key() }
func (StoreGenerator) Delete() Generator[string] { return key() }
The main downside of this approach is that it requires by far the most typing out of any approach. The more I thought about this though, the less it bothered me. I started to think of it similar to how Go implements errors and how code is littered with if err != nil { ... }
code. It's more typing, but there's not really any subtleties or gotchas. I started to feel that this approach is the easiest to understand, but the most annoying to type out.
Finally, a small but related point. We've discussed the idea of generators having state. I started to think this is a bad idea and muddies the otherwise simple abstraction of a generator. I think all state should be in the workload, and the generator is left as a stateless and relatively independent entity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Sanjay! I addressed the rest of your comments.
Gen: func(*rand.Rand) int { ... }, | ||
Func: func(ctx context.Context, replies map[int]int, x int, a A) error { | ||
if y, err := a.A(ctx, x); err == nil { | ||
replies[x] = y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
return nil | ||
} | ||
|
||
func (*Workload) GenCallA(r *rand.Rand) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied to your other comment with an example of this proposal.
- The `Generator[T]` interface is clunky. A value of type `T` generates a value | ||
of type `T`, which is weird. It means that op arguments are doubling as | ||
generators and generated values. Plus, the user sometimes has to typecast | ||
these arguments. We saw that in the `CallA` method above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that typecasting is both un-ergonomic and error-prone. I described one approach to a Get
method in Option 8 which separates the Generate
method and Get
method across two different interfaces, but we can maybe merge them into a single interface.
|
||
**Cons.** | ||
Magic struct tags. Also, it is awkward to fake a component to which we don't | ||
need a direct reference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this option to the doc. I can work on writing user docs.
- People can easily forget to register a generator for every type. | ||
- People can register two generators for the same type. | ||
|
||
### Decision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the decision we discussed on offline. I'll start implementing things, and we can tweak stuff as we go.
No description provided.