Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce coroutines to the language #1249

Draft
wants to merge 45 commits into
base: devel
Choose a base branch
from

Conversation

zerbina
Copy link
Collaborator

@zerbina zerbina commented Mar 18, 2024

Summary

Add first-class coroutines to the language. A coroutine is a stack-less, resumeable procedure, which is similar to .closure iterators, but with a lower-level and more flexible interface.

The main inspiration for this feature is the CPS library.

Important: this PR is a work-in-progress. It exists for the purpose of tracking progress and to have something concrete that discussion can focus on. Nothing about the specification nor implementation is final, not even the name "coroutine".

Details

Current status:

  • a basic specification exists
  • the compiler implements a subset of the specification

The specification evolved from an earlier draft implementation, which can be found here. For turning procedures into resumable procedures, the existing closureiters machinery is re-used.


To-Do

  • iterate on the specification until a release candidate emerges
  • implement the specification (can start concurrently with iterating on the specification)

@zerbina zerbina added enhancement New feature or request spec Specification tests edit or formal specification required. compiler General compiler tag language-design Language design syntax, semantics, types, statics and dynamics. labels Mar 18, 2024
discard """
"""

## Except for `openArray` and `var` types, there are no restrictions on the
Copy link
Collaborator

@saem saem Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the var parameter restriction because we don't know how to handle them in the environment? Because technically we can just treat it as a ptr to the that location (out parameter), although perhaps we're inheriting this limitation from closure iterators (haven't poked at those much).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disallowing var is not so much a limitation with closure iterators, rather it's for safety. Capturing a var as a ptr is indeed possible, but then it's very easy to create a situation of dangling pointers. Consider:

proc coro(x: var int) {.coroutine.} =
  x = 1
  echo x

proc test(): Coroutine[void] =
  var y = 1
  result = launch coro(y)

# upon returning from `test`, the `y` local is gone, so reading/writing from/to
# `x` would be a use-after-(stack)-free  
resume(test())

Callsite escape analysis doesn't help, since the coroutine could save the instance in some global, for example.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so we couldn't require var T params be treated as lent T?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to refs shared-owner ship semantics, and Coroutine beings ref, it would not work, yeah.

If Coroutine were a unique-ownership type (e.g., something akin to C++'s std::unique_ptr), then storing the var T params as lent T could work, but the borrow checker would need to understand such unique-ownership type.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, even if a coroutine callable was proc coro(self: sink Coroutine): Coroutine[T], sink doesn't work because it might be consumed (not strong enough) and that doesn't play well with the fact that we need to return it. I guess we'd have to have {.noalias.} become first class, as opposed to restrict in codgen.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea, but we could introduce a CoroutinePtr, which is a heap-allocated object that has value semantics. Making this type as ergonomic to use as a ref type would be quite tricky, however.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I don't think it'd be a good idea. If anything that capability should develop separately (guided by a few more use cases) and if successful used here. Otherwise, I'm guessing there are better solutions.

@blackmius
Copy link

add tests where you passing lambda functions into coroutine, tests calling coroutine inside other coroutine (or you designed it to call launch in all cases?), and running lambdas inside coroutines that returns other coroutine

proc a(cb: proc()) {.coroutine.}
proc b() {.coroutine} =
  b() # recursively
proc c(coro: proc() {.coroutine.}) {.coroutine.}

## ``self`` parameter is made available.

# XXX: not a good solution, either the ``self`` parameter should be explicit
# (somehow), or there should be a magic procedure
Copy link

@blackmius blackmius Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

declaring self explicitly is better imo

proc coro(self: Coroutine) {.coroutine.} =
  echo typeof(self)
  echo self.status

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; if we're not going to have colorless functions, we should go all the way into color. Otherwise, it's just confusing to both programmers and metaprogrammers.

Copy link
Collaborator

@saem saem Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it is, proc c(a: int) {.coroutine.} describes at least two things, one is the coroutine constructor, and the other is all the fields (parameter list) that are part of the backing closure environment. The body of c assumes both self and an unpacked a.

We could have proc c(self: Coroutine, a: int) {.coroutine.}, but besides being redundant, it only provides an optional rename of the self parameter and specification of an alternative Coroutine base type. With that said:

  1. self's type isn't Coroutine, that's actually the base type
  2. renaming self is likely not helpful for most readers (misfeature)
  3. we can do named parameter shenanigans?

If you really want to be explicit, then it'd be more like this:

proc coro(#[constructor params go here]#) {.coroutine.} =
  proc (self: Coroutine) =
    echo typeof(self)
    echo self.status
  # `coroutine` pragma makes the return type a `void proc closure/Coroutine[void]`

Copy link
Collaborator Author

@zerbina zerbina Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had considered this, but I'm not sure if it's better. The first parameter having special meaning seems inconsistent with the rest of the language, where parameter position doesn't have special meaning.

I think it's also somewhat confusing for the programmer:

proc coro(self: Coroutine) {.coroutine.} =
  discard

# why is there no argument passed to the coroutine constructor?
discard launch(coro())

Edit: this boils down to what @saem said, but I missed said comment due to not refreshing the page

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're not going to have colorless functions

I'm not sure I understand this remark. The feature as currently specified doesn't introduce "color", in that the coroutines can be launched and resumed from everywhere, without tainting the caller.

@blackmius
Copy link

also how try/except/finally and defer will work in coroutines?

Copy link
Contributor

@disruptek disruptek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to reduce/eliminate the status use; it feels like API that doesn't actually add much other than explicit boilerplate. Having to check the status of a coroutine feels like it makes what should be a simple function call into a fiddly instrument not unlike recovering an Option or Result value.

When should I choose a coroutine over a closure iterator and why?

Comment on lines 17 to 18
let instance = launch coro()
discard resume(instance) # the echo will be executed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean that an apparently immutable coroutine can change state?
Why do we need a launch operator?
Why can't we simply call instance() to resume the coroutine and recover the argument of a yield?
What purpose does returning the coroutine from resume serve?

Copy link
Collaborator

@saem saem Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What purpose does returning the coroutine from resume serve?

The point of returning a coroutine is to allow one coroutine call to "forever" yield control to another coroutine in its stead. This allows two way communication between coroutines, where it's not so much caller-callee, but "symmetric".

I recommend the wikipedia article on coroutines, the "Definition and Types" section is really good for mapping the design space (stackless/full, a/symmetric, and first-class vs constrained), and then the comparison section with subroutines illustrates the peering/symmetric relationship between a producer consumer pair.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean that an apparently immutable coroutine can change state?

I'm not sure I understand the question. A couroutine instance is not immutable, or at least I don't consider them to be.

Why do we need a launch operator?

To construct/create an instance of a coroutine without blocking use of the coroutine(...) standalone invocation syntax for other purposes. At present, coroutine(...) is expanded to trampoline(launch coroutine()).

Why can't we simply call instance() to resume the coroutine and recover the argument of a yield?

Having an explicit resume relies on less less-tested compiler features, but I agree that having a proc ()[T: Coroutine](c: T): T that's an alias for resume would make sense.

## ``self`` parameter is made available.

# XXX: not a good solution, either the ``self`` parameter should be explicit
# (somehow), or there should be a magic procedure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; if we're not going to have colorless functions, we should go all the way into color. Otherwise, it's just confusing to both programmers and metaprogrammers.

Comment on lines +17 to +19
## If a coroutine instance is cancelled, the exception that caused
## cancellation is stored in the coroutine instance object. It can be
## extracted by using the built-in ``unwrap``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this preferable to simply raising the exception in the code path where it's running?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because presently the exception is doing double duty for exceptions + cancellation. At the moment, I think cancellation needs its own thing, but that needs syntax/some operator and still jive with exceptions, as those can be raised by calls subordinate to the coroutine.

What should be possible is assuming there is an executing coroutine (coro) and it encounters an error: it should be able to signal to the caller than an error occurred, the caller should be able to recover, and then it can resume execution of coro.

Comment on lines 6 to 11
proc trampoline(c: sink Coroutine[int]): int =
# an example implementation
var c = c
while c.status == csSuspended:
c = c.resume()
result = c.finish()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a deep disappointment to have to trampoline coroutines implemented in the compiler.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, why? In the earlier revision I linked, the trampoline action was fixed and inserted by the compiler, but that took away control from the programmer while unnecessarily complicating the compiler.

I'm imagining the responsibilities to be the following:

  • the compiler provides a low-level but still easy to use and build upon interface for interacting with coroutines
  • some sugar and often useful functionality (such as a general trampoline) is provided by the standard library
  • libraries can then build their own abstractions on-top, if wanted

In short, the compiler does the heavy lifting and exposes a low-level interface; sugar and other convenience is provided by the standard (or external) library code. In my mind, this should help with keeping the compiler itself lean(er) and make further evolution of the feature easier.

Comment on lines +2 to +3
## A coroutine can be an inner procedure that closes over locals of the outer
## procedure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, why can't it accept mutable arguments?

Comment on lines 5 to 8
## The ``.coroutine`` pragma can be supplied with a type, which the internal
## coroutine environment object then derives from. The type must be a non-
## final ``ref`` sub-type of ``Coroutine[T]``, where `T` is the return type
## of the coroutine.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this should evaporate with the explicit first argument defining the coroutine's type.

Comment on lines 8 to 9
if cancel:
raise CatchableError.newException("cancel")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zerbina did you consider/think of an alternative approach to cancellation, I'm trying to sort out how we might do:

  1. coroutine: signals error
  2. caller: recovers and resume
  3. coroutine: resumes post-recovery

I imagine that'd be pretty key for enabling something like effect handling.

It could also be that raise is a coroutine entirely giving up (irrecoverable) vs "something else" where a coroutine needs input/something to continue or it's cancelled ('graceful' cancellation). This distinction might be entirely off base.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you consider/think of an alternative approach to cancellation

I did consider an explicit cancel, but the way I imagined it seemed redundant with raising an exception, so I left it out.

So, with the current specification, one can do two things:

  • raise, and not handle, an exception from within the coroutine (terminal; coroutine instance cannot be resumed afterwards)
  • suspend and expect the caller to perform some action (like changing some coroutine instance state)

I think what you're saying, please correct me if I'm misunderstanding, is that coroutines should provide a built-in middle-ground, where the coroutine instance can signal that it can be resumed, but some action must be taken first; not doing so would raise a CoroutineError. How this could look like, I'm not sure yet.


Aside: whichever solution we pick, I think an escaping exception should abort the coroutine. My reasoning is that I think the programmer and compiler should be able reason about control-flow within a coroutine the same as they do everywhere else (minus the suspending, of course, but that should ideally be more or less opaque).

Put differently, I think that it should be possible to turn a normal procedure into a coroutine without having to change anything else, and the body continues to behave exactly the same.

If a coroutine can continue after a raise (or a normal call that raised), then that's no longer the above no longer holds, and whether raise is a terminator for structured control-flow becomes context and run-time dependent.

In short, I think raise should be "entirely giving up".

Copy link
Collaborator

@saem saem Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, with the current specification, one can do two things:

  • raise, and not handle, an exception from within the coroutine (terminal; coroutine instance cannot be resumed afterwards)
  • suspend and expect the caller to perform some action (like changing some coroutine instance state)

I think what you're saying, please correct me if I'm misunderstanding, is that coroutines should provide a built-in middle-ground, where the coroutine instance can signal that it can be resumed, but some action must be taken first; not doing so would raise a CoroutineError. How this could look like, I'm not sure yet.

Yup, I think a middle ground makes sense, to share my motivation for having a middle ground, I want to see if we can drop the status field, ideally more, and I believe the middle ground will be required in fulfilling that.

I'll pick up dropping the state field below the section break.

Aside: whichever solution we pick, I think an escaping exception should abort the coroutine. My reasoning is that I think the programmer and compiler should be able reason about control-flow within a coroutine the same as they do everywhere else (minus the suspending, of course, but that should ideally be more or less opaque).

I also agree that raising exceptions, and their impact on control flow, should remain as expected (hah!) and therefore predictable based on existing reasoning/intuition.

Put differently, I think that it should be possible to turn a normal procedure into a coroutine without having to change anything else, and the body continues to behave exactly the same.

This is a good property, because any background CPS transform, or the reverse, should be equivalent -- otherwise we've got bigger problems. 😬

If a coroutine can continue after a raise (or a normal call that raised), then that's no longer the above no longer holds, and whether raise is a terminator for structured control-flow becomes context and run-time dependent.

In short, I think raise should be "entirely giving up".

Yup, on the same page; raise is an 'abort', instead of a 'fail/error'.


Dropping the state Field

I'd like to drop the state field, for possibly the same reasons @disruptek suggested removing it, mine are:

  • it reduces the amount of mutable state
  • the state of a coroutine is a derivation of it's internal control flow, so the field is really a cache than the reality
  • it invites the temptation of fiddling with it
  • smaller coroutine instances; moar fast!

Instead, at least at a low level, a coroutine should yields those state values (statically knowable within a transformed coroutine definition body). For completion, where completion might have a non-void return value a finishedWithResult state that informs the receiver that a 'result' field both exists and it contains a value of type T for a Coroutine[T]. This path might allow us to shrink CoroutineBase of other fields as well.

Otherwise, where state needs to be queried later, that burden can be carried by any runner, out of band, or by extended/more stateful coroutine instance types.

@saem
Copy link
Collaborator

saem commented Mar 19, 2024

The wikipedia article on coroutines does a good job outlining the design space. Except for stackfull vs stackless, where the decision is sorta made for us (it's stackless), symmetric vs asymmetric and constrained vs first-class are still open for discussion.

I think we should aim for symmetric coroutines, because without that I believe we end up with generators (a limited form), which IIUC we already have via first class iterators. Major open questions here:

  • call vs pass/yield: how do we call (pass control with the expectation of control coming back) vs yield/pass to (transfer control with no expectation of getting control back) a coroutine -- this is of course mostly within coroutines
  • success vs cancellation, and recovery: if a computation (coroutine) fails, how do we recover and resume? how does this mesh with exceptions?

Constrained vs First-Class, I don't have a good intuition about it, I lean towards first-class where one can manipulate coroutine objects, but I also understand this could wreak havoc with analysis/safety properties, not to mention future composition and extension.


Quick explainer of terms, but reading the 'Definition and Types' section (it's very short) is likely much better:

  • Stackfull vs Stackless: if any ol' proc can yield, then it's stack full, otherwise stackless (IIUC limitation of closure iterators means we're stackless)
  • Constrained vs First-Class: whether coroutines are objects that can freely be manipulated by the programmer (caveat: flexibility now means less flexibility later)
  • Asymmetric vs Symmetric: If a coroutine can choose where to yield control to, then it's symmetric, allowing for one coroutine to setup another, yield directly to it, with no expectation that it's a call (give control, and then expect it back)

@zerbina
Copy link
Collaborator Author

zerbina commented Mar 20, 2024

Thank you for the overview, @saem!

My current opinions/thoughts on the major questions you raised:

I think we should aim for symmetric coroutines

I agree.

  • call vs pass/yield: how do we call (pass control with the expectation of control coming back) vs yield/pass to (transfer control with no expectation of getting control back) a coroutine -- this is of course mostly within coroutines

I do think both should be possible. That is, one should be able to yield to another coroutine (without any expectation of getting control back; at the moment), while also being able to invoke another coroutine (with the expectation of getting control back).

For what it's worth, the current specification allows for both:

proc coro() {.coroutine.} =
  let other = launch coro2()
  resume(other)
  # ^^ works just like it would outside of a coroutine, i.e., it's guaranteed the `coro`
  # gets control back
  suspend(other)
  # ^^ yield to other coroutine. No guarantees that `coro` ever continues
  • success vs cancellation, and recovery: if a computation (coroutine) fails, how do we recover and resume? how does this mesh with exceptions?

I don't have a fully thought-through opinion on this, but, building on what I said earlier, I do think that a coroutine should not be able to resume after an unhandled exception escapes it.

A mock example of how I think handling an exception and continuing the coroutine afterwards could look like:

proc coro() {.coroutine.} =
  cancel CatchableError.newException("error")
  echo "here"

var instance = launch coro()
resume(instance)
# sugar implemented in the standard library could provide a more convenient
# abstraction for this kind of handling
if instance.status == csAborted:
  instance.handle(proc (e: sink Exception): bool =
    echo "handled: ", e.msg
    result = true # error was handled
  )
  # if returning 'false', then the `cancel` call within the coroutine would `raise`
  # the exception
# now the instance can resume
resume(instance)

Constrained vs First-Class

My opinion is that it they should be constrained, with coroutine objects consisting of internal, not freely modifiable state, and state that can be freely modified by the programmer (the custom coroutine types provide this).

  • (IIUC limitation of closure iterators means we're stackless)

It depends on how stackful coroutines should work. For example, they could be (and without being visible to the programmer) emulated by promoting every routine that yields to a coroutine to a coroutine. However, such coroutines would only be "stackful" in terms of semantics, not in the sense that there's a "real" stack memory backing them.

As a minor aside, (experimental) stackful coroutine support did exist as a library solution at one point, but it was removed (refer to #45).

@zerbina
Copy link
Collaborator Author

zerbina commented Mar 20, 2024

Thanks for taking interest and providing feedback, @blackmius.

add tests where you passing lambda functions into coroutine

The current set of tests are meant as specification tests, which describe what works and what doesn't. If something isn't explicitly tested for or mentioned, then that means it's not special and should just work.

tests calling coroutine inside other coroutine (or you designed it to call launch in all cases?)

Either you'd launch one and then yield/resume, or you'd use the trampoline feature, yeah. The current set of tests should already covers this.

and running lambdas inside coroutines that returns other coroutine

Closures returning coroutines work just like any other closure. As for a proc () {.coroutine.} type, I'm not entirely sure what they should represent, nor whether adding them is a good idea.

Should it be the type of the internal launch procedure for a coroutine with the given signature? Should it be something else? How to use the type?

They could work like this:

proc coro() {.coroutine.} =
  discard

let c = coro
# `c` can now be used in all places where `coro` can be used
var instance = launch c()
c() # invokes `trampoline`

also how try/except/finally and defer will work in coroutines?

Early implementation limitations aside (closureiters has some issues), those constructs would work the same as everywhere else. For me, a guiding principle is that the programmer should be able to reason about the body of a coroutine in much the same way as they would do for any other routine.

@disruptek
Copy link
Contributor

Is this thread about coroutines or is it about something else which is currently named coroutine?

@zerbina
Copy link
Collaborator Author

zerbina commented Mar 20, 2024

Is this thread about coroutines or is it about something else which is currently named coroutine?

I think Coroutine is an adequate name for the feature as it's currently specified. What name do you think would describe/fit the current behaviour better?

@saem
Copy link
Collaborator

saem commented Mar 20, 2024

  • call vs pass/yield: how do we call (pass control with the expectation of control coming back) vs yield/pass to (transfer control with no expectation of getting control back) a coroutine -- this is of course mostly within coroutines

I do think both should be possible. That is, one should be able to yield to another coroutine (without any expectation of getting control back; at the moment), while also being able to invoke another coroutine (with the expectation of getting control back).

... snip ...

  • success vs cancellation, and recovery: if a computation (coroutine) fails, how do we recover and resume? how does this mesh with exceptions?

I don't have a fully thought-through opinion on this, but, building on what I said earlier, I do think that a coroutine should not be able to resume after an unhandled exception escapes it.

The above facilities might provide the means to allow hooks that raise, because we should be able to handle the raised exception within a coroutine with a try/except all within the hook, then pass control back to the hook.

As a minor aside, (experimental) stackful coroutine support did exist as a library solution at one point, but it was removed (refer to #45).

I forgot about this, we can restore it at some point, although not as a dialect.

@saem
Copy link
Collaborator

saem commented Mar 20, 2024

Is this thread about coroutines or is it about something else which is currently named coroutine?

Fair point, they really look like coroutines, and if we get them symmetric then we're pretty much at one-shot/delimited continuations. I'm just having a really hard time getting to the essence of what we're after without establishing a definition of the properties we want it to exhibit, and along the way sort out an interface that allows the user to best exploits those properties.

@blackmius
Copy link

blackmius commented Mar 20, 2024

@zerbina i suggested such tests because there were problems with these cases in CPS and async/await implementations.

Closures returning coroutines work just like any other closure. As for a proc () {.coroutine.} type, I'm not entirely sure what they should represent, nor whether adding them is a good idea.

just like regular closures they represents either callback or function returning value, but now they can suspend and be awaited with value. and needs for composition

here are examples

# coroutine callback (also as closure)
proc httpServe(handler: proc(req: Request, res: Response)  {.coroutine.}) {.coroutine.}
httpServe(proc (req: Request, res: Response) {.coroutine.} =
  resume launch res.end("hello", 200)
)

# coroutine closure
proc socks5Req(url: string, auth: proc (sock: Socket): bool {.coroutine.}): Socks5 {.coroutine.} =
  let sock = newSocket(url)
  # initiate socks5 protocol
  if not resume(launch auth(sock)):
    # wait authenticated
    raise Socks5AuthException()
  return Socks5(sock)

# maybe some pipeline operations makings requests what needs to await values to continue

maybe it is not proc () {.coroutine.} but proc (): Coroutine[void]

@zerbina
Copy link
Collaborator Author

zerbina commented Mar 20, 2024

just like regular closures they represents either callback or function returning value, but now they can suspend and be awaited with value. and needs for composition

Okay, so if I understand you correctly, you argue that proc () {.coroutine.} should be a first-class coroutine (a Coroutine represents a coroutine instance), which makes sense, and overlaps with what I suggested above. Having them seems like a good idea, yeah, I'll write the specification tests.

One note about your example code: if not resume(launch auth(sock)) won't work, at least with the current specification, since resume always returns a (.discardable) Coroutine. Assuming that an applicable trampoline overload exists, you'd have to use if not auth(sock) (which the compiler expands to if not trampoline(launch auth(sock)).

The tests are about custom *instance* types, not *coroutine* types.
System integration:
* add a first revision of the compiler interface and low-level API
* add an example of how a generic `trampoline` could look like

Compiler integration:
* register the magics
* register the `coroutine` word
* add a condition symbol for the "coroutine" feature
@@ -0,0 +1,25 @@

# XXX: maybe coroutine values should be named "first-class coroutine symbol"?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if proc foo() = discard is a proc def, then shouldn't proc bar() {.coroutine.} = discard be a coroutine def, although a better term might be coroutine-constructor?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current terminology I'm using (it changed a bit since writing the comment):

  • proc c() {.coroutine.} = discard is a coroutine definition
  • c is the name of a coroutine
  • the c() in launch c() is a coroutine instance construction
  • proc() {.coroutine.} is a coroutine type
  • launch returns a coroutine instance
  • with let v = c, v stores a coroutine value (coroutine alone probably suffices)

## with them.

type
CoroutineBase {.compilerproc.} = ref object of RootObj

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there seems to be a lack of transitions from child to parent
like mom field in CPS at least i dont get how coroutine can return control back to its parent after several suspendings without storing parent ref in self

Copy link

@blackmius blackmius Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proc a() {.coroutine.} =
  echo 2
  suspend()
  echo 3
proc b() {.coroutine.} =
  echo 1
  a()
  echo 4
b()

with current trampoline implementation this will work because there are not real suspends. i think it is more like closure iterators right now but with only single return at the end (so even less usefull)

but if we save a coroutine somewhere and will try to continue later it will lost its parent because it lost its stack on which it all was worked before

var wait: Coroutine

proc a() {.coroutine.} =
  echo 2
  wait = self
  suspend()
  echo 3
proc b() {.coroutine.} =
  echo 1
  launch a()
  echo 4
launch b()
resume wait

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there seems to be a lack of transitions from child to parent like mom field in CPS

Correct, at the moment, the basic instance type has no concept of coroutine instance relationship.

at least i dont get how coroutine can return control back to its parent after several suspendings without storing parent ref in self

It is currently possible to implement tail calls, which I think is what you're after, by using a custom instance type that stores the parent instance. If knowing the parent coroutine instance (if any) is something that's necessary often enough, then I think it would make sense to have a mom-like field part of Coroutine[T].

with current trampoline implementation this will work because there are not real suspends

Yep, but keep in mind that the trampoline implementation is only a showcase implementation. It's there to have a working example of how an implementation could look like.

In addition, the trampoline is looked up in the context of a(), meaning that it's possible to override/overload the default trampoline (I'm still not sure whether there should be an always-present default trampoline at all).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least i dont get how coroutine can return control back to its parent after several suspendings without storing parent ref in self

It is currently possible to implement tail calls, which I think is what you're after, by using a custom instance type that stores the parent instance. If knowing the parent coroutine instance (if any) is something that's necessary often enough, then I think it would make sense to have a mom-like field part of Coroutine[T].

Hmm, if the current implementation doesn't remember the parent, and the sorta unconventional control flow, at least based on coroutine "standards", would a potentially better name for this more stripped down concept be suspendable. It's a suspendable routine, as in it can suspend it's execution, but that's really the only guarantee it provides. It doesn't have the state (i.e.: mom) to honour returning control back to the parent.

I'm sorta borrowing terminology from Kotlin here. Don't mean for this to be out of the blue, been thinking about it for a while and this occurred to me just now. I might be putting too hard a requirement on how coroutines should behave, but given that subroutines are meant to be a special case of coroutine, I think the parent requirement seems reasonable (subroutines take control from and return control to the parent).

@Varriount
Copy link
Contributor

Keep in mind that symmetric coroutines can be implemented via asymmetric coroutines, and vice-versa.

zerbina and others added 6 commits March 26, 2024 00:31
The special-handling for the result variable wasn't employed.
`Exception` is an object type, not a `ref` type.
A catchable exception escaping a coroutine now properly cancels the
coroutine. (The `t06_cancel_coroutine` test succeeds)
It's implemented as a separately injected cursor local. When used in
the coroutine, it gets lifted into the instance -- it being a cursor
prevents reference cycles in this case.

For attaching the symbol the to the procedure, the (otherwise unused)
dispatcher slot is re-used.
User-code is most likely going to need to refer to *all* instance types
in some way.
Co-authored-by: Saem Ghani <[email protected]>
@blackmius
Copy link

suspend lifted differently than yield, maybe it is better to check these cases too

proc a() {.coroutine.} =
  for i in 1..5:
    echo i
    suspend(self)
  var q = 5
  while q > 0:
    suspend(self)
    echo q
    q -= 1
trampoline a()

proc b(a: int) {.coroutine.} =
  if a > 2:
    suspend(self)
    echo 1
  case a:
  of 1:
    suspend(self)
    echo 2
  else:
    suspend(self)

proc c() {.coroutine.} =
  defer:
    suspend(self)
    echo 4
  try:
    suspend(self)
    echo 1
    raise newException("a")
  except:
    echo 2
    suspend(self)
  finally:
    echo 3
    suspend(self)

* suspending in for loops doesn't work (misbehaves at run-time)
* `try`/`except`/`finally` results in C compiler errors
A `result` symbol was unnecessarily created twice for the constructor.
The full environment type was used to computed the offset, but this is
wrong! Its fields need to be patched, so the environment's *base* type
has to be used for computing the field offset.

The incorrect field positions made the VM crash.
The previous implementation had one major flaw: it didn't account for
`closureiters` adding new fields to the environment type, which happens
when there's an `except` or `finally` in the body.

To properly support that situation, the pass is reworked to be bit less
clever:
* `coroutines` doesn't dispatch to `lambdalifting`
* instead, the inner procedure is a lambda expression
* the inner procedure is transformed during the core `transf` pass for
  the constructor (requires recursion, but is much easier)

When the `closureiters` is now run, the hidden parameter still has the
original type, allowing for the addition of new fields.
Temporaries spawned as part of `transf` weren't lifted into the
environment (due to a phase ordering issue), affecting inlining. Same
as with closure iterators, the spawned temporaries need to be added to
the environment directly.
`suspend` accepted a generic `Coroutine` type as the parameter, which
resulted in a run-time type error when an argument not of type
`CoroutineBase` was passed to `suspend`. Only the VM, due to its strict
checks, failed because of this.
@zerbina
Copy link
Collaborator Author

zerbina commented Mar 26, 2024

@blackmius: Thank you for testing the branch! I've reduced the test you provided to the problematic parts and added them to the test suite.

suspend lifted differently than yield

suspend is lowered to yield; they're not transformed differently. The problem was that not all temporaries were properly lifted into the environment.

In many cases, `suspend` is only be called with `self`. Apart from
reducing visual noise, not having to explicitly pass `self` also allows
for a more efficient implementation of yield-to-self.
For coroutines with a result variables, the compiler crashes when
there's an error in the body.

This was because `addResult` didn't account for the result slot
existing but being empty, *appending* the result symbol to the AST,
thus leaving the actual result slot uninitialized and `buildErrorList`
then crashing due to a traversed node being nil.
@zerbina
Copy link
Collaborator Author

zerbina commented Mar 26, 2024

On Matrix, the suggestion was made to have suspend() mean the same as suspend(self). I think this is a good idea, so I've changed the specification for suspend and updated the implementation accordingly.

@zerbina
Copy link
Collaborator Author

zerbina commented Apr 5, 2024

Regarding whether the built-in coroutine should know about continuations (i.e., the coroutine to continue with on exit if the current one), after spending some time on thinking about it, my current opinion is that it should.

Upsides

  • a baseline coroutines is more powerful, making them compose better
  • less reliance on abstraction (could also be seen as a downside)
  • the name "coroutine" would be more fitting (in their present shape, @saem suggested that "suspendable procedure" might be a better name for coroutines)
  • no new hook (for providing the continuation) is required

Downsides

  • slightly more code in the compiler
  • a coroutine instance is larger by default (by one ref)
  • the continuation is managed by the runtime/compiler, so there's somewhat less freedom for custom abstractions

What's still missing either way is a tail routine, which allows for tail-calling a coroutine. This would be similar to suspend(other), but with the important difference that:

  • control doesn't get passed back to the caller of resume
  • exceptions and return value unpacking happen in the coroutine

A practical demonstration:

proc other(): int {.coroutine.} =
  echo "2"
  result = 3

proc coro() {.coroutine.} =
  echo "1"
  echo tail(other())
  echo "4"

let c = resume coro()
doAssert c.status == csPending
# echoes 1 2 3 4

If continuation support is built into coroutines, this, more or less, work as is. If going the library route, the standard library needs to provide a Continuation instance type, which other would have to use.

@zerbina
Copy link
Collaborator Author

zerbina commented Apr 10, 2024

Regarding the interface, I've been thinking about whether implicit coroutine constructors should be removed. At the moment, the signature of a coroutine describes the constructor, not the coroutine itself:

proc coro(a, b: int): int {.coroutine.} =
  discard

# coro is the *constructor* for a coroutine instance of type `Coroutine[int]`, not
# the coroutine itself  

What I've been thinking about is making the constructor explicit, by re-introducing launch:

proc coro(self: Coroutine) {.coroutine.} =
  discard

resume(launch coro)

# for capturing parameters, a dedicate constructor procedure is required
proc construct(x, y: int): Coroutine[void] =
  launch(proc (self: Coroutine) {.coroutine.} =
    echo x, y
  )

resume(construct(1, 2))

The Upsides

  • there's less magic going on, making the code easier to reason about for the programmer
  • the self parameter becomes explicit, which would allow for removing the .coroutine pragma parameter
  • the implementation in the compiler becomes simpler

In addition, removing implicit coroutine constructors might allow for the following to work, but there are some open questions regarding how it'd be implemented:

proc coro(self: Coroutine, a, b: int) {.coroutine.} =
  echo (a, b)
  suspend()
  echo (a, b)

let c = launch coro
resume c, 1, 2 # echoes "(1, 2)"
resume c, 3, 4 # echoes "(3, 4)"

The Downsides

  • capturing parameters becomes less efficient, at least with how lambda-lifting currently works
  • it's not possible to turn a procedure into a coroutine by just adding the .coroutine pragma
  • macros cannot support being coroutines

The overlap (in terms of interface) between closure iterators and coroutines would also increase significantly.

@saem
Copy link
Collaborator

saem commented Apr 11, 2024

Regarding the interface, I've been thinking about whether implicit coroutine constructors should be removed. At the moment, the signature of a coroutine describes the constructor, not the coroutine itself

I briefly thought about this too, this is where my head went, sharing for sharing's sake, I don't consider this any sort of final interface, but merely showing how I was thinking about it if trending towards a more explicit interface.

proc cc(a: int): Coroutine[int] =
  ## `cc` is the coroutine constructor
  result.coroutine: # template to lift/assign `c` to the constructed coroutine
    proc c(self: Coroutine) =
      ## `c` is the "computation-sequence"/state-machine definition
      echo a
      self.suspend()
      self.result = a

let coro = cc(10)
resume coro  # first state of `c`
resume coro  # second state of `c`

I think the more explicit style of the declaration is better, as it's quite honest wrt what a coroutine is; but the calling/consumption side of the interface might leave something to be desired. An alternative approach for calling could be:

let coro = launch(cc(10))
resume coro
resume coro

# where `cc(10)` could then be construct + run to the end

* a coroutine can now pass control to another coroutine instance
  without `resume` returning
* the `suspend` overload that allows specifying a destination coroutine
  is removed
The continuation is stored directly within `CoroutineBase`.
@zerbina
Copy link
Collaborator Author

zerbina commented Apr 12, 2024

I briefly thought about this too, this is where my head went, sharing for sharing's sake, I don't consider this any sort of final interface, but merely showing how I was thinking about it if trending towards a more explicit interface.

Thank you for sharing, I appreciate it! I do quite like the idea of having a magic routine/operator that performs the lifting/assignment; it's very close to what currently happens internally, but within user-space. It's very honest, like you said, and there's a lot less hidden processing, symbol introduction, etc. going on, which I personally consider a significant plus.

If going with explicit constructors, I think an important design/interface decision is whether coroutine procedures (for the lack of better name) should be define-able everywhere, or, going with the interface from your idea, only as name-less argument to the coroutine magic.

Also, without explicit constructors, in the ideal case, I think the .coroutine pragma should not be a built-in pragma, as it would make the name available for custom, programmer-defined macro/template pragmas. Whether it's really possible to not need the built-in pragma, I'm not sure yet.

An alternative approach for calling could be:

let coro = launch(cc(10))
resume coro
resume coro

# where `cc(10)` could then be construct + run to the end

Being able to "call" a coroutine as if it were a normal procedure is certainly nice, and it's why, with the early draft, there was the "coroutine constructor invocation outside of launch calls expands to trampoline cc(...)" rule.

However, without implicit coroutine constructors, I'm not sure how it could work, since the explicit constructor would just return a Coroutine[T] value. What could work is designating coroutine constructors as such, likely via a pragma, in order to allow the compiler to special-case them, though I'm still undecided whether that's a good idea or not.

@saem
Copy link
Collaborator

saem commented Apr 12, 2024

I'll take a peek at Zig, they were focused on async/await, but they did address running the same routine in cooperative mode or simply linearly with blocking.

Maybe there are some ideas there.

@@ -0,0 +1,8 @@
## An anonymous routine be turned into a coroutine too.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## An anonymous routine be turned into a coroutine too.
## An anonymous routine can be turned into a coroutine too.

@saem
Copy link
Collaborator

saem commented Apr 17, 2024

  • the name "coroutine" would be more fitting (in their present shape, @saem suggested that "suspendable procedure" might be a better name for coroutines)

I think this is a connotative vs denotative meaning issue. As far as coroutines have an "official definition", they really aren't more than a suspendable routine (denotative). The modern expectation and understanding extends that to blur the line by including the continuation (connotative).

I'm less bullish on combining the two concepts by default. Suspendable routines being called that is a concession to the now somewhat popularized understanding of coroutine (we don't have to concede). While keeping the continuation passing separate let's us better explore structured concurrency and continuation context. Otherwise I suspect we'll end up adding another field or two (maybe that's not an issue?). Also, I think we lose an important bit of generality, it just adds suspendability to any routine.

I might be overly romanticizing the simplicity of only layering in suspension support, and it might be practically entirely useless. I consider unstructured concurrency that way, it all sounds straightforward, but it becomes untenable so fast that it's sorta pointless.

I'll admit I'm on the fence about it, with a preference of not mixing the two because that might be easier to change later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler General compiler tag enhancement New feature or request language-design Language design syntax, semantics, types, statics and dynamics. spec Specification tests edit or formal specification required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants