123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488148914901491149214931494149514961497149814991500150115021503150415051506150715081509151015111512151315141515151615171518151915201521152215231524152515261527152815291530153115321533153415351536153715381539154015411542154315441545154615471548154915501551155215531554155515561557155815591560 |
- =====================================
- Coroutines in LLVM
- =====================================
- .. contents::
- :local:
- :depth: 3
- .. warning::
- This is a work in progress. Compatibility across LLVM releases is not
- guaranteed.
- Introduction
- ============
- .. _coroutine handle:
- LLVM coroutines are functions that have one or more `suspend points`_.
- When a suspend point is reached, the execution of a coroutine is suspended and
- control is returned back to its caller. A suspended coroutine can be resumed
- to continue execution from the last suspend point or it can be destroyed.
- In the following example, we call function `f` (which may or may not be a
- coroutine itself) that returns a handle to a suspended coroutine
- (**coroutine handle**) that is used by `main` to resume the coroutine twice and
- then destroy it:
- .. code-block:: llvm
- define i32 @main() {
- entry:
- %hdl = call i8* @f(i32 4)
- call void @llvm.coro.resume(i8* %hdl)
- call void @llvm.coro.resume(i8* %hdl)
- call void @llvm.coro.destroy(i8* %hdl)
- ret i32 0
- }
- .. _coroutine frame:
- In addition to the function stack frame which exists when a coroutine is
- executing, there is an additional region of storage that contains objects that
- keep the coroutine state when a coroutine is suspended. This region of storage
- is called the **coroutine frame**. It is created when a coroutine is called
- and destroyed when a coroutine either runs to completion or is destroyed
- while suspended.
- LLVM currently supports two styles of coroutine lowering. These styles
- support substantially different sets of features, have substantially
- different ABIs, and expect substantially different patterns of frontend
- code generation. However, the styles also have a great deal in common.
- In all cases, an LLVM coroutine is initially represented as an ordinary LLVM
- function that has calls to `coroutine intrinsics`_ defining the structure of
- the coroutine. The coroutine function is then, in the most general case,
- rewritten by the coroutine lowering passes to become the "ramp function",
- the initial entrypoint of the coroutine, which executes until a suspend point
- is first reached. The remainder of the original coroutine function is split
- out into some number of "resume functions". Any state which must persist
- across suspensions is stored in the coroutine frame. The resume functions
- must somehow be able to handle either a "normal" resumption, which continues
- the normal execution of the coroutine, or an "abnormal" resumption, which
- must unwind the coroutine without attempting to suspend it.
- Switched-Resume Lowering
- ------------------------
- In LLVM's standard switched-resume lowering, signaled by the use of
- `llvm.coro.id`, the coroutine frame is stored as part of a "coroutine
- object" which represents a handle to a particular invocation of the
- coroutine. All coroutine objects support a common ABI allowing certain
- features to be used without knowing anything about the coroutine's
- implementation:
- - A coroutine object can be queried to see if it has reached completion
- with `llvm.coro.done`.
- - A coroutine object can be resumed normally if it has not already reached
- completion with `llvm.coro.resume`.
- - A coroutine object can be destroyed, invalidating the coroutine object,
- with `llvm.coro.destroy`. This must be done separately even if the
- coroutine has reached completion normally.
- - "Promise" storage, which is known to have a certain size and alignment,
- can be projected out of the coroutine object with `llvm.coro.promise`.
- The coroutine implementation must have been compiled to define a promise
- of the same size and alignment.
- In general, interacting with a coroutine object in any of these ways while
- it is running has undefined behavior.
- The coroutine function is split into three functions, representing three
- different ways that control can enter the coroutine:
- 1. the ramp function that is initially invoked, which takes arbitrary
- arguments and returns a pointer to the coroutine object;
- 2. a coroutine resume function that is invoked when the coroutine is resumed,
- which takes a pointer to the coroutine object and returns `void`;
- 3. a coroutine destroy function that is invoked when the coroutine is
- destroyed, which takes a pointer to the coroutine object and returns
- `void`.
- Because the resume and destroy functions are shared across all suspend
- points, suspend points must store the index of the active suspend in
- the coroutine object, and the resume/destroy functions must switch over
- that index to get back to the correct point. Hence the name of this
- lowering.
- Pointers to the resume and destroy functions are stored in the coroutine
- object at known offsets which are fixed for all coroutines. A completed
- coroutine is represented with a null resume function.
- There is a somewhat complex protocol of intrinsics for allocating and
- deallocating the coroutine object. It is complex in order to allow the
- allocation to be elided due to inlining. This protocol is discussed
- in further detail below.
- The frontend may generate code to call the coroutine function directly;
- this will become a call to the ramp function and will return a pointer
- to the coroutine object. The frontend should always resume or destroy
- the coroutine using the corresping intrinsics.
- Returned-Continuation Lowering
- ------------------------------
- In returned-continuation lowering, signaled by the use of
- `llvm.coro.id.retcon` or `llvm.coro.id.retcon.once`, some aspects of
- the ABI must be handled more explicitly by the frontend.
- In this lowering, every suspend point takes a list of "yielded values"
- which are returned back to the caller along with a function pointer,
- called the continuation function. The coroutine is resumed by simply
- calling this continuation function pointer. The original coroutine
- is divided into the ramp function and then an arbitrary number of
- these continuation functions, one for each suspend point.
- LLVM actually supports two closely-related returned-continuation
- lowerings:
- - In normal returned-continuation lowering, the coroutine may suspend
- itself multiple times. This means that a continuation function
- itself returns another continuation pointer, as well as a list of
- yielded values.
- The coroutine indicates that it has run to completion by returning
- a null continuation pointer. Any yielded values will be `undef`
- should be ignored.
- - In yield-once returned-continuation lowering, the coroutine must
- suspend itself exactly once (or throw an exception). The ramp
- function returns a continuation function pointer and yielded
- values, but the continuation function simply returns `void`
- when the coroutine has run to completion.
- The coroutine frame is maintained in a fixed-size buffer that is
- passed to the `coro.id` intrinsic, which guarantees a certain size
- and alignment statically. The same buffer must be passed to the
- continuation function(s). The coroutine will allocate memory if the
- buffer is insufficient, in which case it will need to store at
- least that pointer in the buffer; therefore the buffer must always
- be at least pointer-sized. How the coroutine uses the buffer may
- vary between suspend points.
- In addition to the buffer pointer, continuation functions take an
- argument indicating whether the coroutine is being resumed normally
- (zero) or abnormally (non-zero).
- LLVM is currently ineffective at statically eliminating allocations
- after fully inlining returned-continuation coroutines into a caller.
- This may be acceptable if LLVM's coroutine support is primarily being
- used for low-level lowering and inlining is expected to be applied
- earlier in the pipeline.
- Coroutines by Example
- =====================
- The examples below are all of switched-resume coroutines.
- Coroutine Representation
- ------------------------
- Let's look at an example of an LLVM coroutine with the behavior sketched
- by the following pseudo-code.
- .. code-block:: c++
- void *f(int n) {
- for(;;) {
- print(n++);
- <suspend> // returns a coroutine handle on first suspend
- }
- }
- This coroutine calls some function `print` with value `n` as an argument and
- suspends execution. Every time this coroutine resumes, it calls `print` again with an argument one bigger than the last time. This coroutine never completes by itself and must be destroyed explicitly. If we use this coroutine with
- a `main` shown in the previous section. It will call `print` with values 4, 5
- and 6 after which the coroutine will be destroyed.
- The LLVM IR for this coroutine looks like this:
- .. code-block:: llvm
- define i8* @f(i32 %n) {
- entry:
- %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
- %size = call i32 @llvm.coro.size.i32()
- %alloc = call i8* @malloc(i32 %size)
- %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %alloc)
- br label %loop
- loop:
- %n.val = phi i32 [ %n, %entry ], [ %inc, %loop ]
- %inc = add nsw i32 %n.val, 1
- call void @print(i32 %n.val)
- %0 = call i8 @llvm.coro.suspend(token none, i1 false)
- switch i8 %0, label %suspend [i8 0, label %loop
- i8 1, label %cleanup]
- cleanup:
- %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
- call void @free(i8* %mem)
- br label %suspend
- suspend:
- %unused = call i1 @llvm.coro.end(i8* %hdl, i1 false)
- ret i8* %hdl
- }
- The `entry` block establishes the coroutine frame. The `coro.size`_ intrinsic is
- lowered to a constant representing the size required for the coroutine frame.
- The `coro.begin`_ intrinsic initializes the coroutine frame and returns the
- coroutine handle. The second parameter of `coro.begin` is given a block of memory
- to be used if the coroutine frame needs to be allocated dynamically.
- The `coro.id`_ intrinsic serves as coroutine identity useful in cases when the
- `coro.begin`_ intrinsic get duplicated by optimization passes such as
- jump-threading.
- The `cleanup` block destroys the coroutine frame. The `coro.free`_ intrinsic,
- given the coroutine handle, returns a pointer of the memory block to be freed or
- `null` if the coroutine frame was not allocated dynamically. The `cleanup`
- block is entered when coroutine runs to completion by itself or destroyed via
- call to the `coro.destroy`_ intrinsic.
- The `suspend` block contains code to be executed when coroutine runs to
- completion or suspended. The `coro.end`_ intrinsic marks the point where
- a coroutine needs to return control back to the caller if it is not an initial
- invocation of the coroutine.
- The `loop` blocks represents the body of the coroutine. The `coro.suspend`_
- intrinsic in combination with the following switch indicates what happens to
- control flow when a coroutine is suspended (default case), resumed (case 0) or
- destroyed (case 1).
- Coroutine Transformation
- ------------------------
- One of the steps of coroutine lowering is building the coroutine frame. The
- def-use chains are analyzed to determine which objects need be kept alive across
- suspend points. In the coroutine shown in the previous section, use of virtual register
- `%n.val` is separated from the definition by a suspend point, therefore, it
- cannot reside on the stack frame since the latter goes away once the coroutine
- is suspended and control is returned back to the caller. An i32 slot is
- allocated in the coroutine frame and `%n.val` is spilled and reloaded from that
- slot as needed.
- We also store addresses of the resume and destroy functions so that the
- `coro.resume` and `coro.destroy` intrinsics can resume and destroy the coroutine
- when its identity cannot be determined statically at compile time. For our
- example, the coroutine frame will be:
- .. code-block:: llvm
- %f.frame = type { void (%f.frame*)*, void (%f.frame*)*, i32 }
- After resume and destroy parts are outlined, function `f` will contain only the
- code responsible for creation and initialization of the coroutine frame and
- execution of the coroutine until a suspend point is reached:
- .. code-block:: llvm
- define i8* @f(i32 %n) {
- entry:
- %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
- %alloc = call noalias i8* @malloc(i32 24)
- %0 = call noalias i8* @llvm.coro.begin(token %id, i8* %alloc)
- %frame = bitcast i8* %0 to %f.frame*
- %1 = getelementptr %f.frame, %f.frame* %frame, i32 0, i32 0
- store void (%f.frame*)* @f.resume, void (%f.frame*)** %1
- %2 = getelementptr %f.frame, %f.frame* %frame, i32 0, i32 1
- store void (%f.frame*)* @f.destroy, void (%f.frame*)** %2
-
- %inc = add nsw i32 %n, 1
- %inc.spill.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i32 0, i32 2
- store i32 %inc, i32* %inc.spill.addr
- call void @print(i32 %n)
-
- ret i8* %frame
- }
- Outlined resume part of the coroutine will reside in function `f.resume`:
- .. code-block:: llvm
- define internal fastcc void @f.resume(%f.frame* %frame.ptr.resume) {
- entry:
- %inc.spill.addr = getelementptr %f.frame, %f.frame* %frame.ptr.resume, i64 0, i32 2
- %inc.spill = load i32, i32* %inc.spill.addr, align 4
- %inc = add i32 %n.val, 1
- store i32 %inc, i32* %inc.spill.addr, align 4
- tail call void @print(i32 %inc)
- ret void
- }
- Whereas function `f.destroy` will contain the cleanup code for the coroutine:
- .. code-block:: llvm
- define internal fastcc void @f.destroy(%f.frame* %frame.ptr.destroy) {
- entry:
- %0 = bitcast %f.frame* %frame.ptr.destroy to i8*
- tail call void @free(i8* %0)
- ret void
- }
- Avoiding Heap Allocations
- -------------------------
-
- A particular coroutine usage pattern, which is illustrated by the `main`
- function in the overview section, where a coroutine is created, manipulated and
- destroyed by the same calling function, is common for coroutines implementing
- RAII idiom and is suitable for allocation elision optimization which avoid
- dynamic allocation by storing the coroutine frame as a static `alloca` in its
- caller.
- In the entry block, we will call `coro.alloc`_ intrinsic that will return `true`
- when dynamic allocation is required, and `false` if dynamic allocation is
- elided.
- .. code-block:: llvm
- entry:
- %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
- %need.dyn.alloc = call i1 @llvm.coro.alloc(token %id)
- br i1 %need.dyn.alloc, label %dyn.alloc, label %coro.begin
- dyn.alloc:
- %size = call i32 @llvm.coro.size.i32()
- %alloc = call i8* @CustomAlloc(i32 %size)
- br label %coro.begin
- coro.begin:
- %phi = phi i8* [ null, %entry ], [ %alloc, %dyn.alloc ]
- %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %phi)
- In the cleanup block, we will make freeing the coroutine frame conditional on
- `coro.free`_ intrinsic. If allocation is elided, `coro.free`_ returns `null`
- thus skipping the deallocation code:
- .. code-block:: llvm
- cleanup:
- %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
- %need.dyn.free = icmp ne i8* %mem, null
- br i1 %need.dyn.free, label %dyn.free, label %if.end
- dyn.free:
- call void @CustomFree(i8* %mem)
- br label %if.end
- if.end:
- ...
- With allocations and deallocations represented as described as above, after
- coroutine heap allocation elision optimization, the resulting main will be:
- .. code-block:: llvm
- define i32 @main() {
- entry:
- call void @print(i32 4)
- call void @print(i32 5)
- call void @print(i32 6)
- ret i32 0
- }
- Multiple Suspend Points
- -----------------------
- Let's consider the coroutine that has more than one suspend point:
- .. code-block:: c++
- void *f(int n) {
- for(;;) {
- print(n++);
- <suspend>
- print(-n);
- <suspend>
- }
- }
- Matching LLVM code would look like (with the rest of the code remaining the same
- as the code in the previous section):
- .. code-block:: llvm
- loop:
- %n.addr = phi i32 [ %n, %entry ], [ %inc, %loop.resume ]
- call void @print(i32 %n.addr) #4
- %2 = call i8 @llvm.coro.suspend(token none, i1 false)
- switch i8 %2, label %suspend [i8 0, label %loop.resume
- i8 1, label %cleanup]
- loop.resume:
- %inc = add nsw i32 %n.addr, 1
- %sub = xor i32 %n.addr, -1
- call void @print(i32 %sub)
- %3 = call i8 @llvm.coro.suspend(token none, i1 false)
- switch i8 %3, label %suspend [i8 0, label %loop
- i8 1, label %cleanup]
- In this case, the coroutine frame would include a suspend index that will
- indicate at which suspend point the coroutine needs to resume. The resume
- function will use an index to jump to an appropriate basic block and will look
- as follows:
- .. code-block:: llvm
- define internal fastcc void @f.Resume(%f.Frame* %FramePtr) {
- entry.Resume:
- %index.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i64 0, i32 2
- %index = load i8, i8* %index.addr, align 1
- %switch = icmp eq i8 %index, 0
- %n.addr = getelementptr inbounds %f.Frame, %f.Frame* %FramePtr, i64 0, i32 3
- %n = load i32, i32* %n.addr, align 4
- br i1 %switch, label %loop.resume, label %loop
- loop.resume:
- %sub = xor i32 %n, -1
- call void @print(i32 %sub)
- br label %suspend
- loop:
- %inc = add nsw i32 %n, 1
- store i32 %inc, i32* %n.addr, align 4
- tail call void @print(i32 %inc)
- br label %suspend
- suspend:
- %storemerge = phi i8 [ 0, %loop ], [ 1, %loop.resume ]
- store i8 %storemerge, i8* %index.addr, align 1
- ret void
- }
- If different cleanup code needs to get executed for different suspend points,
- a similar switch will be in the `f.destroy` function.
- .. note ::
- Using suspend index in a coroutine state and having a switch in `f.resume` and
- `f.destroy` is one of the possible implementation strategies. We explored
- another option where a distinct `f.resume1`, `f.resume2`, etc. are created for
- every suspend point, and instead of storing an index, the resume and destroy
- function pointers are updated at every suspend. Early testing showed that the
- current approach is easier on the optimizer than the latter so it is a
- lowering strategy implemented at the moment.
- Distinct Save and Suspend
- -------------------------
- In the previous example, setting a resume index (or some other state change that
- needs to happen to prepare a coroutine for resumption) happens at the same time as
- a suspension of a coroutine. However, in certain cases, it is necessary to control
- when coroutine is prepared for resumption and when it is suspended.
- In the following example, a coroutine represents some activity that is driven
- by completions of asynchronous operations `async_op1` and `async_op2` which get
- a coroutine handle as a parameter and resume the coroutine once async
- operation is finished.
- .. code-block:: text
- void g() {
- for (;;)
- if (cond()) {
- async_op1(<coroutine-handle>); // will resume once async_op1 completes
- <suspend>
- do_one();
- }
- else {
- async_op2(<coroutine-handle>); // will resume once async_op2 completes
- <suspend>
- do_two();
- }
- }
- }
- In this case, coroutine should be ready for resumption prior to a call to
- `async_op1` and `async_op2`. The `coro.save`_ intrinsic is used to indicate a
- point when coroutine should be ready for resumption (namely, when a resume index
- should be stored in the coroutine frame, so that it can be resumed at the
- correct resume point):
- .. code-block:: llvm
- if.true:
- %save1 = call token @llvm.coro.save(i8* %hdl)
- call void @async_op1(i8* %hdl)
- %suspend1 = call i1 @llvm.coro.suspend(token %save1, i1 false)
- switch i8 %suspend1, label %suspend [i8 0, label %resume1
- i8 1, label %cleanup]
- if.false:
- %save2 = call token @llvm.coro.save(i8* %hdl)
- call void @async_op2(i8* %hdl)
- %suspend2 = call i1 @llvm.coro.suspend(token %save2, i1 false)
- switch i8 %suspend1, label %suspend [i8 0, label %resume2
- i8 1, label %cleanup]
- .. _coroutine promise:
- Coroutine Promise
- -----------------
- A coroutine author or a frontend may designate a distinguished `alloca` that can
- be used to communicate with the coroutine. This distinguished alloca is called
- **coroutine promise** and is provided as the second parameter to the
- `coro.id`_ intrinsic.
- The following coroutine designates a 32 bit integer `promise` and uses it to
- store the current value produced by a coroutine.
- .. code-block:: llvm
- define i8* @f(i32 %n) {
- entry:
- %promise = alloca i32
- %pv = bitcast i32* %promise to i8*
- %id = call token @llvm.coro.id(i32 0, i8* %pv, i8* null, i8* null)
- %need.dyn.alloc = call i1 @llvm.coro.alloc(token %id)
- br i1 %need.dyn.alloc, label %dyn.alloc, label %coro.begin
- dyn.alloc:
- %size = call i32 @llvm.coro.size.i32()
- %alloc = call i8* @malloc(i32 %size)
- br label %coro.begin
- coro.begin:
- %phi = phi i8* [ null, %entry ], [ %alloc, %dyn.alloc ]
- %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %phi)
- br label %loop
- loop:
- %n.val = phi i32 [ %n, %coro.begin ], [ %inc, %loop ]
- %inc = add nsw i32 %n.val, 1
- store i32 %n.val, i32* %promise
- %0 = call i8 @llvm.coro.suspend(token none, i1 false)
- switch i8 %0, label %suspend [i8 0, label %loop
- i8 1, label %cleanup]
- cleanup:
- %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
- call void @free(i8* %mem)
- br label %suspend
- suspend:
- %unused = call i1 @llvm.coro.end(i8* %hdl, i1 false)
- ret i8* %hdl
- }
- A coroutine consumer can rely on the `coro.promise`_ intrinsic to access the
- coroutine promise.
- .. code-block:: llvm
- define i32 @main() {
- entry:
- %hdl = call i8* @f(i32 4)
- %promise.addr.raw = call i8* @llvm.coro.promise(i8* %hdl, i32 4, i1 false)
- %promise.addr = bitcast i8* %promise.addr.raw to i32*
- %val0 = load i32, i32* %promise.addr
- call void @print(i32 %val0)
- call void @llvm.coro.resume(i8* %hdl)
- %val1 = load i32, i32* %promise.addr
- call void @print(i32 %val1)
- call void @llvm.coro.resume(i8* %hdl)
- %val2 = load i32, i32* %promise.addr
- call void @print(i32 %val2)
- call void @llvm.coro.destroy(i8* %hdl)
- ret i32 0
- }
- After example in this section is compiled, result of the compilation will be:
- .. code-block:: llvm
- define i32 @main() {
- entry:
- tail call void @print(i32 4)
- tail call void @print(i32 5)
- tail call void @print(i32 6)
- ret i32 0
- }
- .. _final:
- .. _final suspend:
- Final Suspend
- -------------
- A coroutine author or a frontend may designate a particular suspend to be final,
- by setting the second argument of the `coro.suspend`_ intrinsic to `true`.
- Such a suspend point has two properties:
- * it is possible to check whether a suspended coroutine is at the final suspend
- point via `coro.done`_ intrinsic;
- * a resumption of a coroutine stopped at the final suspend point leads to
- undefined behavior. The only possible action for a coroutine at a final
- suspend point is destroying it via `coro.destroy`_ intrinsic.
- From the user perspective, the final suspend point represents an idea of a
- coroutine reaching the end. From the compiler perspective, it is an optimization
- opportunity for reducing number of resume points (and therefore switch cases) in
- the resume function.
- The following is an example of a function that keeps resuming the coroutine
- until the final suspend point is reached after which point the coroutine is
- destroyed:
- .. code-block:: llvm
- define i32 @main() {
- entry:
- %hdl = call i8* @f(i32 4)
- br label %while
- while:
- call void @llvm.coro.resume(i8* %hdl)
- %done = call i1 @llvm.coro.done(i8* %hdl)
- br i1 %done, label %end, label %while
- end:
- call void @llvm.coro.destroy(i8* %hdl)
- ret i32 0
- }
- Usually, final suspend point is a frontend injected suspend point that does not
- correspond to any explicitly authored suspend point of the high level language.
- For example, for a Python generator that has only one suspend point:
- .. code-block:: python
- def coroutine(n):
- for i in range(n):
- yield i
- Python frontend would inject two more suspend points, so that the actual code
- looks like this:
- .. code-block:: c
- void* coroutine(int n) {
- int current_value;
- <designate current_value to be coroutine promise>
- <SUSPEND> // injected suspend point, so that the coroutine starts suspended
- for (int i = 0; i < n; ++i) {
- current_value = i; <SUSPEND>; // corresponds to "yield i"
- }
- <SUSPEND final=true> // injected final suspend point
- }
- and python iterator `__next__` would look like:
- .. code-block:: c++
- int __next__(void* hdl) {
- coro.resume(hdl);
- if (coro.done(hdl)) throw StopIteration();
- return *(int*)coro.promise(hdl, 4, false);
- }
- Intrinsics
- ==========
- Coroutine Manipulation Intrinsics
- ---------------------------------
- Intrinsics described in this section are used to manipulate an existing
- coroutine. They can be used in any function which happen to have a pointer
- to a `coroutine frame`_ or a pointer to a `coroutine promise`_.
- .. _coro.destroy:
- 'llvm.coro.destroy' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- Syntax:
- """""""
- ::
- declare void @llvm.coro.destroy(i8* <handle>)
- Overview:
- """""""""
- The '``llvm.coro.destroy``' intrinsic destroys a suspended
- switched-resume coroutine.
- Arguments:
- """"""""""
- The argument is a coroutine handle to a suspended coroutine.
- Semantics:
- """"""""""
- When possible, the `coro.destroy` intrinsic is replaced with a direct call to
- the coroutine destroy function. Otherwise it is replaced with an indirect call
- based on the function pointer for the destroy function stored in the coroutine
- frame. Destroying a coroutine that is not suspended leads to undefined behavior.
- .. _coro.resume:
- 'llvm.coro.resume' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare void @llvm.coro.resume(i8* <handle>)
- Overview:
- """""""""
- The '``llvm.coro.resume``' intrinsic resumes a suspended switched-resume coroutine.
- Arguments:
- """"""""""
- The argument is a handle to a suspended coroutine.
- Semantics:
- """"""""""
- When possible, the `coro.resume` intrinsic is replaced with a direct call to the
- coroutine resume function. Otherwise it is replaced with an indirect call based
- on the function pointer for the resume function stored in the coroutine frame.
- Resuming a coroutine that is not suspended leads to undefined behavior.
- .. _coro.done:
- 'llvm.coro.done' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare i1 @llvm.coro.done(i8* <handle>)
- Overview:
- """""""""
- The '``llvm.coro.done``' intrinsic checks whether a suspended
- switched-resume coroutine is at the final suspend point or not.
- Arguments:
- """"""""""
- The argument is a handle to a suspended coroutine.
- Semantics:
- """"""""""
- Using this intrinsic on a coroutine that does not have a `final suspend`_ point
- or on a coroutine that is not suspended leads to undefined behavior.
- .. _coro.promise:
- 'llvm.coro.promise' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare i8* @llvm.coro.promise(i8* <ptr>, i32 <alignment>, i1 <from>)
- Overview:
- """""""""
- The '``llvm.coro.promise``' intrinsic obtains a pointer to a
- `coroutine promise`_ given a switched-resume coroutine handle and vice versa.
- Arguments:
- """"""""""
- The first argument is a handle to a coroutine if `from` is false. Otherwise,
- it is a pointer to a coroutine promise.
- The second argument is an alignment requirements of the promise.
- If a frontend designated `%promise = alloca i32` as a promise, the alignment
- argument to `coro.promise` should be the alignment of `i32` on the target
- platform. If a frontend designated `%promise = alloca i32, align 16` as a
- promise, the alignment argument should be 16.
- This argument only accepts constants.
- The third argument is a boolean indicating a direction of the transformation.
- If `from` is true, the intrinsic returns a coroutine handle given a pointer
- to a promise. If `from` is false, the intrinsics return a pointer to a promise
- from a coroutine handle. This argument only accepts constants.
- Semantics:
- """"""""""
- Using this intrinsic on a coroutine that does not have a coroutine promise
- leads to undefined behavior. It is possible to read and modify coroutine
- promise of the coroutine which is currently executing. The coroutine author and
- a coroutine user are responsible to makes sure there is no data races.
- Example:
- """"""""
- .. code-block:: llvm
- define i8* @f(i32 %n) {
- entry:
- %promise = alloca i32
- %pv = bitcast i32* %promise to i8*
- ; the second argument to coro.id points to the coroutine promise.
- %id = call token @llvm.coro.id(i32 0, i8* %pv, i8* null, i8* null)
- ...
- %hdl = call noalias i8* @llvm.coro.begin(token %id, i8* %alloc)
- ...
- store i32 42, i32* %promise ; store something into the promise
- ...
- ret i8* %hdl
- }
- define i32 @main() {
- entry:
- %hdl = call i8* @f(i32 4) ; starts the coroutine and returns its handle
- %promise.addr.raw = call i8* @llvm.coro.promise(i8* %hdl, i32 4, i1 false)
- %promise.addr = bitcast i8* %promise.addr.raw to i32*
- %val = load i32, i32* %promise.addr ; load a value from the promise
- call void @print(i32 %val)
- call void @llvm.coro.destroy(i8* %hdl)
- ret i32 0
- }
- .. _coroutine intrinsics:
- Coroutine Structure Intrinsics
- ------------------------------
- Intrinsics described in this section are used within a coroutine to describe
- the coroutine structure. They should not be used outside of a coroutine.
- .. _coro.size:
- 'llvm.coro.size' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare i32 @llvm.coro.size.i32()
- declare i64 @llvm.coro.size.i64()
- Overview:
- """""""""
- The '``llvm.coro.size``' intrinsic returns the number of bytes
- required to store a `coroutine frame`_. This is only supported for
- switched-resume coroutines.
- Arguments:
- """"""""""
- None
- Semantics:
- """"""""""
- The `coro.size` intrinsic is lowered to a constant representing the size of
- the coroutine frame.
- .. _coro.begin:
- 'llvm.coro.begin' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare i8* @llvm.coro.begin(token <id>, i8* <mem>)
- Overview:
- """""""""
- The '``llvm.coro.begin``' intrinsic returns an address of the coroutine frame.
- Arguments:
- """"""""""
- The first argument is a token returned by a call to '``llvm.coro.id``'
- identifying the coroutine.
- The second argument is a pointer to a block of memory where coroutine frame
- will be stored if it is allocated dynamically. This pointer is ignored
- for returned-continuation coroutines.
- Semantics:
- """"""""""
- Depending on the alignment requirements of the objects in the coroutine frame
- and/or on the codegen compactness reasons the pointer returned from `coro.begin`
- may be at offset to the `%mem` argument. (This could be beneficial if
- instructions that express relative access to data can be more compactly encoded
- with small positive and negative offsets).
- A frontend should emit exactly one `coro.begin` intrinsic per coroutine.
- .. _coro.free:
- 'llvm.coro.free' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare i8* @llvm.coro.free(token %id, i8* <frame>)
- Overview:
- """""""""
- The '``llvm.coro.free``' intrinsic returns a pointer to a block of memory where
- coroutine frame is stored or `null` if this instance of a coroutine did not use
- dynamically allocated memory for its coroutine frame. This intrinsic is not
- supported for returned-continuation coroutines.
- Arguments:
- """"""""""
- The first argument is a token returned by a call to '``llvm.coro.id``'
- identifying the coroutine.
- The second argument is a pointer to the coroutine frame. This should be the same
- pointer that was returned by prior `coro.begin` call.
- Example (custom deallocation function):
- """""""""""""""""""""""""""""""""""""""
- .. code-block:: llvm
- cleanup:
- %mem = call i8* @llvm.coro.free(token %id, i8* %frame)
- %mem_not_null = icmp ne i8* %mem, null
- br i1 %mem_not_null, label %if.then, label %if.end
- if.then:
- call void @CustomFree(i8* %mem)
- br label %if.end
- if.end:
- ret void
- Example (standard deallocation functions):
- """"""""""""""""""""""""""""""""""""""""""
- .. code-block:: llvm
- cleanup:
- %mem = call i8* @llvm.coro.free(token %id, i8* %frame)
- call void @free(i8* %mem)
- ret void
- .. _coro.alloc:
- 'llvm.coro.alloc' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare i1 @llvm.coro.alloc(token <id>)
- Overview:
- """""""""
- The '``llvm.coro.alloc``' intrinsic returns `true` if dynamic allocation is
- required to obtain a memory for the coroutine frame and `false` otherwise.
- This is not supported for returned-continuation coroutines.
- Arguments:
- """"""""""
- The first argument is a token returned by a call to '``llvm.coro.id``'
- identifying the coroutine.
- Semantics:
- """"""""""
- A frontend should emit at most one `coro.alloc` intrinsic per coroutine.
- The intrinsic is used to suppress dynamic allocation of the coroutine frame
- when possible.
- Example:
- """"""""
- .. code-block:: llvm
- entry:
- %id = call token @llvm.coro.id(i32 0, i8* null, i8* null, i8* null)
- %dyn.alloc.required = call i1 @llvm.coro.alloc(token %id)
- br i1 %dyn.alloc.required, label %coro.alloc, label %coro.begin
- coro.alloc:
- %frame.size = call i32 @llvm.coro.size()
- %alloc = call i8* @MyAlloc(i32 %frame.size)
- br label %coro.begin
- coro.begin:
- %phi = phi i8* [ null, %entry ], [ %alloc, %coro.alloc ]
- %frame = call i8* @llvm.coro.begin(token %id, i8* %phi)
- .. _coro.noop:
- 'llvm.coro.noop' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare i8* @llvm.coro.noop()
- Overview:
- """""""""
- The '``llvm.coro.noop``' intrinsic returns an address of the coroutine frame of
- a coroutine that does nothing when resumed or destroyed.
- Arguments:
- """"""""""
- None
- Semantics:
- """"""""""
- This intrinsic is lowered to refer to a private constant coroutine frame. The
- resume and destroy handlers for this frame are empty functions that do nothing.
- Note that in different translation units llvm.coro.noop may return different pointers.
- .. _coro.frame:
- 'llvm.coro.frame' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare i8* @llvm.coro.frame()
- Overview:
- """""""""
- The '``llvm.coro.frame``' intrinsic returns an address of the coroutine frame of
- the enclosing coroutine.
- Arguments:
- """"""""""
- None
- Semantics:
- """"""""""
- This intrinsic is lowered to refer to the `coro.begin`_ instruction. This is
- a frontend convenience intrinsic that makes it easier to refer to the
- coroutine frame.
- .. _coro.id:
- 'llvm.coro.id' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare token @llvm.coro.id(i32 <align>, i8* <promise>, i8* <coroaddr>,
- i8* <fnaddrs>)
- Overview:
- """""""""
- The '``llvm.coro.id``' intrinsic returns a token identifying a
- switched-resume coroutine.
- Arguments:
- """"""""""
- The first argument provides information on the alignment of the memory returned
- by the allocation function and given to `coro.begin` by the first argument. If
- this argument is 0, the memory is assumed to be aligned to 2 * sizeof(i8*).
- This argument only accepts constants.
- The second argument, if not `null`, designates a particular alloca instruction
- to be a `coroutine promise`_.
- The third argument is `null` coming out of the frontend. The CoroEarly pass sets
- this argument to point to the function this coro.id belongs to.
- The fourth argument is `null` before coroutine is split, and later is replaced
- to point to a private global constant array containing function pointers to
- outlined resume and destroy parts of the coroutine.
- Semantics:
- """"""""""
- The purpose of this intrinsic is to tie together `coro.id`, `coro.alloc` and
- `coro.begin` belonging to the same coroutine to prevent optimization passes from
- duplicating any of these instructions unless entire body of the coroutine is
- duplicated.
- A frontend should emit exactly one `coro.id` intrinsic per coroutine.
- .. _coro.id.retcon:
- 'llvm.coro.id.retcon' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare token @llvm.coro.id.retcon(i32 <size>, i32 <align>, i8* <buffer>,
- i8* <continuation prototype>,
- i8* <alloc>, i8* <dealloc>)
- Overview:
- """""""""
- The '``llvm.coro.id.retcon``' intrinsic returns a token identifying a
- multiple-suspend returned-continuation coroutine.
- The 'result-type sequence' of the coroutine is defined as follows:
- - if the return type of the coroutine function is ``void``, it is the
- empty sequence;
- - if the return type of the coroutine function is a ``struct``, it is the
- element types of that ``struct`` in order;
- - otherwise, it is just the return type of the coroutine function.
- The first element of the result-type sequence must be a pointer type;
- continuation functions will be coerced to this type. The rest of
- the sequence are the 'yield types', and any suspends in the coroutine
- must take arguments of these types.
- Arguments:
- """"""""""
- The first and second arguments are the expected size and alignment of
- the buffer provided as the third argument. They must be constant.
- The fourth argument must be a reference to a global function, called
- the 'continuation prototype function'. The type, calling convention,
- and attributes of any continuation functions will be taken from this
- declaration. The return type of the prototype function must match the
- return type of the current function. The first parameter type must be
- a pointer type. The second parameter type must be an integer type;
- it will be used only as a boolean flag.
- The fifth argument must be a reference to a global function that will
- be used to allocate memory. It may not fail, either by returning null
- or throwing an exception. It must take an integer and return a pointer.
- The sixth argument must be a reference to a global function that will
- be used to deallocate memory. It must take a pointer and return ``void``.
- 'llvm.coro.id.retcon.once' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare token @llvm.coro.id.retcon.once(i32 <size>, i32 <align>, i8* <buffer>,
- i8* <prototype>,
- i8* <alloc>, i8* <dealloc>)
- Overview:
- """""""""
- The '``llvm.coro.id.retcon.once``' intrinsic returns a token identifying a
- unique-suspend returned-continuation coroutine.
- Arguments:
- """"""""""
- As for ``llvm.core.id.retcon``, except that the return type of the
- continuation prototype must be `void` instead of matching the
- coroutine's return type.
- .. _coro.end:
- 'llvm.coro.end' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare i1 @llvm.coro.end(i8* <handle>, i1 <unwind>)
- Overview:
- """""""""
- The '``llvm.coro.end``' marks the point where execution of the resume part of
- the coroutine should end and control should return to the caller.
- Arguments:
- """"""""""
- The first argument should refer to the coroutine handle of the enclosing
- coroutine. A frontend is allowed to supply null as the first parameter, in this
- case `coro-early` pass will replace the null with an appropriate coroutine
- handle value.
- The second argument should be `true` if this coro.end is in the block that is
- part of the unwind sequence leaving the coroutine body due to an exception and
- `false` otherwise.
- Semantics:
- """"""""""
- The purpose of this intrinsic is to allow frontends to mark the cleanup and
- other code that is only relevant during the initial invocation of the coroutine
- and should not be present in resume and destroy parts.
- In returned-continuation lowering, ``llvm.coro.end`` fully destroys the
- coroutine frame. If the second argument is `false`, it also returns from
- the coroutine with a null continuation pointer, and the next instruction
- will be unreachable. If the second argument is `true`, it falls through
- so that the following logic can resume unwinding. In a yield-once
- coroutine, reaching a non-unwind ``llvm.coro.end`` without having first
- reached a ``llvm.coro.suspend.retcon`` has undefined behavior.
- The remainder of this section describes the behavior under switched-resume
- lowering.
- This intrinsic is lowered when a coroutine is split into
- the start, resume and destroy parts. In the start part, it is a no-op,
- in resume and destroy parts, it is replaced with `ret void` instruction and
- the rest of the block containing `coro.end` instruction is discarded.
- In landing pads it is replaced with an appropriate instruction to unwind to
- caller. The handling of coro.end differs depending on whether the target is
- using landingpad or WinEH exception model.
- For landingpad based exception model, it is expected that frontend uses the
- `coro.end`_ intrinsic as follows:
- .. code-block:: llvm
- ehcleanup:
- %InResumePart = call i1 @llvm.coro.end(i8* null, i1 true)
- br i1 %InResumePart, label %eh.resume, label %cleanup.cont
- cleanup.cont:
- ; rest of the cleanup
- eh.resume:
- %exn = load i8*, i8** %exn.slot, align 8
- %sel = load i32, i32* %ehselector.slot, align 4
- %lpad.val = insertvalue { i8*, i32 } undef, i8* %exn, 0
- %lpad.val29 = insertvalue { i8*, i32 } %lpad.val, i32 %sel, 1
- resume { i8*, i32 } %lpad.val29
- The `CoroSpit` pass replaces `coro.end` with ``True`` in the resume functions,
- thus leading to immediate unwind to the caller, whereas in start function it
- is replaced with ``False``, thus allowing to proceed to the rest of the cleanup
- code that is only needed during initial invocation of the coroutine.
- For Windows Exception handling model, a frontend should attach a funclet bundle
- referring to an enclosing cleanuppad as follows:
- .. code-block:: llvm
- ehcleanup:
- %tok = cleanuppad within none []
- %unused = call i1 @llvm.coro.end(i8* null, i1 true) [ "funclet"(token %tok) ]
- cleanupret from %tok unwind label %RestOfTheCleanup
- The `CoroSplit` pass, if the funclet bundle is present, will insert
- ``cleanupret from %tok unwind to caller`` before
- the `coro.end`_ intrinsic and will remove the rest of the block.
- The following table summarizes the handling of `coro.end`_ intrinsic.
- +--------------------------+-------------------+-------------------------------+
- | | In Start Function | In Resume/Destroy Functions |
- +--------------------------+-------------------+-------------------------------+
- |unwind=false | nothing |``ret void`` |
- +------------+-------------+-------------------+-------------------------------+
- | | WinEH | nothing |``cleanupret unwind to caller``|
- |unwind=true +-------------+-------------------+-------------------------------+
- | | Landingpad | nothing | nothing |
- +------------+-------------+-------------------+-------------------------------+
- .. _coro.suspend:
- .. _suspend points:
- 'llvm.coro.suspend' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare i8 @llvm.coro.suspend(token <save>, i1 <final>)
- Overview:
- """""""""
- The '``llvm.coro.suspend``' marks the point where execution of a
- switched-resume coroutine is suspended and control is returned back
- to the caller. Conditional branches consuming the result of this
- intrinsic lead to basic blocks where coroutine should proceed when
- suspended (-1), resumed (0) or destroyed (1).
- Arguments:
- """"""""""
- The first argument refers to a token of `coro.save` intrinsic that marks the
- point when coroutine state is prepared for suspension. If `none` token is passed,
- the intrinsic behaves as if there were a `coro.save` immediately preceding
- the `coro.suspend` intrinsic.
- The second argument indicates whether this suspension point is `final`_.
- The second argument only accepts constants. If more than one suspend point is
- designated as final, the resume and destroy branches should lead to the same
- basic blocks.
- Example (normal suspend point):
- """""""""""""""""""""""""""""""
- .. code-block:: llvm
- %0 = call i8 @llvm.coro.suspend(token none, i1 false)
- switch i8 %0, label %suspend [i8 0, label %resume
- i8 1, label %cleanup]
- Example (final suspend point):
- """"""""""""""""""""""""""""""
- .. code-block:: llvm
- while.end:
- %s.final = call i8 @llvm.coro.suspend(token none, i1 true)
- switch i8 %s.final, label %suspend [i8 0, label %trap
- i8 1, label %cleanup]
- trap:
- call void @llvm.trap()
- unreachable
- Semantics:
- """"""""""
- If a coroutine that was suspended at the suspend point marked by this intrinsic
- is resumed via `coro.resume`_ the control will transfer to the basic block
- of the 0-case. If it is resumed via `coro.destroy`_, it will proceed to the
- basic block indicated by the 1-case. To suspend, coroutine proceed to the
- default label.
- If suspend intrinsic is marked as final, it can consider the `true` branch
- unreachable and can perform optimizations that can take advantage of that fact.
- .. _coro.save:
- 'llvm.coro.save' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare token @llvm.coro.save(i8* <handle>)
- Overview:
- """""""""
- The '``llvm.coro.save``' marks the point where a coroutine need to update its
- state to prepare for resumption to be considered suspended (and thus eligible
- for resumption).
- Arguments:
- """"""""""
- The first argument points to a coroutine handle of the enclosing coroutine.
- Semantics:
- """"""""""
- Whatever coroutine state changes are required to enable resumption of
- the coroutine from the corresponding suspend point should be done at the point
- of `coro.save` intrinsic.
- Example:
- """"""""
- Separate save and suspend points are necessary when a coroutine is used to
- represent an asynchronous control flow driven by callbacks representing
- completions of asynchronous operations.
- In such a case, a coroutine should be ready for resumption prior to a call to
- `async_op` function that may trigger resumption of a coroutine from the same or
- a different thread possibly prior to `async_op` call returning control back
- to the coroutine:
- .. code-block:: llvm
- %save1 = call token @llvm.coro.save(i8* %hdl)
- call void @async_op1(i8* %hdl)
- %suspend1 = call i1 @llvm.coro.suspend(token %save1, i1 false)
- switch i8 %suspend1, label %suspend [i8 0, label %resume1
- i8 1, label %cleanup]
- .. _coro.suspend.retcon:
- 'llvm.coro.suspend.retcon' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare i1 @llvm.coro.suspend.retcon(...)
- Overview:
- """""""""
- The '``llvm.coro.suspend.retcon``' intrinsic marks the point where
- execution of a returned-continuation coroutine is suspended and control
- is returned back to the caller.
- `llvm.coro.suspend.retcon`` does not support separate save points;
- they are not useful when the continuation function is not locally
- accessible. That would be a more appropriate feature for a ``passcon``
- lowering that is not yet implemented.
- Arguments:
- """"""""""
- The types of the arguments must exactly match the yielded-types sequence
- of the coroutine. They will be turned into return values from the ramp
- and continuation functions, along with the next continuation function.
- Semantics:
- """"""""""
- The result of the intrinsic indicates whether the coroutine should resume
- abnormally (non-zero).
- In a normal coroutine, it is undefined behavior if the coroutine executes
- a call to ``llvm.coro.suspend.retcon`` after resuming abnormally.
- In a yield-once coroutine, it is undefined behavior if the coroutine
- executes a call to ``llvm.coro.suspend.retcon`` after resuming in any way.
- .. _coro.param:
- 'llvm.coro.param' Intrinsic
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- ::
- declare i1 @llvm.coro.param(i8* <original>, i8* <copy>)
- Overview:
- """""""""
- The '``llvm.coro.param``' is used by a frontend to mark up the code used to
- construct and destruct copies of the parameters. If the optimizer discovers that
- a particular parameter copy is not used after any suspends, it can remove the
- construction and destruction of the copy by replacing corresponding coro.param
- with `i1 false` and replacing any use of the `copy` with the `original`.
- Arguments:
- """"""""""
- The first argument points to an `alloca` storing the value of a parameter to a
- coroutine.
- The second argument points to an `alloca` storing the value of the copy of that
- parameter.
- Semantics:
- """"""""""
- The optimizer is free to always replace this intrinsic with `i1 true`.
- The optimizer is also allowed to replace it with `i1 false` provided that the
- parameter copy is only used prior to control flow reaching any of the suspend
- points. The code that would be DCE'd if the `coro.param` is replaced with
- `i1 false` is not considered to be a use of the parameter copy.
- The frontend can emit this intrinsic if its language rules allow for this
- optimization.
- Example:
- """"""""
- Consider the following example. A coroutine takes two parameters `a` and `b`
- that has a destructor and a move constructor.
- .. code-block:: c++
- struct A { ~A(); A(A&&); bool foo(); void bar(); };
- task<int> f(A a, A b) {
- if (a.foo())
- return 42;
- a.bar();
- co_await read_async(); // introduces suspend point
- b.bar();
- }
- Note that, uses of `b` is used after a suspend point and thus must be copied
- into a coroutine frame, whereas `a` does not have to, since it never used
- after suspend.
- A frontend can create parameter copies for `a` and `b` as follows:
- .. code-block:: text
- task<int> f(A a', A b') {
- a = alloca A;
- b = alloca A;
- // move parameters to its copies
- if (coro.param(a', a)) A::A(a, A&& a');
- if (coro.param(b', b)) A::A(b, A&& b');
- ...
- // destroy parameters copies
- if (coro.param(a', a)) A::~A(a);
- if (coro.param(b', b)) A::~A(b);
- }
- The optimizer can replace coro.param(a',a) with `i1 false` and replace all uses
- of `a` with `a'`, since it is not used after suspend.
- The optimizer must replace coro.param(b', b) with `i1 true`, since `b` is used
- after suspend and therefore, it has to reside in the coroutine frame.
- Coroutine Transformation Passes
- ===============================
- CoroEarly
- ---------
- The pass CoroEarly lowers coroutine intrinsics that hide the details of the
- structure of the coroutine frame, but, otherwise not needed to be preserved to
- help later coroutine passes. This pass lowers `coro.frame`_, `coro.done`_,
- and `coro.promise`_ intrinsics.
- .. _CoroSplit:
- CoroSplit
- ---------
- The pass CoroSplit buides coroutine frame and outlines resume and destroy parts
- into separate functions.
- CoroElide
- ---------
- The pass CoroElide examines if the inlined coroutine is eligible for heap
- allocation elision optimization. If so, it replaces
- `coro.begin` intrinsic with an address of a coroutine frame placed on its caller
- and replaces `coro.alloc` and `coro.free` intrinsics with `false` and `null`
- respectively to remove the deallocation code.
- This pass also replaces `coro.resume` and `coro.destroy` intrinsics with direct
- calls to resume and destroy functions for a particular coroutine where possible.
- CoroCleanup
- -----------
- This pass runs late to lower all coroutine related intrinsics not replaced by
- earlier passes.
- Areas Requiring Attention
- =========================
- #. A coroutine frame is bigger than it could be. Adding stack packing and stack
- coloring like optimization on the coroutine frame will result in tighter
- coroutine frames.
- #. Take advantage of the lifetime intrinsics for the data that goes into the
- coroutine frame. Leave lifetime intrinsics as is for the data that stays in
- allocas.
- #. The CoroElide optimization pass relies on coroutine ramp function to be
- inlined. It would be beneficial to split the ramp function further to
- increase the chance that it will get inlined into its caller.
- #. Design a convention that would make it possible to apply coroutine heap
- elision optimization across ABI boundaries.
- #. Cannot handle coroutines with `inalloca` parameters (used in x86 on Windows).
- #. Alignment is ignored by coro.begin and coro.free intrinsics.
- #. Make required changes to make sure that coroutine optimizations work with
- LTO.
- #. More tests, more tests, more tests
|