>That is not an accurate summary of that comment. How is to so, if he explicitly...

comex · on March 10, 2021

> How is to so, if he explicitly writes:

There's a difference between "we decided this 3 years ago" and "we rushed the decision". At this point, it's no longer possible to weigh the two models on a neutral scale, because changing the model would cause a huge amount of ecosystem churn. But that doesn't mean they weren't properly weighed in the first place.

Regarding cyclicity… well, consider something like a task running two sub-tasks at the same time. That works out quite naturally in a polling-based model, but in a completion-based model you have to worry about things like 'what if both completion handlers are called at the same time', or even 'what if one of the completion handlers ends up calling the other one'.

Regarding dynamic allocations… well, what kind of desugaring are you thinking of? If you have

    async fn foo(input: u32) -> String;

then a simple desugaring could be

    fn foo(input: u32, completion: Arc<FnOnce(String)>);

but then the function has to responsible for allocating its own memory.

Sure, there are alternatives. We could do...

    struct Foo { /* state */ }
    impl Foo {
        fn call(self: Arc<Self>, input: u32, completion: Arc<FnOnce(String)>);
    }

Which by itself is no better; it still implies separate allocations. But then I suppose we could have an `ArcDerived<T>` which acts like `Arc<T>` but can point to a part of a larger allocation, so that `self` and `completion` could be parts of the same object.

However, in that case, how do you deal with borrowed arguments? You could rewrite them to Arc, I suppose. But if you must use Arc, performance-wise, ideally you want to be moving references around rather than actually bumping reference counts. You can usually do that if there's just `self` and `completion`, but not if there are a bunch of other Arcs.

Also, what if the implementation misbehaved and called `completion` without giving up the reference to `self`? That would imply that any further async calls by the caller could not use the same memory. It's possible to work around this, but I think it would start to make the interface relatively ugly, less ergonomic to implement manually.

Also, `ArcDerived` would have to consist of two pointers and there would have to be at least one `ArcDerived` in every nested future, bloating the future object. But really you don't want to mandate one particular implementation of Arc, so you need a vtable, but that means indirect calls and more space waste.

Most of those problems could be solved by making the interface unsafe and using something with more complex correctness requirements than Arc. But the fact that current async fns desugar to a safe interface is a significant upside. (...Even if the safety must be provided with a bunch of macros, thanks to Pin not being built into the language.)

newpavlov · on March 10, 2021

>There's a difference between "we decided this 3 years ago" and "we rushed the decision".

As far as I understand the situation, the completion-based API simply was not on the table 3 years ago. io-uring was not a thing and there was a negligible interest in properly supporting IOCP. So when a viable alternative has appeared right before stabilization of the developed epoll-centric API, the 3 year old decision has not been properly reviewed in the light of the changed environment and instead the team has pushed forward with the stabilization.

>because changing the model would cause a huge amount of ecosystem churn.

No, the discussion has happened before the stabilization (it's literally in the stabilization issue). Most of the ecosystem at the time was on futures 0.2.

Regarding your examples, I think you simply look at the problem from a wrong angle. In my opinion compiler should not desugar async fns into usual functions, instead it should construct explicit FSMs out of them. So no need for Arcs, the String would be stored directly in the "output" FSM state generated for foo. Yes, this approach is harder for compiler, but it opens some optimization capabilities, e.g. regarding the trade-off between FSM "stack" size and and number of copies which state transition functions have to do. AFAIK right now Rust uses "dumb" enums, which can be quite sub-optimal, i.e. they always minimize the "stack" size at the expense of additional data copies and they do not reorder fields in the enum variants to minimize copies.

In your example with two sub-tasks a generated FSM could look like this (each item is a transition function):

1) initialization [0 -> init_state]: create requests A and B

2) request A is complete [init_state -> state_a]: if request B is complete do nothing, else mark that request A is complete and request cancellation of task B, but do not change layout of a buffer used by request B.

3) cancellation of B is complete [state_a -> state_c]: process data from A, perform data processing common for branches A and B, create request C. It's safe to overwrite memory behind buffer B in this handler.

4) request B is complete [init_state -> state_b]: if request A is complete do nothing, else mark that request B is complete and request cancellation of task A, but do not change layout of a buffer used by request A.

5) cancellation of A is complete [state_b -> state_c]: process data from A, perform data processing common for branches A and B, create request C. It's safe to overwrite memory behind buffer A in this handler.

(This FSM assumes that it's legal to request cancellation of a completed task)

Note that handlers 2 and 4 can not be called at the same time, since they are bound to the same ring and thus executed on the same thread. Other completion handler simply can not call another handler, since they are part of the same FSM and only one FSM transition function can be executed at a time. At the first glance all those states and transitions look like an unnecessary complexity, but I think that it's how a proper select should work under the hood.

the_mitsuhiko · on March 10, 2021

> As far as I understand the situation, the completion-based API simply was not on the table 3 years ago

Completion APIs were always considered. They are just significantly harder for Rust to support.

newpavlov · on March 10, 2021

Can you provide any public sources for that? From that I've seen Rust async story was always developed primarily around epoll.

withoutboats2 · on March 10, 2021

Alex Crichton started with a completion based Future struct in 2015. It was even (unstable) in std in 1.0.0:

https://doc.rust-lang.org/1.0.0/std/sync/struct.Future.html

Our async IO model was based on the Linux industry standard (then and now) epoll, but that is not at all what drove the switch to a polling based model, and the polling based model presents no issues whatsoever with io-uring. You do not know what you are talking about.

newpavlov · on March 10, 2021

>Our async IO model was based on the Linux industry standard (then and now) epoll, but that is not at all what drove the switch to a polling based model

Can you provide a link to a design document or at the very least to a discussion with motivation for this switch outside of the desire to be as compatible as possible with the "Linux industry standard"?

>the polling based model presents no issues whatsoever with io-uring

There are no issues with io-uring compatibility to such extent that you wrote about a whole blog post about those issues: https://boats.gitlab.io/blog/post/io-uring/

IIUC the best solutions right now are either to copy data around (bye-bye zero-cost) or to use another Pin-like awkward hack with executor-based buffer management, instead of using simple and familiar buffers which are part of a future state.

withoutboats2 · on March 10, 2021

https://aturon.github.io/blog/2016/09/07/futures-design/

The completion based futures that Alex started with were also based on epoll. The performance issues it presented had nothing to do any sort of impedence mismatch between a completion based future and epoll, because there is no impedence issue. You are confused.

newpavlov · on March 10, 2021

Thank you for the link! But immideately we can see the false equivalence: completion based API does not imply the callback-based approach. The article critigues the latter, but not the former. Earlier in this thread I've described how I see a completion-based model built on top of FSMs generated by compiler from async fns. In other words, the arguments presented in that article do not apply to this discussion.

>The performance issues it presented had nothing to do any sort of impedence mismatch between a completion based future and epoll

Sorry, but what? Even aturon's article states zero-cost as one of the 3 main goals. So performance issues with strong roots in the selected model is a very big problem in my book.

>You do not know what you are talking about.

>You are confused.

Please, tone down your replies.

steveklabnik · on March 10, 2021

> Please, tone down your replies.

You cannot literally make extremely inflammatory comments about people's work, and accuse them of all sorts of things, and then get upset when they are mad about it. You've made a bunch of very serious accusations on multiple people's hard work, with no evidence, and with arguments that are shaky at best, on one of the largest and most influential forums in the world.

I mean, you can get mad about it, but I don't think it's right.

mst · on March 10, 2021

I found it highly critical but not inflammatory - though I'm not sure if I'd've felt the same way had they been being similarly critical of -my- code.

However, either way, responding with condescension (which is how the 'industry standard' thing came across) and outright aggression is never going to be constructive, and if that's the only response one is able to formulate then it's time to either wait a couple hours or ask somebody else to answer on your behalf instead (I have a number of people who are kind enough to do that for me when my reaction is sufficiently exothermic to make posting a really bad idea).

boats-of-a-year ago handled a similar situation much more graciously here - https://news.ycombinator.com/item?id=22464629 - so it's entirely possibly a lockdown fatigue issue - but responding to calmly phrased criticism with outright aggression is still pretty much never a net win and defending that behaviour seems contrary to the tone the rust team normally tries to set for discussions.

withoutboats2 · on March 10, 2021

Of course I was more gracious to pornel - that remark was uncharacteristically flippant from a contributor who is normally thoughtful and constructive. pornel is not in the habit of posting that my work is fatally flawed because I did not pursue some totally unviable vaporware proposal.

newpavlov · on March 10, 2021

I am not mad, it was nothing more than an attempt to urge a more civil tone from boats. If you both think that such tone is warranted, then so be it. But it does affect my (really high) opinion about you.

I do understand the pain of your dear work to be harshly criticized. I have experienced it many times in my career. But my critique intended as a tough love for the language in which I am heavily invested in. If you see my comments as only "extremely inflammatory"... Well, it's a shame I guess, since it's not the first case of the Rust team unnecessarily rushing something (see the 2018 edition debacle), so I guess such attitude only increases rate of mistake accumulation by Rust.

steveklabnik · on March 10, 2021

I do not doubt that you care about Rust. Civility, though, is a two-way street. Just because you phrase something in a way that has a more neutral tone does not mean that the underlying meaning cannot be inflammatory.

"Instead of carefully weighing advantages and disadvantages of both models," may be written in a way that more people would call "civil," but is in practice a direct attack on both the work, and the people doing the work. It is extremely difficult to not take this as a slightly more politely worded "fuck you," if I'm being honest. In some sense, that it is phrased as being neutral and "civil" makes it more inflammatory.

You can have whatever opinion that you want, of course. But you should understand that the stuff you've said here is exactly that. It may be politely worded, but is ultimately an extremely public direct attack.

nemothekid · on March 10, 2021

>Earlier in this thread I've described how I see a completion-based model built on top of FSMs generated by compiler from async fns. In other words, the arguments presented in that article do not apply to this discussion.

I've been lurking your responses, but now I'm confused. If you are not using a callback based approach, then what are you using? Rust's FSM approach is predicated on polling; In other words if you aren't using callbacks, then how do you know that Future A has finished? If the answer is to use Rust's current systems, then that means the FSM is "polled" periodically, and then you still have "async Drop" problem as described in withoutboat's notorious article and furthermore, you haven't really changed Rust's design.

Edit: As I've seen you mention in other threads, you need a sound design for async Drop for this to work. I'm not sure this is possible in Rust 1.0 (as Drop isn't currently required to run in safe Rust). That said it's unfair to call async "rushed", when your proposed design wouldn't even work in Rust 1.0. I'd be hesitant to call the design of the entire language rushed just because it didn't include linear types.

newpavlov · on March 10, 2021

I meant the callback based approach described in the article, for example take this line from it:

>Unfortunately, this approach nevertheless forces allocation at almost every point of future composition, and often imposes dynamic dispatch, despite our best efforts to avoid such overhead.

It clearly does not apply to the model which I've described earlier.

Of course, the described FSM state transition functions can be rightfully called callbacks, which adds a certain amount of confusion.

I can agree with the argument that a proper async Drop can not be implemented in Rust 1.0, so we have to settle with a compromise solution. Same with proper self-referential structs vs Pin. But I would like to see this argument to be explicitly stated with sufficient backing of the impossibility statements.

nemothekid · on March 10, 2021

>Of course, the described FSM state transition functions can be rightfully called callbacks, which adds a certain amount of confusion.

No, I'm not talking about the state transition functions. I'm talking about the runtime - the thing that will call the state transition function. In the current design, abstractly, the runtime polls/checks every if future if it's in a runnable state, and if so executes it. In an completion based design the future itself tells the runtime that the value is ready (either driven by a kernel thread, another thread or some other callback). (conceptually the difference is, in an poll based design, the future calls waker.wake(), and in a completion one, the future just calls the callback fn). Aaron has already described why that is a problem.

The confusion I have is that both would have problems integrating io_uring into rust (due to the Drop problem; as Rust has no concept of the kernel owning a buffer), but your proposed solution seems strictly worse as it requires async Drop to be sound which is not guaranteed by Rust; which would make it useless for programs that are being written today. As a result, I'm having trouble accepting that your criticism is actually valid - what you seem to be arguing is that async/await should have never been stabilized in Rust 1.0, which I believe is a fair criticism, but it isn't one that indicates that the current design has been rushed.

Upon further thought, I think your design ultimately requires futures to be implemented as a language feature, rather than a library (ex. for the future itself to expose multiple state transition functions without allocating is not possible with the current Trait system), which wouldn't have worked without forking Rust during the prototype stage.

newpavlov · on March 10, 2021

>In an completion based design the future itself tells the runtime that the value is ready

I think there is a misunderstanding. In a completion-based model (read io-uring, but I think IOCP behaves similarly, though I am less familiar with it) it's a runtime who "notifies" tasks about completed IO requests. In io-uring you have two queues represented by ring buffers shared with OS. You add submission queue entries (SQE) to the first buffer which describe what you want for OS to do, OS reads them, performs the requested job, and places completion queue events (CQEs) for completed requests into the second buffer.

So in this model a task (Future in your terminology) registers SQE (the registration process may be proxied via user-space runtime) and suspends itself. Let's assume for simplicity that only one SQE was registered for the task. After OS sends CQE for the request, runtime finds a correct state transition function (via meta-information embedded into SQE, which gets mirrored to the relevant CQE) and simply executes it, the requested data (if it was a read) will be already filled into a buffer which is part of the FSM state, so no need for additional syscalls or interactions with the runtime to read this data!

If you are familiar with embedded development, then it should sound quite familiar, since it's roughly how hardware interrupts work as well! You register a job (e.g. DMA transfer), dedicated hardware block does it, and notifies a registered callback after the job was done. Of course, it's quite an oversimplification, but fundamental similarity is there.

>I think your design ultimately requires futures to be implemented as a language feature, rather than a library

I am not sure if this design would have had a Future type at all, but you are right, the advocated approach requires a deeper integration with the language compared to the stabilized solution. Though I disagree with the opinion that it would've been impossible to do in Rust 1.

pcwalton · on March 11, 2021

Doesn't work because it relies on caller-managed buffers. See withoutboats' post: https://without.boats/blog/io-uring/

newpavlov · on March 13, 2021

It does not work in the current version of Rust, but it's not given that a backwards-compatible solution for it could not have been designed, e.g. by using a deeper integration of async tasks with the language or by adding proper linear types, thus all the discussions around reliable async Drop. The linked blog post takes for given that we should be able to drop futures at any point in time, which while being convenient has a lot of implications.

ben0x539 · on March 10, 2021

What happens if you drop the task between 1 and 2? Does dropping block until the cancellation of both tasks is complete?

newpavlov · on March 10, 2021

As I've mentioned several times, in this model you can not simply "drop the task" without running its asynchronous Drop. Each state in FSM will be generated with a "drop" transition function, which may include asynchronous cancellation requests (i.e. cleanup can be bigger than one transition function and may represent a mini sub-FSM). This would require introducing more fundamental changes to the language (same as with proper self-referential types) be it either some kind of linear type capabilities or a deeper integration of runtimes with the language (so you will not be able to manipulate FSM states directly as any other data structure), since right now it's safe to forget anything and destructors are not guaranteed to run. IMO such changes would've maid Rust a better language in the end.

anp · on March 10, 2021

“Rust would have been a better language by breaking its stability guarantees” is just saying “Rust would have been a better language by not being Rust.” Maybe true, but not relevant to the people whose work you’ve blanket criticized. Rust language designers have to work within the existing language and your arguments are in bad faith if you say “async could have been perfect with all this hindsight and a few breaking language changes”.

newpavlov · on March 10, 2021

I do not think that impossibility of a reliable async Drop in Rust 1 is a proven thing (prior to the stabilization of async in the current form). Yes, it may require some unpleasant additions such as making Futures and async fns more special than they are right now and implementing it with high probability would have required a lot of work (at least on the same scale as was invested into the poll-based model), but it does not make it impossible automatically.

anp · on March 10, 2021

I don’t agree with this analysis TBH - async drop has been revisited multiple times recently with no luck. Without a clear path there I don’t know why that would seem like an option for async/await two years ago. Do you actually think the language team should have completely exhausted that option in order to try to require an allocator for async/await?

Async drop would still not address the single-allocation-per-state-machine advantage of the current design that you’ve mostly not engaged with in this thread.

newpavlov · on March 10, 2021

>I don’t agree with this analysis TBH

No worries, I like when someone disagrees with me and argues his or her position well, since it's a chance for me to learn.

>async drop has been revisited multiple times recently with no luck

The key word is "recently", meaning "after the stabilization". It's exactly my point: this problem was not sufficiently explored in my opinion prior stabilization. I would've been fine with a well argued position "async Drop is impossible without breaking language changes, so we will not care about it", but now we try to shoehorn async Drop on top of the stabilized feature.

>Async drop would still not address the single-allocation-per-state-machine advantage of the current design that you’ve mostly not engaged with in this thread.

I don't think you are correct here, please see this comment: https://news.ycombinator.com/item?id=26408524