Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's a workaround, but it's unidiomatic, requires more traits, and requires inefficient copying of data if you want to adapt from one to the other.

However, I wouldn't call this a problem with a polling-based model.

At least part of the goal here must be to avoid allocations and reference counting. If you don't care about that, then the design could have been to 'just' pass around atomically-reference-counted buffers everywhere, including as the buffer arguments to AsyncRead/AsyncWrite. That would avoid the need for AsyncBufRead to be separate from AsyncRead. It wouldn't prevent some unidiomaticness from existing – you still couldn't, say, have an async function do a read into a Vec, because a Vec is not reference counted – but if the entire async ecosystem used reference counted buffers, the ergonomics would be pretty decent.

But we do care about avoiding allocations and reference counting, resulting in this problem. However, that means a completion-based model wouldn't really help, because a completion-based model essentially requires allocations and reference counting for the futures themselves.

To me, the question is whether Rust could have avoided this with a different polling-based model. It definitely could have avoided it with a model where the allocations for async functions are always managed by the system, just like the stacks used for regular functions are. But that would lose the elegance of async fns being 'just' a wrapper over a state machine. Perhaps, though, Rust could also have avoided it with just some tweaks to how Pin works [1]… but I am not sure whether this is actually viable. If it is, then that might be one motivation for eventually replacing Pin with a different construct, albeit a weak motivation by itself.

[1] https://www.reddit.com/r/rust/comments/dtfgsw/iou_rust_bindi...



> I am not sure whether this is actually viable.

Having investigated this myself, I would be very surprised to discover that it is.

The only viable solution to make AsyncRead zero cost for io-uring would be to have required futures to be polled to completion before they are dropped. So you can give up on select and most necessary concurrency primitives. You really want to be able to stop running futures you don't need, after all.

If you want the kernel to own the buffer, you should just let the kernel own the buffer. Therefore, AsyncBufRead. This will require the ecosystem to shift where the buffer is owned, of course, and that's a cost of moving to io-uring. Tough, but those are the cards we were dealt.


Well, you can still have select; it "just" has to react to one of the futures becoming ready by cancelling all the other ones and waiting (asynchronously) for the cancellation to be complete. Future doesn't currently have a "cancel" method, but I guess it would just be represented as async drop. So this requires some way of enforcing that async drop is called, which is hard, but I believe it's equally hard as enforcing that futures are polled to completion: either way you're requiring that some method on the future be called, and polled on, before the memory the future refers to can be reused. For the sake of this post I'll assume it's somehow possible.

Having to wait for cancellation does sound expensive, especially if the end goal is to pervasively use APIs like io_uring where cancellation can be slow.

But then, in a typical use of select, you don't actually want to cancel the I/O operations represented by the other futures. Rather, you're running select in a loop in order to handle each completed operation as it comes.

So I think the endgame of this hypothetical world is to encourage having the actual I/O be initiated by a Future or Stream created outside the loop. Then within the loop you would poll on `&mut future` or `stream.next()`. This already exists and is already cheaper in some cases, but it would be significantly cheaper when the backend is io_uring.


> But then, in a typical use of select, you don't actually want to cancel the I/O operations represented by the other futures. Rather, you're running select in a loop in order to handle each completed operation as it comes.

You often do want to cancel them in some branches of the code that handles the result (for example, if they error). It indeed may be prohibitively expensive to wait until cancellation is complete - because io-uring cancellation requires a full round trip through the interface, the IORING_OP_ASYNC_CANCEL op is just a hint to the kernel to cancel any blocking work, you still have to wait to get a completion back before you know the kernel will not touch the buffer passed in.

And this doesn't even get into the much better buffer management strategies io-uring has baked into it, like registered buffers and buffer pre-allocation. I'm really skeptical of making those work with AsyncRead (now you need to define buffer types that deref to slices that are tracking these things independent of the IO object), but since AsyncBufRead lets the IO object own the buffer, it is trivial.

Moving the ecosystem that cares about io-uring to AsyncBufRead (a trait that already exists) and letting the low level IO code handle the buffer is a strictly better solution than requiring futures to run until they're fully, truly cancalled. Protocol libraries should already expose the ability to parse the protocol from an arbitrary stream of buffers, instead of directly owning an IO handle. I'm sure some libraries don't, but that's a mistake that this will course correct.


> Well, you can still have select; it "just" has to react to one of the futures becoming ready by cancelling all the other ones and waiting (asynchronously) for the cancellation to be complete.

Right. Which is more or less what the structured concurrency primitives in Kotlin, Trio, and soon Swift are doing.


Wouldn't a more 'correct' implementation be moving the buffer into the thing that initiates the future (and thus, abstractly, into the future), rather than refcounting? At least with IOCP you aren't really supposed to even touch the memory region given to the completion port until it's signaled completion iirc.

Ie. to me, an implementation of read() that would work for a completion model could be basically:

    async read<T: IntoSystemBufferSomehow>(&self, buf: T) -> Result<T, Error>
I recognize this doesn't resolve the early drop issues outlined, and it obviously does require copying to adapt it to the existing AsyncRead trait, or if you want to like.. update a buffer in an already allocated object. It's just what I would expect an api working against iocp to look like, and I feel like it avoids many of the issues you're talking about.


I'm not a rust expert, so I'm not sure how close this proposal is to Composita

http://concurrency.ch/Content/publications/Blaeser_Component...

Essentially each component has a buffered interface (an interface message queue), which static analysis sizes at compile time. This buffer can act as a daemon, ref counter, offline dropbox, cache, cancellation check, and can probably help with cycle checking.

Is this the sort of model which would be useful here?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: