New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task system overhaul #436
Task system overhaul #436
Conversation
I should elaborate a bit. This change should also enable to implement a tokio-core server that can run without allocations. Given that |
If this ends up being a direction that we want to pursue, I think that it would be safe to keep this in a |
This looks pretty slick, I really like it how it's making Your code snippet of how behavior changes was intially troubling to me, but in practice I don't think it will matter. Each So to make sure I get this straight in my head:
Does that sound like a correct interpretation? I can't really see many downsides to this! Some other random thoughts:
|
Re: cycles, I don't think they are possible. Re: how to hook this into tokio-core, you could most likely get rid of the |
Ok yeah, that both makes sense to me. Before merging I'd want to see tokio-core working (ideally something like sccache compiling llvm) just to be sure though. |
Seems fine to me, there is still work to be done though! |
Ok, I have a tweaked proposal.
pub trait Notify: Send + Sync {
/// Indicates that an associated future and/or task are ready to make
/// progress.
///
/// Typically this means that the receiver of the notification should
/// arrange for the future to get poll'd in a prompt fashion.
fn notify(&self, id: u64);
/// Called when an instance of `NotifyHandle` has been cloned.
fn ref_inc(&self);
/// Called when an instance of `NotifyHandle` has been dropped.
fn ref_dec(&self);
/// A new `Task` handle referencing `id` has been created.
fn id_ref_inc(&self, id: u64);
/// A `Task` handle referencing `id` has been dropped.
fn id_ref_dec(&self, id: u64);
}
struct NotifyHandle {
ptr: *const NotifyHandle,
}
impl NotifyHandle {
unsafe pub fn from_ptr(ptr: *const NotifyHandle) -> NotifyHandle {
unimplemented!();
}
pub fn noop() -> NotifyHandle {
unimplemented!();
}
pub fn notify_fn<F: Fn() + Sync + Send>(f: F) -> NotifyHandle {
unimplemented!();
}
}
On one hand, Another option would be to provide The one unknown here is how to add IMO, fn poll_future(&mut self) -> Poll<F::Item, F::Error> {
self.poll_future_notify(&NotifyHandle::noop())
} |
The previous implementation of the task system required a number of allocations per task instance. Each task required a dedicated `Arc<Unpark>` handle which means that executors require at least two allocations per task. Things get worse when using `with_unpark_event` as nested calls to `with_unpark_event` result in Vec allocation and cloning during each call to `task::park`. This commit provides an overhaul to the task system to work around these problems. The `Unpark` trait is changed so that only one instance is required per executor. In order to identify which task is being unparked, `Unpark::unpark` takes an `unpark_id: u64` argument. `with_unpark_event` is removed in favor of `UnparkContext` which satisfies a similar end goal, but requires only a single allocation per lifetime of the `UnparkContext`.
The old task system is depreated and updated to run implicitly on the new task system.
* Move task2 back into task * Rename new trait to `Notify` * Add `NotifyHandle` which doesn't embed `Arc` * Add new `Spawn::*_notify` APIs * Rename `UnparkContext` to `NotifyContext` * Deprecate `Unpark` and associated machinery * Reimplement tasks to keep the same performance of `Unpark` as before.
So, it seems that |
`ReadyQueue` is a structure into which an arbitrary number of futures can be pushed. `ReadyQueue` implements `Stream`, yielding the inner futures' resolved values as they become ready. This structure can handle large numbers of futures efficiently.
Switch ReadyQueue to use UnsafeNotify. This is needed because Notify is implemented on `Inner<T>`. UnsafeNotify requires 'static but T may not be static. Even though T may not be 'static, it is safe for Inner to implement Notify because the implementation will never read `T`.
I think to merge we need an up-to-date summary of what this PR does, how it changes things, and why the changes are all justified. Basically just a correct PR description and/or correct commit message. |
Other than that though I believe I'm comfortable landing everything here. I can try to get around to the docs next week if it's not done already. |
Ok I started writing up documentation for this but ended up finding what I believe is some lingering unsafety and memory leaks. |
If |
Why is |
@leodasvacas the intention is to take @carllerche can speak more though to the choice of |
A nice way of working with tasks without allocations.
I opted for Picking u64 definitely gears towards optimizing for 64 bit platforms... I'm not sure how common it is to have 32 bit platforms and care about the extra perf hit. THAT said, it occurs to me that hardcoding to u64 would be a hazard if there ever are 128bit CPUs... though I don't know how realistic that is :) (especially since ptrs these days are limited to 48bit). If there is strong preference to switching to |
@alexcrichton said he wanted to keep the various |
Avoids twiddling reference counts ourselves and makes it clear who owns what, also led to the discovery of a memory leak... At the same time this also renames `ref_inc` to `clone_id` and `ref_dec` to `drop_id` for closer mirroring to ownership in Rust.
There were a few independent bugs here which were fixed: * Nodes left in the mpsc queue after the `FuturesUnordered` was dropped were previously leaked. * The `drop_raw` calls were incorrect, not actually freeing anything. * Memory unsafety was possible if polling a future panicked, for a number of reasons. Memory management is now entirely done through `Arc` and `Weak` and code was restructured slightly to work around panics in polling future.
Ok here's my updated summary of the PR: @carllerche mind checking? Major ChangesThe primary change made in this PR is to restructure memory management and One of the largest changes being made is an update to the memory management of In addition to supporting more forms of memory management, the API Additions
Compatibility NotesAs with all 0.1.x releases this PR is intended to be 100% backwards compatible.
|
@alexcrichton seems good to me. |
This commit enables usage of the standard task system in `no_std` contexts. This essentially piggy-backs the `use_std` feature already have and, when deactivated, will provide the `futures::{task, executor}` modules. Building on all the previous refactorings for types like `NotifyHandle` this should enable usage of the task system in these contexts. This does, however, come with a caveat. A relatively large dependency of the task system is currently its usage of thread-local storage, unavailable in task contexts! To work around this an initialization hook for the futures library is provided, and this initialization hook is required to be called before working with the task system. This hook takes just two pointers, a `get` and a `set` for a pointer read/write into some "local storage", depending on the context of the system that we're running on. Usage of this crate from a `std` context (e.g. with default features enabled) should not be impacted by this change. Everything previously present is still there and no explicit initialization is required.
Alright after some discussion with @aturon I reached the conclusion that I'd personally at least be much more comfortable landing this if we could achieve one of the original goals ( The crux here is that no_std contexts must explicitly initialize the task system with a getter/setter for thread-local-storage-like information. This is lazily provided when running in the context of libstd but when disabling the default features of this crate it's required for usage. |
The initial implementation of no-std introduced a performance regression when using std. This was caused by a few factors: * Additional calls to TLS accessors. * Addition of dynamic dispatch. * Addition of atomic ops. * Less inlining Most of these additions have been removed, leaving a single relaxed atomic load in the `task::current()` path as well as limiting the `call_once` to when the task is entered vs. in all task functions.
Clean up a few more hot spots as well: * Inline some trivial functions that are called across crates * Avoid cross-crate TLS access as it's a vtable hit right now which is a bummer
Using u64 isn't a clear win and it will usually represent a pointer anyway.
As far as I know, everything that is needed for this PR has happened... |
Alright, let's do it! |
Updated description
The primary change made in this PR is to restructure memory management and
notifications throughout the "task system" in the
futures
crate. It isintended that this will have very little impact, if any, on consumers of
futures. Implementations of runtimes of futures (e.g. crates like tokio-core)
are the target for this series of changes, enabling a suite of possible
optimizations that were not previously feasible. One major use case that is
now enabled is usage of the
task
andexecutor
modules in theno_std
ecosystem.This means that bare-metal applications of futures should be able to use the same
task system that the std-based futures ecosystem uses.
One of the largest changes being made to support this is an update to the memory management of
objects behind
Task
handles. Previously it was required thatArc<Unpark>
instances were passed into the various
Spawn::poll_*
functions, but newfunctions
Spawn::poll_*_notify
were added which operate with aNotifyHandle
instead. A
NotifyHandle
is conceptually very similar to anArc<Unpark>
instance, but it works through an
unsafe trait UnsafeNotify
to manage memoryinstead of requiring
Arc
is used. You can still useArc
safely, however, ifyou'd like.
In addition to supporting more forms of memory management, the
poll_*_notify
functions also take a new
id
parameter. This parameter is intended to be anopaque bag-of-bits to the
futures
crate itself but runtimes can use this toidentify the future being notified. This is intended to enable situations where
the same instance of a
NotifyHandle
can be used for all futures executed byusing the
id
field to distinguish which future is ready when it getsnotified.
API Additions
FuturesUnordered::push
method was added and theFuturesUnordered
typeitself was completely rewritten to efficiently track a large number of
futures.
Task::will_notify_current
method was added with a slightly differentimplementation than
Task::is_current
but with stronger guarantees anddocumentation wording about its purpose.
Compatibility Notes
As with all 0.1.x releases this PR is intended to be 100% backwards compatible.
All code that previously compiled should continue to do so with these changes.
As with other changes, though, there are also some updates to be aware of:
task::park
function has been renamed totask::current
.Task::unpark
function has been renamed toTask::notify
, and in generalterminology around "unpark" has shifted to terminology around "notify"
Unpark
trait has been deprecated in favor of theNotify
traitmentioned above.
UnparkEvent
structure has been deprecated. It currently should performthe same as it used to, but it's planned that in a future 0.1.x release the
performance will regress for crates that have not transitioned away. The
primary primitive to replace this is the addition of a
push
function on theFuturesUnordered
type. If this does not help implement your use case though,please let us know!
Task::is_current
method is now deprecated, and you likely want to useTask::will_notify_current
instead, but let us know if this doesn't suffice!Original description
This PR is a work in progress
I'm submitting this PR early to hopefully try to illustrate my thoughts and get some early feedback.
Checklist
Arc<Unpark>
toUnparkHandle
Consider switching Arc<Unpark> to an opaque type #432wait()
behavior #360 (deferring this until later)Task::is_current
Notify::is_current
, I don't think this is needed anymore.GetNotifyHandle
Consider passing the Task that is driving the future to Future::poll and renaming the function #129ref_inc
->ref_dec
be moved toUnsafeNotify
(@alexcrichton says no).poll_*_notify
onStream
andSink
. Also, maybe name itpoll_notify
if it is only forFuture
.u64
vs.usize
Overview
The previous implementation of the task system required a number of allocations per task instance. Each task required a dedicated
Arc<Unpark>
handle which means that executors require at least two allocations per task.Things get worse when using
with_unpark_event
as nested calls towith_unpark_event
result in Vec allocation and cloning during each call totask::park
.This commit provides an overhaul to the task system to work around these problems. The
Unpark
trait is changed so that only one instance is required per executor. In order to identify which task is being unparked,Unpark::unpark
takes anunpark_id: u64
argument.with_unpark_event
is removed in favor ofUnparkContext
which satisfies a similar end goal, but requires only a single allocation per lifetime of theUnparkContext
.The new
Unpark
traitIn general, tasks are driven to completion by executors and executors are able to handle large number of tasks. As such, the
Unpark
trait has been tweaked to require a single allocation per executor instead of one per task. The idea is that the executor creates oneArc<Unpark>
handle and uses theunpark_id: u64
to identify which task is unparked.In the case of tokio-core, each task is stored in a slab and the the
unpark_id
is the slab index. Now, given that an Arc is no longer used, it can be that a slab slot is released and repurposed for a different task while there are still outstanding Task handles referencing the now released task.There are two potential ways to deal with this.
a) Not care. Futures need to be able to handle spurious wake ups already. Spurious wake ups can be reduced by splitting the
u64
into 28 bits for the slab offset and use the rest of the u64 as a slot usage counter.b) Use
Unpark::ref_inc
andUnpark::ref_dec
to allow the executor implementation to handle its own reference counting.Option b) would allow an executor implementation to store a pointer as the
unpark_id
and theref_inc
andref_dec
allow for atomic reference counting. This could be used in cases where using a slab is not an option.UnparkContext
This is quite similar in spirit to
with_unpark_event
except it requires significantly less allocations when used in a nested situation.It does have some different behavior. Given the following:
Task B will be the root task that is notified.