Expand description
Scheduler for Shadow discrete-event simulations.
In Shadow, each host has a queue of events it must process, and within a given scheduling round the host can process these events independently of all other hosts. This means that Shadow can process each host in parallel.
For a given list of hosts, the scheduler must tell each host to run and process its events. This must occur in parallel with minimal overhead. With a typical thread pool you might create a new task for each host and run all of the tasks on the thread pool, but this is too slow for Shadow and results in a huge runtime performance loss (simulation run time increases by over 10x). Most thread pools also don’t have a method of specifying which task (and therefore which host) runs on which CPU core, which is an important performance optimization on NUMA architectures.
The scheduler in this library uses a thread pool optimized for running the same task across all
threads. This means that the scheduler takes a single function/closure and runs it on each
thread simultaneously (and sometimes repeatedly) until all of the hosts have been processed. The
implementation details depend on which scheduler is in use ( ThreadPerCoreSched
or
ThreadPerHostSched
), but all schedulers share a common interface so that they can easily be
switched out.
The Scheduler
provides a simple wrapper to make it easier to support both schedulers, which
is useful if you want to choose one at runtime. The schedulers use a “scoped
threads” design to simplify the calling code. This helps the calling code
share data with the scheduler without requiring the caller to use locking or “unsafe” to do so.
// a simulation with three hosts
let hosts = [Host::new(0), Host::new(1), Host::new(2)];
// a scheduler with two threads (no cpu pinning) and three hosts
let mut sched: ThreadPerCoreSched<Host> =
ThreadPerCoreSched::new(&[None, None], hosts, false);
// the counter is owned by this main thread with a non-static lifetime, but
// because of the "scoped threads" design it can be accessed by the task in
// the scheduler's threads
let counter = AtomicU32::new(0);
// run one round of the scheduler
sched.scope(|s| {
s.run_with_hosts(|thread_idx, hosts| {
hosts.for_each(|mut host| {
println!("Running host {} on thread {thread_idx}", host.id());
host.run_events();
counter.fetch_add(1, Ordering::Relaxed);
host
});
});
// we can do other processing here in the main thread while we wait for the
// above task to finish running
println!("Waiting for the task to finish on all threads");
});
println!("Finished processing the hosts");
// the `counter.fetch_add(1)` was run once for each host
assert_eq!(counter.load(Ordering::Relaxed), 3);
// we're done with the scheduler, so join all of its threads
sched.join();
The ThreadPerCoreSched
scheduler is generally much faster and should be preferred over the
ThreadPerHostSched
scheduler. If no one finds a situation where the ThreadPerHostSched
is
faster, then it should probably be removed sometime in the future.
It’s probably good to box
the host since the schedulers move the host frequently, and it’s
faster to move a pointer than the entire host object.
Unsafe code should only be written in the thread pools. The schedulers themselves should be written in only safe code using the safe interfaces provided by the thread pools. If new features are needed in the scheduler, it’s recommended to try to add them to the scheduler itself and not modify any of the thread pools. The thread pools are complicated and have delicate lifetime sub-typing/variance handling, which is easy to break and would enable the user of the scheduler to invoke undefined behaviour.
If the scheduler uses CPU pinning, the task can get the CPU its pinned to using
core_affinity
.
Modules§
- A thread-per-core host scheduler.
- A thread-per-host host scheduler.
Enums§
- Supports iterating over all hosts assigned to this thread.
- A wrapper for different host schedulers. It would have been nice to make this a trait, but would require support for GATs.
- A scope for any task run on the scheduler.
Traits§
Functions§
- Get the core affinity of the current thread, as set by the active scheduler. Will be
None
if the scheduler is not using CPU pinning, or if called from a thread not owned by the scheduler.