Mapping#
Mapping tasks that create buffers during execution#
From the 25.01 release, a task variant that creates temporary or output buffers during execution requires the following two steps for correct mapping:
The variant should be registered with a
VariantOptions
with thehas_allocations
field set totrue
(the default value isfalse
).The mapper should return an upper bound of the total size of allocations in the
allocation_pool_size()
call. The mapper can choose to give an “unbounded” allocation pool by returningstd::nullopt
. This is always a sound answer to give from the mapper, but incurs performance penalty that mapping of any downstream tasks creating fresh allocations is blocked.
The allocation pool size is specific to each kind of memory to which the
executing processor has affinity, and the mapper is queried once for each
memory kind. The runtime does not call allocation_pool_size()
for task
variants registered with has_allocations
being false
.
- group mapping
Classes and utilities to control the placement and allocation of tasks on the machine.
Enums
-
enum class TaskTarget : std::uint8_t#
An enum class for task targets.
The enumerators of
TaskTarget
are ordered by their precedence; i.e.,GPU
, if available, is chosen overOMP
orCPU,
OMP, if available, is chosen over
CPU`.Values:
-
enumerator GPU#
Indicates the task be mapped to a GPU.
-
enumerator OMP#
Indicates the task be mapped to an OpenMP processor.
-
enumerator CPU#
Indicates the task be mapped to a CPU.
-
enumerator GPU#
-
enum class StoreTarget : std::uint8_t#
An enum class for store targets.
Values:
-
enumerator SYSMEM#
Indicates the store be mapped to the system memory (host memory)
-
enumerator FBMEM#
Indicates the store be mapped to the GPU framebuffer.
-
enumerator ZCMEM#
Indicates the store be mapped to the pinned memory for zero-copy GPU accesses.
-
enumerator SOCKETMEM#
Indicates the store be mapped to the host memory closest to the target CPU.
-
enumerator SYSMEM#
Functions
- std::ostream &operator<<(
- std::ostream &stream,
- const ProcessorRange &range,
- std::ostream &operator<<(
- std::ostream &stream,
- const Machine &machine,
- std::ostream &operator<<(
- std::ostream &stream,
- const TaskTarget &target,
- std::ostream &operator<<(
- std::ostream &stream,
- const StoreTarget &target,
-
class NodeRange#
- #include <legate/mapping/machine.h>
A class to represent a range of nodes.
NodeRange
s are half-open intervals of logical node IDs.
-
class ProcessorRange#
- #include <legate/mapping/machine.h>
A class to represent a range of processors.
ProcessorRange
s are half-open intervals of logical processors IDs.Public Functions
-
std::uint32_t count() const noexcept#
Returns the number of processors in the range.
- Returns:
Processor count
-
bool empty() const noexcept#
Checks if the processor range is empty.
- Returns:
true The range is empty
- Returns:
false The range is not empty
-
ProcessorRange slice(std::uint32_t from, std::uint32_t to) const#
Slices the processor range for a given sub-range.
- Parameters:
from – Starting index
to – End index
- Returns:
Sliced procesor range
-
NodeRange get_node_range() const#
Computes a range of node IDs for this processor range.
- Returns:
Node range in a pair
-
std::string to_string() const#
Converts the range to a human-readable string.
- Returns:
Processor range in a string
-
constexpr ProcessorRange() = default#
Creates an empty processor range.
- constexpr ProcessorRange(
- std::uint32_t low_id,
- std::uint32_t high_id,
- std::uint32_t per_node_proc_count,
Creates a processor range.
- Parameters:
low_id – Starting processor ID
high_id – End processor ID
per_node_proc_count – Number of per-node processors
-
std::uint32_t count() const noexcept#
-
class Machine#
- #include <legate/mapping/machine.h>
Machine descriptor class.
A
Machine
object describes the machine resource that should be used for a given scope of execution. By default, the scope is given the entire machine resource configured for this process. Then, the client can limit the resource by extracting a portion of the machine and setting it for the scope usingMachineTracker
. Configuring the scope with an empty machine raises astd::runtime_error
exception.Public Functions
-
TaskTarget preferred_target() const#
Preferred processor type of this machine descriptor.
- Returns:
Task target
-
ProcessorRange processor_range() const#
Returns the processor range for the preferred processor type in this descriptor.
- Returns:
A processor range `
-
ProcessorRange processor_range(TaskTarget target) const#
Returns the processor range for a given processor type.
If the processor type does not exist in the descriptor, an empty range is returned
- Parameters:
target – Processor type to query
- Returns:
A processor range
-
const std::vector<TaskTarget> &valid_targets() const#
Returns the valid task targets within this machine descriptor.
- Returns:
Task targets
- std::vector<TaskTarget> valid_targets_except(
- const std::set<TaskTarget> &to_exclude,
Returns the valid task targets excluding a given set of targets.
-
std::uint32_t count(TaskTarget target) const#
Returns the number of processors of a given type.
-
std::string to_string() const#
Converts the machine descriptor to a human-readable string.
- Returns:
Machine descriptor in a string
-
Machine only(TaskTarget target) const#
Extracts the processor range for a given processor type and creates a fresh machine descriptor with it.
If the
target
does not exist in the machine descriptor, an empty descriptor is returned.
-
Machine only(const std::vector<TaskTarget> &targets) const#
Extracts the processor ranges for a given set of processor types and creates a fresh machine descriptor with them.
Any of the
targets
that does not exist will be mapped to an empty processor range in the returned machine descriptor
- Machine slice(
- std::uint32_t from,
- std::uint32_t to,
- TaskTarget target,
- bool keep_others = false,
Slices the processor range for a given processor type.
- Machine slice(
- std::uint32_t from,
- std::uint32_t to,
- bool keep_others = false,
Slices the processor range for the preferred processor type of this machine descriptor.
- Parameters:
from – Starting index
to – End index
keep_others – Optional flag to keep unsliced ranges in the returned machine descriptor
- Returns:
Machine descriptor with the preferred processor range sliced
-
Machine operator[](TaskTarget target) const#
Selects the processor range for a given processor type and constructs a machine descriptor with it.
This yields the same result as
.only(target)
.
-
Machine operator[](const std::vector<TaskTarget> &targets) const#
Selects the processor ranges for a given set of processor types and constructs a machine descriptor with them.
This yields the same result as
.only(targets)
.
-
Machine operator&(const Machine &other) const#
Computes an intersection between two machine descriptors.
-
bool empty() const#
Indicates whether the machine descriptor is empty.
A machine descriptor is empty when all its processor ranges are empty
- Returns:
true The machine descriptor is empty
- Returns:
false The machine descriptor is non-empty
-
TaskTarget preferred_target() const#
-
class DimOrdering#
- #include <legate/mapping/mapping.h>
A descriptor for dimension ordering.
Public Types
-
enum class Kind : std::uint8_t#
An enum class for kinds of dimension ordering.
Values:
-
enumerator C#
Indicates the instance have C layout (i.e., the last dimension is the leading dimension in the instance)
-
enumerator FORTRAN#
Indicates the instance have Fortran layout (i.e., the first dimension is the leading dimension instance)
-
enumerator CUSTOM#
Indicates the order of dimensions of the instance is manually specified.
-
enumerator C#
Public Functions
-
void set_c_order()#
Sets the dimension ordering to C.
-
void set_fortran_order()#
Sets the dimension ordering to Fortran.
-
void set_custom_order(std::vector<std::int32_t> dims)#
Sets a custom dimension ordering.
- Parameters:
dims – A vector that stores the order of dimensions.
-
std::vector<std::int32_t> dimensions() const#
Dimension list. Used only when the
kind
isCUSTOM
.
Public Static Functions
-
static DimOrdering c_order()#
Creates a C ordering object.
- Returns:
A
DimOrdering
object
-
static DimOrdering fortran_order()#
Creates a Fortran ordering object.
- Returns:
A
DimOrdering
object
-
static DimOrdering custom_order(std::vector<std::int32_t> dims)#
Creates a custom ordering object.
- Parameters:
dims – A vector that stores the order of dimensions.
- Returns:
A
DimOrdering
object
-
enum class Kind : std::uint8_t#
-
class InstanceMappingPolicy#
- #include <legate/mapping/mapping.h>
A descriptor for instance mapping policy.
Public Functions
-
inline InstanceMappingPolicy &with_target(StoreTarget target) &#
Changes the store target.
- Parameters:
target – A new store target
- Returns:
This instance mapping policy
- inline InstanceMappingPolicy &with_allocation_policy(
- AllocPolicy allocation,
Changes the allocation policy.
- Parameters:
allocation – A new allocation policy
- Returns:
This instance mapping policy
- inline InstanceMappingPolicy &with_instance_layout(
- InstLayout layout,
Changes the instance layout.
- Parameters:
layout – A new instance layout
- Returns:
This instance mapping policy
-
inline InstanceMappingPolicy &with_ordering(DimOrdering ordering) &#
Changes the dimension ordering.
- Parameters:
ordering – A new dimension ordering
- Returns:
This instance mapping policy
-
inline InstanceMappingPolicy &with_exact(bool exact) &#
Changes the value of
exact
- Parameters:
exact – A new value for the
exact
field- Returns:
This instance mapping policy
-
inline InstanceMappingPolicy &with_redundant(bool redundant) &#
Changes the value of
redundant
- Parameters:
redundant – A new value for the
redundant
field- Returns:
This instance mapping policy
-
inline void set_target(StoreTarget target)#
Changes the store target.
- Parameters:
target – A new store target
-
inline void set_allocation_policy(AllocPolicy allocation)#
Changes the allocation policy.
- Parameters:
allocation – A new allocation policy
-
inline void set_instance_layout(InstLayout layout)#
Changes the instance layout.
- Parameters:
layout – A new instance layout
-
inline void set_ordering(DimOrdering ordering)#
Changes the dimension ordering.
- Parameters:
ordering – A new dimension ordering
-
inline void set_exact(bool exact)#
Changes the value of
exact
- Parameters:
exact – A new value for the
exact
field
-
inline void set_redundant(bool redundant)#
Changes the value of
redundant
- Parameters:
redundant – A new value for the
redundant
field
-
bool subsumes(const InstanceMappingPolicy &other) const#
Indicates whether this policy subsumes a given policy.
Policy
A
subsumes policyB
, if every instance created underB
satisfiesA
as well.- Parameters:
other – Policy to check the subsumption against
- Returns:
true If this policy subsumes
other
- Returns:
false Otherwise
Public Members
-
StoreTarget target = {StoreTarget::SYSMEM}#
Target memory type for the instance.
-
AllocPolicy allocation = {AllocPolicy::MAY_ALLOC}#
Allocation policy.
-
InstLayout layout = {InstLayout::SOA}#
Instance layout for the instance.
-
DimOrdering ordering = {}#
Dimension ordering for the instance. C order by default.
-
bool exact = {false}#
If true, the instance must be tight to the store(s); i.e., the instance must not have any extra elements not included in the store(s).
-
bool redundant = {false}#
If true, the runtime treats the instance as a redundant copy and marks it as collectible as soon as the consumer task is done using it. In case where the program makes access to a store through several different partitions, setting this flag will help reduce the memory footprint by allowing the runtime to collect redundant instances eagerly.
This flag has no effect when the instance is not freshly created for the task or is used for updates.
-
inline InstanceMappingPolicy &with_target(StoreTarget target) &#
-
class StoreMapping#
- #include <legate/mapping/mapping.h>
A mapping policy for stores.
Public Functions
-
InstanceMappingPolicy &policy()#
Returns the instance mapping policy of this
StoreMapping
object.- Returns:
A reference to the
InstanceMappingPolicy
object
-
const InstanceMappingPolicy &policy() const#
Returns the instance mapping policy of this
StoreMapping
object.- Returns:
A reference to the
InstanceMappingPolicy
object
-
Store store() const#
Returns the store for which this
StoreMapping
object describes a mapping policy.If the policy is for multiple stores, the first store added to this policy will be returned;
- Returns:
A
Store
object
-
std::vector<Store> stores() const#
Returns all the stores for which this
StoreMapping
object describes a mapping policy.- Returns:
A vector of
Store
objects
-
void add_store(const Store &store)#
Adds a store to this
StoreMapping
object.- Parameters:
store – Store to add
Public Static Functions
- static StoreMapping default_mapping(
- const Store &store,
- StoreTarget target,
- bool exact = false,
Creates a mapping policy for the given store following the default mapping poicy.
- Parameters:
store – Target store
target – Kind of the memory to which the store should be mapped
exact – Indicates whether the instance should be exact
- Returns:
A store mapping
- static StoreMapping create(
- const Store &store,
- InstanceMappingPolicy &&policy,
Creates a mapping policy for the given store using the instance mapping policy.
- Parameters:
store – Target store for the mapping policy
policy – Instance mapping policy to apply
- Returns:
A store mapping
- static StoreMapping create(
- const std::vector<Store> &stores,
- InstanceMappingPolicy &&policy,
Creates a mapping policy for the given set of stores using the instance mapping policy.
- Parameters:
stores – Target stores for the mapping policy
policy – Instance mapping policy to apply
- Returns:
A store mapping
-
class ReleaseKey#
-
InstanceMappingPolicy &policy()#
-
class MachineQueryInterface#
- #include <legate/mapping/mapping.h>
An abstract class that defines machine query APIs.
Subclassed by legate::mapping::detail::BaseMapper
Public Functions
-
virtual const std::vector<Processor> &cpus() const = 0#
Returns local CPUs.
- Returns:
A vector of processors
-
virtual const std::vector<Processor> &gpus() const = 0#
Returns local GPUs.
- Returns:
A vector of processors
-
virtual const std::vector<Processor> &omps() const = 0#
Returns local OpenMP processors.
- Returns:
A vector of processors
-
virtual std::uint32_t total_nodes() const = 0#
Returns the total number of nodes.
- Returns:
Total number of nodes
-
virtual const std::vector<Processor> &cpus() const = 0#
-
class Mapper#
- #include <legate/mapping/mapping.h>
An abstract class that defines Legate mapping APIs.
The APIs give Legate libraries high-level control on task and store mappings
Subclassed by legate::experimental::io::detail::Mapper, legate::mapping::detail::CoreMapper, legate::mapping::detail::DefaultMapper
Public Functions
- virtual std::vector<StoreMapping> store_mappings(
- const Task &task,
- const std::vector<StoreTarget> &options,
Chooses mapping policies for the task’s stores.
Store mappings can be underspecified; any store of the task that doesn’t have a mapping policy will fall back to the default one.
- Parameters:
task – Task to map
options – Types of memories to which the stores can be mapped
- Returns:
A vector of store mappings
- virtual std::optional<std::size_t> allocation_pool_size(
- const Task &task,
- StoreTarget memory_kind,
Returns an upper bound for the amount of memory (in bytes), of a particular memory type, allocated by a task via Legate allocators.
All buffers created by
create_buffer
orcreate_output_buffer
calls are drawn from this allocation pool, and their aggregate size cannot exceed the upper bound returned from this call (the program will crash otherwise). Any out-of-band memory allocations (e.g., those created bymalloc
orcudaMalloc
) invisible to Legate are not subject to this pool bound.This callback is invoked only for task variants that are registered with
has_allocations
beingtrue
.
-
class Task#
- #include <legate/mapping/operation.h>
A metadata class for tasks.
Public Functions
-
std::vector<Array> inputs() const#
Returns metadata for the task’s input arrays.
- Returns:
Vector of array metadata objects
-
std::vector<Array> outputs() const#
Returns metadata for the task’s output arrays.
- Returns:
Vector of array metadata objects
-
std::vector<Array> reductions() const#
Returns metadata for the task’s reduction arrays.
- Returns:
Vector of array metadata objects
-
std::vector<Scalar> scalars() const#
Returns the vector of the task’s by-value arguments. Unlike
mapping::Array
objects that have no access to data in the arrays, the returnedScalar
objects contain valid arguments to the task.- Returns:
Vector of
Scalar
objects
-
Array input(std::uint32_t index) const#
Returns metadata for the task’s input array.
- Parameters:
index – Index of the input array
- Returns:
Array metadata object
-
Array output(std::uint32_t index) const#
Returns metadata for the task’s output array.
- Parameters:
index – Index of the output array
- Returns:
Array metadata object
-
Array reduction(std::uint32_t index) const#
Returns metadata for the task’s reduction array.
- Parameters:
index – Index of the reduction array
- Returns:
Array metadata object
-
Scalar scalar(std::uint32_t index) const#
Returns a by-value argument of the task.
- Parameters:
index – Index of the scalar
- Returns:
-
std::size_t num_inputs() const#
Returns the number of task’s inputs.
- Returns:
Number of arrays
-
std::size_t num_outputs() const#
Returns the number of task’s outputs.
- Returns:
Number of arrays
-
std::size_t num_reductions() const#
Returns the number of task’s reductions.
- Returns:
Number of arrays
-
bool is_single_task() const#
Indicates whether the task is parallelized.
- Returns:
true The task is a single task
- Returns:
false The task is one in a set of multiple parallel tasks
-
std::vector<Array> inputs() const#
-
class Store#
- #include <legate/mapping/store.h>
A metadata class that mirrors the structure of legate::PhysicalStore but contains only the data relevant to mapping.
Public Functions
-
bool is_future() const#
Indicates whether the store is backed by a future.
- Returns:
true The store is backed by a future
- Returns:
false The store is backed by a region field
-
bool unbound() const#
Indicates whether the store is unbound.
- Returns:
true The store is unbound
- Returns:
false The store is a normal store
-
bool is_reduction() const#
Indicates whether the store is a reduction store.
- Returns:
true The store is a reduction store
- Returns:
false The store is either an input or output store
-
GlobalRedopID redop() const#
Returns the reduction operator id for the store.
- Returns:
Reduction oeprator id
-
bool can_colocate_with(const Store &other) const#
Indicates whether the store can colocate in an instance with a given store.
- Parameters:
other – Store against which the colocation is checked
- Returns:
true The store can colocate with the input
- Returns:
false The store cannot colocate with the input
-
bool is_future() const#
-
enum class TaskTarget : std::uint8_t#