Mapping#

Mapping tasks that create buffers during execution#

From the 25.01 release, a task variant that creates temporary or output buffers during execution requires the following two steps for correct mapping:

  • The variant should be registered with a VariantOptions with the has_allocations field set to true (the default value is false).

  • The mapper should return an upper bound of the total size of allocations in the allocation_pool_size() call. The mapper can choose to give an “unbounded” allocation pool by returning std::nullopt. This is always a sound answer to give from the mapper, but incurs performance penalty that mapping of any downstream tasks creating fresh allocations is blocked.

The allocation pool size is specific to each kind of memory to which the executing processor has affinity, and the mapper is queried once for each memory kind. The runtime does not call allocation_pool_size() for task variants registered with has_allocations being false.

group mapping

Classes and utilities to control the placement and allocation of tasks on the machine.

Enums

enum class TaskTarget : std::uint8_t#

An enum class for task targets.

The enumerators of TaskTarget are ordered by their precedence; i.e., GPU, if available, is chosen over OMP or CPU,OMP, if available, is chosen overCPU`.

Values:

enumerator GPU#

Indicates the task be mapped to a GPU.

enumerator OMP#

Indicates the task be mapped to an OpenMP processor.

enumerator CPU#

Indicates the task be mapped to a CPU.

enum class StoreTarget : std::uint8_t#

An enum class for store targets.

Values:

enumerator SYSMEM#

Indicates the store be mapped to the system memory (host memory)

enumerator FBMEM#

Indicates the store be mapped to the GPU framebuffer.

enumerator ZCMEM#

Indicates the store be mapped to the pinned memory for zero-copy GPU accesses.

enumerator SOCKETMEM#

Indicates the store be mapped to the host memory closest to the target CPU.

enum class AllocPolicy : std::uint8_t#

An enum class for instance allocation policies.

Values:

enumerator MAY_ALLOC#

Indicates the store can reuse an existing instance.

enumerator MUST_ALLOC#

Indicates the store must be mapped to a fresh instance.

enum class InstLayout : std::uint8_t#

An enum class for instant layouts.

Values:

enumerator SOA#

Indicates the store must be mapped to an SOA instance.

enumerator AOS#

Indicates the store must be mapped to an AOS instance. No different than SOA in a store mapping for a single store.

Functions

std::ostream &operator<<(
std::ostream &stream,
const ProcessorRange &range,
)#
std::ostream &operator<<(
std::ostream &stream,
const Machine &machine,
)#
std::ostream &operator<<(
std::ostream &stream,
const TaskTarget &target,
)#
std::ostream &operator<<(
std::ostream &stream,
const StoreTarget &target,
)#
class NodeRange#
#include <legate/mapping/machine.h>

A class to represent a range of nodes.

NodeRanges are half-open intervals of logical node IDs.

class ProcessorRange#
#include <legate/mapping/machine.h>

A class to represent a range of processors.

ProcessorRanges are half-open intervals of logical processors IDs.

Public Functions

std::uint32_t count() const noexcept#

Returns the number of processors in the range.

Returns:

Processor count

bool empty() const noexcept#

Checks if the processor range is empty.

Returns:

true The range is empty

Returns:

false The range is not empty

ProcessorRange slice(std::uint32_t from, std::uint32_t to) const#

Slices the processor range for a given sub-range.

Parameters:
  • from – Starting index

  • to – End index

Returns:

Sliced procesor range

NodeRange get_node_range() const#

Computes a range of node IDs for this processor range.

Returns:

Node range in a pair

std::string to_string() const#

Converts the range to a human-readable string.

Returns:

Processor range in a string

constexpr ProcessorRange() = default#

Creates an empty processor range.

constexpr ProcessorRange(
std::uint32_t low_id,
std::uint32_t high_id,
std::uint32_t per_node_proc_count,
) noexcept#

Creates a processor range.

Parameters:
  • low_id – Starting processor ID

  • high_id – End processor ID

  • per_node_proc_count – Number of per-node processors

Public Members

std::uint32_t low = {0}#

Starting processor ID.

std::uint32_t high = {0}#

End processor ID.

std::uint32_t per_node_count = {1}#

Number of per-node processors.

class Machine#
#include <legate/mapping/machine.h>

Machine descriptor class.

A Machine object describes the machine resource that should be used for a given scope of execution. By default, the scope is given the entire machine resource configured for this process. Then, the client can limit the resource by extracting a portion of the machine and setting it for the scope using MachineTracker. Configuring the scope with an empty machine raises a std::runtime_error exception.

Public Functions

TaskTarget preferred_target() const#

Preferred processor type of this machine descriptor.

Returns:

Task target

ProcessorRange processor_range() const#

Returns the processor range for the preferred processor type in this descriptor.

Returns:

A processor range `

ProcessorRange processor_range(TaskTarget target) const#

Returns the processor range for a given processor type.

If the processor type does not exist in the descriptor, an empty range is returned

Parameters:

targetProcessor type to query

Returns:

A processor range

const std::vector<TaskTarget> &valid_targets() const#

Returns the valid task targets within this machine descriptor.

Returns:

Task targets

std::vector<TaskTarget> valid_targets_except(
const std::set<TaskTarget> &to_exclude,
) const#

Returns the valid task targets excluding a given set of targets.

Parameters:

to_excludeTask targets to exclude from the query

Returns:

Task targets

std::uint32_t count() const#

Returns the number of preferred processors.

Returns:

Processor count

std::uint32_t count(TaskTarget target) const#

Returns the number of processors of a given type.

Parameters:

targetProcessor type to query

Returns:

Processor count

std::string to_string() const#

Converts the machine descriptor to a human-readable string.

Returns:

Machine descriptor in a string

Machine only(TaskTarget target) const#

Extracts the processor range for a given processor type and creates a fresh machine descriptor with it.

If the target does not exist in the machine descriptor, an empty descriptor is returned.

Parameters:

targetProcessor type to select

Returns:

Machine descriptor with the chosen processor range

Machine only(const std::vector<TaskTarget> &targets) const#

Extracts the processor ranges for a given set of processor types and creates a fresh machine descriptor with them.

Any of the targets that does not exist will be mapped to an empty processor range in the returned machine descriptor

Parameters:

targetsProcessor types to select

Returns:

Machine descriptor with the chosen processor ranges

Machine slice(
std::uint32_t from,
std::uint32_t to,
TaskTarget target,
bool keep_others = false,
) const#

Slices the processor range for a given processor type.

Parameters:
  • from – Starting index

  • to – End index

  • targetProcessor type to slice

  • keep_others – Optional flag to keep unsliced ranges in the returned machine descriptor

Returns:

Machine descriptor with the chosen procssor range sliced

Machine slice(
std::uint32_t from,
std::uint32_t to,
bool keep_others = false,
) const#

Slices the processor range for the preferred processor type of this machine descriptor.

Parameters:
  • from – Starting index

  • to – End index

  • keep_others – Optional flag to keep unsliced ranges in the returned machine descriptor

Returns:

Machine descriptor with the preferred processor range sliced

Machine operator[](TaskTarget target) const#

Selects the processor range for a given processor type and constructs a machine descriptor with it.

This yields the same result as .only(target).

Parameters:

targetProcessor type to select

Returns:

Machine descriptor with the chosen processor range

Machine operator[](const std::vector<TaskTarget> &targets) const#

Selects the processor ranges for a given set of processor types and constructs a machine descriptor with them.

This yields the same result as .only(targets).

Parameters:

targetsProcessor types to select

Returns:

Machine descriptor with the chosen processor ranges

Machine operator&(const Machine &other) const#

Computes an intersection between two machine descriptors.

Parameters:

otherMachine descriptor to intersect with this descriptor

Returns:

Machine descriptor

bool empty() const#

Indicates whether the machine descriptor is empty.

A machine descriptor is empty when all its processor ranges are empty

Returns:

true The machine descriptor is empty

Returns:

false The machine descriptor is non-empty

class DimOrdering#
#include <legate/mapping/mapping.h>

A descriptor for dimension ordering.

Public Types

enum class Kind : std::uint8_t#

An enum class for kinds of dimension ordering.

Values:

enumerator C#

Indicates the instance have C layout (i.e., the last dimension is the leading dimension in the instance)

enumerator FORTRAN#

Indicates the instance have Fortran layout (i.e., the first dimension is the leading dimension instance)

enumerator CUSTOM#

Indicates the order of dimensions of the instance is manually specified.

Public Functions

void set_c_order()#

Sets the dimension ordering to C.

void set_fortran_order()#

Sets the dimension ordering to Fortran.

void set_custom_order(std::vector<std::int32_t> dims)#

Sets a custom dimension ordering.

Parameters:

dims – A vector that stores the order of dimensions.

Kind kind() const#

Dimension ordering type.

std::vector<std::int32_t> dimensions() const#

Dimension list. Used only when the kind is CUSTOM.

Public Static Functions

static DimOrdering c_order()#

Creates a C ordering object.

Returns:

A DimOrdering object

static DimOrdering fortran_order()#

Creates a Fortran ordering object.

Returns:

A DimOrdering object

static DimOrdering custom_order(std::vector<std::int32_t> dims)#

Creates a custom ordering object.

Parameters:

dims – A vector that stores the order of dimensions.

Returns:

A DimOrdering object

class InstanceMappingPolicy#
#include <legate/mapping/mapping.h>

A descriptor for instance mapping policy.

Public Functions

inline InstanceMappingPolicy &with_target(StoreTarget target) &#

Changes the store target.

Parameters:

target – A new store target

Returns:

This instance mapping policy

inline InstanceMappingPolicy &with_allocation_policy(
AllocPolicy allocation,
) &#

Changes the allocation policy.

Parameters:

allocation – A new allocation policy

Returns:

This instance mapping policy

inline InstanceMappingPolicy &with_instance_layout(
InstLayout layout,
) &#

Changes the instance layout.

Parameters:

layout – A new instance layout

Returns:

This instance mapping policy

inline InstanceMappingPolicy &with_ordering(DimOrdering ordering) &#

Changes the dimension ordering.

Parameters:

ordering – A new dimension ordering

Returns:

This instance mapping policy

inline InstanceMappingPolicy &with_exact(bool exact) &#

Changes the value of exact

Parameters:

exact – A new value for the exact field

Returns:

This instance mapping policy

inline InstanceMappingPolicy &with_redundant(bool redundant) &#

Changes the value of redundant

Parameters:

redundant – A new value for the redundant field

Returns:

This instance mapping policy

inline void set_target(StoreTarget target)#

Changes the store target.

Parameters:

target – A new store target

inline void set_allocation_policy(AllocPolicy allocation)#

Changes the allocation policy.

Parameters:

allocation – A new allocation policy

inline void set_instance_layout(InstLayout layout)#

Changes the instance layout.

Parameters:

layout – A new instance layout

inline void set_ordering(DimOrdering ordering)#

Changes the dimension ordering.

Parameters:

ordering – A new dimension ordering

inline void set_exact(bool exact)#

Changes the value of exact

Parameters:

exact – A new value for the exact field

inline void set_redundant(bool redundant)#

Changes the value of redundant

Parameters:

redundant – A new value for the redundant field

bool subsumes(const InstanceMappingPolicy &other) const#

Indicates whether this policy subsumes a given policy.

Policy A subsumes policy B, if every instance created under B satisfies A as well.

Parameters:

other – Policy to check the subsumption against

Returns:

true If this policy subsumes other

Returns:

false Otherwise

Public Members

StoreTarget target = {StoreTarget::SYSMEM}#

Target memory type for the instance.

AllocPolicy allocation = {AllocPolicy::MAY_ALLOC}#

Allocation policy.

InstLayout layout = {InstLayout::SOA}#

Instance layout for the instance.

DimOrdering ordering = {}#

Dimension ordering for the instance. C order by default.

bool exact = {false}#

If true, the instance must be tight to the store(s); i.e., the instance must not have any extra elements not included in the store(s).

bool redundant = {false}#

If true, the runtime treats the instance as a redundant copy and marks it as collectible as soon as the consumer task is done using it. In case where the program makes access to a store through several different partitions, setting this flag will help reduce the memory footprint by allowing the runtime to collect redundant instances eagerly.

This flag has no effect when the instance is not freshly created for the task or is used for updates.

class StoreMapping#
#include <legate/mapping/mapping.h>

A mapping policy for stores.

Public Functions

InstanceMappingPolicy &policy()#

Returns the instance mapping policy of this StoreMapping object.

Returns:

A reference to the InstanceMappingPolicy object

const InstanceMappingPolicy &policy() const#

Returns the instance mapping policy of this StoreMapping object.

Returns:

A reference to the InstanceMappingPolicy object

Store store() const#

Returns the store for which this StoreMapping object describes a mapping policy.

If the policy is for multiple stores, the first store added to this policy will be returned;

Returns:

A Store object

std::vector<Store> stores() const#

Returns all the stores for which this StoreMapping object describes a mapping policy.

Returns:

A vector of Store objects

void add_store(const Store &store)#

Adds a store to this StoreMapping object.

Parameters:

storeStore to add

Public Static Functions

static StoreMapping default_mapping(
const Store &store,
StoreTarget target,
bool exact = false,
)#

Creates a mapping policy for the given store following the default mapping poicy.

Parameters:
  • store – Target store

  • target – Kind of the memory to which the store should be mapped

  • exact – Indicates whether the instance should be exact

Returns:

A store mapping

static StoreMapping create(
const Store &store,
InstanceMappingPolicy &&policy,
)#

Creates a mapping policy for the given store using the instance mapping policy.

Parameters:
  • store – Target store for the mapping policy

  • policy – Instance mapping policy to apply

Returns:

A store mapping

static StoreMapping create(
const std::vector<Store> &stores,
InstanceMappingPolicy &&policy,
)#

Creates a mapping policy for the given set of stores using the instance mapping policy.

Parameters:
  • stores – Target stores for the mapping policy

  • policy – Instance mapping policy to apply

Returns:

A store mapping

class ReleaseKey#
class MachineQueryInterface#
#include <legate/mapping/mapping.h>

An abstract class that defines machine query APIs.

Subclassed by legate::mapping::detail::BaseMapper

Public Functions

virtual const std::vector<Processor> &cpus() const = 0#

Returns local CPUs.

Returns:

A vector of processors

virtual const std::vector<Processor> &gpus() const = 0#

Returns local GPUs.

Returns:

A vector of processors

virtual const std::vector<Processor> &omps() const = 0#

Returns local OpenMP processors.

Returns:

A vector of processors

virtual std::uint32_t total_nodes() const = 0#

Returns the total number of nodes.

Returns:

Total number of nodes

class Mapper#
#include <legate/mapping/mapping.h>

An abstract class that defines Legate mapping APIs.

The APIs give Legate libraries high-level control on task and store mappings

Subclassed by legate::experimental::io::detail::Mapper, legate::mapping::detail::CoreMapper, legate::mapping::detail::DefaultMapper

Public Functions

virtual std::vector<StoreMapping> store_mappings(
const Task &task,
const std::vector<StoreTarget> &options,
) = 0#

Chooses mapping policies for the task’s stores.

Store mappings can be underspecified; any store of the task that doesn’t have a mapping policy will fall back to the default one.

Parameters:
  • taskTask to map

  • options – Types of memories to which the stores can be mapped

Returns:

A vector of store mappings

virtual std::optional<std::size_t> allocation_pool_size(
const Task &task,
StoreTarget memory_kind,
) = 0#

Returns an upper bound for the amount of memory (in bytes), of a particular memory type, allocated by a task via Legate allocators.

All buffers created by create_buffer or create_output_buffer calls are drawn from this allocation pool, and their aggregate size cannot exceed the upper bound returned from this call (the program will crash otherwise). Any out-of-band memory allocations (e.g., those created by malloc or cudaMalloc) invisible to Legate are not subject to this pool bound.

This callback is invoked only for task variants that are registered with has_allocations being true.

Parameters:
  • taskTask to map

  • memory_kindType of memory in which the memory pool is created

Returns:

A memory pool size; returning std::nullopt means the total size is unknown.

virtual Scalar tunable_value(TunableID tunable_id) = 0#

Returns a tunable value.

Parameters:

tunable_id – a tunable value id

Returns:

A tunable value in a Scalar object

class Task#
#include <legate/mapping/operation.h>

A metadata class for tasks.

Public Functions

LocalTaskID task_id() const#

Returns the task id.

Returns:

Task id

std::vector<Array> inputs() const#

Returns metadata for the task’s input arrays.

Returns:

Vector of array metadata objects

std::vector<Array> outputs() const#

Returns metadata for the task’s output arrays.

Returns:

Vector of array metadata objects

std::vector<Array> reductions() const#

Returns metadata for the task’s reduction arrays.

Returns:

Vector of array metadata objects

std::vector<Scalar> scalars() const#

Returns the vector of the task’s by-value arguments. Unlike mapping::Array objects that have no access to data in the arrays, the returned Scalar objects contain valid arguments to the task.

Returns:

Vector of Scalar objects

Array input(std::uint32_t index) const#

Returns metadata for the task’s input array.

Parameters:

index – Index of the input array

Returns:

Array metadata object

Array output(std::uint32_t index) const#

Returns metadata for the task’s output array.

Parameters:

index – Index of the output array

Returns:

Array metadata object

Array reduction(std::uint32_t index) const#

Returns metadata for the task’s reduction array.

Parameters:

index – Index of the reduction array

Returns:

Array metadata object

Scalar scalar(std::uint32_t index) const#

Returns a by-value argument of the task.

Parameters:

index – Index of the scalar

Returns:

Scalar

std::size_t num_inputs() const#

Returns the number of task’s inputs.

Returns:

Number of arrays

std::size_t num_outputs() const#

Returns the number of task’s outputs.

Returns:

Number of arrays

std::size_t num_reductions() const#

Returns the number of task’s reductions.

Returns:

Number of arrays

std::size_t num_scalars() const#

Returns the number of Scalars.

Returns:

Number of Scalars

bool is_single_task() const#

Indicates whether the task is parallelized.

Returns:

true The task is a single task

Returns:

false The task is one in a set of multiple parallel tasks

const Domain &get_launch_domain() const#

Returns the launch domain.

Returns:

Launch domain

class Store#
#include <legate/mapping/store.h>

A metadata class that mirrors the structure of legate::PhysicalStore but contains only the data relevant to mapping.

Public Functions

bool is_future() const#

Indicates whether the store is backed by a future.

Returns:

true The store is backed by a future

Returns:

false The store is backed by a region field

bool unbound() const#

Indicates whether the store is unbound.

Returns:

true The store is unbound

Returns:

false The store is a normal store

std::uint32_t dim() const#

Returns the store’s dimension.

Returns:

Store’s dimension

bool is_reduction() const#

Indicates whether the store is a reduction store.

Returns:

true The store is a reduction store

Returns:

false The store is either an input or output store

GlobalRedopID redop() const#

Returns the reduction operator id for the store.

Returns:

Reduction oeprator id

bool can_colocate_with(const Store &other) const#

Indicates whether the store can colocate in an instance with a given store.

Parameters:

otherStore against which the colocation is checked

Returns:

true The store can colocate with the input

Returns:

false The store cannot colocate with the input

template<std::int32_t DIM>
Rect<DIM> shape() const#

Returns the store’s domain.

Returns:

Store’s domain

Domain domain() const#

Returns the store’s domain in a dimension-erased domain type.

Returns:

Store’s domain in a dimension-erased domain type