partitioning#

group Partitioning

Enums

enum class ImageComputationHint : std::uint8_t#

Hints to the runtime for the image computation.

Values:

enumerator NO_HINT#

A precise image of the function is needed

enumerator MIN_MAX#

An approximate image of the function using bounding boxes is sufficient

enumerator FIRST_LAST#

Elements in the function store are sorted and thus bounding can be computed using only the first and the last elements

Functions

Constraint align(Variable lhs, Variable rhs)#

Creates an alignment constraint on two variables.

An alignment constraint between variables x and y indicates to the runtime that the PhysicalStores (leaf-task-local portions, typically equal-size tiles) of the LogicalStores corresponding to x and y must have the same global indices (i.e. the Stores must “align” with one another).

This is commonly used for e.g. element-wise operations. For example, consider an element-wise addition (z = x + y), where each array is 100 elements long. Each leaf task must receive the same local tile for all 3 arrays. For example, leaf task 0 receives indices 0 - 24, leaf task 1 receives 25 - 49, leaf task 2 receives 50 - 74, and leaf task 3 receives 75 - 99.

Parameters:
  • lhs – LHS variable

  • rhs – RHS variable

Returns:

Alignment constraint

ProxyConstraint align(
std::variant<ProxyArrayArgument, ProxyInputArguments, ProxyOutputArguments, ProxyReductionArguments> left,
std::variant<ProxyArrayArgument, ProxyInputArguments, ProxyOutputArguments, ProxyReductionArguments> right
)#

Construct an alignment constraint descriptor from a pair of proxy objects.

This routine may be used to describe an alignment constraint between prospective arguments to a task. For example:

legate::align(legate::proxy::inputs, legate::proxy::outputs[0])

Dictates that all inputs should be aligned with output 0. Similarly

legate::align(legate::proxy::inputs[0], legate::proxy::inputs[1])

Dictates that inputs 0 and 1 of the task should be aligned.

Parameters:
  • left – The left operand to the alignment constraint.

  • right – The right operand to the alignment constraint.

Returns:

The alignment descriptor.

ProxyConstraint align(ProxyInputArguments proxies)#

Construct an alignment constraint descriptor for all input arguments.

The returned constraint aligns all input arguments with each other.

Parameters:

proxies – The input arguments.

Returns:

The alignment descriptor.

ProxyConstraint align(ProxyOutputArguments proxies)#

Construct an alignment constraint descriptor for all output arguments.

The returned constraint aligns all output arguments with each other.

Parameters:

proxies – The output arguments.

Returns:

The alignment descriptor.

Constraint broadcast(Variable variable)#

Creates a broadcast constraint on a variable.

A broadcast constraint informs the runtime that the variable should not be split among the leaf tasks, instead, each leaf task should get a full copy of the underlying store. In other words, the store should be “broadcast” in its entirety to all leaf tasks in a task launch.

In effect, this constraint prevents all dimensions of the store from being partitioned.

Parameters:

variable – Partition symbol to constrain

Returns:

Broadcast constraint

Constraint broadcast(Variable variable, tuple<std::uint32_t> axes)#

Creates a broadcast constraint on a variable.

A modified form of broadcast constraint which applies the broadcast to a subset of the axes of the LogicalStore corresponding to variable. The Store will be partitioned on all other axes.

Parameters:
  • variable – Partition symbol to constrain

  • axes – List of dimensions to broadcast

Throws:

std::invalid_argument – If the list of axes is empty

Returns:

Broadcast constraint

ProxyConstraint broadcast(
std::variant<ProxyArrayArgument, ProxyInputArguments, ProxyOutputArguments, ProxyReductionArguments> value,
std::optional<tuple<std::uint32_t>> axes = std::nullopt
)#

Construct a broadcast constraint descriptor.

This routine may be used to describe a broadcast constraint for prospective arguments to a task. For example:

legate::broadcast(legate::proxy::inputs[0])

Dictates that the first input argument should be broadcast to all leaf tasks, while

legate::broadcast(legate::proxy::outputs)

Dictates that all outputs should be broadcast to all leaf tasks.

See legate::broadcast() for more information on the precise semantics of broadcasting arguments.

Parameters:
  • value – The proxy value to apply the broadcast constraint to.

  • axes – Optional axes to specify when broadcasting.

Returns:

The broadcast descriptor.

Constraint image(
Variable var_function,
Variable var_range,
ImageComputationHint hint = ImageComputationHint::NO_HINT
)#

Creates an image constraint between partitions.

The elements of var_function are treated as pointers to elements in var_range. Each sub-store s of var_function is aligned with a sub-store t of var_range, such that every element in s will find the element of var_range it’s pointing to inside of t.

Currently, the precise image computation can be performed only by CPUs. As a result, the function store is copied to the system memory if the store was last updated by GPU tasks. The approximate image computation has no such issue and is fully GPU accelerated.

Note

An approximate image of a function potentially contains extra points not in the function’s image. For example, if a function sub-store contains two 2-D points (0, 0) and (1, 1), the corresponding sub-store of the range would only contain the elements at points (0, 0) and (1, 1) if it was constructed from a precise image computation, whereas an approximate image computation would yield a sub-store with elements at point (0, 0), (0, 1), (1, 0), and (1, 1) (two extra elements).

Parameters:
  • var_function – Partition symbol for the function store

  • var_range – Partition symbol of the store whose partition should be derived from the image

  • hint – Optional hint to the runtime describing how the image computation can be performed. If no hint is given (which is the default), the runtime falls back to the precise image computation. Otherwise, the runtime computes a potentially approximate image of the function.

Returns:

Image constraint

ProxyConstraint image(
std::variant<ProxyArrayArgument, ProxyInputArguments, ProxyOutputArguments, ProxyReductionArguments> var_function,
std::variant<ProxyArrayArgument, ProxyInputArguments, ProxyOutputArguments, ProxyReductionArguments> var_range,
std::optional<ImageComputationHint> hint = std::nullopt
)#

Construct an image constraint descriptor.

This routine may be used to describe an image constraint for prospective arguments to a task.

See legate::image() for more information on the precise semantics of image constraints.

Parameters:
  • var_function – The proxy symbol for the function store.

  • var_range – The proxy symbol for the range store.

  • hint – The optional hint given to the runtime describing how the image computation will be performed.

Returns:

The image descriptor.

Constraint scale(
tuple<std::uint64_t> factors,
Variable var_smaller,
Variable var_bigger
)#

Creates a scaling constraint between partitions.

A scaling constraint is similar to an alignment constraint, except that the sizes of the aligned tiles is first scaled by factors.

For example, this may be used in compacting a 5x56 array of bools to a 5x7 array of bytes, treated as a bitfield. In this case var_smaller would be the byte array, var_bigger would be the array of bools, and factors would be [1, 8] (a 2x3 tile on the byte array corresponds to a 2x24 tile on the bool array.

Formally: if two stores A and B are constrained by a scaling constraint

legate::scale(S, pA, pB)

where pA and pB are partition symbols for A and B, respectively, A and B will be partitioned such that each pair of sub-stores Ak and Bk satisfy the following property:

\(\mathtt{S} \cdot \mathit{dom}(\mathtt{Ak}) \cap \mathit{dom}(\mathtt{B}) \subseteq \) \(\mathit{dom}(\mathtt{Bk})\)

Parameters:
  • factors – Scaling factors

  • var_smaller – Partition symbol for the smaller store (i.e., the one whose extents are scaled)

  • var_bigger – Partition symbol for the bigger store

Returns:

Scaling constraint

ProxyConstraint scale(
tuple<std::uint64_t> factors,
std::variant<ProxyArrayArgument, ProxyInputArguments, ProxyOutputArguments, ProxyReductionArguments> var_smaller,
std::variant<ProxyArrayArgument, ProxyInputArguments, ProxyOutputArguments, ProxyReductionArguments> var_bigger
)#

Construct a scaling constraint descriptor.

This routine may be used to describe a scaling constraint for prospective arguments to a task.

See legate::scale() for more information on the precise semantics of scaling constraints.

Parameters:
  • factors – The scaling factors.

  • var_smaller – The proxy argument for the smaller store (that which should be scaled).

  • var_bigger – The proxy argument for the bigger store.

Returns:

The scale descriptor.

Constraint bloat(
Variable var_source,
Variable var_bloat,
tuple<std::uint64_t> low_offsets,
tuple<std::uint64_t> high_offsets
)#

Creates a bloating constraint between partitions.

This is typically used in stencil computations, to instruct the runtime that the tiles on the “private + ghost” partition (var_bloat) must align with the tiles on the “private” partition (var_source), but also include a halo of additional elements off each end.

For example, if var_source and var_bloat correspond to 10-element vectors, var_source is split into 2 tiles, 0-4 and 5-9, low_offsets == 1 and high_offsets == 2, then var_bloat will be split into 2 tiles, 0-6 and 4-9.

Formally, if two stores A and B are constrained by a bloating constraint

legate::bloat(pA, pB, L, H)

where pA and pB are partition symbols for A and B, respectively, A and B will be partitioned such that each pair of sub-stores Ak and Bk satisfy the following property:

\( \forall p \in \mathit{dom}(\mathtt{Ak}). \forall \delta \in [-\mathtt{L}, \mathtt{H}]. \) \( p + \delta \in \mathit{dom}(\mathtt{Bk}) \lor p + \delta \not \in \mathit{dom}(\mathtt{B})\)

Parameters:
  • var_source – Partition symbol for the source store

  • var_bloat – Partition symbol for the target store

  • low_offsets – Offsets to bloat towards the negative direction

  • high_offsets – Offsets to bloat towards the positive direction

Returns:

Bloating constraint

ProxyConstraint bloat(
std::variant<ProxyArrayArgument, ProxyInputArguments, ProxyOutputArguments, ProxyReductionArguments> var_source,
std::variant<ProxyArrayArgument, ProxyInputArguments, ProxyOutputArguments, ProxyReductionArguments> var_bloat,
tuple<std::uint64_t> low_offsets,
tuple<std::uint64_t> high_offsets
)#

Construct a bloat constraint descriptor.

This routine may be used to describe a bloat constraint for prospective arguments to a task.

See legate::bloat() for more information on the precise semantics of bloat constraints.

Parameters:
  • var_source – The proxy source store.

  • var_bloat – The proxy target store.

  • low_offsets – Offsets to bloat towards the negative direction.

  • high_offsets – Offsets to bloat towards the positive direction.

Returns:

The bloat descriptor.

Variables

ProxyInputArguments inputs = {}#

A proxy object that models the input arguments to a task as whole.

ProxyOutputArguments outputs = {}#

A proxy object that models the output arguments to a task as whole.

ProxyReductionArguments reductions = {}#

A proxy object that models the reduction arguments to a task as whole.

class Variable
#include <legate/partitioning/constraint.h>

Class for partition symbols.

class Constraint
#include <legate/partitioning/constraint.h>

A base class for partitioning constraints.

class ProxyArrayArgument
#include <legate/partitioning/proxy.h>

An object that models a specific array argument to a task.

Public Types

enum class Kind : std::uint8_t

The kind of argument.

Values:

enumerator INPUT
enumerator OUTPUT
enumerator REDUCTION

Public Members

Kind kind = {}

The selected kind of the argument.

std::uint32_t index = {}

The index into the argument list (as returned e.g. by TaskContext::inputs()) corresponding to the argument.

class ProxyInputArguments : public legate::proxy_detail::TaskArgsBase<ProxyInputArguments, ProxyArrayArgument::Kind::INPUT>
#include <legate/partitioning/proxy.h>

A class that models the input arguments to a task.

class ProxyOutputArguments : public legate::proxy_detail::TaskArgsBase<ProxyOutputArguments, ProxyArrayArgument::Kind::OUTPUT>
#include <legate/partitioning/proxy.h>

A class that models the output arguments to a task.

class ProxyReductionArguments : public legate::proxy_detail::TaskArgsBase<ProxyReductionArguments, ProxyArrayArgument::Kind::REDUCTION>
#include <legate/partitioning/proxy.h>

A class that models the reduction arguments to a task.

class ProxyConstraint
#include <legate/partitioning/proxy.h>

The base proxy constraint class.

Public Functions

explicit ProxyConstraint(SharedPtr<detail::ProxyConstraint> impl)

Construct a proxy constraint.

Parameters:

impl – The pointer to the private implementation.

inline const SharedPtr<detail::ProxyConstraint> &impl() const
Returns:

The pointer to the private implementation.

namespace detail#

Typedefs

using StoreAnalyzable = std::variant<RegionFieldArg, OutputRegionArg, ScalarStoreArg, ReplicatedScalarStoreArg, WriteOnlyScalarStoreArg>#
using ArrayAnalyzable = std::variant<BaseArrayArg, ListArrayArg, StructArrayArg>#
using Analyzable = variant_detail::variant_concat_t<StoreAnalyzable, ArrayAnalyzable>#
using Restrictions = tuple<Restriction>#
typedef BasicZStringView<char, std::char_traits<char>> ZStringView#
template<typename Default, template<typename...> typename Op, typename ...Args>
using detected_or = detected_detail::detector<Default, void, Op, Args...>#
template<template<typename...> typename Op, typename ...Args>
using is_detected = detected_or<detected_detail::nonesuch, Op, Args...>#
template<template<typename...> class Op, typename ...Args>
using is_detected_t = typename is_detected<Op, Args...>::type#
template<typename T>
using type_identity_t = typename type_identity<T>::type#
template<typename T>
using has_shared_from_this = decltype(std::declval<T*>()->shared_from_this())#

Enums

enum class ArrayKind : std::uint8_t#

Values:

enumerator BASE#
enumerator LIST#
enumerator STRUCT#
enum class AccessMode : std::uint8_t#

Values:

enumerator READ#
enumerator REDUCE#
enumerator WRITE#
enum class Restriction : std::uint8_t#

Enum to describe partitioning preference on dimensions of a store.

Values:

enumerator ALLOW#

The dimension can be partitioned

enumerator AVOID#

The dimension can be partitioned, but other dimensions are preferred

enumerator FORBID#

The dimension must not be partitioned

enum class ExceptionKind : std::uint8_t#

Values:

enumerator CPP#
enumerator PYTHON#
enum class CoreProjectionOp : std::int32_t#

Values:

enumerator DELINEARIZE#
enumerator FIRST_DYNAMIC_FUNCTOR#
enumerator MAX_FUNCTOR#
enum class CoreShardID : std::underlying_type_t<CoreProjectionOp>#

Values:

enumerator TOPLEVEL_TASK#
enumerator LINEARIZE#
enum class CoreTransform : std::int8_t#

Values:

enumerator INVALID#
enumerator SHIFT#
enumerator PROMOTE#
enumerator PROJECT#
enumerator TRANSPOSE#
enumerator DELINEARIZE#
enum class TaskPriority : std::int8_t#

Values:

enumerator DEFAULT#

Functions

void show_progress(
const Legion::Task *task,
Legion::Context ctx,
Legion::Runtime *runtime
)#
void check_alignment(std::size_t alignment)#
void register_array_tasks(Library &core_lib)#
inline InternalSharedPtr<StoragePartition> create_storage_partition(
const InternalSharedPtr<Storage> &self,
InternalSharedPtr<Partition> partition,
std::optional<bool> complete
)#
inline InternalSharedPtr<Storage> slice_storage(
const InternalSharedPtr<Storage> &self,
tuple<std::uint64_t> tile_shape,
tuple<std::int64_t> offsets
)#
inline InternalSharedPtr<LogicalStore> slice_store(
const InternalSharedPtr<LogicalStore> &self,
std::int32_t dim,
Slice sl
)#
inline InternalSharedPtr<LogicalStorePartition> partition_store_by_tiling(
const InternalSharedPtr<LogicalStore> &self,
tuple<std::uint64_t> tile_shape
)#
inline InternalSharedPtr<LogicalStorePartition> create_store_partition(
const InternalSharedPtr<LogicalStore> &self,
InternalSharedPtr<Partition> partition,
std::optional<bool> complete = std::nullopt
)#
inline StoreAnalyzable store_to_launcher_arg(
const InternalSharedPtr<LogicalStore> &self,
const Variable *variable,
const Strategy &strategy,
const Domain &launch_domain,
const std::optional<SymbolicPoint> &projection,
Legion::PrivilegeMode privilege,
GlobalRedopID redop = GlobalRedopID{-1}
)#
inline RegionFieldArg store_to_launcher_arg_for_fixup(
const InternalSharedPtr<LogicalStore> &self,
const Domain &launch_domain,
Legion::PrivilegeMode privilege
)#
std::ostream &operator<<(
std::ostream &out,
const Transform &transform
)#
template<typename T>
inline decltype(auto) canonical_value_of(
T &&v
) noexcept#
inline std::uint64_t canonical_value_of(std::size_t v) noexcept#
template<typename ...T>
variant_detail::VariantProxy<T...> variant_cast(
std::variant<T...> v
)#
InternalSharedPtr<Alignment> align(
const Variable *lhs,
const Variable *rhs
)#
InternalSharedPtr<Broadcast> broadcast(const Variable *variable)#
InternalSharedPtr<Broadcast> broadcast(
const Variable *variable,
tuple<std::uint32_t> axes
)#
InternalSharedPtr<ImageConstraint> image(
const Variable *var_function,
const Variable *var_range,
ImageComputationHint hint
)#
InternalSharedPtr<ScaleConstraint> scale(
tuple<std::uint64_t> factors,
const Variable *var_smaller,
const Variable *var_bigger
)#
InternalSharedPtr<BloatConstraint> bloat(
const Variable *var_source,
const Variable *var_bloat,
tuple<std::uint64_t> low_offsets,
tuple<std::uint64_t> high_offsets
)#
inline bool operator==(const Variable &lhs, const Variable &rhs)#
InternalSharedPtr<NoPartition> create_no_partition()#
InternalSharedPtr<Tiling> create_tiling(
tuple<std::uint64_t> tile_shape,
tuple<std::uint64_t> color_shape,
tuple<std::int64_t> offsets
)#
InternalSharedPtr<Tiling> create_tiling(
tuple<std::uint64_t> tile_shape,
tuple<std::uint64_t> color_shape,
tuple<std::int64_t> offsets,
tuple<std::uint64_t> strides
)#
InternalSharedPtr<Weighted> create_weighted(
const Legion::FutureMap &weights,
const Domain &color_domain
)#
InternalSharedPtr<Image> create_image(
InternalSharedPtr<detail::LogicalStore> func,
InternalSharedPtr<Partition> func_partition,
mapping::detail::Machine machine,
ImageComputationHint hint
)#
std::ostream &operator<<(
std::ostream &out,
const Partition &partition
)#
void register_partitioning_tasks(Library &core_lib)#
template<typename OP, typename T>
void wrap_with_cas(
OP op,
T &lhs,
T rhs
)#
Restriction join(Restriction lhs, Restriction rhs)#
Restrictions join(const Restrictions &lhs, const Restrictions &rhs)#
void join_inplace(Restrictions &lhs, const Restrictions &rhs)#
template<typename T>
std::ostream &operator<<(
std::ostream &os,
const Scaled<T> &arg
)#
template<typename T>
std::ostream &operator<<(
std::ostream &os,
const Argument<T> &arg
)#
std::string compose_legion_default_args(const ParsedArgs &parsed)#

Compose the contents of LEGION_DEFAULT_ARGS.

This routine does not actually set LEGION_DEFAULT_ARGS, it only computes what the new value should be.

This is technically a private function, but we expose it to test it.

Parameters:

parsed – The parsed command-line arguments.

Returns:

The new value of LEGION_DEFAULT_ARGS.

void configure_legion(const ParsedArgs &parsed)#

Configure Legion based on parsed command-line flags.

This function sets LEGION_DEFAULT_ARGS.

Parameters:

parsed – The parsed command-line arguments.

void configure_realm(const ParsedArgs &parsed)#

Configure Realm based on the command-line flags.

Parameters:

parsed – The command-line flags.

void configure_cpus(
bool auto_config,
const Realm::ModuleConfig &core,
const Argument<std::int32_t> &omps,
const Argument<std::int32_t> &util,
const Argument<std::int32_t> &gpus,
Argument<std::int32_t> *cpus
)#
void configure_cuda_driver_path(
const Argument<std::string> &cuda_driver_path
)#
void configure_fbmem(
bool auto_config,
const Realm::ModuleConfig *cuda,
const Argument<std::int32_t> &gpus,
Argument<Scaled<std::int64_t>> *fbmem
)#
void configure_gpus(
bool auto_config,
const Realm::ModuleConfig *cuda,
Argument<std::int32_t> *gpus,
Config *cfg
)#
std::string convert_log_levels(std::string_view log_levels)#

Convert text-based logging levels to the numeric logging levels that Legion expects.

Parameters:

log_levels – The logging string specification.

Returns:

The converted log levels.

std::string logging_help_str()#
void configure_numamem(
bool auto_config,
Span<const std::size_t> numa_mems,
const Argument<std::int32_t> &omps,
Argument<Scaled<std::int64_t>> *numamem
)#
void configure_ompthreads(
bool auto_config,
const Realm::ModuleConfig &core,
const Argument<std::int32_t> &util,
const Argument<std::int32_t> &cpus,
const Argument<std::int32_t> &gpus,
const Argument<std::int32_t> &omps,
Argument<std::int32_t> *ompthreads,
Config *cfg
)#
void configure_omps(
bool auto_config,
const Realm::ModuleConfig *openmp,
Span<const std::size_t> numa_mems,
const Argument<std::int32_t> &gpus,
Argument<std::int32_t> *omps
)#
void configure_sysmem(
bool auto_config,
const Realm::ModuleConfig &core,
const Argument<Scaled<std::int64_t>> &numamem,
Argument<Scaled<std::int64_t>> *sysmem
)#
std::string_view get_parsed_LEGATE_CONFIG()#
Returns:

Get the value of LEGATE_CONFIG that was parsed.

Config handle_legate_args()#

Parse LEGATE_CONFIG and generate a Config database from it.

Returns:

The configuration of Legate.

ParsedArgs parse_args(std::vector<std::string> args)#

Parse the given command-line flags and return their values.

args must not be empty.

Parameters:

args – A list of command-line flags.

Returns:

The parsed command-line values.

template<typename StringType>
std::vector<StringType> string_split(
std::string_view command,
const char sep
)#
bool multi_node_job()#
Returns:

true when Legate is being invoked as a multi-node job, false otherwise.

std::vector<std::string> deduplicate_command_line_flags(
Span<const std::string> args
)#

De-duplicate a series of command-line flags, preserving the relative ordering of the flags.

Given:

["--foo", "--bar", "--baz", "bop", "--foo=1"]
This routine returns:
["--bar", "--baz", "bop", "--foo=1"]
Note that the relative ordering of arguments is preserved.

Parameters:

args – The arguments to de-duplicate.

Returns:

The de-duplicated flags.

void set_mpi_wrapper_libraries()#
ProjectionFunction *find_projection_function(
Legion::ProjectionID proj_id
)#
void register_affine_projection_functor(
std::uint32_t src_ndim,
const proj::SymbolicPoint &point,
Legion::ProjectionID proj_id
)#
void register_delinearizing_projection_functor(
const tuple<std::uint64_t> &color_shape,
Legion::ProjectionID proj_id
)#
void register_compound_projection_functor(
const tuple<std::uint64_t> &color_shape,
const proj::SymbolicPoint &point,
Legion::ProjectionID proj_id
)#
Logger &log_legate()#
Logger &log_legate_partitioner()#
void register_legate_core_tasks(Library &core_lib)#
void register_exception_reduction_op(const Library &context)#
bool has_started()#
bool has_finished()#
void register_legate_core_sharding_functors(
const detail::Library &core_library
)#
Legion::ShardingID find_sharding_functor_by_projection_functor(
Legion::ProjectionID proj_id
)#
void create_sharding_functor_using_projection(
Legion::ShardID shard_id,
Legion::ProjectionID proj_id,
const mapping::ProcessorRange &range
)#
void create_sharding_functor_using_projection(
Legion::ShardingID shard_id,
Legion::ProjectionID proj_id,
const mapping::ProcessorRange &range
)#
template<typename REDOP>
void register_reduction_callback(
const Legion::RegistrationCallbackArgs &args
)#
void inline_task_body(
const Task &task,
VariantCode variant_code,
VariantImpl variant_impl
)#
void legion_task_body(
VariantImpl variant_impl,
VariantCode variant_kind,
std::optional<std::string_view> task_name,
const void *args,
std::size_t arglen,
Processor p
)#
void show_progress(
const DomainPoint &index_point,
std::string_view task_name,
std::string_view provenance,
Legion::Context ctx,
Legion::Runtime *runtime
)#
bool operator==(const TaskConfig &lhs, const TaskConfig &rhs)#
bool operator!=(const TaskConfig &lhs, const TaskConfig &rhs)#
bool operator==(
const TaskSignature::Nargs &lhs,
const TaskSignature::Nargs &rhs
)#
bool operator!=(
const TaskSignature::Nargs &lhs,
const TaskSignature::Nargs &rhs
)#
bool operator==(const TaskSignature &lhs, const TaskSignature &rhs)#
bool operator!=(const TaskSignature &lhs, const TaskSignature &rhs)#
void task_wrapper(
VariantImpl variant_impl,
VariantCode variant_kind,
std::optional<std::string_view> task_name,
const void *args,
std::size_t arglen,
const void*,
std::size_t,
Processor p
)#
void task_wrapper(
VariantImpl,
VariantCode,
std::optional<std::string_view>,
const void*,
std::size_t,
const void*,
std::size_t,
Legion::Processor
)#
template<VariantImpl variant_fn, VariantCode variant_kind>
inline void task_wrapper_dyn_name(
const void *args,
std::size_t arglen,
const void *userdata,
std::size_t userlen,
Legion::Processor p
)#
LEGATE_SELECTOR_SPECIALIZATION(CPU, cpu)#
LEGATE_SELECTOR_SPECIALIZATION(OMP, omp)#
LEGATE_SELECTOR_SPECIALIZATION(GPU, gpu)#
InternalSharedPtr<Type> primitive_type(Type::Code code)#
InternalSharedPtr<Type> string_type()#
InternalSharedPtr<Type> binary_type(std::uint32_t size)#
InternalSharedPtr<FixedArrayType> fixed_array_type(
InternalSharedPtr<Type> element_type,
std::uint32_t N
)#
InternalSharedPtr<StructType> struct_type(
std::vector<InternalSharedPtr<Type>> field_types,
bool align
)#
InternalSharedPtr<ListType> list_type(
InternalSharedPtr<Type> element_type
)#
InternalSharedPtr<Type> bool_()#
InternalSharedPtr<Type> int8()#
InternalSharedPtr<Type> int16()#
InternalSharedPtr<Type> int32()#
InternalSharedPtr<Type> int64()#
InternalSharedPtr<Type> uint8()#
InternalSharedPtr<Type> uint16()#
InternalSharedPtr<Type> uint32()#
InternalSharedPtr<Type> uint64()#
InternalSharedPtr<Type> float16()#
InternalSharedPtr<Type> float32()#
InternalSharedPtr<Type> float64()#
InternalSharedPtr<Type> complex64()#
InternalSharedPtr<Type> complex128()#
InternalSharedPtr<FixedArrayType> point_type(std::uint32_t ndim)#
InternalSharedPtr<StructType> rect_type(std::uint32_t ndim)#
InternalSharedPtr<Type> null_type()#
InternalSharedPtr<Type> domain_type()#
bool is_point_type(const InternalSharedPtr<Type> &type)#
bool is_point_type(
const InternalSharedPtr<Type> &type,
std::uint32_t ndim
)#
std::int32_t ndim_point_type(const InternalSharedPtr<Type> &type)#
bool is_rect_type(const InternalSharedPtr<Type> &type)#
bool is_rect_type(
const InternalSharedPtr<Type> &type,
std::uint32_t ndim
)#
std::int32_t ndim_rect_type(const InternalSharedPtr<Type> &type)#
void abort_handler(
std::string_view file,
std::string_view func,
int line,
std::stringstream *ss
)#
template<typename ...T>
void abort_handler_tpl(
std::string_view file,
std::string_view func,
int line,
T&&... args
)#
std::string demangle_type(const std::type_info &ti)#
std::pair<void*, std::size_t> align_for_unpack_impl(
void *ptr,
std::size_t capacity,
std::size_t bytes,
std::size_t align
)#
std::size_t round_up_to_multiple(
std::size_t value,
std::size_t round_to
)#
template<typename T>
std::pair<void*, std::size_t> align_for_unpack(
void *ptr,
std::size_t capacity,
std::size_t bytes = sizeof(T),
std::size_t align = alignof(T)
)#
template<typename T>
std::size_t max_aligned_size_for_type()#
template<typename T>
zip_detail::Zipper<zip_detail::ZiperatorShortest, Enumerator, T> enumerate(
T &&iterable,
typename Enumerator::value_type start = {}
)#

Enumerate an iterable.

The enumerator is classed as a bidirectional iterator, so can be both incremented and decremented. Decrementing the enumerator will decrease the count. However, this only applies if iterable is itself at least bidirectional. If iterable does not satisfy bidirectional iteration, then the returned enumerator will assume the iterator category of iterable.

  std::vector<int> my_vector{1, 2, 3, 4, 5};

  // Enumerate a vector starting from index 0
  for (auto&& [idx, val] : legate::detail::enumerate(my_vector)) {
    std::cout << "accessing element " << idx << " of vector: " << val << '\n';
    // a sanity check
    EXPECT_EQ(my_vector[idx], val);
  }

  // Enumerate the vector, but enumerator starts at index 3. Note that the enumerator start has
  // no bearing on the thing being enumerated. The vector is still iterated over from start to
  // finish!
  auto enum_start = 3;
  for (auto&& [idx, val] : legate::detail::enumerate(my_vector, enum_start)) {
    std::cout << "enumerator has value: " << idx << '\n';
    std::cout << "accessing element " << idx - enum_start << " of vector: " << val << '\n';
    EXPECT_EQ(my_vector[idx - enum_start], val);
  }
Parameters:
  • iterable – The iterable to enumerate

  • start – [optional] Set the starting value for the enumerator

Returns:

The enumerator iterator adaptor

LEGATE_DEFINE_ENV_VAR(bool, LEGATE_TEST)#
LEGATE_DEFINE_ENV_VAR(bool, LEGATE_SHOW_USAGE)#
LEGATE_DEFINE_ENV_VAR(bool, LEGATE_AUTO_CONFIG)#
LEGATE_DEFINE_ENV_VAR(bool, LEGATE_SHOW_CONFIG)#
LEGATE_DEFINE_ENV_VAR(bool, LEGATE_SHOW_PROGRESS)#
LEGATE_DEFINE_ENV_VAR(bool, LEGATE_EMPTY_TASK)#
LEGATE_DEFINE_ENV_VAR(bool, LEGATE_SYNC_STREAM_VIEW)#
LEGATE_DEFINE_ENV_VAR(bool, LEGATE_LOG_MAPPING)#
LEGATE_DEFINE_ENV_VAR(bool, LEGATE_LOG_PARTITIONING)#
LEGATE_DEFINE_ENV_VAR(bool, LEGATE_WARMUP_NCCL)#
LEGATE_DEFINE_ENV_VAR(std::string, LEGION_DEFAULT_ARGS)#
LEGATE_DEFINE_ENV_VAR(std::uint32_t, LEGATE_MAX_EXCEPTION_SIZE)#
LEGATE_DEFINE_ENV_VAR(std::int64_t, LEGATE_MIN_CPU_CHUNK)#
LEGATE_DEFINE_ENV_VAR(std::int64_t, LEGATE_MIN_GPU_CHUNK)#
LEGATE_DEFINE_ENV_VAR(std::int64_t, LEGATE_MIN_OMP_CHUNK)#
LEGATE_DEFINE_ENV_VAR(std::uint32_t, LEGATE_WINDOW_SIZE)#
LEGATE_DEFINE_ENV_VAR(std::uint32_t, LEGATE_FIELD_REUSE_FRAC)#
LEGATE_DEFINE_ENV_VAR(std::uint32_t, LEGATE_FIELD_REUSE_FREQ)#
LEGATE_DEFINE_ENV_VAR(bool, LEGATE_CONSENSUS)#
LEGATE_DEFINE_ENV_VAR(bool, LEGATE_DISABLE_MPI)#
LEGATE_DEFINE_ENV_VAR(std::string, LEGATE_CONFIG)#
LEGATE_DEFINE_ENV_VAR(std::string, LEGATE_MPI_WRAPPER)#
LEGATE_DEFINE_ENV_VAR(std::string, LEGATE_CUDA_DRIVER)#
LEGATE_DEFINE_ENV_VAR(bool, LEGATE_IO_USE_VFD_GDS)#
LEGATE_DEFINE_ENV_VAR(std::string, REALM_UCP_BOOTSTRAP_MODE)#
std::string make_error_message(Span<const ErrorDescription> errs)#
template<typename T, typename U>
void typed_malloc(
T **ret,
U num_elems
) noexcept#
template<typename El, typename Ex, typename L, typename A>
FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>>::difference_type operator-(
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &self,
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &other
) noexcept#
template<typename El, typename Ex, typename L, typename A>
FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> operator-(
FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> self,
typename FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>>::difference_type n
) noexcept#
template<typename El, typename Ex, typename L, typename A>
FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> operator+(
FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> self,
typename FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>>::difference_type n
) noexcept#
template<typename El, typename Ex, typename L, typename A>
FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> operator+(
typename FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>>::difference_type n,
FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> self
) noexcept#
template<typename El, typename Ex, typename L, typename A>
bool operator==(
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &lhs,
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &rhs
) noexcept#
template<typename El, typename Ex, typename L, typename A>
bool operator!=(
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &lhs,
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &rhs
) noexcept#
template<typename El, typename Ex, typename L, typename A>
bool operator<(
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &lhs,
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &rhs
) noexcept#
template<typename El, typename Ex, typename L, typename A>
bool operator>(
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &lhs,
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &rhs
) noexcept#
template<typename El, typename Ex, typename L, typename A>
bool operator<=(
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &lhs,
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &rhs
) noexcept#
template<typename El, typename Ex, typename L, typename A>
bool operator>=(
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &lhs,
const FlatMDSpanIterator<::cuda::std::mdspan<El, Ex, L, A>> &rhs
) noexcept#
template<typename T>
FlatMDSpanView(T span) -> FlatMDSpanView<T>#
template<typename T>
std::pair<void*, std::size_t> pack_buffer(
void *buf,
std::size_t remaining_cap,
T &&value
)#
template<typename T>
std::pair<void*, std::size_t> pack_buffer(
void *buf,
std::size_t remaining_cap,
std::size_t nelem,
const T *value
)#
template<typename T>
std::pair<const void*, std::size_t> unpack_buffer(
const void *buf,
std::size_t remaining_cap,
T *value
)#
template<typename T>
std::pair<const void*, std::size_t> unpack_buffer(
const void *buf,
std::size_t remaining_cap,
std::size_t nelem,
T *const *value
)#
std::size_t processor_id()#
void throw_invalid_proc_local_storage_access(
const std::type_info &value_type
)#
template<typename U, typename Alloc, typename P, typename ...Args>
U *construct_from_allocator_(
Alloc &allocator,
P *hint,
Args&&... args
)#
LEGATE_PRAGMA_PUSH()#
LEGATE_PRAGMA_POP()#
template<typename T = long long>
T safe_strtoll(
const char *env_value,
char **end_ptr = nullptr
)#
bool install_terminate_handler() noexcept#

Install the Legate std::terminate() handler.

This routine is thread-safe, and may be called multiple times. However, only the first invocation has any effect. Subsequent calls to this function have no effect. The user may respect the return value to determine whether the handler was installed.

The installed handler will pretty-print any thrown exceptions, adding a traceback showing where the exception was thrown.

Returns:

true if the handlers were installed, false otherwise.

Domain to_domain(Span<const std::uint64_t> shape)#
Domain to_domain(const tuple<std::uint64_t> &shape)#
DomainPoint to_domain_point(const tuple<std::uint64_t> &shape)#
tuple<std::uint64_t> from_domain(const Domain &domain)#
void assert_valid_mapping(
std::size_t tuple_size,
const std::vector<std::int32_t> &mapping
)#
void throw_invalid_tuple_sizes(
std::size_t lhs_size,
std::size_t rhs_size
)#
void assert_in_range(std::size_t tuple_size, std::int32_t pos)#
template<typename T>
std::underlying_type_t<T> to_underlying(
T e
) noexcept#
template<typename ...T>
Overload(T...) -> Overload<T...>#
template<typename ...T>
zip_detail::Zipper<zip_detail::ZiperatorShortest, T...> zip_shortest(
T&&... args
)#

Zip a set of containers together.

The adaptor returned by this routine implements a “zip shortest” zip operation. That is, the returned zipper stops when at least one object or container has reached the end. Iterating past that point results in undefined behavior.

The iterators returned by the adaptor support the lowest common denominator of all containers when it comes to iterator functionality. For example, if all containers’ iterators support std::random_access_iterator_tag, then the returned iterator will as well.

Parameters:

args – The set of containers to zip.

Returns:

A zipper constructed from the set of containers. Calling begin() or end() on the zipper returns the corresponding iterators.

template<typename ...T>
zip_detail::Zipper<zip_detail::ZiperatorEqual, T...> zip_equal(
T&&... args
)#

Zip a set of containers of equal length together.

The adaptor returned by this routine implements a “zip equal” zip operation. That is, the returned zipper assumes all inputs are of equal size. Debug builds will attempt to verify this invariant upfront, by calling (if applicable) std::size() on the inputs. Iterating past the end results in undefined behavior.

The iterators returned by the adaptor support the lowest common denominator of all containers when it comes to iterator functionality. For example, if all containers’ iterators support std::random_access_iterator_tag, then the returned iterator will as well.

  std::vector<float> vec{1, 2, 3, 4, 5};
  std::list<int> list{5, 4, 3, 2, 1};

  // Add all elements of a list to each element of a vector
  for (auto&& [vi, li] : legate::detail::zip_equal(vec, list)) {
    vi = static_cast<float>(li + 10);
    std::cout << vi << ", ";
  }
Parameters:

args – The set of containers to zip.

Returns:

A zipper constructed from the set of containers of equal size. Calling begin() or end() on the zipper returns the corresponding iterators.

template<typename C, typename T>
std::basic_ostream<C, T> &operator<<(
std::basic_ostream<C, T> &os,
BasicZStringView<C, T> sv
)#
template<typename C, typename T>
bool operator==(
BasicZStringView<C, T> lhs,
BasicZStringView<C, T> rhs
)#
template<typename C, typename T>
bool operator!=(
BasicZStringView<C, T> lhs,
BasicZStringView<C, T> rhs
)#
template<typename C, typename T>
bool operator==(
typename BasicZStringView<C, T>::base_view_type lhs,
BasicZStringView<C, T> rhs
)#
template<typename C, typename T>
bool operator!=(
typename BasicZStringView<C, T>::base_view_type lhs,
BasicZStringView<C, T> rhs
)#
template<typename C, typename T>
bool operator==(
BasicZStringView<C, T> lhs,
typename BasicZStringView<C, T>::base_view_type rhs
)#
template<typename C, typename T>
bool operator!=(
BasicZStringView<C, T> lhs,
typename BasicZStringView<C, T>::base_view_type rhs
)#
void throw_unsupported_dim(std::int32_t dim)#
void throw_unsupported_type_code(legate::Type::Code code)#
void throw_bad_internal_weak_ptr()#
template<typename T>
T *to_address(T *p) noexcept#
template<typename T, typename = std::void_t<decltype(std::declval<T>().operator->())>>
auto *to_address(
const T &p
) noexcept#

Variables

template<typename T>
bool is_pure_move_constructible_v = is_pure_move_constructible<T>::value#
template<typename T>
bool is_pure_move_assignable_v = is_pure_move_assignable<T>::value#
template<typename From, typename To>
bool is_ptr_compat_v = is_ptr_compat<From, To>::value#
template<typename T, typename ...Ts>
bool is_same_as_one_of_v = is_same_as_one_of<T, Ts...>::value#
template<template<typename...> typename Op, typename ...Args>
bool is_detected_v = is_detected<Op, Args...>::value#
template<typename T>
bool shared_from_this_enabled_v = is_detected_v<has_shared_from_this, T>#
template<typename T>
bool is_container_v = is_container<T>::value#
namespace proxy_detail#