util#
- group Utilities
General utilities.
Defines
-
LEGATE_CONCAT_(x, ...)#
Concatenate a series of tokens without macro expansion.
This macro will NOT macro-expand any tokens passed to it. If this behavior is undesirable, and the user wishes to have all tokens expanded before concatenation, use LEGATE_CONCAT() instead. For example:
#define FOO 1 #define BAR 2 LEGATE_CONCAT(FOO, BAR) // expands to FOOBAR
See also
- Parameters:
x – The first parameter to concatenate.
... – The remaining parameters to concatenate.
-
LEGATE_CONCAT(x, ...)#
Concatenate a series of tokens.
This macro will first macro-expand any tokens passed to it. If this behavior is undesirable, use LEGATE_CONCAT_() instead. For example:
#define FOO 1 #define BAR 2 LEGATE_CONCAT(FOO, BAR) // expands to 12
See also
- Parameters:
x – The first parameter to concatenate.
... – The remaining parameters to concatenate.
-
LEGATE_STRINGIZE_(...)#
Stringize a series of tokens.
This macro will turn its arguments into compile-time constant C strings.
This macro will NOT macro-expand any tokens passed to it. If this behavior is undesirable, and the user wishes to have all tokens expanded before stringification, use LEGATE_STRINGIZE() instead. For example:
#define FOO 1 #define BAR 2 LEGATE_STRINGIZE_(FOO, BAR) // expands to "FOO, BAR" (note the "")
See also
- Parameters:
... – The tokens to stringize.
-
LEGATE_STRINGIZE(...)#
Stringize a series of tokens.
This macro will turn its arguments into compile-time constant C strings.
This macro will first macro-expand any tokens passed to it. If this behavior is undesirable, use LEGATE_STRINGIZE_() instead. For example:
#define FOO 1 #define BAR 2 LEGATE_STRINGIZE(FOO, BAR) // expands to "1, 2" (note the "")
See also
- Parameters:
... – The tokens to stringize.
-
LEGATE_DEFINED_ENABLED_FORM_1#
-
LEGATE_DEFINED_ENABLED_FORM_#
-
LEGATE_DEFINED_PRIVATE_3_(ignored, val, ...)#
-
LEGATE_DEFINED_PRIVATE_2_(args)#
-
LEGATE_DEFINED_PRIVATE_1_(...)#
-
LEGATE_DEFINED_PRIVATE(x)#
-
LEGATE_DEFINED(x)#
Determine if a preprocessor definition is positively defined.
LEGATE_DEFINED() returns 1 if and only if x expands to integer literal 1, or is defined (but empty). In all other cases, LEGATE_DEFINED() returns the integer literal 0. Therefore this macro should not be used if its argument may expand to a non-empty value other than
The only exception is if the argument is defined but expands to 0, in which case
LEGATE_DEFINED()
will also expand to 0:
#define FOO_EMPTY #define FOO_ONE 1 #define FOO_ZERO 0 // #define FOO_UNDEFINED static_assert(LEGATE_DEFINED(FOO_EMPTY) == 1); static_assert(LEGATE_DEFINED(FOO_ONE) == 1); static_assert(LEGATE_DEFINED(FOO_ZERO) == 0); static_assert(LEGATE_DEFINED(FOO_UNDEFINED) == 0);
Conceptually,
LEGATE_DEFINED()
is equivalent to#if defined(x) && (x == 1 || x == *empty*) // "return" 1 #else // "return" 0 #endif
As a result this macro works both in preprocessor statements:
#if LEGATE_DEFINED(FOO_BAR) foo_bar_is_defined(); #else foo_bar_is_not_defined(); #endif
And in regular C++ code:
if (LEGATE_DEFINED(FOO_BAR)) { foo_bar_is_defined(); } else { foo_bar_is_not_defined(); }
Note that in the C++ example above both arms of the if statement must compile. If this is not desired, then — since
LEGATE_DEFINED()
produces a compile-time constant expression — the user may use C++17’sif constexpr
to block out one of the arms:if constexpr (LEGATE_DEFINED(FOO_BAR)) { foo_bar_is_defined(); } else { foo_bar_is_not_defined(); }
See also
- Parameters:
x – The legate preprocessor definition.
- Returns:
1 if the argument is defined and true, 0 otherwise.
-
LEGATE_SCOPE_GUARD(...)#
Construct an unnamed legate::ScopeGuard from the contents of the macro arguments.
It is impossible to enable or disable the legate::ScopeGuard constructed by this macro.
This macro is useful if the user need only define some action to be executed on scope exit, but doesn’t care to name the legate::ScopeGuard and/or has no need to enable/disable it after construction.
For example:
int *mem = std::malloc(10 * sizeof(int)); LEGATE_SCOPE_GUARD(std::free(mem)); // use mem... // scope exits, and mem is free'd.
Multi-line statements are also supported:
int *mem = std::malloc(10 * sizeof(int)); LEGATE_SCOPE_GUARD( if (frobnicate()) { std::free(mem); } ); // use mem... // scope exits, and mem is free'd depending on return value of frobnicate()
If the body of the guard should only be executed on failure, use LEGATE_SCOPE_FAIL instead.
See also
ScopeGuard
See also
- Parameters:
... – The body of the constructed legate::ScopeGuard.
-
LEGATE_SCOPE_FAIL(...)#
Construct an unnamed legate::ScopeFail from the contents of the macro arguments.
This macro behaves identically to
LEGATE_SCOPE_GUARD
, except that it creates a legate::ScopeFail instead of a legate::ScopeGuard. Please refer to its documentation for further discussion.See also
ScopeFail
See also
- Parameters:
... – The body of the constructed legate::ScopeFail.
Typedefs
-
using VariantImpl = void (*)(TaskContext)#
Function signature for task variants. Each task variant must be a function of this type.
-
template<typename T = void>
using LegionVariantImpl = T (*)(const Legion::Task*, const std::vector<Legion::PhysicalRegion>&, Legion::Context, Legion::Runtime*)# Function signature for direct-to-legion task variants. Users should usually prefer VariantImpl instead.
-
using ShutdownCallback = std::function<void(void)>#
Signature for a callable to be executed right before the runtime shuts down.
Enums
-
enum class VariantCode : Legion::VariantID#
An enum describing the kind of variant.
Note
The values don’t start at 0. This is to match Legion, where
0
is the ‘None’ variant.Values:
-
enumerator CPU#
A CPU variant.
-
enumerator GPU#
A GPU variant.
-
enumerator OMP#
An OpenMP variant.
-
enumerator CPU#
-
enum class LocalTaskID : std::int64_t#
Integer type representing a
Library
-local task ID.All tasks are uniquely identifiable via a “task ID”. These task ID’s come in 2 flavors: global and local. When a task is registered to a
Library
, the task must declare a unique “local” task ID (LocalTaskID
) within thatLibrary
. This task ID must not coincide with any other task ID within thatLibrary
. After registration, the task is also assigned a “global” ID (GlobalTaskID
) which is guaranteed to be unique across the entire program.GlobalTaskID
s may therefore be used to refer to tasks registered to otherLibrary
s or to refer to the task when interfacing with Legion.For example, consider a task
Foo
:And twoclass Foo : public legate::LegateTask<Foo> { public: // Foo declares a local task ID of 10 static inline const auto TASK_CONFIG = // NOLINT(cert-err58-cpp) legate::TaskConfig{legate::LocalTaskID{10}}; static void cpu_variant(legate::TaskContext /* ctx */) { // some very useful work... } };
Library
s,bar
andbaz
:legate::Library bar_lib = runtime->create_library(BAR_LIBNAME); legate::Library baz_lib = runtime->create_library(BAZ_LIBNAME); // Foo registers itself with bar, claiming the bar-local task ID of 10. Foo::register_variants(bar_lib); // Retrieve the global task ID after registration. legate::GlobalTaskID gid_bar = bar_lib.get_task_id(Foo::TASK_CONFIG.task_id()); // This should be false, Foo has not registered itself to baz yet. ASSERT_FALSE(baz_lib.valid_task_id(gid_bar)); // However, we can query information from Legion about this task (such as its name), since // the global task ID has been assigned. const char* legion_task_name{}; Legion::Runtime::get_runtime()->retrieve_name(static_cast<Legion::TaskID>(gid_bar), legion_task_name); ASSERT_STREQ(legion_task_name, "example::Foo"); // We can get the same information using the local ID from the Library auto task_name = bar_lib.get_task_name(Foo::TASK_CONFIG.task_id()); ASSERT_EQ(task_name, legion_task_name);
See also
GlobalTaskID Library Library::get_task_id()
Values:
-
enum class GlobalTaskID : Legion::TaskID#
Integer type representing a global task ID.
GlobalTaskID
s may be used to refer to tasks registered to otherLibrary
s or to refer to the task when interfacing with Legion. SeeLocalTaskID
for further discussion on task ID’s and task registration.See also
LocalTaskID Library Library::get_local_task_id()
Values:
-
enum class LocalRedopID : std::int64_t#
Integer type representing a
Library
-local reduction operator ID.All reduction operators are uniquely identifiable via a “reduction ID”, which serve as proxy task ID’s for the reduction meta-tasks. When a reduction operator is registered with a
Library
, the reduction must declare a unique “local” ID (LocalRedopID
) within thatLibrary
. TheLibrary
then assigns a globally unique ID to the reduction operator, which may be used to refer to the operator across the entire program.See also
GlobalRedopID Library Library::get_reduction_op_id()
Values:
-
enum class GlobalRedopID : Legion::ReductionOpID#
Integer type representing a global reduction operator ID.
GlobalRedopID
s may be used to refer to reduction operators registered to otherLibrary
s, or to refer to the reduction operator when interfacing with Legion. SeeLocalRedopID
for further discussion on reduction operator ID’s.See also
Values:
Functions
-
Time measure_microseconds()#
Returns a timestamp at the resolution of microseconds.
The returned timestamp indicates the time at which all preceding Legate operations finish. This timestamp generation is a non-blocking operation, and the blocking happens when the value wrapped within the returned
Time
object is retrieved.- Returns:
A
Time
object
-
Time measure_nanoseconds()#
Returns a timestamp at the resolution of nanoseconds.
The returned timestamp indicates the time at which all preceding Legate operations finish. This timestamp generation is a non-blocking operation, and the blocking happens when the value wrapped within the returned
Time
object is retrieved.- Returns:
A
Time
object
-
template<typename T, int DIM>
std::string print_dense_array(
)# Converts the dense array into a string.
- Parameters:
base – Array to convert
extents – Extents of the array
strides – Strides for dimensions
- Returns:
A string expressing the contents of the array
-
template<int DIM, typename ACC>
std::string print_dense_array(
)# Converts the dense array into a string using an accessor.
- Parameters:
accessor – Accessor to an array
rect – Sub-rectangle within which the elements should be retrieved
- Returns:
A string expressing the contents of the array
- std::size_t linearize(
- const DomainPoint &lo,
- const DomainPoint &hi,
- const DomainPoint &point
Given an N-Dimensional shape and a point inside that shape, compute the “linearized” index of the point within the shape.
This routine is often used to determine the “local”, 0-based position of a point within a task, that will be in the range
[0, shape.volume() - 1)
. This may be used to e.g. copy a sub-store into a temporary 1D buffer, in which caselinearize()
would map each point in the shape to an index within the buffer:auto shape = store.shape<DIM>(); auto *buf = new int[shape.volume()]; for (auto it = legate::PointInRectIterator<DIM>{shape}; it.valid(); ++it) { auto local_idx = legate::linearize(shape.lo, shape.hi, *it); // local_idx contains the 0-based index of *it, regardless of how the task was // parallelized buf[local_idx] = accessor[*it]; }
For example, given a 2x2 shape with bounds
lo
of(0, 0)
andhi
of(2, 2)
, then for eachpoint
the linearized indices would be as follows:Similarly, with aPoint -> idx (0, 0) -> 0 (0, 1) -> 1 (0, 2) -> 2 (1, 0) -> 3 (1, 1) -> 4 (1, 2) -> 5 (2, 0) -> 6 (2, 1) -> 7 (2, 2) -> 8
lo
of(2, 2)
andhi
of(4, 4)
:Point -> idx (2, 2) -> 0 (2, 3) -> 1 (2, 4) -> 2 (3, 2) -> 3 (3, 3) -> 4 (3, 4) -> 5 (4, 2) -> 6 (4, 3) -> 7 (4, 4) -> 8
See also
- Parameters:
lo – The lowest point in the shape.
hi – The highest point in the shape.
point – The point whose position in the shape you wish to linearize.
- Returns:
The linear index of the point.
- DomainPoint delinearize(
- const DomainPoint &lo,
- const DomainPoint &hi,
- std::size_t idx
Given an N-Dimensional shape and an index corresponding to a point inside that shape, compute the point corresponding to the index.
This routine is often used to convert a “local” 1d index, in the range
[0, shape.volume() - 1)
, to a point within the “local” shape. For example, this is often used to convert a thread ID in a CUDA kernel or OpenMP loop to the corresponding point within the shape:// e.g. in an OpenMP loop auto shape = store.shape<DIM>(); #omp parallel for for (std::size_t i = 0; i < shape.volume(); ++i) { auto local_pt = legate::delinearize(shape.lo, shape.hi, i); // local_pt now contains the local point corresponding to index i }
For example, given a 2x2 shape with bounds
lo
of(0, 0)
andhi
of(2, 2)
, then for eachidx
, the delinearized points would be as follows:idx -> Point 0 -> (0, 0) 1 -> (0, 1) 2 -> (0, 2) 3 -> (1, 0) 4 -> (1, 1) 5 -> (1, 2) 6 -> (2, 0) 7 -> (2, 1) 8 -> (2, 2)
See also
- Parameters:
lo – The lowest point in the shape.
hi – The highest point in the shape.
idx – The linearized index of the point.
- Returns:
The point inside the shape.
-
template<typename Functor, typename ...Fnargs>
decltype(auto) double_dispatch(
)# Converts the runtime dimension and type code into compile time constants and invokes the functor with them.
The functor’s
operator()
should take a dimension and a type code as template parameters.- Parameters:
dim – Dimension
code – Type code
f – Functor to dispatch
args – Extra arguments to the functor
- Returns:
The functor’s return value
-
template<typename Functor, typename ...Fnargs>
decltype(auto) double_dispatch(
)# Converts the runtime dimensions into compile time constants and invokes the functor with them.
The functor’s
operator()
should take exactly two integers as template parameters.- Parameters:
dim1 – First dimension
dim2 – Second dimension
f – Functor to dispatch
args – Extra arguments to the functor
- Returns:
The functor’s return value
-
template<typename Functor, typename ...Fnargs>
decltype(auto) dim_dispatch(
)# Converts the runtime dimension into a compile time constant and invokes the functor with it.
The functor’s
operator()
should take an integer as its sole template parameter.- Parameters:
dim – Dimension
f – Functor to dispatch
args – Extra arguments to the functor
- Returns:
The functor’s return value
-
template<typename Functor, typename ...Fnargs>
decltype(auto) type_dispatch(
)# Converts the runtime type code into a compile time constant and invokes the functor with it.
The functor’s
operator()
should take a type code as its sole template parameter.- Parameters:
code – Type code
f – Functor to dispatch
args – Extra arguments to the functor
- Returns:
The functor’s return value
-
template<typename Element, typename Extent, typename Layout, typename Accessor>
detail::FlatMDSpanView<::cuda::std::mdspan<Element, Extent, Layout, Accessor>> flatten(
) noexcept# Create a flattened view of an
mdspan
that allows efficient random elementwise access.The returned view object supports all the usual iterator semantics.
Unfortunately, flattening mdspan into a linear iterator ends up with inefficient code-gen as compilers are unable to untangle the internal state required to make this work. This is not really an “implementation quality” issue so much as a fundamental constraint. In order to implement iterators, you need to solve the problem of mapping a linear index to a N-dimensional point in space. This linearization is done via the following:
std::array<std::size_t, DIM> point; for (auto dim = DIM; dim-- > 0;) { point[dim] = index % span.extent(dim); index /= span.extent(dim); }
The problem are the modulus and div commands. Modern compilers are seemingly unable to hoist those computations out of the loop and vectorize the code. So an equivalent loop over the extents “normally”:
for (std::size_t i = 0; i < span.extent(0); ++i) { for (std::size_t j = 0; j < span.extent(1); ++j) { span(i, j) = ... } }
Will be fully vectorized by optimizers, but the following (which is more or less what this iterator expands to):
for (std::size_t i = 0; i < PROD(span.extents()...); ++i) { std::array<std::size_t, DIM> point = delinearize(i); span(point) = ... }
Defeats all known modern optimizing compilers. Therefore, unless this iterator is truly required, the user is strongly encouraged to iterate over their mdspan normally.
- Parameters:
span – The mdspan to flatten.
- Returns:
The flat view.
-
template<typename IndexType, std::size_t... Extents, typename F>
void for_each_in_extent(
)# Execute a function
fn
for eachi, j, k, ...
-th point in an extentextents
.Invoking this method is roughly equivalent to
for (std::size_t i = 0; i < extents.extent(0); ++i) { for (std::size_t j = 0; j < extents.extent(1); ++j) { // ... fn(i, j, ...); } }
Where the number of nested loops generated are equal to the rank of the extent.
The utility of this function is multi-fold:
#. It allow efficient iteration over an mdspan of variable dimension. #. It separates the iteration from the container. For example, if the user wanted to iterate over the intersection of multiple mdspans, they could compute the intersection of their extents, and use this function to generate the loops.
- Parameters:
extents – The extents to iterate over.
fn – The function to execute.
-
template<std::int32_t DIM, typename F>
void for_each_in_extent(
)# Execute a function
fn
for eachi, j, k, ...
-th index in pointpoint
.This routine treats
point
as an “extent”, where each index ofpoint
gives the 0-based extent for that dimension. So given a 2D point<1, 1>
, then this routine would generate the following calls:fn(0, 0)
fn(0, 1)
fn(1, 0)
fn(1, 1)
- Parameters:
point – The
Point
to iterate over.fn – The function to execute.
-
template<std::int32_t DIM, typename F>
void for_each_in_extent(
)# Execute a function
fn
for eachi, j, k, ...
-th index in rectrect
.This routine is similar to the
Point
overload, except that the extents are given by the difference betweenrect[i].lo
andrect[i].hi
. The indices are then converted to 0-based indices before being passed tofn
. So given a 2D rect:[<1, 1>, <2, 2>]
, then this routine would generate the following calls:fn(0, 0)
fn(0, 1)
fn(0, 2)
fn(1, 0)
fn(1, 1)
fn(1, 2)
fn(2, 0)
fn(2, 1)
fn(2, 2)
- Parameters:
rect – The
Rect
to iterate over.fn – The function to execute.
-
template<typename F>
ScopeGuard<F> make_scope_guard( - F &&fn
Create a ScopeGuard from a given functor.
See also
- Parameters:
fn – The functor to create the ScopeGuard with.
- Template Parameters:
The – type of
fn
, usually inferred from the argument itself.- Returns:
The constructed ScopeGuard
-
class Time
- #include <legate/timing/timing.h>
Deferred timestamp class.
Public Functions
-
class Impl
-
class Impl
-
template<typename T>
class ProcLocalStorage - #include <legate/utilities/proc_local_storage.h>
A helper data structure to store processor-local objects.
Oftentimes, users need to create objects, usually some library handles, each of which is associated with only one processor (GPU, most likely). For those cases, users can create a
ProcLocalStorage<T>
that holds a unique singleton object of typeT
for each processor thread. The object can be retrieved simply by theget()
method and internally the calls are distinguished by IDs of the processors invoking them.Two parallel tasks running on the same processor will get the same object if they query the same
ProcLocalStorage
. Atomicity of access to the storage is guaranteed by the programming model running parallel tasks atomically on each processor; in other words, no synchronization is needed to call theget()
method on aProcLocalStorage
even when it’s shared by multiple tasks.Despite the name, the values that are stored in this storage don’t have static storage duration, but they are alive only as long as the owning
ProcLocalStorage
object is.This example uses a
ProcLocalStorage<int>
to count the number of task invocations on each processor:static void cpu_variant(legate::TaskContext context) { static legate::ProcLocalStorage<int> counter{}; if (!storage.has_value()) { // If this is the first visit, initialize the counter counter.emplace(1); } else { // Otherwise, increment the counter by 1 ++counter.get(); } }
- Template Parameters:
T – Type of values stored in this
ProcLocalStorage
.
Public Types
-
using value_type = T
The type of stored objects.
Public Functions
-
bool has_value() const noexcept
Checks if the value has been created for the executing processor.
- Returns:
true
if the value exists,false
otherwise.
-
template<typename ...Args>
value_type &emplace(Args&&... args) Constructs a new value for the executing processor.
The existing value will be overwritten by the new value.
- Parameters:
args – Arguments to the constructor of type
T
.- Returns:
A reference to the newly constructed element.
-
value_type &get()
Returns the value for the executing processor.
- Throws:
std::logic_error – If no value exists for this processor (i.e., if
has_value()
returnsfalse
), or if the method is invoked outside a task- Returns:
The value for the executing processor.
-
const value_type &get() const
Returns the value for the executing processor.
- Throws:
std::logic_error – If no value exists for this processor (i.e., if
has_value()
returnsfalse
), or if the method is invoked outside a task- Returns:
The value for the executing processor
-
template<typename F>
class ScopeGuard - #include <legate/utilities/scope_guard.h>
A simple wrapper around a callable that automatically executes the callable on exiting the scope.
- Template Parameters:
F – The type of the callable to execute.
Public Types
-
using value_type = F
The type of callable stored within the ScopeGuard.
Public Functions
-
explicit ScopeGuard(value_type &&fn, bool enabled = true) noexcept
Construct a ScopeGuard.
On destruction, a ScopeGuard will execute
fn
if and only if it is in the enabled state.fn
will be invoked with no arguments, and any return value discarded.fn
must be no-throw move-constructible, and must not throw any exceptions when invoked.See also
See also
See also
See also
- Parameters:
fn – The function to execute.
enabled – Whether the ScopeGuard should start in the “enabled” state.
-
ScopeGuard(ScopeGuard &&other) noexcept
Move-construct a ScopeGuard.
other
will be left in the “disabled” state, and will not execute its held functor upon destruction. Furthermore, the held functor is moved into the receiving ScopeGuard, soother's
functor may be in an indeterminate state. It is therefore not advised to re-enableother
.- Parameters:
other – The ScopeGuard to move from.
-
ScopeGuard &operator=(ScopeGuard &&other) noexcept
Construct a ScopeGuard via move-assignment.
This routine has no effect if
other
andthis
are the same.other
will be left in the “disabled” state, and will not execute its held functor upon destruction. Furthermore, the held functor is moved into the receiving ScopeGuard, soother's
functor may be in an indeterminate state. It is therefore not advised to re-enableother
.- Parameters:
other – The ScopeGuard to move from.
- Returns:
A reference to
this
.
-
~ScopeGuard() noexcept
Destroy a ScopeGuard.
If the ScopeGuard is currently in the enabled state, executes the held functor, otherwise does nothing.
-
bool enabled() const
Query a ScopeGuard’s state.
See also
See also
- Returns:
true if the ScopeGuard is enabled, false otherwise.
-
void disable()
Disable a ScopeGuard.
This routine prevents a ScopeGuard from executing its held functor on destruction. On return, ScopeGuard::enabled() will return false.
Calling this routine on an already disabled ScopeGuard has no effect.
See also
-
void enable()
Enable a ScopeGuard.
This routine makes a ScopeGuard execute its held functor on destruction. On return, ScopeGuard::enabled() will return true.
Calling this routine on an already enabled ScopeGuard has no effect.
See also
-
template<typename F>
class ScopeFail - #include <legate/utilities/scope_guard.h>
Similar to ScopeGuard, except that the callable is only executed if the scope is exited due to an exception.
- Template Parameters:
F – The type of the callable to execute.
Public Functions
-
explicit ScopeFail(value_type &&fn) noexcept
Construct a ScopeFail.
On destruction, a ScopeFail will execute
fn
if and only if the scope is being exited due to an uncaught exception. Therefore, unlike ScopeGuard, it is not possible to “disable” a ScopeFail.fn
will be invoked with no arguments, and any return value discarded.fn
must be no-throw move-constructible, and must not throw any exceptions when invoked.See also
- Parameters:
fn – The function to execute.
-
LEGATE_CONCAT_(x, ...)#