Legate.STL Reference#

Concepts#

group Concepts

Functions

template<typename StoreLike> LogicalStore get_logical_store( StoreLike &&store_like )

Get the logical store from a logical store-like object.

See also

logical_store_like

template<typename StoreLike> concept logical_store_like#

#include <legate/experimental/stl/detail/stlfwd.hpp>

A type StoreLike satisfied logical_store_like when it exposes a legate::LogicalStore via the get_logical_store customization point.

requires(StoreLike& storeish,
         legate::LogicalStore& lstore,
         stl::mdspan_for_t<StoreLike> span,
         legate::PhysicalStore& pstore) {
    { get_logical_store(storeish) } -> std::same_as<LogicalStore>;
    { StoreLike::policy::logical_view(lstore) } -> std::ranges::range;
    { StoreLike::policy::physical_view(span) } -> std::ranges::range;
    { StoreLike::policy::size(pstore) } -> legate::coord_t;
    { StoreLike::policy::partition_constraints(iteration_kind{}) } -> tuple-like;
    { StoreLike::policy::partition_constraints(reduction_kind{}) } -> tuple-like;
  };

See also

get_logical_store

template<typename Reduction> concept legate_reduction#

#include <legate/experimental/stl/detail/stlfwd.hpp>

A concept describing the requirements of a reduction operation that can be used with the reduce and transform_reduce algorithms.

A reduction is characterized by the following three things:

An apply operation
A fold operation
An identity value

apply is used to apply the reduction operation to a pair of values, modifying the first value in-place. fold is used to combine two values into an accumulator that can then be passed as the second argument to the apply operation. The fold operation must be reflexive, transitive, and symmetric. fold, like apply, modifies the first parameter in-place.

The following relations must hold for the three reduction components:

apply(apply(x, y), z) is functionally equivalent to apply(x, fold(y, z)).
apply(x, identity) leaves x unchanged.
fold(x, identity) leaves x unchanged.

A type Reduction satisfies legate_reduction if the requires clause below is true:

requires (Reduction red, typename Reduction::LHS& lhs, typename Reduction::RHS& rhs) {
  { Reduction::template apply<true>(lhs, std::move(rhs)) } -> std::same_as<void>;
  { Reduction::template apply<false>(lhs, std::move(rhs)) } -> std::same_as<void>;
  { Reduction::template fold<true>(rhs, std::move(rhs)) } -> std::same_as<void>;
  { Reduction::template fold<false>(rhs, std::move(rhs)) } -> std::same_as<void>;
  typename std::integral_constant<typename Reduction::LHS, Reduction::identity>;
  typename std::integral_constant<int, Reduction::REDOP_ID>;
}

See also

reduce
transform_reduce

Containers#

Views#

group Views

Typedefs

template<typename ElementType, std::int32_t Dim> using mdspan_t = ::cuda::std::mdspan<ElementType, ::cuda::std::dextents<coord_t, Dim>, ::cuda::std::layout_right, detail::MDSpanAccessor<ElementType, Dim>>

An alias for std::mdspan with a custom accessor that allows elementwise access to a legate::PhysicalStore.

Template Parameters:

ElementType – The element type of the mdspan.
Dim – The dimensionality of the mdspan.

using element_policy = detail::ElementPolicy: A policy for use with legate::experimental::stl::slice_view that creates a flat view of all the elements of the store.

using row_policy = detail::RowPolicy: A policy for use with legate::experimental::stl::slice_view that slices a logical store along the row (0th) dimension.

Note

The elements of the resulting range are logical stores with one fewer dimension than the original store.

using column_policy = detail::ColumnPolicy: A policy for use with legate::experimental::stl::slice_view that slices a logical store along the column (1st) dimension.

Note

The elements of the resulting range are logical stores with one fewer dimension than the original store.

template<std::int32_t... ProjDims> using projection_policy = detail::ProjectionPolicy<ProjDims...>: A policy for use with legate::experimental::stl::slice_view that slices a logical store along ProjDims... dimensions.

Note

The elements of the resulting range are logical stores with N fewer dimensions than the original store, where N is sizeof...(ProjDims).

template<typename ElementType, std::int32_t Dim, typename SlicePolicy> using slice_view = detail::slice_view_t<ElementType, Dim, SlicePolicy>

A view of a logical store, sliced along some specified dimension(s), resulting in a 1-dimensional range of logical stores.

Template Parameters:

ElementType – The element type of the underlying logical store.
Dim – The dimensionality of the underlying logical store.
SlicePolicy – A type that determines how the logical store is sliced into a range. Choices include element_policy, row_policy, column_policy, and projection_policy.

Algorithms#

group Algorithms

Functions

template<typename Range> void fill( Range &&output, value_type_of_t<Range> val )

Fills the given range with the specified value.

This function fills the elements in the range [begin, end) with the specified value. The range must be a logical store-like object, meaning it supports the necessary operations for storing values. The value to be filled is specified by the val parameter.

Example:

// Declare a 3-dimensional logical store and fill it with the value 42.
stl::logical_store<int, 3> store{{100, 200, 300}};
stl::fill(store, 42);
// store's elements are now all 42

Parameters:

output – The range to be filled.
val – The value to fill the range with.

template<typename Function, typename ...Inputs> void for_each_zip( Function &&fn, Inputs&&... ins )

Applies the given function fn with elements of each of the input sequences ins as function arguments.

This function launches a Legate task that applies the provided function fn element-wise to the pack of ranges ins.

Examples:

// Element-wise addition of two logical stores.
stl::logical_store<int, 2> store1 = {{1, 2, 3, 4},  //
                                     {2, 3, 4, 5},
                                     {3, 4, 5, 6}};
stl::logical_store<int, 2> store2 = {{3, 4, 5, 6},  //
                                     {2, 3, 4, 5},
                                     {1, 2, 3, 4}};

// `a` and `b` refer to the elements of `store1` and `store2`.
auto fn = [] LEGATE_HOST_DEVICE(int& a, int& b) { a += b; };
stl::for_each_zip(fn, store1, store2);

// store1 now contains:
// {
//   {4, 6, 8, 10},
//   {4, 6, 8, 10},
//   {4, 6, 8, 10}
// }

// Row-wise operation on two logical stores.
stl::logical_store<int, 2> store1 = {{1, 2, 3, 4},  //
                                     {2, 3, 4, 5},
                                     {3, 4, 5, 6}};
stl::logical_store<int, 2> store2 = {{3, 4, 5, 6},  //
                                     {2, 3, 4, 5},
                                     {1, 2, 3, 4}};

// `a` and `b` are `mdspan` objects referring to the rows of `store1`
// and `store2`.
auto fn = [] LEGATE_HOST_DEVICE(auto&& a, auto&& b) {
  for (std::ptrdiff_t i = 0; i < a.extent(0); ++i) {
    a(i) += b(i);
  }
};
stl::for_each_zip(fn, stl::rows_of(store1), stl::rows_of(store2));

// store1 now contains:
// {
//   {4, 6, 8, 10},
//   {4, 6, 8, 10},
//   {4, 6, 8, 10}
// }

Parameters:

fn – The function object to apply with each set of elements.
ins – The input sequences to iterate over.

Pre:

The number of input sequences must be greater than 0.
The input sequences must satisfy the logical_store_like concept.
The input sequences must have the same shape.
The function object fn must be callable with the same number of arguments as the number of input sequences.
The function object fn must be trivially copyable.

template<typename Input, typename Function> void for_each( Input &&input, Function &&fn )

Applies the given function to each element in the input range.

This function launches a Legate task that applies the provided function fn to each element in the input range input.

Examples:

// Element-wise addition of two logical stores.
stl::logical_store<int, 2> store = {{1, 2, 3, 4},  //
                                    {2, 3, 4, 5},
                                    {3, 4, 5, 6}};

// `a` refers to the elements of `store`.
auto fn = [] LEGATE_HOST_DEVICE(int& a) { ++a; };
stl::for_each(store, fn);

// store1 now contains:
// {
//   {2, 3, 4, 5},
//   {3, 4, 5, 6}
//   {4, 5, 6, 7}
// }

// Row-wise operation on two logical stores.
stl::logical_store<int, 2> store = {{1, 2, 3, 4},  //
                                    {2, 3, 4, 5},
                                    {3, 4, 5, 6}};

// `a` is an `mdspan` object referring to the rows of `store`.
auto fn = [] LEGATE_HOST_DEVICE(auto&& a) { a(0) = 42; };
stl::for_each(stl::rows_of(store), fn);

// store1 now contains:
// {
//   {42, 2, 3, 4},
//   {42, 3, 4, 5},
//   {42, 4, 5, 6}
// }

Parameters:

input – The input range to iterate over.
fn – The function to apply to each element.

Pre:

The input range input must satisfy the logical_store_like concept.
The function object fn must be callable with the element type of the input range.
The function object fn must be trivially copyable.

template<typename InputRange, typename Init, typename ReductionOperation> logical_store<element_type_of_t<InputRange>, dim_of_v<Init>> reduce( InputRange &&input, Init &&init, ReductionOperation op )

Reduces the elements of the input range using the given reduction operation operation.

See also

Examples:

auto store = stl::create_store({5}, std::int64_t{1});
auto init  = stl::create_store({}, std::int64_t{1});

// fill the store with data
auto elems = stl::elements_of(store);
std::iota(elems.begin(), elems.end(), std::int64_t{1});

auto result = stl::reduce(store, init, std::plus<>());

auto result_span = stl::as_mdspan(result);
auto&& value     = result_span();
static_assert(std::is_same_v<decltype(value), const std::int64_t&>);
EXPECT_EQ(16, value);

stl::logical_store<std::int64_t, 2> store = {{0, 0, 0, 0},  //
                                             {1, 1, 1, 1},
                                             {2, 2, 2, 2}};

// Reduce by rows
{
  auto init        = stl::create_store({4}, std::int64_t{0});
  auto result      = stl::reduce(stl::rows_of(store), init, stl::elementwise(std::plus<>()));
  auto result_span = stl::as_mdspan(result);
  EXPECT_EQ(result_span.rank(), 1);
  EXPECT_EQ(result_span.extent(0), 4);
  EXPECT_EQ(result_span(0), 3);
  EXPECT_EQ(result_span(1), 3);
  EXPECT_EQ(result_span(2), 3);
  EXPECT_EQ(result_span(3), 3);
}

// Reduce by columns
{
  auto init        = stl::create_store({3}, std::int64_t{0});
  auto result      = stl::reduce(stl::columns_of(store), init, stl::elementwise(std::plus<>()));
  auto result_span = stl::as_mdspan(result);
  EXPECT_EQ(result_span.rank(), 1);
  EXPECT_EQ(result_span.extent(0), 3);
  EXPECT_EQ(result_span(0), 0);
  EXPECT_EQ(result_span(1), 4);
  EXPECT_EQ(result_span(2), 8);
}

Parameters:

input – The input range to reduce.
init – The initial value of the reduction.
op – The reduction operation to apply to the elements of the input range. op can be a type that satisfies the legate_reduction concept or one of the standard functional objects std::plus, std::minus, std::multiplies, std::divides, etc.; or an elementwise operation created by passing any of the above to stl::elementwise.

Pre:

InputRange must satisfy the logical_store_like concept.
Init must satisfy the logical_store_like concept.
The value type of the input range must be the same as the value type of the initial value.
The dimension of the input range must be one greater than the dimension of the initial value.

Returns:

An instance of logical_store with the same value type and shape as init.

template<typename InputRange, typename OutputRange, typename UnaryOperation> void transform( InputRange &&input, OutputRange &&output, UnaryOperation op )

Applies a unary operation to each element in the input range and stores the result in the output range.

The input range and the output range may be the same.

Example:

stl::logical_store<std::int64_t, 2> input = {{0, 1, 2, 3},  //
                                             {4, 5, 6, 7},
                                             {8, 9, 10, 11}};

// Transform by rows
auto result = stl::create_store({3, 4}, std::int64_t{0});
stl::transform(stl::rows_of(input),  //
               stl::rows_of(result),
               stl::elementwise(Square()));

// `result` now contains the squares of the elements:
//     [[0   1   4   9]
//      [16 25  36  49]
//      [64 81 100 121]]

Parameters:

input – The input range. Must satisfy the logical_store_like concept.
output – The output range. Must satisfy the logical_store_like concept.
op – The unary operation to apply.

Pre:

The input and output ranges must have the same shape.
The unary operation must be trivially relocatable.

template<typename InputRange1, typename InputRange2, typename OutputRange, typename BinaryOperation> void transform( InputRange1 &&input1, InputRange2 &&input2, OutputRange &&output, BinaryOperation op )

Applies a binary operation to each element in two input ranges and stores the result in the output range.

The output range may be the same as one of the input ranges.

Example:

std::size_t extents[] = {4, 5};
auto store1           = stl::create_store<std::int64_t>(extents);
auto store2           = stl::create_store<std::int64_t>(extents);
auto store3           = stl::create_store<std::int64_t>(extents);

// Stateless extended lambdas work with both clang CUDA and nvcc
auto shift = [] LEGATE_HOST_DEVICE(std::int64_t a, std::int64_t b) {  //
  return a << b;
};

stl::fill(store1, 2);
stl::fill(store2, 4);
stl::transform(store1, store2, store3, shift);

// `store3` now contains the elements:
//     [[32 32 32 32 32]
//      [32 32 32 32 32]
//      [32 32 32 32 32]
//      [32 32 32 32 32]]

Parameters:

input1 – The first input range. Must satisfy the logical_store_like concept.
input2 – The second input range. Must satisfy the logical_store_like concept.
output – The output range. Must satisfy the logical_store_like concept.
op – The binary operation to apply.

Pre:

The input and output ranges must all have the same shape.
The binary operation must be trivially relocatable.

template<typename InputRange, typename Init, typename Reduction, typename UnaryTransform> logical_store<value_type_of_t<Init>, dim_of_v<Init>> transform_reduce( InputRange &&input, Init &&init, Reduction &&reduction_op, UnaryTransform &&transform_op )

Transform the elements of a store with a unary operation and reduce them using a binary operation with an initial value.

stl::transform_reduce(input, init, reduction_op, transform_op) is semantically equivalent (with the caveat noted below) to:

auto result = stl::create_store<TransformResult>( <extents-of-input> );
stl::transform(input, result, transform_op);
return stl::reduce(result, init, reduction_op);

TransformResult is the result type of the unary transform operation. The input store arguments can be legate_store instances, or they can be views created with one of the view adaptors . If the input store is a view, the result store used to hold the results of the transformation step will be a view of the same shape as the input store.

See also

Example

stl::logical_store<std::int64_t, 1> store{{5}};

// fill the store with data. The store will contain {1, 2, 3, 4, 5}
auto elems = stl::elements_of(store);
std::iota(elems.begin(), elems.end(), std::int64_t{1});

// a host/device lambda to square the elements
auto square = [] LEGATE_HOST_DEVICE(std::int64_t x) { return x * x; };

// sum the squared elements
auto result = stl::transform_reduce(store, stl::scalar<std::int64_t>(0), std::plus<>{}, square);

auto result_span  = stl::as_mdspan(result);
auto result_value = result_span();  // index into the 0-D mdspan
// result_value is 55

Parameters:

input – The input range to transform.
init – The initial value of the reduction.
reduction_op – The reduction operation to apply to the transformed elements of the input range. Reduction can be a type that satisfies the legate_reduction concept or one of the standard functional objects std::plus, std::minus, std::multiplies, std::divides, etc.; or an elementwise operation created by passing any of the above to stl::elementwise.
transform_op – The unary operation to apply to the elements of the input prior to the reduction step.

Pre:

InputRange must satisfy the logical_store_like concept.
Init must satisfy the logical_store_like concept.
The result type of the unary transform must be the same as the value type of the reduction’s initial value.
The dimension of the input range must be one greater than the dimension of the initial value.

Returns:

An instance of logical_store with the same value type and shape as init.

template<typename InputRange1, typename InputRange2, typename Init, typename Reduction, typename BinaryTransform> logical_store<element_type_of_t<Init>, dim_of_v<Init>> transform_reduce( InputRange1 &&input1, InputRange2 &&input2, Init &&init, Reduction &&reduction_op, BinaryTransform &&transform_op )

Transform the elements of two stores with a binary operation and reduce them using a binary operation with an initial value.

stl::transform_reduce(input1, input2, init, reduction_op, transform_op) is semantically equivalent (with the caveat noted below) to:

auto result = stl::create_store<TransformResult>( <extents-of-input1> );
stl::transform(input1, input2, result, transform_op);
return stl::reduce(result, init, reduction_op);

TransformResult is the result type of the binary transform operation. The input store arguments can be legate_store instances, or they can be views created with one of the view adaptors . If the input stores are views, the result store used to hold the results of the transformation step will be a view of the same shape as the first input store.

See also

Parameters:

input1 – The first input range to transform.
input2 – The second input range to transform.
init – The initial value of the reduction.
reduction_op – The reduction operation to apply to the transformed elements of the input ranges. Reduction can be a type that satisfies the legate_reduction concept or one of the standard functional objects std::plus, std::minus, std::multiplies, std::divides, etc.; or an elementwise operation created by passing any of the above to stl::elementwise.
transform_op – The binary operation to apply to the elements of the two input ranges prior to the reduction step.

Pre:

InputRange1 and InputRange2 must satisfy the @c logical_store_like concept. @liInit` must satisfy the logical_store_like concept.
The shape of the input ranges must be the same.
The result type of the binary transform must be the same as the value type of the initial reduction value.
The dimensionality of the input ranges must be one greater than the dimension of the reduction initial value.

Returns:

An instance of logical_store with the same value type and shape as init.

Utilities#

group Utilities

Functions

template<typename Function> unspecified elementwise(Function &&fn)

A functional adaptor that, given a callable object fn, returns another callable object g that applies fn element-wise to its arguments.

The arguments to g must be mdspan objects or models of the logical_store_like concept. The shapes of the input arguments must all match. The element-wise application of f is performed lazily; i.e., the result is not computed until the elements of the result are accessed.

Example:

// Perform element-wise addition of the rows of two logical stores,
// assigning the result element-wise into the rows of the first.
stl::logical_store<int, 2> store1 = {
  {1, 2, 3, 4},  // row 0
  {2, 3, 4, 5},  // row 1
  {3, 4, 5, 6}   // row 2
};
stl::logical_store<int, 2> store2 = {
  {10, 20, 30, 40},  // row 0
  {20, 30, 40, 50},  // row 1
  {30, 40, 50, 60}   // row 2
};
stl::transform(stl::rows_of(store1),
               stl::rows_of(store2),
               stl::rows_of(store1),
               stl::elementwise(std::plus<>{}));

// store1 now contains:
// [[11 22 33 44]   // row 0
//  [22 33 44 55]   // row 1
//  [33 44 55 66]]  // row 2

Note

The Legate.STL algorithms recognize the return type of \(\mathtt{elementwise(fn)(}A^1,A^2\cdots,A^n\mathtt{)}\) such that assigning its result to an mdspan object will perform an element-wise assignment. The element-wise assignment is done with thrust::copy and will be accelerated if CUDA support is enabled.

Parameters:: fn – The callable object to apply element-wise.
Returns:: A callable object \(\mathtt{g}\) such that, given multi-dimensional arguments \(A^1,A^2\cdots,A^n\), the expression \(\mathtt{g(}A^1,A^2\cdots,A^n\mathtt{)}\) returns a multi-dimensional view \(\mathtt{V}\) where \(\mathtt{V}_{i,j,\ldots}\) is the result of calling \(\mathtt{fn(}{A^1}_{i,j,\ldots}, {A^2}_{i,j,\ldots}, \cdots, {A^n}_{i,j,\ldots}\mathtt{)}\).

template<LaunchParam... Params> void launch_task(Params... params)

A function that launches a task with the given inputs, outputs, scalars, and constraints.

Launch parameter arguments can be one of the following in any order:

legate::experimental::stl::inputs - specifies the input stores for the task
- Example:
```
inputs(store1, store2, store3)
```
legate::experimental::stl::outputs - specifies the output stores for the task
- Example:
```
outputs(store1, store2, store3)
```
legate::experimental::stl::scalars - specifies the scalar arguments for the task
- Example:
```
scalars(42, 3.14f)
```
legate::experimental::stl::function - specifies the function to be applied iteratively to the inputs.
- The function will take as arguments the current elements of the input stores, in order, followed by the current elements of the output stores. The elements of a stl::logical_store are lvalue references to the elements of the physical store it represents. The elements of a view such as stl::rows_of(store) are mdspans denoting the rows of store.
- The function must be bitwise copyable.
- Only one of function or reduction can be specified in a call to launch_task
- Example:
```
function([](const auto& in, auto& out) { out = in * in; })
```
legate::experimental::stl::reduction - specifies the reduction store and the reduction function to be applied to the inputs.
- The function must be bitwise copyable.
- The reduction function must take as mdspans referring to parts of the input stores.
- The reduction store can be a logical_store or some view of a store, such as rows_of(store). When operating on a view, the arguments to the reduction function will be the elements of the view. For example, if the reduction store is rows_of(store), the arguments passed to the reduction function will be mdspans denoting rows of store.
- Only one of function or reduction can be specified in a call to launch_task
- Example:
```
stl::reduction(stl::rows_of(store), stl::elementwise(std::plus{}))
```
legate::experimental::stl::constraints - specifies the constraints for the task.
- A constraint is a callable that takes an legate::AutoTask& and the input, output, and reduction stores as arguments. Its function signature must be:
```
void(legate::AutoTask&,                // the task to add the constraints to
     const std::vector<LogicalStore>&, // the input stores
     const std::vector<LogicalStore>&, // the output stores
     const LogicalStore&)              // the reduction store
```
- Legate.STL provides one constraint generator, legate::experimental::stl::align, for specifying the alignment constraints for the task. It can be used many different ways:
  - align(inputs[0], inputs[1]) - aligns the first input with the second input
  - align(inputs[0], outputs[0]) - aligns the first input with the first output
  - align(outputs[0], inputs) - aligns the first output with all the inputs
  - align(outputs, inputs[1]) - aligns all the outputs with the second input
  - align(reduction, inputs[0]) - aligns the reduction store with the first input
  - align(reduction, inputs) - aligns the reduction store with all the input
  - align(inputs) - aligns all the inputs with each other
  - align(outputs) - aligns all the outputs with each other

Example

The following use of launch_task is equivalent to stl::transform(input, output op):

stl::launch_task(stl::function(detail::UnaryTransform{std::move(op)}),
                 stl::inputs(std::forward<InputRange>(input)),
                 stl::outputs(std::forward<OutputRange>(output)),
                 stl::constraints(stl::align(stl::inputs[0], stl::outputs[0])));

Variables

ResourceConfig LEGATE_STL_RESOURCE_CONFIG = {1024, 1024, 64, 0, 0}

Configuration for the Legate STL resource.

This constant represents the configuration for the Legate STL resource. It specifies (in order):

the maximum number of tasks,
the maximum number of dynamic tasks,
the maximum number of reduction operations,
the maximum number of projections, and
the maximum number of shardings

that can be used in a program using Legate.STL.

See also

initialize_library

class launch_task

#include <legate/experimental/stl/detail/launch_task.hpp>

A class that represents a task launcher.

The launch_task class provides a convenient interface for launching tasks in the Legate framework. It supports both iteration tasks and reduction tasks. The tasks are created and submitted to the runtime using the provided inputs, outputs, scalars, and constraints.

class initialize_library

#include <legate/experimental/stl/detail/registrar.hpp>

A class that initializes the Legate runtime and creates the legate.stl library instance.

The initialize_library class is responsible for creating a library instance. The initialization fails if the runtime has not started. If the initialization is successful, it creates a library with the name "legate.stl".

The library instance is automatically destroyed when the initialize_library object goes out of scope.

It is harmless to create multiple initialize_library objects in the same program.

See also

LEGATE_STL_RESOURCE_CONFIG

Public Functions

inline initialize_library()

Constructs an initialize_library object.

This constructor creates a library instance for Legate STL. The initialization fails if the runtime has not started. If the initialization is successful, it creates a library with the name "legate.stl".

Throws:: std::runtime_error – If the runtime has not started