Operations#

Operations in Legate are by default automatically parallelized. Legate extracts parallelism from an operation by partitioning its store arguments. Operations usually require the partitions to be aligned in some way; e.g., partitioning vectors across multiple addition tasks requires the vectors to be partitioned in the same way. Legate provides APIs for developers to control how stores are partitioned via partitioning constraints.

When an operation needs a store to be partitioned more than one way, the operation can create partition symbols and use them in partitioning constraints. In that case, a partition symbol must be passed along with the store when the store is added. Stores can be partitioned in multiple ways when they are used only for read accesses or reductions.

AutoTask#

AutoTask is a type of tasks that are automatically parallelized. Each Legate task is associated with a task id that uniquely names a task to invoke. The actual task implementation resides on the C++ side.

AutoTask.add_input(store[, partition])

Adds a store as input to the task

AutoTask.add_output(store[, partition])

Adds a store as output to the task

AutoTask.add_reduction(store, redop[, partition])

Adds a store to the task for reduction

AutoTask.add_scalar_arg(value, dtype)

Adds a by-value argument to the task

AutoTask.declare_partition(store[, ...])

Creates a partition symbol for the store

AutoTask.add_constraint(constraint)

Adds a partitioning constraint to the operation

AutoTask.add_alignment(store1, store2)

Sets an alignment between stores.

AutoTask.add_broadcast(store[, axes])

Sets a broadcasting constraint on the store.

AutoTask.throws_exception(exn_type)

Declares that the task can raise an exception.

AutoTask.can_raise_exception

Indicates whether the task can raise an exception

AutoTask.add_nccl_communicator()

Adds a NCCL communicator to the task

AutoTask.add_cpu_communicator()

Adds a CPU communicator to the task

AutoTask.side_effect

Indicates whether the task has side effects

AutoTask.set_concurrent(concurrent)

Sets whether the task needs a concurrent task launch.

AutoTask.set_side_effect(side_effect)

Sets whether the task has side effects or not.

AutoTask.execute()

Submits the operation to the runtime.

Copy#

Copy is a special kind of operation for copying data from one store to another. Unlike tasks that are mapped to and run on application processors, copies are performed by the DMA engine in the runtime. Also, unlike tasks that are user-defined, copies have well-defined semantics and come with predefined partitioning assumptions on stores. Hence, copies need not take partitioning constraints from developers.

A copy can optionally take a store for indices that need to be used in accessing the source or target. With an indirection store on the source, the copy performs a gather operation, and with an indirection on the target, the copy does a scatter; when indirections exist for both the source and target, the copy turns into a full gather-scatter copy. Out-of-bounds indices are not checked and can produce undefined behavior. The caller therefore is responsible for making sure the indices are within bounds.

Copy.add_input(store)

Adds a store as a source of the copy

Copy.add_output(store)

Adds a store as a target of the copy.

Copy.add_reduction(store, redop)

Adds a store as a reduction target of the copy.

Copy.add_source_indirect(store)

Adds an indirection for sources.

Copy.add_target_indirect(store)

Adds an indirection for targets.

Copy.execute()

Submits the operation to the runtime.

Fill#

Fill is a special kind of operation for filling a store with constant values. Like coipes, fills are performed by the DMA engine and their partitioning constraints are predefined.

Fill.execute()

Submits the operation to the runtime.

Manually Parallelized Tasks#

In some occassions, tasks are unnatural or even impossible to write in the auto-parallelized style. For those occassions, Legate provides explicit control on how tasks are parallelized via ManualTask. Each manual task requires the caller to provide a launch domain that determines the degree of parallelism and also names task instances initiaed by the task. Direct store arguments to a manual task are assumed to be replicated across task instances, and it’s the developer’s responsibility to partition stores. Mapping between points in the launch domain and colors in the color space of a store partition is assumed to be an identity mapping by default, but it can be configured with a projection function, a Python function on tuples of coordinates. (See StorePartition for definitions of color, color space, and store partition.)

ManualTask.side_effect

Indicates whether the task has side effects

ManualTask.set_concurrent(concurrent)

Sets whether the task needs a concurrent task launch.

ManualTask.set_side_effect(side_effect)

Sets whether the task has side effects or not.

ManualTask.add_input(arg[, proj])

Adds a store as input to the task

ManualTask.add_output(arg[, proj])

Adds a store as output to the task

ManualTask.add_reduction(arg, redop[, proj])

Adds a store to the task for reduction

ManualTask.add_scalar_arg(value, dtype)

Adds a by-value argument to the task

ManualTask.throws_exception(exn_type)

Declares that the task can raise an exception.

ManualTask.can_raise_exception

Indicates whether the task can raise an exception

ManualTask.add_nccl_communicator()

Adds a NCCL communicator to the task

ManualTask.add_cpu_communicator()

Adds a CPU communicator to the task

ManualTask.execute()

Submits the operation to the runtime.