Machine and Resource Scoping#

By default, each Legate operation is allowed to use the entire machine for parallelization, but oftentimes client programs want control on the machine resource assigned to each section of the program. Legate provides a programmatic way to control resource assignment, called a resource scoping.

To scope the resource, a client takes two steps. First, the client queries the machine resource available in the given scope and shrinks it to a subset to assign to a scope. Then, the client assigns that subset of the machine to a scope with a usual with statement. All Legate operations issued within that with block are now subject to the resource scoping. The steps look like the following pseudocode:

# Retrieves the machine of the current scope
machine = legate.get_machine()
# Extracts a subset to assign to a nested scope
subset = extract_subset(machine)
# Installs the new machine to a scope
with subset:
  ...

The machine available to a nested scope is always a subset of that for the outer scope. If the machine given to a scope has some resources that are not part of the machine for the outer scope, they will be removed during the resource scoping. The machine used in a scoping must not be empty; otherwise, an EmptyMachineError will be raised.

In cases where a machine has more than one kind of processor, the parallelization heuristic has the following precedence on preference between different types: GPU > OpenMP > CPU.

Metadata about the machine is stored in a Machine object and the Machine class provides APIs for querying and subdivision of resources.

Machine#

`Machine.preferred_target`	TaskTarget
`Machine.valid_targets`	tuple[TaskTarget, ...]
`Machine.get_processor_range`(self[, target])	Returns the processor range of a given task target.
`Machine.get_node_range`(self[, target])	Returns the node range for processor of a given task target.
`Machine.only`(self, targets)	Returns a machine that contains only the processors of given kinds
`Machine.count`(self[, target])	Returns the number of processors of a given task target
`Machine.empty`	bool
`Machine.__and__`(self, Machine other)	Computes an intersection with a given machine
`Machine.__len__`(self)	Returns the number of preferred processors
`Machine.__getitem__`(self, key)	Slices the machine with a given slicer

ProcessorRange#

A ProcessorRange is a half-open interval of global processor IDs.

`ProcessorRange.low`	uint32_t
`ProcessorRange.high`	uint32_t
`ProcessorRange.per_node_count`	uint32_t
`ProcessorRange.empty`	bool
`ProcessorRange.get_node_range`(self)	Returns the range of node IDs for this processor range
`ProcessorRange.slice`(self, slice sl)	Slices the processor range by a given `slice`
`ProcessorRange.__and__`(self, ...)	Computes an intersection with a given processor range
`ProcessorRange.__len__`(self)	Returns the number of processors in the range
`ProcessorRange.__getitem__`(self, key)	Slices the processor range with a given slicer