Changes: 25.10#

General

  • Add support for CUDA 13.

  • Define LEGION_DISABLE_DEPRECATED_ENUMS in legate_defines.h, which disables the use of deprecated Legion enum values. Uses of these values will now result in compile-time errors.

C++#

General

  • Add config option --profile-name to customize the base filename for profiling output.

  • Add new UCX networking backend. On supporting builds (LEGATE_DEFINED(LEGATE_USE_UCX) being 1), CPU collectives can be routed through UCX by disabling MPI (LEGATE_CONFIG=--disable-mpi).

  • Remove support for Cal communicator.

  • Remove LEGATE_USE_CAL macro definition.

  • Support for cal communicator via legate.AutoTask.add_communicator("cal") and legate.ManualTask.add_communicator("cal") have been removed.

  • Remove legate/cuda/cuda.h.

  • Remove LEGATE_CHECK_CUDA() and LegateCheckCUDA().

  • Remove LEGATE_CHECK_CUDA_STREAM() and LegateCheckCUDAStream().

  • Remove LEGATE_THREADS_PER_BLOCK, LEGATE_MIN_CTAS_PER_SM, LEGATE_MAX_REDUCTION_CTAS, and LEGATE_WARP_SIZE. These were internal symbols accidentally exposed via legate/cuda/cuda.h and have been privatized as part of its removal.

Data

  • Make legate::ScopedAllocator copy-constructible.

  • Add legate::ScopedAllocator::allocate_aligned().

  • Add legate::ScopedAllocator::allocate_type().

  • Add legate::LogicalArray::as_struct_array().

  • Add legate::StructLogicalArray.

Mapping

Partitioning

  • Add overload of legate::align() that takes a span of variables to align as a convenience for aligning multiple task arguments.

  • Add overload of legate::broadcast() that takes a span of variables to broadcast as a convenience for broadcasting multiple task arguments.

  • Add overload of legate::broadcast() that takes a span of pairs of variables and axes to broadcast as a convenience for broadcasting multiple task arguments.

  • Add LogicalStore::get_partition(). This method allows users to access the partition used by the runtime for optimizing task launches and data movement.

Tasks

Types

Tuning

  • Change legate::ParallelPolicy to take an enum type legate::StreamingMode instead of a boolean flag for streaming task execution.

Runtime

  • Add legate::Runtime::create_struct_array().

Utilities

  • Switch legate::Span to being an alias to cuda::std::span instead of the homegrown implementation. As legate::Span mirrored the interface of std::span (which cuda::std::span does as well), this change should be invisible to users.

  • Remove previously deprecated classes legate::cuda::StreamPool and legate::cuda::StreamView.

I/O

  • Change HDF5 Virtual File Driver (VFD) GPUDirectStorage (GDS) to enabled by default if Legate was built with support for it and Legate determines that GDS is likely to work. Previously this feature was disabled but could be toggled on via the --io-use-vfd-gds LEGATE_CONFIG option.

    Users should note that there is presently no way to reliably know ahead of time (i.e. before attempting cuFile calls) whether the filesystem supports GDS. Legate employs several heuristics to determine viability that – while rare – can provide both false positives and false negatives. Users relying on I/O performance who want this feature enabled should ensure it is on via the flag (as before), while users that encounter false positives should disable it via the flag and raise a bug report at nv-legate/legate#issues.

  • Add legate::io::hdf5::to_file() to write a legate::LogicalArray() to file using HDF5.

Python#

General

  • Remove legate.core.AutoTask.add_cal_communicator().

  • Remove legate.core.ManualTask.add_cal_communicator().

  • Remove support for passing "cal" to legate.core.AutoTask.add_communicator() and legate.core.ManualTask.add_communicator().

Data

  • Add legate.core.LogicalArray.as_struct_array().

  • Add legate.core.StructLogicalArray.

Mapping

  • Add legate.core.DimOrdering and legate.core.DimOrderingKind for dimension ordering

  • Add an optional argument of type DimOrderingKind to legate.core.Runtime.create_from_buffer() that denotes the dimension ordering kind

Partitioning

  • Change legate.core.align() to return an iterable of constraints instead of a single value.

  • Change legate.core.broadcast() to return an iterable of constraints instead of a single value.

  • Add legate.core.LogicalStore.partition. This property allows users to access the partition used by the runtime for optimizing task launches and data movement.

Tasks

  • Change registration of tasks to now be performed lazily. It is no longer necessary to call task.complete_registration().

  • Accessing the task ID of a task (i.e. invoking task.task_id) now registers the task if it wasn’t already.

  • Remove the register optional argument from legate.core.task.task decorator.

Types

Tuning

  • Change legate.core.ParallelPolicy to take an enum type legate.core.StreamingMode instead of a boolean flag for streaming task execution.

Runtime

  • Expose profiling range functions to Python: legate.core.start_profiling_range() and legate.core.stop_profiling_range().

  • Add legate.core.runtime.config property to access runtime configuration. Note that this API is considered an implementation detail and has no guarantee of stability.

  • Add legate.core.Runtime.create_struct_array().

Utilities

I/O

  • Add legate.io.hdf5.from_file_batched() to read a HDF5 file in batches.

  • Remove legate.io.hdf5.kerchunk_read(). Legate has had first-class support for HDF5 reads for a while, making this function obsolete.

  • Add legate.io.hdf5.to_file() to write a legate.core.LogicalArray to file using HDF5.