io-kvikio#

group KVikIO

I/O operations backed by KVikIO.

Functions

LogicalStore from_file(
const std::filesystem::path &file_path,
const Type &type
)#

Read a LogicalStore from a file.

The store contained in file_path must have been written by a call to to_file(const std::filesystem::path&, const LogicalStore&).

This routine expects the file to contain nothing but the raw data linearly in memory, starting at offset 0. The file must contain no other metadata, padding, or other data, it will be interpreted as data to be read into the store.

Warning

This API is experimental. A future release may change or remove this API without warning, deprecation period, or notice. The user is nevertheless encouraged to use this API, and submit any feedback to legate@nvidia.com.

Parameters:
  • file_path – The path to the file.

  • type – The datatype of the store.

Throws:

std::system_error – If file_path does not exist.

Returns:

LogicalStore The loaded store.

void to_file(
const std::filesystem::path &file_path,
const LogicalStore &store
)#

Write a LogicalStore to a file.

The store must be linear, i.e. have dimension of 1.

Warning

This API is experimental. A future release may change or remove this API without warning, deprecation period, or notice. The user is nevertheless encouraged to use this API, and submit any feedback to legate@nvidia.com.

Parameters:
  • file_path – The path to the file.

  • store – The store to serialize.

Throws:

std::invalid_argument – If the dimension of store is not 1.

LogicalStore from_file(
const std::filesystem::path &file_path,
const Shape &shape,
const Type &type,
const std::vector<std::uint64_t> &tile_shape,
std::optional<std::vector<std::uint64_t>> tile_start = {}
)#

Load a LogicalStore from a file in tiles.

The file must have been written by a call to to_file(). If tile_start is not given, it is initialized with zeros.

tile_start and tile_shape must have the same size.

store must have the same number of dimensions as tiles. In effect store.dim() must equal tile_shape.size().

The store shape must be divisible by the tile shape.

Given some store stored on disk as:

[1, 2, 3, 4, 5, 6, 7, 8, 9]

tile_shape sets the leaf-task launch group size. For example, tile_shape = [3] would result in each leaf-task getting assigned a contiguous triplet of the store:

  task_0     task_1     task_2
____|____  _____|___  ____|____
[1, 2, 3], [4, 5, 6], [7, 8, 9]

tile_start is a local offset into the tile from which to begin reading. Given tile_start = [1], in the above example would mean that the resulting store would be read as:

// First, split into tile_shape shapes.
[1, 2, 3], [4, 5, 6], [7, 8, 9]
// Then apply the offset (1) to each subgroup
   [2, 3],    [5, 6],    [8, 9]

Such that the resulting store would contain:

[2, 3, 5, 6, 8, 9]

Warning

This API is experimental. A future release may change or remove this API without warning, deprecation period, or notice. The user is nevertheless encouraged to use this API, and submit any feedback to legate@nvidia.com.

Parameters:
  • file_path – The path to the dataset.

  • shape – The shape of the resulting store.

  • type – The datatype of the store.

  • tile_shape – The shape of each tile.

  • tile_start – The offsets into each tile from which to read.

Throws:
  • std::system_error – If file_path does not exist.

  • std::invalid_argument – If tile_shape and tile_start are not the same size.

  • std::invalid_argument – If the store dimension does not match the tile shape.

  • std::invalid_argument – If the store shape is not divisible by the tile shape.

Returns:

LogicalStore The loaded store.

void to_file(
const std::filesystem::path &file_path,
const LogicalStore &store,
const std::vector<std::uint64_t> &tile_shape,
std::optional<std::vector<std::uint64_t>> tile_start = {}
)#

Write a LogicalStore to file in tiles.

If tile_start is not given, it is initialized with zeros.

tile_start and tile_shape must have the same size.

store must have the same number of dimensions as tiles. In effect store.dim() must equal tile_shape.size().

The store shape must be divisible by the tile shape.

See from_file() for further discussion on the arguments.

Warning

This API is experimental. A future release may change or remove this API without warning, deprecation period, or notice. The user is nevertheless encouraged to use this API, and submit any feedback to legate@nvidia.com.

Parameters:
  • file_path – The base path of the dataset to write.

  • store – The store to serialize.

  • tile_shape – The shape of the tiles.

  • tile_start – The offsets into each tile from which to write.

Throws:
  • std::invalid_argument – If tile_shape and tile_start are not the same size.

  • std::invalid_argument – If the store dimension does not match the tile shape.

  • std::invalid_argument – If the store shape is not divisible by the tile shape.

LogicalStore from_file_by_offsets(
const std::filesystem::path &file_path,
const Shape &shape,
const Type &type,
const std::vector<std::uint64_t> &offsets,
const std::vector<std::uint64_t> &tile_shape
)#

Load a LogicalStore from a file in tiles.

store must have the same number of dimensions as tiles. In effect store.dim() must equal tile_shape.size().

This routine should be used if each leaf task in a tile should read from a potentially non-uniform offset than the others. If the offset is uniform (i.e. can be deduced by the leaf task index, and the tile shape), then from_file() should be preferred.

For example, given some store (of int32’s) stored on disk as:

[1, 2, 3, 4, 5, 6, 7, 8, 9]

tile_shape sets the leaf-task launch group size. For example, tile_shape = {3} would result in each leaf-task getting assigned a contiguous triplet of the store:

  task_0     task_1     task_2
____|____  _____|___  ____|____
[1, 2, 3], [4, 5, 6], [7, 8, 9]

It also sets the number of elements to read. Each leaf-task will read tile_shape.volume() * type.size() bytes from the file.

offsets encodes the per-leaf-task global offset in bytes into the store for each tile. Crucially, these offsets need not (and by definition shall not) be the same for each leaf task. For example, assuming sizeof(std::int32_t) = 4:

std::vector<std::uint64_t> offsets = {
  // task_0 reads from byte index 0 of the file (i.e. starting from element 0)
  0,
  // task_1 reads from byte index 4 * 3 = 12 of the file (i.e. starting from element 4)
  3 * sizeof(std::int32_t),
  // task_2 reads from byte index 4 * 7 = 28 of the file (i.e. starting from element 8)
  7 * sizeof(std::int32_t),
};

Note how the final offset is arbitrary. If the offsets were uniform, it would start from element 7. The resulting store would then contain:

[1, 2, 3, 4, 5, 6, 8, 9]

If the data is multi-dimensional, the task IDs for the purposes of indexing into offsets are linearized in C order. For example, if we have 2x2 tiles (tile_shape = {2, 2}), the task IDs would be linearized as follows:

(0, 0) -> 0
(0, 1) -> 1
(1, 0) -> 2
(1, 1) -> 3

Warning

This API is experimental. A future release may change or remove this API without warning, deprecation period, or notice. The user is nevertheless encouraged to use this API, and submit any feedback to legate@nvidia.com.

Parameters:
  • file_path – The path to the file to read.

  • shape – The shape of the resulting store.

  • type – The datatype of the store.

  • offsets – The per-leaf-task global offsets (in bytes) into the file from which to read.

  • tile_shape – The shape of each tile.

Throws:
  • std::system_error – If file_path does not exist.

  • std::invalid_argument – If offsets.size() does not equal the number of partitioned store tiles.

Returns:

LogicalStore The loaded store.