Readers
DataAxesFormats.Readers
—
Module
The
DafReader
interface specifies a high-level API for reading
Daf
data. This API is implemented here, on top of the low-level
FormatReader
API. The high-level API provides thread safety so the low-level API can (mostly) ignore this issue.
Each data set is given a name to use in error messages etc. You can explicitly set this name when creating a
Daf
object. Otherwise, when opening an existing data set, if it contains a scalar "name" property, it is used. Otherwise some reasonable default is used. In all cases, object names are passed through
unique_name
to avoid ambiguity.
Data properties are identified by a unique name given the axes they are based on. That is, there is a separate namespace for scalar properties, vector properties for each specific axis, and matrix properties for each unordered pair of axes.
For matrices, we keep careful track of their
MatrixLayouts
. Returned matrices are always in column-major layout, using
relayout!
if necessary. As this is an expensive operation, we'll cache the result in memory. Similarly, we cache the results of applying a query to the data. We allow clearing the cache to reduce memory usage, if necessary.
The data API is the high-level API intended to be used from outside the package, and is therefore re-exported from the top-level
Daf
namespace. It provides additional functionality on top of the low-level
FormatReader
implementation, accepting more general data types, automatically dealing with
relayout!
when needed. In particular, it enforces single-writer multiple-readers for each data set, so the format code can ignore multi-threading and still be thread-safe.
In the APIs below, when getting a value, specifying a
default
of
undef
means that it is an
error
for the value not to exist. In contrast, specifying a
default
of
nothing
means it is OK for the value not to exist, returning
nothing
. Specifying an actual value for
default
means it is OK for the value not to exist, returning the
default
instead. This is in spirit with, but not identical to,
undef
being used as a flag for array construction saying "there is no initializer". If you feel this is an abuse of the
undef
value, take some comfort in that it is the default value for the
default
, so you almost never have to write it explicitly in your code.
DataAxesFormats.Readers.description
—
Function
description(daf::DafReader[; deep::Bool = false, cache::Bool = false])::AbstractString
Return a (multi-line) description of the contents of
daf
. This tries to hit a sweet spot between usefulness and terseness. If
cache
, also describes the content of the cache. If
deep
, also describes any data set nested inside this one (if any).
Scalar properties
DataAxesFormats.Readers.has_scalar
—
Function
has_scalar(daf::DafReader, name::AbstractString)::Bool
Check whether a scalar property with some
name
exists in
daf
.
DataAxesFormats.Readers.scalars_set
—
Function
scalars_set(daf::DafReader)::AbstractSet{<:AbstractString}
The names of the scalar properties in
daf
.
There's no immutable set type in Julia for us to return. If you do modify the result set, bad things will happen.
DataAxesFormats.Readers.get_scalar
—
Function
get_scalar(
daf::DafReader,
name::AbstractString;
[default::Union{StorageScalar, Nothing, UndefInitializer} = undef]
)::Maybe{StorageScalar}
Get the value of a scalar property with some
name
in
daf
.
If
default
is
undef
(the default), this first verifies the
name
scalar property exists in
daf
. Otherwise
default
will be returned if the property does not exist.
Readers axes
DataAxesFormats.Readers.has_axis
—
Function
has_axis(daf::DafReader, axis::AbstractString)::Bool
Check whether some
axis
exists in
daf
.
DataAxesFormats.Readers.axes_set
—
Function
axes_set(daf::DafReader)::AbstractSet{<:AbstractString}
The names of the axes of
daf
.
There's no immutable set type in Julia for us to return. If you do modify the result set, bad things will happen.
DataAxesFormats.Readers.axis_array
—
Function
axis_array(
daf::DafReader,
axis::AbstractString;
[default::Union{Nothing, UndefInitializer} = undef]
)::Maybe{AbstractVector{<:AbstractString}}
The array of unique names of the entries of some
axis
of
daf
. This is similar to doing
get_vector
for the special
name
property, except that it returns a simple vector (array) of strings instead of a
NamedVector
.
If
default
is
undef
(the default), this verifies the
axis
exists in
daf
. Otherwise, the
default
is
nothing
, which is returned if the
axis
does not exist.
DataAxesFormats.Readers.axis_dict
—
Function
axis_dict(daf::DafReader, axis::AbstractString)::AbstractDict{<:AbstractString, <:Integer}
Return a dictionary converting axis entry names to their integer index.
DataAxesFormats.Readers.axis_indices
—
Function
axis_indices(daf::DafReader, axis::AbstractString, entries::AbstractVector{<:AbstractString})::AbstractVector{<:Integer}
Return a vector of the indices of the
entries
in the
axis
.
DataAxesFormats.Readers.axis_length
—
Function
axis_length(daf::DafReader, axis::AbstractString)::Int64
The number of entries along the
axis
in
daf
.
This first verifies the
axis
exists in
daf
.
Vector properties
DataAxesFormats.Readers.has_vector
—
Function
has_vector(daf::DafReader, axis::AbstractString, name::AbstractString)::Bool
Check whether a vector property with some
name
exists for the
axis
in
daf
. This is always true for the special
name
property.
This first verifies the
axis
exists in
daf
.
DataAxesFormats.Readers.vectors_set
—
Function
vectors_set(daf::DafReader, axis::AbstractString)::AbstractSet{<:AbstractString}
The names of the vector properties for the
axis
in
daf
,
not
including the special
name
property.
This first verifies the
axis
exists in
daf
.
There's no immutable set type in Julia for us to return. If you do modify the result set, bad things will happen.
DataAxesFormats.Readers.get_vector
—
Function
get_vector(
daf::DafReader,
axis::AbstractString,
name::AbstractString;
[default::Union{StorageScalar, StorageVector, Nothing, UndefInitializer} = undef]
)::Maybe{NamedVector}
Get the vector property with some
name
for some
axis
in
daf
. The names of the result are the names of the vector entries (same as returned by
axis_array
). The special property
name
returns an array whose values are also the (read-only) names of the entries of the axis.
This first verifies the
axis
exists in
daf
. If
default
is
undef
(the default), this first verifies the
name
vector exists in
daf
. Otherwise, if
default
is
nothing
, it will be returned. If it is a
StorageVector
, it has to be of the same size as the
axis
, and is returned. If it is a
StorageScalar
. Otherwise, a new
Vector
is created of the correct size containing the
default
, and is returned.
Matrix properties
DataAxesFormats.Readers.has_matrix
—
Function
has_matrix(
daf::DafReader,
rows_axis::AbstractString,
columns_axis::AbstractString,
name::AbstractString;
[relayout::Bool = true]
)::Bool
Check whether a matrix property with some
name
exists for the
rows_axis
and the
columns_axis
in
daf
. Since this is Julia, this means a column-major matrix. A daf may contain two copies of the same data, in which case it would report the matrix under both axis orders.
If
relayout
(the default), this will also check whether the data exists in the other layout (that is, with flipped axes).
This first verifies the
rows_axis
and
columns_axis
exists in
daf
.
DataAxesFormats.Readers.matrices_set
—
Function
matrices_set(
daf::DafReader,
rows_axis::AbstractString,
columns_axis::AbstractString;
[relayout::Bool = true]
)::AbstractSet{<:AbstractString}
The names of the matrix properties for the
rows_axis
and
columns_axis
in
daf
.
If
relayout
(default), then this will include the names of matrices that exist in the other layout (that is, with flipped axes).
This first verifies the
rows_axis
and
columns_axis
exist in
daf
.
There's no immutable set type in Julia for us to return. If you do modify the result set, bad things will happen.
DataAxesFormats.Readers.get_matrix
—
Function
get_matrix(
daf::DafReader,
rows_axis::AbstractString,
columns_axis::AbstractString,
name::AbstractString;
[default::Union{StorageReal, StorageMatrix, Nothing, UndefInitializer} = undef,
relayout::Bool = true]
)::Maybe{NamedMatrix}
Get the column-major matrix property with some
name
for some
rows_axis
and
columns_axis
in
daf
. The names of the result axes are the names of the relevant axes entries (same as returned by
axis_array
).
If
relayout
(the default), then if the matrix is only stored in the other memory layout (that is, with flipped axes), then automatically call
relayout!
to compute the result. If
daf
isa
DafWriter
, then store the result for future use; otherwise, just cache it as
MemoryData
. This may lock up very large amounts of memory; you can call
empty_cache!
to release it.
This first verifies the
rows_axis
and
columns_axis
exist in
daf
. If
default
is
undef
(the default), this first verifies the
name
matrix exists in
daf
. Otherwise, if
default
is
nothing
, it is returned. If
default
is a
StorageMatrix
, it has to be of the same size as the
rows_axis
and
columns_axis
, and is returned. Otherwise, a new
Matrix
is created of the correct size containing the
default
, and is returned.
Utilities
DataAxesFormats.Readers.axis_version_counter
—
Function
axis_version_counter(daf::DafReader, axis::AbstractString)::UInt32
Return the version number of the axis. This is incremented every time
delete_axis!
is called. It is used by interfaces to other programming languages to minimize copying data.
This is purely in-memory per-instance, and
not
a global persistent version counter. That is, the version counter starts at zero even if opening a persistent disk
daf
data set.
DataAxesFormats.Readers.vector_version_counter
—
Function
vector_version_counter(daf::DafReader, axis::AbstractString, name::AbstractString)::UInt32
Return the version number of the vector. This is incremented every time
set_vector!
,
empty_dense_vector!
or
empty_sparse_vector!
are called. It is used by interfaces to other programming languages to minimize copying data.
This is purely in-memory per-instance, and
not
a global persistent version counter. That is, the version counter starts at zero even if opening a persistent disk
daf
data set.
DataAxesFormats.Readers.matrix_version_counter
—
Function
matrix_version_counter(
daf::DafReader,
rows_axis::AbstractString,
columns_axis::AbstractString,
name::AbstractString
)::UInt32
Return the version number of the matrix. The order of the axes does not matter. This is incremented every time
set_matrix!
,
empty_dense_matrix!
or
empty_sparse_matrix!
are called. It is used by interfaces to other programming languages to minimize copying data.
This is purely in-memory per-instance, and
not
a global persistent version counter. That is, the version counter starts at zero even if opening a persistent disk
daf
data set.
Index
-
DataAxesFormats.Readers -
DataAxesFormats.Readers.axes_set -
DataAxesFormats.Readers.axis_array -
DataAxesFormats.Readers.axis_dict -
DataAxesFormats.Readers.axis_indices -
DataAxesFormats.Readers.axis_length -
DataAxesFormats.Readers.axis_version_counter -
DataAxesFormats.Readers.description -
DataAxesFormats.Readers.get_matrix -
DataAxesFormats.Readers.get_scalar -
DataAxesFormats.Readers.get_vector -
DataAxesFormats.Readers.has_axis -
DataAxesFormats.Readers.has_matrix -
DataAxesFormats.Readers.has_scalar -
DataAxesFormats.Readers.has_vector -
DataAxesFormats.Readers.matrices_set -
DataAxesFormats.Readers.matrix_version_counter -
DataAxesFormats.Readers.scalars_set -
DataAxesFormats.Readers.vector_version_counter -
DataAxesFormats.Readers.vectors_set