Media¶
Media resolution and anchor-aware slicing. resolve_media resolves a
media record to a media handle, fetching bytes lazily through injected
ports; decoding is a separate step. resolve_anchor dispatches an
anchor to the slice of the target it points at. The audio, video, and
neural decode paths require the matching lairs[...] extra at runtime,
but the millisecond-to-sample math, slicing, and box interpolation are
pure Python.
Resolution¶
Dispatches on blob versus external URI, fetching lazily through injected fetcher and cache ports.
lairs.media.resolve ¶
Media resolution: a media record resolves to a decoded handle.
resolve_media dispatches on blob versus external URI, fetches lazily, and
caches by content identifier. The returned MediaHandle is a didactic model
that carries the raw bytes in an opaque field with typed metadata alongside.
The transport (blob fetch) and the on-disk cache are owned by other components,
so they are injected through the small BlobFetcher and BlobCache
protocols rather than implemented here; an HTTP fetcher for externalUri is
likewise injected. When no fetcher is supplied the handle is returned with
typed metadata only and bytes are left empty for a later decode.
BlobFetcher ¶
Bases: Protocol
A port that fetches blob bytes for a repository by content identifier.
Component B (the ATProto client) supplies a concrete implementation; the media layer only depends on this shape.
get_blob ¶
Return the bytes of a blob.
| PARAMETER | DESCRIPTION |
|---|---|
did
|
The DID of the repository holding the blob.
TYPE:
|
cid
|
The content identifier of the blob.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
bytes
|
The blob bytes. |
UriFetcher ¶
Bases: Protocol
A port that fetches bytes for an externally hosted media URI.
get_uri ¶
Return the bytes of an external resource.
| PARAMETER | DESCRIPTION |
|---|---|
uri
|
The URI of the external resource.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
bytes
|
The fetched bytes. |
BlobCache ¶
Bases: Protocol
A content-addressed cache port for resolved bytes.
Component C (the store) supplies a concrete implementation; the media layer only depends on this shape.
Blob bytes are cached under their content identifier (CID), which is content-addressed. External-URI bytes are cached under the URI itself, which is not content-addressed: if the resource at a URI changes, a cached entry can serve stale bytes until it is evicted. The media layer does not verify fetched bytes against a CID; integrity checking, if required, is the responsibility of the injected cache or fetcher.
exists ¶
Return whether a content identifier is cached.
| PARAMETER | DESCRIPTION |
|---|---|
cid
|
The content identifier to check.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
bool
|
|
get ¶
Return cached bytes for a content identifier.
| PARAMETER | DESCRIPTION |
|---|---|
cid
|
The content identifier to read.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
bytes
|
The cached bytes. |
put ¶
Store bytes under a content identifier.
| PARAMETER | DESCRIPTION |
|---|---|
cid
|
The content identifier to write under.
TYPE:
|
data
|
The bytes to store.
TYPE:
|
MediaHandle ¶
Bases: Model
A resolved media handle holding raw bytes and typed metadata.
The raw media bytes live in an opaque field; the modality, MIME type, and
duration are typed metadata so callers never inspect the payload blindly.
When data is empty the handle is metadata-only and bytes are fetched on
a later decode.
| ATTRIBUTE | DESCRIPTION |
|---|---|
cid |
The content identifier of the resolved media.
TYPE:
|
mime_type |
The MIME type of the media.
TYPE:
|
modality |
The modality (
TYPE:
|
duration_ms |
The media duration in milliseconds, when known.
TYPE:
|
external_uri |
The external URI, when the media is externally hosted.
TYPE:
|
data |
The raw media bytes, carried as an opaque payload.
TYPE:
|
resolve_media ¶
resolve_media(
media: Model,
*,
did: str | None = None,
blob_fetcher: BlobFetcher | None = None,
uri_fetcher: UriFetcher | None = None,
cache: BlobCache | None = None,
) -> MediaHandle
Resolve a media record to a media handle, fetching bytes lazily.
Dispatches on whether the record carries a blob or an externalUri.
A cached payload is returned directly; otherwise, when a matching fetcher is
supplied, the bytes are fetched and cached. With no fetcher the handle is
metadata-only (empty data) so callers can decide when to fetch.
| PARAMETER | DESCRIPTION |
|---|---|
media
|
A
TYPE:
|
did
|
The DID of the repository holding the blob, required to fetch a blob.
TYPE:
|
blob_fetcher
|
The injected blob transport (Component B).
TYPE:
|
uri_fetcher
|
The injected external-URI transport.
TYPE:
|
cache
|
The injected content-addressed cache (Component C).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
MediaHandle
|
The resolved handle, with bytes populated when a fetch succeeded. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If the record carries neither a blob nor an external URI. |
Anchors¶
Unified anchor resolution over byte spans, token refs, temporal spans, bounding boxes, and spatio-temporal anchors.
lairs.media.anchors ¶
Unified anchor resolution over all anchor kinds.
resolve_anchor dispatches over byte spans, token refs, temporal spans,
page anchors, external targets, bounding boxes, and spatio-temporal anchors,
returning the corresponding slice or view of the right target (text, tokens,
audio, video frame, or signal). It is the single API the dataset layer calls
for the data an annotation points at.
The Layers anchor is an object whose optional variant fields select the
anchor kind. The generated Anchor model carries seven variants
(externalTarget, pageAnchor, spatioTemporalAnchor, temporalSpan,
textSpan, tokenRef, tokenRefSequence); every one is dispatched here.
Because the generated record models are not required, dispatch is structural:
the wrapper's set variant is found and the variant model's own fields are
probed, tolerating both the camelCase lexicon names and the snake_case
generated names.
A bounding box (BoundingBox) is never a top-level Anchor variant: it
only appears nested inside pageAnchor and inside the keyframes of
spatioTemporalAnchor. resolve_anchor therefore reaches a bounding box
through those variants, but also accepts a bare bounding-box model directly so
callers holding one can crop with it.
AnchorTarget ¶
AnchorTarget = (
str
| tuple[str, ...]
| AudioBuffer
| SignalBuffer
| VideoFrame
| BoundingBox
)
The kinds of slice an anchor can resolve to across the supported targets.
resolve_anchor ¶
resolve_anchor(
anchor: Model, target: AnchorTarget
) -> AnchorTarget
Resolve an anchor to the slice of the target it points at.
| PARAMETER | DESCRIPTION |
|---|---|
anchor
|
An
TYPE:
|
target
|
The data the anchor selects into: expression text (
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AnchorTarget
|
The resolved slice or view, dispatched on the anchor kind. An
|
| RAISES | DESCRIPTION |
|---|---|
TypeError
|
If the anchor kind does not match the supplied target type. |
ValueError
|
If the anchor kind cannot be determined. |
Audio¶
Audio decoding and temporal-span slicing. The decode path requires the
lairs[audio] extra (soundfile).
lairs.media.audio ¶
Audio decoding and temporal-span slicing.
Decodes audio into a sample buffer and slices it by temporal-span anchors,
converting milliseconds to sample indices in a rate-aware way. The buffer is a
didactic model carrying the samples in an opaque field. The decode path
requires the lairs[audio] extra (soundfile) at runtime, but the
millisecond-to-sample math and slicing are pure Python and need no extra.
AudioBuffer ¶
Bases: Model
A decoded audio buffer.
Samples are stored interleaved by channel as a flat tuple of floats: for a
two-channel buffer the layout is (l0, r0, l1, r1, ...). The payload
lives in an opaque field so callers go through the typed helpers rather than
inspecting it blindly.
| ATTRIBUTE | DESCRIPTION |
|---|---|
sample_rate |
The sample rate in hertz.
TYPE:
|
channels |
The channel count.
TYPE:
|
samples |
The interleaved samples, carried as an opaque payload.
TYPE:
|
ms_to_sample ¶
Convert a millisecond offset to a per-channel sample index.
| PARAMETER | DESCRIPTION |
|---|---|
ms
|
The offset in milliseconds.
TYPE:
|
sample_rate
|
The sample rate in hertz.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
int
|
The per-channel sample index, floored to a whole sample. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If |
sample_to_ms ¶
Convert a per-channel sample index to a millisecond offset.
| PARAMETER | DESCRIPTION |
|---|---|
sample
|
The per-channel sample index.
TYPE:
|
sample_rate
|
The sample rate in hertz.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
int
|
The offset in milliseconds, floored to a whole millisecond. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If |
decode_audio ¶
decode_audio(handle: MediaHandle) -> AudioBuffer
Decode a media handle into an audio buffer.
Decoding uses soundfile (the lairs[audio] extra), imported lazily
so importing this module never pulls in the heavy dependency.
| PARAMETER | DESCRIPTION |
|---|---|
handle
|
The resolved media handle to decode.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
The decoded audio buffer with interleaved samples. |
| RAISES | DESCRIPTION |
|---|---|
ModuleNotFoundError
|
If the |
ValueError
|
If the handle carries no bytes to decode. |
slice_by_temporal ¶
slice_by_temporal(
buffer: AudioBuffer, start_ms: int, end_ms: int
) -> AudioBuffer
Slice an audio buffer by a temporal span in milliseconds.
The span is converted to per-channel sample indices in a rate-aware way and the interleaved payload is sliced accordingly. This is pure Python and does not require the audio extra.
| PARAMETER | DESCRIPTION |
|---|---|
buffer
|
The buffer to slice.
TYPE:
|
start_ms
|
The start of the span in milliseconds.
TYPE:
|
end_ms
|
The end of the span in milliseconds.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
AudioBuffer
|
A new buffer holding only the samples in the span. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If the span is reversed ( |
forced_alignment_segments ¶
forced_alignment_segments(
buffer: AudioBuffer,
spans: Iterable[tuple[int, int, str]],
) -> Iterator[tuple[str, AudioBuffer]]
Yield labelled waveform segments for a forced-alignment layer.
Each input span is a (start_ms, end_ms, label) triple, mirroring an
aligned annotation; the corresponding waveform slice is produced lazily.
| PARAMETER | DESCRIPTION |
|---|---|
buffer
|
The buffer to segment.
TYPE:
|
spans
|
The
TYPE:
|
| YIELDS | DESCRIPTION |
|---|---|
tuple of (str, AudioBuffer)
|
Each label paired with its waveform segment. |
Video¶
Video frame access, bounding-box cropping, and keyframe interpolation.
The decode path requires the lairs[video] extra (av).
lairs.media.video ¶
Video decoding, frame access, and bounding-box cropping.
Decodes video frames by time or index, crops frames to bounding boxes, and
resolves spatio-temporal anchors to dense per-frame boxes through keyframe
interpolation. Frames are didactic models carrying pixels in an opaque field.
The decode path requires the lairs[video] extra (av) at runtime, but
the keyframe-interpolation and box math are pure Python and need no extra.
Interpolation ¶
The supported keyframe interpolation modes.
BoundingBox ¶
Bases: Model
An axis-aligned bounding box in pixel coordinates.
| ATTRIBUTE | DESCRIPTION |
|---|---|
x |
The left coordinate in pixels.
TYPE:
|
y |
The top coordinate in pixels.
TYPE:
|
width |
The box width in pixels.
TYPE:
|
height |
The box height in pixels.
TYPE:
|
Keyframe ¶
Bases: Model
A timed bounding box used as a spatio-temporal keyframe.
| ATTRIBUTE | DESCRIPTION |
|---|---|
time_ms |
The keyframe time in milliseconds.
TYPE:
|
box |
The bounding box at this time.
TYPE:
|
VideoFrame ¶
Bases: Model
A single decoded video frame.
| ATTRIBUTE | DESCRIPTION |
|---|---|
index |
The frame index.
TYPE:
|
width |
The frame width in pixels.
TYPE:
|
height |
The frame height in pixels.
TYPE:
|
time_ms |
The frame presentation time in milliseconds, set during decode. This is
the temporal position spatio-temporal anchors interpolate against; it is
not the same as
TYPE:
|
pixels |
The frame pixels, carried as an opaque payload.
TYPE:
|
interpolate_box ¶
interpolate_box(
keyframes: Sequence[Keyframe],
time_ms: int,
interpolation: Interpolation = "linear",
) -> BoundingBox
Resolve the bounding box at a time by interpolating keyframes.
Keyframes are assumed to be ordered by time_ms. Times before the first
or after the last keyframe clamp to the nearest keyframe box.
| PARAMETER | DESCRIPTION |
|---|---|
keyframes
|
The ordered keyframes to interpolate between.
TYPE:
|
time_ms
|
The query time in milliseconds.
TYPE:
|
interpolation
|
The interpolation mode between adjacent keyframes.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
BoundingBox
|
The interpolated bounding box at |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If |
frame_at_ms ¶
frame_at_ms(
handle: MediaHandle, time_ms: int
) -> VideoFrame
Decode the video frame at a given time.
Decoding uses av (the lairs[video] extra), imported lazily so
importing this module never pulls in the heavy dependency.
| PARAMETER | DESCRIPTION |
|---|---|
handle
|
The resolved media handle to decode.
TYPE:
|
time_ms
|
The frame time in milliseconds.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
VideoFrame
|
The decoded frame nearest the requested time. |
| RAISES | DESCRIPTION |
|---|---|
ModuleNotFoundError
|
If the |
ValueError
|
If the handle carries no bytes to decode, |
crop_to_bbox ¶
crop_to_bbox(
frame: VideoFrame, box: BoundingBox
) -> VideoFrame
Crop a frame to a bounding box.
The crop math is pure Python (it only adjusts the frame dimensions and slices the row-major RGB payload), so it does not require the video extra.
| PARAMETER | DESCRIPTION |
|---|---|
frame
|
The frame to crop.
TYPE:
|
box
|
The crop region in pixel coordinates.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
VideoFrame
|
The cropped frame. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If the box falls outside the frame bounds, or the pixel payload does not match a row-major 3-bytes-per-pixel layout of the frame dimensions. |
Neural¶
Multi-channel signal windowing for neural and sensor data. The decode
path requires the lairs[neural] extra (mne).
lairs.media.neural ¶
Neural and time-series signal windowing.
Treats neural and sensor signals as sampling-rate-aware, multi-channel buffers
referenced by a media record, and windows them by temporal-span anchors. The
signal buffer is a didactic model carrying per-channel samples in an opaque
field. The decode path requires the lairs[neural] extra (mne) at
runtime, but the millisecond-to-window math and slicing are pure Python and
need no extra.
decode_signal dispatches on the handle's MIME type to the matching mne
reader and temp-file suffix (FIF, EDF, BDF, EEGLAB SET, BrainVision). A MIME
type mne cannot read raises a clear error rather than being mis-read as FIF.
SignalBuffer ¶
Bases: Model
A decoded multi-channel signal buffer.
Samples are stored per channel as a tuple of per-channel sample tuples,
aligned with channels by position. The payload lives in an opaque field
so callers go through the typed helpers rather than inspecting it blindly.
| ATTRIBUTE | DESCRIPTION |
|---|---|
sample_rate |
The sample rate in hertz.
TYPE:
|
channels |
The ordered channel labels.
TYPE:
|
samples |
The per-channel samples, carried as an opaque payload.
TYPE:
|
ms_to_sample ¶
Convert a millisecond offset to a sample index for a given rate.
| PARAMETER | DESCRIPTION |
|---|---|
ms
|
The offset in milliseconds.
TYPE:
|
sample_rate
|
The sample rate in hertz.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
int
|
The sample index, floored to a whole sample. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If |
decode_signal ¶
decode_signal(handle: MediaHandle) -> SignalBuffer
Decode a media handle into a multi-channel signal buffer.
Decoding uses mne (the lairs[neural] extra), imported lazily so
importing this module never pulls in the heavy dependency. The handle's MIME
type selects the matching mne reader (FIF, EDF, BDF, EEGLAB SET, or
BrainVision) and temp-file suffix; the raw bytes are written to a temporary
file because mne readers operate on paths.
The temporary file is created, written, closed, read, and then removed
explicitly (rather than read while still open) so the path can be reopened by
mne on platforms that do not allow concurrent reopen of an open temp file.
| PARAMETER | DESCRIPTION |
|---|---|
handle
|
The resolved media handle to decode.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
SignalBuffer
|
The decoded signal buffer with per-channel samples. |
| RAISES | DESCRIPTION |
|---|---|
ModuleNotFoundError
|
If the |
ValueError
|
If the handle carries no bytes to decode, or its MIME type names a
format no |
window_by_temporal ¶
window_by_temporal(
buffer: SignalBuffer, start_ms: int, end_ms: int
) -> SignalBuffer
Window a signal buffer by a temporal span in milliseconds.
The span is converted to sample indices in a rate-aware way and every channel is sliced to the same window. This is pure Python and does not require the neural extra.
| PARAMETER | DESCRIPTION |
|---|---|
buffer
|
The buffer to window.
TYPE:
|
start_ms
|
The window start in milliseconds.
TYPE:
|
end_ms
|
The window end in milliseconds.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
SignalBuffer
|
A new buffer holding only the samples in the window. |
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If the window is reversed ( |
select_channels ¶
select_channels(
buffer: SignalBuffer, names: Sequence[str]
) -> SignalBuffer
Select a subset of channels by label, preserving the requested order.
| PARAMETER | DESCRIPTION |
|---|---|
buffer
|
The buffer to subset.
TYPE:
|
names
|
The channel labels to keep, in the desired output order.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
SignalBuffer
|
A new buffer holding only the named channels. |
| RAISES | DESCRIPTION |
|---|---|
KeyError
|
If a requested channel label is not present in the buffer. |
align_events_to_windows ¶
align_events_to_windows(
buffer: SignalBuffer,
events: Iterable[tuple[int, int, str]],
) -> Iterator[tuple[str, SignalBuffer]]
Yield labelled signal windows for a sequence of annotation events.
Each event is a (start_ms, end_ms, label) triple, mirroring an aligned
annotation (a stimulus onset, an epoch); the corresponding multi-channel
window is produced lazily.
| PARAMETER | DESCRIPTION |
|---|---|
buffer
|
The buffer to window.
TYPE:
|
events
|
The
TYPE:
|
| YIELDS | DESCRIPTION |
|---|---|
tuple of (str, SignalBuffer)
|
Each label paired with its signal window. |