Records

The record surface: the BlobRef value type, the generated-safe view helpers, and the generated pub.layers.* record namespaces. The namespace modules under lairs.records._generated are emitted by lairs gen from the vendored lexicons and must not be hand-edited. See code generation.

Blob reference

lairs.records.blobref.BlobRef

Bases: Model

An immutable reference to an ATProto blob.

ATTRIBUTE DESCRIPTION
cid

The content identifier of the blob.

TYPE: str

mime_type

The MIME type of the blob, when known.

TYPE: (str or None, optional)

size

The size of the blob in bytes, when known.

TYPE: (int or None, optional)

View helpers

Behavior over the generated models, never replacements for them.

lairs.records.views.anchor_kind

anchor_kind(anchor: Model) -> str

Return the kind of an anchor model.

PARAMETER DESCRIPTION
anchor

An anchor model from the generated records. Its mutually-exclusive optional properties name the anchor variant.

TYPE: Model

RETURNS DESCRIPTION
str

The set anchor property name (for example "textSpan" or "temporalSpan"), or "none" when no anchor property is set.

lairs.records.views.explode_layer

explode_layer(
    layer: Model,
) -> Iterator[dict[str, JsonValue]]

Explode an annotation layer into one row per annotation.

PARAMETER DESCRIPTION
layer

An annotationLayer model instance. Its annotations field is the tuple of per-annotation models to flatten.

TYPE: Model

YIELDS DESCRIPTION
dict

One row per annotation, carrying the annotation index, the layer's kind and subkind context, the resolved anchor_kind of the annotation, and the annotation's own dumped fields. The row is JSON-valued so it feeds the Arrow materialisation without further conversion.

Generated record namespaces

One module per pub.layers.* namespace. Each class mirrors a lexicon record, object, or union definition. Unions render as dx.TaggedUnion families with their discriminator. Permission-set (OAuth scope) lexicons and method-only namespaces (query, procedure, subscription) contribute no record types and emit no module.

The shared provenance models that the produce records embed live in defs: Licensing (an optional SPDX expression plus an array of LicenseRef, covering single, dual, multi, composite, and component-scoped licensing) and ReproducibilityInfo (code URI, commit, command, environment, seed). The bibliographic models live in eprint: Citation (a raw string and/or structured CSL-JSON / DataCite fields), Creator (CSL name parts plus DataCite nameType / affiliation and ORCID / ROR grounding), and Date (structured or free-form, CSL style).

alignment

lairs.records._generated.alignment

Generated models for the pub.layers.alignment lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

Alignment

Bases: Model

An alignment between two parallel sequences. The sequences can be tokenizations, annotation layers, expressions (for parallel text), or tiers. Links establish many-to-many correspondence between elements indexed by position.

annotation

lairs.records._generated.annotation

Generated models for the pub.layers.annotation lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

ArgumentRef

Bases: Model

A role/argument reference in a predicate-argument structure. Uses the composable objectRef to point to another annotation, either locally (same layer, by UUID) or remotely (cross-layer or cross-record, by AT-URI + UUID).

Annotation

Bases: Model

A single abstract annotation. The fields populated depend on the layer's kind/subkind. For token-tags: tokenIndex + label. For spans: anchor + label. For trees: anchor + label + parentId/childIds. For relations: anchor + arguments. For graphs: anchor + arguments or headIndex/targetIndex. This single type replaces the former tag, spanAnnotation, entityMention, situationMention, dependencyArc, parseNode, etc.

AnnotationLayer

Bases: Model

A named layer of annotations over an expression. All annotation types use this single record type. The combination of kind, subkind, and formalism tells the appview how to render. Multiple layers can coexist for the same expression.

Cluster

Bases: Model

A cluster of annotations (e.g., coreferent entity mentions, situation mentions referring to the same situation).

ClusterSet

Bases: Model

Groups annotations into equivalence classes. Used for coreference resolution (entity clusters, situation clusters), bridging anaphora grouping, and any annotation clustering task.

changelog

lairs.records._generated.changelog

Generated models for the pub.layers.changelog lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

ChangeItem

Bases: Model

An individual change entry. The targets field uses objectRef for machine-readable sub-record targeting, allowing a change item to point at specific objects within the subject record.

ChangeSection

Bases: Model

A group of changes under a single category.

SemanticVersion

Bases: Model

A semantic version following the major.minor.patch convention.

Entry

Bases: Model

A changelog entry describing changes to any Layers record.

corpus

lairs.records._generated.corpus

Generated models for the pub.layers.corpus lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

AdjudicationSpec

Bases: Model

How disagreements between annotators are resolved into a final annotation.

QualityCriterion

Bases: Model

An acceptance criterion for annotation quality.

RedundancySpec

Bases: Model

How many annotators work on each item and how they are assigned.

AnnotationDesign

Bases: Model

Annotation project design parameters: annotator assignment, adjudication, and quality criteria.

Corpus

Bases: Model

A corpus: a curated collection of expressions.

Membership

Bases: Model

A record indicating that a expression belongs to a corpus, with optional split assignment.

defs

lairs.records._generated.defs

Generated models for the pub.layers.defs lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

KnowledgeRef

Bases: Model

A reference to an external knowledge base entry. Supports ATProto-native KBs (e.g., chive.pub with AT-URI nodes), external KBs (e.g., Wikidata with QIDs), and user/persona-specific KBs (AT-URIs in user PDSes).

AgentRef

Bases: Model

A composable reference to any agent (human annotator, ML model, crowd worker, expert panel, etc.) that produced data. Separates the identity of the producer from the interpretive framework (persona) and the software used (tool). Consumers dispatch on which field(s) are populated: did for ATProto-native agents, id for anonymized or platform-specific identifiers, knowledgeRef for externally grounded agents (ORCID, HuggingFace model card, Wikidata).

Feature

Bases: Model

A single key-value feature.

FeatureMap

Bases: Model

An open-ended set of typed key-value features that can be attached to any annotation. Provides maximum extensibility without committing to any label set or linguistic theory.

Bases: Model

A single link in an alignment between two parallel sequences. Maps element(s) in a source sequence to element(s) in a target sequence. Supports many-to-many correspondence for interlinear glossing, parallel text alignment, cross-tokenization mapping, etc.

FragmentSelector

Bases: Model

W3C FragmentSelector: selects by URI fragment identifier.

TextPositionSelector

Bases: Model

W3C TextPositionSelector adapted for ATProto: selects by UTF-8 byte offsets. Semantically equivalent to pub.layers.defs#span but named for W3C compatibility with at.margin.

TextQuoteSelector

Bases: Model

W3C TextQuoteSelector: selects text by quoting it with surrounding context. Compatible with at.margin.annotation and the W3C Web Annotation Data Model.

ExternalTargetSelector

Bases: TaggedUnion

W3C selector for identifying the specific segment within the resource.

ExternalTargetSelectorTextQuoteSelector

Bases: ExternalTargetSelector

The 'textQuoteSelector' member of ExternalTargetSelector.

ExternalTargetSelectorTextPositionSelector

Bases: ExternalTargetSelector

The 'textPositionSelector' member of ExternalTargetSelector.

ExternalTargetSelectorFragmentSelector

Bases: ExternalTargetSelector

The 'fragmentSelector' member of ExternalTargetSelector.

ExternalTarget

Bases: Model

Target for annotating external resources (web pages, documents, etc.). Compatible with at.margin's target model and the W3C Web Annotation Data Model.

BoundingBox

Bases: Model

A spatial bounding box for image or video frame annotation.

Span

Bases: Model

A contiguous span of text defined by UTF-8 byte offsets.

PageAnchor

Bases: Model

Anchor to a specific page and region in a paged document (PDF, etc.). Compatible with chive.pub's page-level annotation model.

Keyframe

Bases: Model

A spatial annotation at a specific time point.

TemporalSpan

Bases: Model

A temporal span within a media source, defined by start and end times in milliseconds.

SpatioTemporalAnchor

Bases: Model

Combined spatial and temporal anchor for video annotation with keyframe-based tracking.

Uuid

Bases: Model

A universally unique identifier for cross-referencing annotation objects.

TokenRef

Bases: Model

A reference to a specific token within a tokenization, by index.

TokenRefSequence

Bases: Model

A sequence of token references, possibly non-contiguous, within a single tokenization.

Anchor

Bases: Model

Abstract anchor: how an annotation attaches to its source data. This is a polymorphic type; at least one anchoring field should be present. Consumers dispatch on which field(s) are populated.

ObjectRef

Bases: Model

A composable reference to any Layers object, whether local (same record, by UUID), remote (different record, by AT-URI + optional object UUID), or external (knowledge graph entry). This is the universal cross-referencing primitive; consumers dispatch on which field(s) are populated. Used by argumentRef, graphNode, alignment endpoints, and any other cross-object pointer.

AnnotationMetadata

Bases: Model

Metadata about who or what produced an annotation, when, and with what confidence. The three key provenance fields are: agent (who did it), personaRef (under what framework), and tool (with what software).

Constraint

Bases: Model

An abstract constraint expression. Used for type constraints on role slots, slot-level constraints in templates, cross-slot agreement constraints, and any other declarative restriction. The expression field holds a DSL string whose format is identified by expressionFormat/expressionFormatUri.

LicenseRef

Bases: Model

Detail for a single license. Follows the URI+slug pattern (spdxUri is the canonical knowledge-graph node, spdx is the human-readable fallback) and mirrors one entry of a DataCite rightsList (rightsIdentifier + rightsURI).

Licensing

Bases: Model

Complete licensing terms for a released artifact. Represents single, dual/multi (choose-one), composite (all-apply), exception (WITH), and component-scoped licensing. The SPDX license expression encodes the boolean relationship between licenses; the licenses array carries per-license detail.

ReproducibilityInfo

Bases: Model

Information about how to reproduce a dataset or the data produced from an eprint. Shared by data-producing produces (corpus, annotation layers, experiments) and eprint data links.

SpatialEntity

Bases: Model

A normalized spatial value representing a point, region, line, or complex geometry. Parallel to temporalEntity. Subsumes GeoJSON geometry types, WKT primitives, and ISO 19107 spatial schema. Consumers dispatch on which fields are populated: bbox only (pixel bounding box), geometry+type (parsed geometry string), geometry+geometryFormat (format-specific parsing).

SpatialModifier

Bases: Model

Qualitative modification of a spatial value. Parallel to temporalModifier. Indicates precision, derivation method, or processing applied to a spatial entity.

SpatialExpression

Bases: Model

A complete spatial annotation packaging the expression type, normalized value, modifier, anchoring, and document function. Parallel to temporalExpression. Subsumes ISO-Space place annotations (ISO 24617-7), SpatialML PLACE elements, and general spatial semantic annotation. Attach to annotation objects via the spatial field.

TemporalEntity

Bases: Model

A normalized temporal value representing a point, interval, duration, or uncertain range in calendar/clock time. Subsumes OWL-Time TemporalEntity (Instant, Interval, Duration) and TimeML TIMEX3 value. Consumers dispatch on which fields are populated: instant only (point), intervalStart+intervalEnd (bounded interval), duration only (pure duration), earliest+latest (uncertain bounds), recurrence (repeating pattern).

TemporalModifier

Bases: Model

Qualitative modification of a temporal value. Subsumes TimeML TIMEX3 mod attribute and OWL-Time DateTimeDescription qualifiers.

TemporalExpression

Bases: Model

A complete temporal annotation packaging the expression type, normalized value, modifier, anchoring, and document function. Subsumes TimeML TIMEX3 and OWL-Time GeneralDateTimeDescription. Attach to annotation objects via the temporal field.

eprint

lairs.records._generated.eprint

Generated models for the pub.layers.eprint lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

Creator

Bases: Model

A bibliographic creator (author, editor, etc.). Name parts follow CSL-JSON; nameType and affiliation follow DataCite. ATProto-native creators additionally ground identity via agent (DID) or knowledgeRef (ORCID, ROR, OpenAlex).

Date

Bases: Model

A date in CSL-JSON style: structured year/month/day and/or a free-form literal. Maps to CSL 'date-parts' (when year/month/day are set) or 'literal'/'raw' (when only literal is set).

Citation

Bases: Model

A bibliographic citation, expressible as a raw formatted string (raw) and/or structured fields following CSL-JSON and DataCite conventions. Consumers prefer the structured fields when present and fall back to raw. Populate at least raw or title.

Bases: Model

A link from an eprint to the Layers data it produced or is associated with. Generalizes the former chive-specific eprintDataLink.

Eprint

Bases: Model

A link between a Layers data record and an eprint.

expression

lairs.records._generated.expression

Generated models for the pub.layers.expression lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

Expression

Bases: Model

An expression record representing a linguistic data source or unit at any granularity, from full documents down to individual morphemes.

graph

lairs.records._generated.graph

Generated models for the pub.layers.graph lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

GraphEdge

Bases: Model

A single directed typed edge between any two Layers objects. Supports multidigraphs and cycles. Source and target use objectRef, which can point to local UUIDs, remote AT-URIs, or external knowledge graph nodes.

GraphEdgeEntry

Bases: Model

A single directed edge entry within a graphEdgeSet.

GraphEdgeSet

Bases: Model

A batch of typed, directed edges between Layers objects. Use for bulk edge creation when many edges share the same provenance and context.

GraphNode

Bases: Model

A standalone node in the property graph. Represents entities, concepts, situations, claims, or any domain object that does not have another Layers record. Existing Layers records (expressions, annotations, typeDefs) are implicitly nodes via objectRef.

judgment

lairs.records._generated.judgment

Generated models for the pub.layers.judgment lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

AgreementReport

Bases: Model

An inter-annotator agreement report summarizing agreement metrics across judgment sets.

ListConstraint

Bases: Model

A constraint on item list construction for an experiment.

ExperimentDesign

Bases: Model

Experiment design parameters for item distribution, ordering, and timing.

PresentationSpec

Bases: Model

How stimuli are displayed to participants.

RecordingMethod

Bases: Model

A data capture instrument used in an experiment.

ExperimentDef

Bases: Model

Definition of an annotation or judgment experiment.

Judgment

Bases: Model

A single judgment about a linguistic item.

JudgmentSet

Bases: Model

A set of judgments from a single annotator for an experiment.

media

lairs.records._generated.media

Generated models for the pub.layers.media lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

AudioInfo

Bases: Model

Composable audio metadata. Attach to any media record representing audio content: standalone audio files, audio tracks in video, etc.

DocumentInfo

Bases: Model

Composable document/image metadata. Attach to any media record representing scanned documents, manuscripts, printed text, or other page-based media for OCR/HTR annotation workflows.

VideoInfo

Bases: Model

Composable video metadata. Attach to any media record representing video content.

Media

Bases: Model

A media source record (audio, video, image, or document) that can be referenced by expressions and annotations. Modality-specific metadata lives in composable audioInfo/videoInfo/documentInfo objects.

ontology

lairs.records._generated.ontology

Generated models for the pub.layers.ontology lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

Ontology

Bases: Model

An annotation ontology: a collection of typed definitions (entity types, situation types, role types, relation types) that together form a complete annotation framework.

RoleSlot

Bases: Model

A role/argument slot in a frame or situation type definition. Structurally parallel to pub.layers.resource#slot: both represent named positions with type constraints. roleSlot is ontology-level (what roles a frame type allows); resource slot is template-level (what variables a template exposes).

TypeDef

Bases: Model

A type definition within an ontology. Covers entity types, situation types, role types, and relation types in a single unified model.

persona

lairs.records._generated.persona

Generated models for the pub.layers.persona lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

Persona

Bases: Model

A persona representing an annotator's role, expertise, and interpretive framework.

resource

lairs.records._generated.resource

Generated models for the pub.layers.resource lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

Collection

Bases: Model

A named collection of linguistic resource entries. Abstract enough to represent bead Lexicons, FrameNet frame inventories, PropBank frame files, WordNet synsets, morphological paradigm tables, gazetteers, stop-word lists, etc.

CollectionMembership

Bases: Model

Links an entry to a collection. Separate record enables many-to-many relationships (an entry can belong to multiple collections) and decentralized curation (anyone can propose membership).

MweComponent

Bases: Model

A component of a multi-word expression entry.

Entry

Bases: Model

A linguistic resource entry: a lexical item, frame element filler, morphological paradigm cell, or any atomic unit in a structured linguistic collection. Abstract enough to represent bead LexicalItems, FrameNet lexical units, PropBank rolesets, WordNet synset members, morphological paradigm cells, etc.

SlotFilling

Bases: Model

A single slot→filler mapping in a filled template. The filler can be an entry reference (AT-URI to a resource entry), a literal value, or both (entry reference with rendered surface form).

Filling

Bases: Model

A filled template: a template with all slots mapped to specific fillers, producing a rendered text. Generalizes bead's FilledTemplate and Item. The rendered text can optionally be materialized as a pub.layers.expression for annotation. Fillings are composable: they reference templates, entries, and communications via AT-URIs.

Slot

Bases: Model

A named variable slot in a template. Generalizes bead's Slot (template variable position with constraints and defaults), ontology roleSlots (argument positions with filler type constraints), and similar parameterized positions in any structured linguistic pattern. Slots are composable: they can reference collections of allowed fillers, ontology types, or express arbitrary constraints.

Template

Bases: Model

A parameterized text template with named variable slots. Generalizes bead's Template (text pattern with {slotName} placeholders, slot definitions, and cross-slot constraints) and similar pattern structures used for stimulus generation, item construction, data augmentation, and controlled natural language. Templates are composable: they can reference ontologies, collections, and other templates.

TemplateMember

Bases: Model

A member in a template composition. References either a template or a nested composition.

TemplateComposition

Bases: Model

A composition of templates (sequence, tree, or other structure). Used for multi-part stimuli, template hierarchies, and complex item construction.

segmentation

lairs.records._generated.segmentation

Generated models for the pub.layers.segmentation lexicon namespace.

This module is emitted by lairs gen from the vendored lexicons and must not be edited by hand. Each class mirrors a lexicon record, object, or union definition.

Token

Bases: Model

A single token within a tokenization.

Tokenization

Bases: Model

An ordered sequence of tokens for an expression or sub-expression. Multiple tokenizations can coexist for the same expression (e.g., whitespace vs. BPE vs. morphological), enabling interlinear glossing, alternative segmentation strategies, or multi-granularity analysis.

Segmentation

Bases: Model

A segmentation of an expression into tokenizations. Structural hierarchy (sections, sentences, paragraphs) is expressed via expression records with parentRef; this record provides the token-level decomposition.