Development¶

This section is for people working on lairs itself: setting up an environment, running the checks that gate every change, and understanding how the codebase is organized. If you are here to use the library, start with the Tutorial instead.

The short version lives in CONTRIBUTING.md at the repository root. This section expands on it.

Environment¶

lairs targets Python 3.14+ and uses uv for environments and dependencies.

git clone https://github.com/layers-pub/lairs
cd lairs
uv sync

uv sync creates .venv and installs the project together with the dev dependency group. The dev group pulls in every optional extra that has a cp314 wheel, so the test suite exercises the integrations rather than skipping them. TensorFlow has no stable cp314 wheel yet; install the nightly (uv pip install tf-nightly) to exercise the tfdata exporter, which CI does on every run. decord and label-studio-sdk have no cp314 wheel at all, so a few of their tests skip cleanly until upstream publishes one.

Run tools through uv run, or activate the environment and call them directly:

source .venv/bin/activate

The checks¶

Continuous integration runs these on every push and pull request. Run them locally before you push.

uv run ruff format --check lairs tests           # formatting
uv run ruff check lairs tests                    # lint, the ruff "ALL" ruleset
uvx ty check --python .venv --error-on-warning   # static type checking
uv run pytest                                    # the default suite

To fix what the tools can fix automatically:

uv run ruff format lairs tests
uv run ruff check --fix lairs tests

CI runs one more gate: a search that fails if Any or a bare object appears in a type-annotation position anywhere under lairs/ or tests/. Annotate precisely instead, with a protocol, a TypeVar, or a concrete union.

The Testing page covers the suite, its markers, and the local Personal Data Server used by the integration tests.

Project layout¶

lairs/
├── records/         generated pub.layers.* models, BlobRef, normalization
├── atproto/         PDS access: XRPC, CAR/DAG-CBOR decode, firehose, handles
├── store/           the schema-aware content-addressed repository
├── data/            the Dataset and Corpus API, Arrow/Parquet materialisation
├── author/          builders, blob upload, dependency-ordered publishing
├── media/           audio/video/time-series resolution and anchor resolution
├── discovery/       network crawl, the searchable index, the DuckDB accelerator
├── integrations/    codecs, exporters, knowledge bases, experiment tracking
├── tui/             the Textual explorer (Explore, Browse, Query)
├── codegen/         the lexicon-to-model generator behind `lairs gen`
├── lexicons/        the vendored Layers lexicon tree and MANIFEST.toml
└── cli.py           the `lairs` command

Tests mirror this tree under tests/.

Conventions¶

The codebase is deliberately uniform. New code matches the code around it.

didactic models for all structured data. No dataclasses, TypedDict, or pydantic for record-shaped values.
No Any in annotations (enforced in CI via ruff ANN401); sound object, narrowed before use, is allowed.
Imports at module top level. Function- or method-level imports are a ruff error (PLC0415). The only exception is a lazy import of a heavy optional extra that must not load unless its extra is installed; never silence the rule for a stdlib, core-dependency, or first-party import.
Numpy-style docstrings on every public module, class, and function. didactic models take **kwargs, so document their fields under Attributes, not Parameters, for mkdocstrings to render them.
Public API through __all__, named exports preferred.

Documentation¶

The docs are built with MkDocs and mkdocstrings (numpy docstring style).

uv run --group docs mkdocs serve            # live preview
uv run --group docs mkdocs build --strict   # the gate: zero warnings

mkdocs build --strict must finish with no warnings. When you add or rename a public symbol, confirm its docstring renders, and wire any new page into the nav in mkdocs.yml.

Record models¶

The pub.layers.* models under lairs/records/_generated/ are generated, not written. Regenerate them rather than editing them:

uv run lairs gen           # regenerate from the vendored lexicons
uv run lairs gen --check   # drift gate: fail if the committed models are stale

Adopting a new Layers lexicon version is a re-vendor followed by a regenerate. The Code generation page covers the pipeline; the user-facing walkthrough is Vendoring and codegen.

Releasing¶

The end-to-end release procedure, from version bump to PyPI upload and docs deploy, is on the Releasing page.

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search