Knowledge bases

The shared entity, candidate, and edge models and the Wikidata, reconciliation, and glazing connectors, which bind to the KnowledgeBase port. For usage see Guides > Knowledge bases.

Value models

lairs.integrations.kb.Entity

Bases: Model

A resolved knowledge-base entity.

ATTRIBUTE DESCRIPTION
ref

The canonical identifier or URI of the entity.

TYPE: str

label

The primary label.

TYPE: str

aliases

Alternative surface forms.

TYPE: tuple of str, optional

types

The entity's type identifiers.

TYPE: tuple of str, optional

description

A short description, when available.

TYPE: (str or None, optional)

same_as

Cross-references to the same entity in other knowledge bases.

TYPE: tuple of str, optional

lairs.integrations.kb.Candidate

Bases: Model

A ranked entity-linking candidate.

ATTRIBUTE DESCRIPTION
ref

The candidate identifier or URI.

TYPE: str

label

The candidate label.

TYPE: str

score

The ranking score.

TYPE: float

lairs.integrations.kb.Edge

Bases: Model

A directed edge in a knowledge-base neighbourhood.

ATTRIBUTE DESCRIPTION
source

The source entity identifier or URI.

TYPE: str

relation

The relation identifier.

TYPE: str

target

The target entity identifier or URI.

TYPE: str

Wikidata

lairs.integrations.kb.wikidata

Wikidata knowledge-base connector.

Resolves and links against Wikidata, the hub other knowledge bases reconcile to. The transport is the public Wikidata REST, action, and SPARQL endpoints over :mod:httpx (a core dependency), so no extra is required. The connector creates its own :class:httpx.Client with a descriptive User-Agent so the public Wikidata Query Service and action API (which return HTTP 403 for requests with a default user agent) accept it; inject a client to override headers or supply a mock transport.

The lairs[wikidata] extra (qwikidata / SPARQLWrapper) is declared for callers who prefer those clients in their own code. This connector does not import or use them; the httpx transport above is the only path here.

DEFAULT_USER_AGENT module-attribute

DEFAULT_USER_AGENT = f"lairs/{__version__} (https://github.com/aaronstevenwhite/lairs)"

The default User-Agent sent to the public Wikidata endpoints.

The Wikidata Query Service and action API reject requests carrying the default :mod:httpx user agent with HTTP 403, so the connector's own client identifies itself per their user-agent policy.

DEFAULT_SPARQL_ENDPOINT module-attribute

DEFAULT_SPARQL_ENDPOINT = (
    "https://query.wikidata.org/sparql"
)

The public Wikidata Query Service SPARQL endpoint.

DEFAULT_API_ENDPOINT module-attribute

DEFAULT_API_ENDPOINT = 'https://www.wikidata.org/w/api.php'

The public Wikidata action API endpoint (search).

DEFAULT_ENTITY_ENDPOINT module-attribute

DEFAULT_ENTITY_ENDPOINT = (
    "https://www.wikidata.org/wiki/Special:EntityData"
)

The public Wikidata linked-data entity endpoint (resolve).

DEFAULT_SEARCH_LIMIT module-attribute

DEFAULT_SEARCH_LIMIT = 10

The default number of candidates requested from the search API.

DEFAULT_NEIGHBOR_LIMIT module-attribute

DEFAULT_NEIGHBOR_LIMIT = 100

The default number of edges requested from the SPARQL endpoint.

WikidataError

Bases: ValueError

A Wikidata reference or property was not a well-formed identifier.

Raised when a value destined for direct interpolation into a SPARQL query (an entity QID or a property identifier) does not match the Wikidata identifier grammar, so a malformed or injected token never reaches the endpoint.

WikidataKB

WikidataKB(
    endpoint: str = DEFAULT_SPARQL_ENDPOINT,
    *,
    api_endpoint: str = DEFAULT_API_ENDPOINT,
    entity_endpoint: str = DEFAULT_ENTITY_ENDPOINT,
    lang: str = "en",
    limit: int = DEFAULT_SEARCH_LIMIT,
    client: Client | None = None,
)

A connector to Wikidata over its public REST, action, and SPARQL APIs.

PARAMETER DESCRIPTION
endpoint

The SPARQL endpoint used by :meth:neighbors.

TYPE: str DEFAULT: DEFAULT_SPARQL_ENDPOINT

api_endpoint

The action API endpoint used by :meth:search.

TYPE: str DEFAULT: DEFAULT_API_ENDPOINT

entity_endpoint

The linked-data entity endpoint used by :meth:resolve.

TYPE: str DEFAULT: DEFAULT_ENTITY_ENDPOINT

lang

The default label language.

TYPE: str DEFAULT: 'en'

limit

The default number of candidates requested from the search API. This also scales the synthetic rank-decay score :meth:search assigns (see that method), so changing it widens both the result set and the score spread.

TYPE: int DEFAULT: DEFAULT_SEARCH_LIMIT

client

An injected HTTP client. When omitted, a private client is created with :data:DEFAULT_USER_AGENT and closed with this connector; injecting one lets a caller override headers or supply a mock transport.

TYPE: Client or None DEFAULT: None

close

close() -> None

Close the underlying HTTP client if this connector owns it.

resolve

resolve(ref: str) -> Entity

Resolve a Wikidata identifier to an entity.

PARAMETER DESCRIPTION
ref

The Wikidata identifier or URI (for example Q42).

TYPE: str

RETURNS DESCRIPTION
Entity

The resolved entity.

RAISES DESCRIPTION
HTTPStatusError

If the entity endpoint returns a non-success status.

search

search(
    text: str,
    *,
    lang: str | None = None,
    types: Sequence[str] | None = None,
) -> list[Candidate]

Search Wikidata for candidate entities.

Uses the wbsearchentities action API, which returns hits already ranked by relevance but carries no per-hit relevance score. Each candidate therefore receives a synthetic rank-decay score 1 - rank / limit (the top hit scores near 1.0, later hits less), clamped to 0.0 so a result set larger than limit never yields a negative score. The ordering is faithful; the magnitudes are not calibrated probabilities.

Type constraints are not expressible in this API and are ignored; use :class:~lairs.integrations.kb.reconciliation.ReconciliationKB against the Wikidata reconciliation endpoint for type-filtered search.

PARAMETER DESCRIPTION
text

The surface text to link.

TYPE: str

lang

A language filter; defaults to the connector language.

TYPE: str or None DEFAULT: None

types

Ignored by the action API.

TYPE: collections.abc.Sequence of str or None DEFAULT: None

RETURNS DESCRIPTION
list of lairs.integrations.kb.Candidate

The ranked candidates.

RAISES DESCRIPTION
HTTPStatusError

If the action API returns a non-success status.

neighbors

neighbors(
    ref: str, *, rels: Sequence[str] | None = None
) -> list[Edge]

Expand a Wikidata entity's neighbourhood via SPARQL.

PARAMETER DESCRIPTION
ref

The Wikidata identifier or URI to expand.

TYPE: str

rels

Property identifiers (for example P31) to restrict the expansion to; all direct statements are returned when omitted.

TYPE: collections.abc.Sequence of str or None DEFAULT: None

RETURNS DESCRIPTION
list of lairs.integrations.kb.Edge

The neighbouring edges.

RAISES DESCRIPTION
WikidataError

If ref or any rels entry is not a well-formed identifier.

HTTPStatusError

If the SPARQL endpoint returns a non-success status.

Reconciliation

lairs.integrations.kb.reconciliation

Generic W3C/OpenRefine reconciliation knowledge-base connector.

A single reconciliation adapter speaks to any endpoint exposing the W3C / OpenRefine reconciliation service API (Wikidata, VIAF, Getty, ORCID, ...), so the entity-linking path is unified. Transport is :mod:httpx, a core dependency, so this connector needs no optional extra.

The reconciliation service API is request/response over a single base URL. A queries POST returns ranked candidates per query; the optional data extension, suggest, and preview services let resolve and neighbors recover an entity and its properties where the endpoint advertises them. A connector that points at an endpoint missing a needed service fails with a clear, actionable message rather than silently returning nothing.

DEFAULT_SEARCH_LIMIT module-attribute

DEFAULT_SEARCH_LIMIT = 10

The default number of candidates requested per reconciliation query.

ReconciliationError

Bases: RuntimeError

A reconciliation endpoint did not support a requested capability.

Raised when an endpoint omits a service (data extension, suggest, preview) that a method needs, so the caller sees an actionable message instead of an empty result.

ReconciliationKB

ReconciliationKB(
    endpoint: str,
    client: Client | None = None,
    *,
    limit: int = DEFAULT_SEARCH_LIMIT,
)

A connector to any reconciliation-service endpoint.

PARAMETER DESCRIPTION
endpoint

The reconciliation service base URL.

TYPE: str

client

An injected HTTP client. When omitted, a private client is created and closed with this connector. Injecting a client lets a caller carry auth headers or a mock transport.

TYPE: Client or None DEFAULT: None

limit

The default number of candidates requested per query.

TYPE: int DEFAULT: DEFAULT_SEARCH_LIMIT

close

close() -> None

Close the underlying HTTP client if this connector owns it.

search

search(
    text: str,
    *,
    lang: str | None = None,
    types: Sequence[str] | None = None,
) -> list[Candidate]

Reconcile surface text to candidate entities.

PARAMETER DESCRIPTION
text

The surface text to reconcile.

TYPE: str

lang

A language filter, passed through to the endpoint when given.

TYPE: str or None DEFAULT: None

types

Type constraints, passed through to the endpoint when given.

TYPE: collections.abc.Sequence of str or None DEFAULT: None

RETURNS DESCRIPTION
list of lairs.integrations.kb.Candidate

The ranked candidates.

RAISES DESCRIPTION
HTTPStatusError

If the endpoint returns a non-success status.

resolve

resolve(ref: str) -> Entity

Resolve an identifier to an entity via the reconciliation service.

Resolution uses the optional data-extension service to recover the entity's sameAs properties; the endpoint must advertise an extend service in its manifest. The label is fetched independently, best-effort, from the suggest service (see :meth:_preview_label), so an entity with a resolvable label but no cross-references still carries its label, and an endpoint without a suggest service yields an empty label rather than failing.

PARAMETER DESCRIPTION
ref

The identifier to resolve.

TYPE: str

RETURNS DESCRIPTION
Entity

The resolved entity.

RAISES DESCRIPTION
ReconciliationError

If the endpoint does not advertise a data-extension service.

HTTPStatusError

If the endpoint returns a non-success status.

neighbors

neighbors(
    ref: str, *, rels: Sequence[str] | None = None
) -> list[Edge]

Expand an entity's neighbourhood via the data-extension service.

Each extended property whose cells carry entity identifiers becomes an edge ref -> property -> target.

PARAMETER DESCRIPTION
ref

The identifier to expand.

TYPE: str

rels

Relation filters; when given, only these property identifiers are requested.

TYPE: collections.abc.Sequence of str or None DEFAULT: None

RETURNS DESCRIPTION
list of lairs.integrations.kb.Edge

The neighbouring edges.

RAISES DESCRIPTION
ReconciliationError

If the endpoint does not advertise a data-extension service.

HTTPStatusError

If the endpoint returns a non-success status.

glazing

lairs.integrations.kb.glazing

Lexical-semantic knowledge-base connector backed by glazing.

Grounds lemmas, senses, frames, and rolesets against FrameNet, PropBank, VerbNet, and WordNet through the glazing library's unified, type-safe interface, with SemLink-style cross-reference resolution. Requires the lairs[lexical] extra (glazing>=0.2) at runtime; glazing is imported lazily inside the connector, never at module import, so importing this module never pulls in the optional dependency.

The connector maps glazing's typed search results onto :class:Candidate, resolved entries onto :class:Entity, and cross-reference links onto :class:Edge, mapping link confidences onto edge weights via the relation label. glazing's objects are consumed through narrow :class:~typing.Protocol shims so this module stays strictly typed without importing the library or using Any.

XrefLinks = dict[
    str, str | list[str] | dict[str, dict[str, float]]
]

The shape of a glazing cross-reference resolution.

Most keys map a target-resource name to a list of resolved identifiers (or a single identifier string for a one-to-one relation); the confidence_scores key maps each relation to a per-target score mapping.

GlazingNotInstalledError

Bases: ImportError

The optional glazing library is not installed.

Raised, with an actionable install hint, when a glazing-backed method is called but the lairs[lexical] extra is absent. Subclasses :class:ImportError so callers can catch it as an import problem.

GlazingKB

GlazingKB(
    data_dir: str | None = None,
    *,
    default_source: str = "propbank",
)

A lexical-semantic connector over glazing's four resources.

PARAMETER DESCRIPTION
data_dir

The local glazing data directory; defaults to the glazing default established by glazing init.

TYPE: str or None DEFAULT: None

default_source

The resource a bare identifier (one without a resource: prefix) is attributed to in :meth:resolve, :meth:neighbors, and the resource-prefix split of :meth:search. Defaults to propbank; set it to verbnet, framenet, or wordnet when working with bare identifiers from another resource.

TYPE: str DEFAULT: 'propbank'

RAISES DESCRIPTION
GlazingNotInstalledError

Lazily, when a glazing-backed method is first called and the lairs[lexical] extra is not installed. Construction never imports glazing.

search

search(
    text: str,
    *,
    lang: str | None = None,
    types: Sequence[str] | None = None,
) -> list[Candidate]

Search the lexical resources for candidate entries.

PARAMETER DESCRIPTION
text

The lemma or surface form to search.

TYPE: str

lang

Ignored; glazing's resources are English-only.

TYPE: str or None DEFAULT: None

types

Resource names (framenet, propbank, verbnet, wordnet) to restrict the results to; all resources when omitted.

TYPE: collections.abc.Sequence of str or None DEFAULT: None

RETURNS DESCRIPTION
list of lairs.integrations.kb.Candidate

The ranked candidates, identifiers prefixed by their resource.

RAISES DESCRIPTION
GlazingNotInstalledError

If the glazing library is not installed.

resolve

resolve(ref: str) -> Entity

Resolve a lexical identifier to an entity.

The entity's same_as carries the resolved cross-references so a single resolve doubles as a SemLink lookup.

PARAMETER DESCRIPTION
ref

The lexical identifier, optionally resource:id (for example propbank:give.01); a bare id is attributed to the connector's default_source.

TYPE: str

RETURNS DESCRIPTION
Entity

The resolved lexical entity.

RAISES DESCRIPTION
GlazingNotInstalledError

If the glazing library is not installed.

neighbors

neighbors(
    ref: str, *, rels: Sequence[str] | None = None
) -> list[Edge]

Expand a lexical entry's cross-references.

Each resolved link becomes an edge whose relation is the target resource and whose weight is folded into the relation when glazing reports a confidence below one (for example verbnet_classes@0.85).

PARAMETER DESCRIPTION
ref

The lexical identifier, optionally resource:id.

TYPE: str

rels

Target-resource keys (for example verbnet_classes) to restrict the expansion to.

TYPE: collections.abc.Sequence of str or None DEFAULT: None

RETURNS DESCRIPTION
list of lairs.integrations.kb.Edge

The cross-reference edges (for example VerbNet class links).

RAISES DESCRIPTION
GlazingNotInstalledError

If the glazing library is not installed.