Skip to content

Health

grelmicro.health

Health Check Registry.

HealthCheckFunc module-attribute

HealthCheckFunc: TypeAlias = (
    SyncHealthCheckFunc | AsyncHealthCheckFunc
)

Any callable acceptable as a health check.

Returns: - None: healthy, no details. - HealthDetails: healthy, with a details dict.

Raises: - HealthError: unhealthy. The message surfaces in the response. - Any other exception: unhealthy with a generic message. The traceback is logged server-side.

HealthDetails module-attribute

HealthDetails: TypeAlias = dict[str, JSONEncodable]

Per-check details payload. JSON-serializable dict keyed by string.

CheckResult

Bases: TypedDict

Result of a single health check.

status instance-attribute

status: HealthStatus

critical instance-attribute

critical: bool

error instance-attribute

error: str | None

details instance-attribute

details: HealthDetails | None

HealthError

HealthError(
    message: str, *, details: HealthDetails | None = None
)

Bases: GrelmicroError

Signal a check failure. The message is exposed in the response.

Pass details to include a diagnostic payload alongside the error, visible under details on the check entry in /healthz (subject to show_details).

Initialize with a message and optional details dict.

details instance-attribute

details = details

HealthRegistry

HealthRegistry(
    *,
    timeout: PositiveFloat | None = None,
    cache_ttl: NonNegativeFloat | None = None,
    env_prefix: str | None = None,
    read_env: bool = True,
)

Registry that manages health checks and runs them concurrently.

Checks are plain async functions. Register them with the :meth:check decorator or the :meth:add method. All registered checks are executed in parallel via an anyio task group. Each check has its own timeout (falling back to the registry default) and its own cached result. Concurrent requests for the same check share a single execution via an anyio.Event.

Initialize the health registry.

PARAMETER DESCRIPTION
timeout

Default per-check timeout in seconds. Checks that exceed this duration are reported as error.

Default: 5.0. When unset, resolves from the environment variable GREL_HEALTH_TIMEOUT if present, otherwise falls back to the HealthRegistryConfig default.

TYPE: PositiveFloat | None DEFAULT: None

cache_ttl

Per-check cache TTL in seconds. Set to 0 to disable.

Default: 1.0. When unset, resolves from the environment variable GREL_HEALTH_CACHE_TTL if present, otherwise falls back to the HealthRegistryConfig default.

TYPE: NonNegativeFloat | None DEFAULT: None

env_prefix

Override the auto-derived environment variable prefix.

Default: GREL_HEALTH_.

TYPE: str | None DEFAULT: None

read_env

Whether to read environment variables.

Default: True. Set to False when every field is already supplied via kwargs and the environment must not influence construction.

TYPE: bool DEFAULT: True

from_config classmethod

from_config(config: HealthRegistryConfig) -> Self

Construct a HealthRegistry from a pre-built HealthRegistryConfig.

PARAMETER DESCRIPTION
config

The pre-built health registry configuration.

Use this path when the configuration is assembled at startup from a settings tree (for example YAML, Vault, or a pydantic-settings aggregator). The environment path is bypassed and the config is used as-is.

TYPE: HealthRegistryConfig

add

add(
    name: str,
    func: HealthCheckFunc,
    *,
    critical: bool = True,
    timeout: PositiveFloat | None = None,
) -> None

Register a health check function.

PARAMETER DESCRIPTION
name

Unique name identifying this check.

TYPE: str

func

Async function: returns None or a details dict on success, raises on failure.

TYPE: HealthCheckFunc

critical

Whether this check affects the aggregate status and HTTP response code. Critical failures flip the aggregate to error and cause /readyz / /healthz to return 503. Non-critical failures are visible in the /healthz body but do not flip the aggregate.

TYPE: bool DEFAULT: True

timeout

Per-check timeout override. Falls back to the registry default when omitted.

TYPE: PositiveFloat | None DEFAULT: None

RAISES DESCRIPTION
ValueError

If name is already registered, or does not match ^[a-z0-9][a-z0-9:_-]*$ (max 64 chars). Colon is allowed for namespacing, e.g. "weather:circuitbreaker".

check

check(
    name: str,
    *,
    critical: bool = True,
    timeout: PositiveFloat | None = None,
) -> Callable[[HealthCheckFunc], HealthCheckFunc]

Decorate an async function to register it as a health check.

PARAMETER DESCRIPTION
name

Unique name identifying this check.

TYPE: str

critical

Whether this check affects the aggregate status.

TYPE: bool DEFAULT: True

timeout

Per-check timeout override.

TYPE: PositiveFloat | None DEFAULT: None

Example

@registry.check("database") ... async def check_db() -> dict | None: ... return None

run async

run(
    *,
    critical_only: bool = False,
    exclude: Iterable[str] | None = None,
) -> HealthReport

Run the selected checks concurrently and aggregate.

Each check runs with its own timeout. Results are cached per check for cache_ttl seconds. Concurrent calls for the same check coalesce via single-flight.

PARAMETER DESCRIPTION
critical_only

If True, only run critical checks.

TYPE: bool DEFAULT: False

exclude

Check names to skip.

TYPE: Iterable[str] | None DEFAULT: None

RETURNS DESCRIPTION
HealthReport

A HealthReport with the aggregate status and per-check

HealthReport

results.

HealthRegistryConfig

Bases: BaseModel

Health Registry Config.

timeout class-attribute instance-attribute

timeout: PositiveFloat = 5.0

Default per-check timeout in seconds. Checks that exceed this duration are reported as error. Can be overridden per check on registration.

cache_ttl class-attribute instance-attribute

cache_ttl: NonNegativeFloat = 1.0

Per-check cache TTL in seconds. Each check's last result is reused until it is older than cache_ttl. Concurrent calls coalesce via single-flight. Set to 0 to disable caching.

HealthReport

Bases: TypedDict

Aggregated health report across all registered checks.

status instance-attribute

status: HealthStatus

checks instance-attribute

checks: dict[str, CheckResult]

HealthStatus

Bases: StrEnum

Binary health status for a component or aggregate report.

  • OK: the check passed. At the aggregate level: every critical check passed (non-critical failures do not flip the aggregate).
  • ERROR: the check failed. At the aggregate level: at least one critical check failed.

OK class-attribute instance-attribute

OK = 'ok'

ERROR class-attribute instance-attribute

ERROR = 'error'

get_health_registry

get_health_registry(
    name: str = DEFAULT_NAME,
) -> HealthRegistry

Resolve a health registry by name.

RAISES DESCRIPTION
BackendNotLoadedError

If no registry resolves.

grelmicro.health.fastapi

FastAPI Health Check Router.

health_router

health_router(
    registry: HealthRegistry | None = None,
    *,
    prefix: str = "",
    show_details: bool | Depends = False,
    healthz_dependencies: list[Depends] | None = None,
) -> APIRouter

Create a FastAPI router with health check endpoints.

Provides three endpoints:

  • GET/HEAD {prefix}/livez: Liveness probe. Never runs checkers. Always returns 200 with an empty body.
  • GET/HEAD {prefix}/readyz: Readiness probe. Runs critical checkers only. Returns 200 or 503 with an empty body.
  • GET/HEAD {prefix}/healthz: Aggregate JSON report.

All responses set Cache-Control: no-store.

PARAMETER DESCRIPTION
registry

Health registry whose checks the router runs. When omitted, the router resolves the global registry (the registry installed via health.use_registry or entered as an async context manager).

TYPE: HealthRegistry | None DEFAULT: None

prefix

URL prefix for health endpoints (e.g. '/api/v1').

TYPE: str DEFAULT: ''

show_details

Whether /healthz includes each check's verbose details field (versions, hostnames, pool stats, ...):

  • False (default): details are stripped. Safe for public endpoints.
  • True: details are always included. Use only if /healthz is private.
  • Depends(fn) where fn returns bool: wires fn into FastAPI's DI graph, so Depends chains, yield cleanup, Security, Request injection, and async all work naturally. Return True to show details, False to strip them. Raising HTTPException blocks the endpoint, so return False instead when you want a soft strip.

TYPE: bool | Depends DEFAULT: False

healthz_dependencies

FastAPI dependencies applied to /healthz. A failing dependency blocks the entire endpoint (401/403). Use to hide /healthz from the public while leaving /livez and /readyz open to orchestrators and load balancers. Independent of show_details.

TYPE: list[Depends] | None DEFAULT: None

RAISES DESCRIPTION
DependencyNotFoundError

If fastapi is not installed.

TypeError

If show_details is neither a bool nor a Depends(...) value.