Engine Module¶
The engine module implements the inference runtime pillar. All backends
implement the InferenceEngine ABC with generate(), stream(),
list_models(), and health() methods. The discovery subsystem probes
running engines and selects the best available backend based on
configuration and health checks.
Abstract Base Class¶
InferenceEngine¶
InferenceEngine
¶
Bases: ABC
Base class for all inference engine backends.
Subclasses must be registered via
@EngineRegistry.register("name") to become discoverable.
Functions¶
generate
abstractmethod
¶
generate(messages: Sequence[Message], *, model: str, temperature: float = 0.7, max_tokens: int = 1024, **kwargs: Any) -> Dict[str, Any]
Synchronous completion — returns a dict with content and usage.
Source code in src/openjarvis/engine/_stubs.py
stream
abstractmethod
async
¶
stream(messages: Sequence[Message], *, model: str, temperature: float = 0.7, max_tokens: int = 1024, **kwargs: Any) -> AsyncIterator[str]
Yield token strings as they are generated.
Source code in src/openjarvis/engine/_stubs.py
list_models
abstractmethod
¶
health
abstractmethod
¶
close
¶
EngineConnectionError¶
EngineConnectionError
¶
Bases: Exception
Raised when an engine is unreachable.
messages_to_dicts¶
messages_to_dicts
¶
messages_to_dicts(messages: Sequence[Message]) -> List[Dict[str, Any]]
Convert Message objects to OpenAI-format dicts.
Source code in src/openjarvis/engine/_base.py
Engine Implementations¶
OllamaEngine¶
OllamaEngine
¶
Bases: InferenceEngine
Ollama backend via its native HTTP API.
Source code in src/openjarvis/engine/ollama.py
VLLMEngine¶
VLLMEngine
¶
Bases: _OpenAICompatibleEngine
vLLM backend — thin wrapper over the shared OpenAI-compatible base.
Source code in src/openjarvis/engine/_openai_compat.py
LlamaCppEngine¶
LlamaCppEngine
¶
Bases: _OpenAICompatibleEngine
llama.cpp server — OpenAI-compatible base.
Source code in src/openjarvis/engine/_openai_compat.py
SGLangEngine¶
SGLangEngine
¶
Bases: _OpenAICompatibleEngine
SGLang backend — thin wrapper over the shared OpenAI-compatible base.
Source code in src/openjarvis/engine/_openai_compat.py
CloudEngine¶
CloudEngine
¶
estimate_cost¶
estimate_cost
¶
Estimate USD cost based on the hardcoded pricing table.
Source code in src/openjarvis/engine/cloud.py
Engine Discovery¶
Functions for probing running engines, aggregating available models, and selecting the best engine for a given configuration.
get_engine¶
get_engine
¶
get_engine(config: JarvisConfig, engine_key: str | None = None) -> Tuple[str, InferenceEngine] | None
Get a specific engine by key, or the default with fallback.
Returns (key, engine_instance) or None if no engine is available.
Source code in src/openjarvis/engine/_discovery.py
discover_engines¶
discover_engines
¶
discover_engines(config: JarvisConfig) -> List[Tuple[str, InferenceEngine]]
Probe registered engines and return [(key, instance)] for healthy ones.
Results are sorted with the config default engine first.
Source code in src/openjarvis/engine/_discovery.py
discover_models¶
discover_models
¶
discover_models(engines: List[Tuple[str, InferenceEngine]]) -> Dict[str, List[str]]
Call list_models() on each engine and return a dict.