Local Inference Engine — On-Device AI
Local AI inference via Ollama on the host machine. Qwen 2.5 3B for structured classification, nomic-embed-text for embeddings. CPU-only, zero API costs, zero network latency. Use for bulk classificati
5 toolslocal
Classify
Classify text into categories using local Qwen 2.5 3B. Returns structured JSON. Use for seriousness classification, repo
textcategories
Embed
Generate a 768-dimensional embedding vector for text using local nomic-embed-text. Use for semantic similarity, concept
text
Summarize
Summarize text using local Qwen 2.5 3B. Use for case narrative summarization, literature abstract compression, and repor
text
Extract Json
Extract structured JSON from unstructured text using local Qwen 2.5 3B. Provide a JSON schema and the model extracts mat
textschema
Status
Check local inference engine status — model availability, memory usage, and latency.
no params