Most AI assistants depend on paid API services. For an independent studio like AJ APPLICATIONS, paying per-token is not sustainable. So we built AJ AI — a fully local, self-hosted AI assistant that runs on your own hardware with zero dependency on external AI providers.
In Development — Not Yet Hosted
AJ AI is functional locally but requires a powerful server (GPU preferred) to host publicly. Server costs exceed our current budget. Support us to get it live.
Why Local AI?
With AJ AI, the model runs locally using Ollama — an open-source runtime that runs LLaMA, Mistral, and other open models directly on a machine you control. Every conversation stays on your hardware. No token costs, no privacy concerns.
Core Features
Local Model Runtime
Ollama and GPT4All power LLaMA, Mistral and other open-source models.
Internet Learning
Fetches live data from Wikipedia, DuckDuckGo, SerpAPI, and YouTube transcripts.
Voice Chat UI
Browser speech APIs handle voice input and output.
Persistent Memory
Remembers previous sessions and preferences across conversations.
Dynamic Roles
Switch between assistant, tutor, creative writer, coder, and more.
Fully Private
Nothing leaves your server. No API keys required for core AI.
The Stack
- Ollama — runs the language model locally, exposes a REST API on localhost
- Flask (Python) — API layer handling requests, memory, role injection, and web searches
- Wikipedia API and DuckDuckGo — real-time knowledge augmentation
- SerpAPI — structured search results on demand
- YouTube Transcript API — so AJ AI can summarize YouTube videos by URL
- Browser Speech API — voice input and output in the frontend
AJ AI uses Retrieval-Augmented Generation (RAG) — the same pattern used by enterprise AI systems — implemented entirely without paid infrastructure.
Hosting Challenge
Running a local LLM requires 8–16 GB RAM minimum and ideally a GPU. Cloud instances powerful enough for Ollama cost significantly more than standard hosting. The code is complete — it is purely a funding issue.