Privacy-First AI Infrastructure
A fully self-hosted personal AI assistant — no cloud, no API keys, no data leakage. Running entirely on local hardware with production-grade architecture.
Explore the project →The Project
JuanaIA is a fully self-hosted personal AI assistant built on a microservices architecture with local LLM inference, semantic memory, RAG pipelines, and an autonomous agent loop. No external dependencies — everything runs on an NVIDIA RTX 5090.
Inference
Qwen3-Omni 30B
vLLM · AWQ 4-bit · 134K ctx
NVIDIA RTX 5090 · 28GB VRAM
Memory
Semantic + Episodic
pgvector · BGE-M3 embeddings
Hybrid search · ParadeDB
Agent Loop
Autonomous Planner
LangGraph · Kafka backbone
Replanning · Tool execution
Interfaces
Multi-Channel
REST · SSE streaming · PWA
Voice: Whisper STT + XTTS-v2
Infrastructure
Self-Hosted Stack
Spring Cloud · Keycloak · n8n
OpenBao · Grafana · Docker
Roadmap
Coming in R4–R5
FLUX.1 image gen · Neo4j graph
Avatar (VRM) · TensorRT-LLM
The Builder
Senior Software Engineer and AI Infrastructure Architect based in Buenos Aires, Argentina. Sole developer, architect, and PM of JuanaIA — designing and building production-grade AI systems end to end.
Open to remote Senior Software Engineer, AI Infrastructure Engineer, or Backend Architect roles.
Find me at