The journey from Gemma 3 4B (17.5 tok/s CPU) to Gemma 4 E2B (25.5 tok/s GPU) on the Jetson Orin Nano. Covers model testing, QAT quantization, the JetPack CUDA rabbithole, CMA traps, and the keepalive architecture that makes it all work.
Beyond ChatGPT: a curated list of 20+ AI chatbot platforms covering frontier models, research tools, and developer-focused interfaces with their unique strengths.
Run local LLMs with Ollama, manage conversations via OpenwebUI, and get AI code completion in VS Code with Continue. A complete local AI stack setup guide.
How to run Fabric with local models through LM Studio for custom AI patterns and workflows. Setup, integration, and practical use cases for prompt-based automation.