<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Llm on Julien.cloud</title><link>https://julien.cloud/tags/llm/</link><description>Recent content in Llm on Julien.cloud</description><generator>Hugo -- gohugo.io</generator><language>en</language><copyright>© 2026 Julien</copyright><lastBuildDate>Tue, 09 Jun 2026 12:00:00 +0200</lastBuildDate><atom:link href="https://julien.cloud/tags/llm/index.xml" rel="self" type="application/rss+xml"/><item><title>Running Ollama on a Jetson Orin Nano: From Gemma 3 to Gemma 4 with GPU Acceleration</title><link>https://julien.cloud/blog/jetson-nano-ollama-edge-inference/</link><pubDate>Tue, 09 Jun 2026 12:00:00 +0200</pubDate><guid>https://julien.cloud/blog/jetson-nano-ollama-edge-inference/</guid><description>The journey from Gemma 3 4B (17.5 tok/s CPU) to Gemma 4 E2B (25.5 tok/s GPU) on the Jetson Orin Nano. Covers model testing, QAT quantization, the JetPack CUDA rabbithole, CMA traps, and the keepalive architecture that makes it all work.</description></item><item><title>LLM Gateway for OpenCode: Building a Local LiteLLM Router</title><link>https://julien.cloud/blog/llm-gateway-for-opencode-building-a-local-litellm-router/</link><pubDate>Sun, 07 Jun 2026 12:00:00 +0200</pubDate><guid>https://julien.cloud/blog/llm-gateway-for-opencode-building-a-local-litellm-router/</guid><description>27 models from 5 providers in LiteLLM, exposed to OpenCode through smart routers that pick the right model tier by prompt content, not context size. Runs locally via Docker with caching, spend tracking, and one endpoint.</description></item><item><title>OpenCode Go: Can $10/Month Open Models Replace Frontier APIs?</title><link>https://julien.cloud/blog/opencode-go-models-2026/</link><pubDate>Sat, 30 May 2026 23:30:00 +0200</pubDate><guid>https://julien.cloud/blog/opencode-go-models-2026/</guid><description>12 open coding models benchmarked against Claude and GPT-5.5. DeepSeek V4 Flash handles 70% of tasks at 12x cheaper than DeepSeek V4 Pro. MiMo-V2.5 is now the cheapest high-volume option at 30,100 req/5h. Qwen3.7 Max leads on SWE-bench Pro (60.6%). Kimi K2.6 leads on agentic coding. Here&amp;rsquo;s how to route between them.</description></item><item><title>Unveiling the World of AI Chatbots: A Diverse Exploration</title><link>https://julien.cloud/blog/unveiling-the-world-of-ai-chatbots-a-diverse-exploration/</link><pubDate>Mon, 03 Mar 2025 20:16:48 +0000</pubDate><guid>https://julien.cloud/blog/unveiling-the-world-of-ai-chatbots-a-diverse-exploration/</guid><description>Beyond ChatGPT: a curated list of 20+ AI chatbot platforms covering frontier models, research tools, and developer-focused interfaces with their unique strengths.</description></item><item><title>Boost Your AI Workflow: A Guide to Using Ollama, OpenwebUI, and Continue</title><link>https://julien.cloud/blog/boost-your-ai-workflow-with-ollama-openwebui-and-continue/</link><pubDate>Thu, 25 Jul 2024 18:56:33 +0000</pubDate><guid>https://julien.cloud/blog/boost-your-ai-workflow-with-ollama-openwebui-and-continue/</guid><description>Run local LLMs with Ollama, manage conversations via OpenwebUI, and get AI code completion in VS Code with Continue. A complete local AI stack setup guide.</description></item><item><title>Leveraging Fabric and LM Studio for Advanced AI</title><link>https://julien.cloud/blog/leveraging-fabric-and-lm-studio-for-advanced-ai/</link><pubDate>Thu, 06 Jun 2024 20:28:43 +0000</pubDate><guid>https://julien.cloud/blog/leveraging-fabric-and-lm-studio-for-advanced-ai/</guid><description>How to run Fabric with local models through LM Studio for custom AI patterns and workflows. Setup, integration, and practical use cases for prompt-based automation.</description></item></channel></rss>