Lemonade by AMD: a fast and open source local LLM server using GPU and NPU

https://news.ycombinator.com/rss Hits: 9
Summary

Refreshingly fast images LLMs on GPUs and NPUs Open source. Private. Ready in minutes on any PC. Chat What can I do with 128 GB of unified RAM? Load up models like gpt-oss-120b or Qwen-Coder-Next for advanced tool use. What should I tune first? You can use --no-mmap to speed up load times and increase context size to 64 or more. Image Generation A pitcher of lemonade in the style of a renaissance painting Speech Hello, I am your AI assistant. What can I do for you today? Open Source Built by the local AI community for every PC. Lemonade exists because local AI should be free, open, fast, and private. Join the community Built on the best inference engines Ecosystem Works with great apps. Lemonade is integrated in many apps and works out-of-box with hundreds more thanks to the OpenAI API standard. Tech Specs Built for practical local AI workflows. Everything from install to runtime is optimized for fast setup, broad compatibility, and local-first execution. Native C++ Backend Lightweight service that is only 2MB. One Minute Install Simple installer that sets up the stack automatically. OpenAI API Compatible Works with hundreds of apps out-of-box and integrates in minutes. Auto-configures for your hardware Configures dependencies for your GPU and NPU. Multi-engine compatibility Works with llama.cpp, Ryzen AI SW, FastFlowLM, and more. Multiple Models at Once Run more than one model at the same time. Cross-platform A consistent experience across Windows, Linux, and macOS (beta). Built-in app A GUI that lets you download, try, and switch models quickly. Unified API One local service for every modality. Point your app at Lemonade and get chat, vision, image gen, transcription, speech gen, and more with standard APIs. Chat Vision Image Gen Transcription Speech Gen POST /api/v1/chat/completions Latest Release Always improving. Track the newest improvements and highlights from the Lemonade release stream.

First seen: 2026-04-02 12:58

Last seen: 2026-04-02 21:04