From zero to a RAG system: successes and failures

https://news.ycombinator.com/rss Hits: 15
Summary

A few months ago I was tasked with creating an internal tool for the company's engineers: a Chat that used a local LLM. Nothing extraordinary so far. Then the requirements came in: it had to have a fast response, I insist... fast!, and... it also had to provide answers about every project the company has done throughout its entire history (almost a decade). They didn't want a traditional search engine, but a tool where you could ask questions in natural language and get answers with references to the original documents. With emphasis on providing information from OrcaFlex files (a simulation software for floating body dynamics, cables, etc., widely used in the offshore industry). It already seemed complex, but it was confirmed when I was given access to 1 TB of projects, mixed with technical documentation, reports, analyses, regulations, CSVs, etc. The emotional roller coaster had begun. I'll tell you upfront that it was neither a quick nor easy process, and that's why I'd like to share it. From the first attempts, mistakes, to the final architecture that ended up in production. I also want to highlight that I had never done anything similar before and didn't know how a RAG worked either. We'll go problem by problem, and the solution I applied to each one. Problem 1: selecting the right technology The first step was to define the stack. I needed a local language model, without relying on external APIs, for confidentiality reasons. Ollama emerged as the most mature and easy-to-use option for running LLaMA models locally. I tried several embeddings, and nomic-embed-text offered good performance and quality for technical documents. Next was a RAG engine to orchestrate the document indexing process, embedding generation, vector database storage, and queries. Without it, no matter how fast the language model is, we couldn't retrieve relevant information from the documents. Think of it like a book's index: without it, you'd have to read the entire book to find the informa...

First seen: 2026-03-26 11:07

Last seen: 2026-03-27 01:17