So, you want to chunk really fast?

https://news.ycombinator.com/rss Hits: 9

Summary

we’ve been working on chonkie, a chunking library for RAG pipelines, and at some point we started benchmarking on wikipedia-scale datasets. that’s when things started feeling… slow. not unbearably slow, but slow enough that we started wondering: what’s the theoretical limit here? how fast can text chunking actually get if we throw out all the abstractions and go straight to the metal? this post is about that rabbit hole, and how we ended up building memchunk. what even is chunking? if you’re building anything with LLMs and retrieval, you’ve probably dealt with this: you have a massive pile of text, and you need to split it into smaller pieces that fit into embedding models or context windows. the naive approach is to split every N characters. but that’s dumb — you end up cutting sentences in half, and your retrieval quality tanks. the smart approach is to split at semantic boundaries: periods, newlines, question marks. stuff that actually indicates “this thought is complete.” "Hello world. How are you?" → ["Hello world.", " How are you?"] why delimiters are enough there are fancy chunking strategies out there — sentence splitters, recursive chunkers, semantic chunkers that use embeddings. but for most use cases, the key thing is just not cutting sentences in half. token-based and character-based chunkers don’t care where sentences end. they just split at N tokens or N bytes. that means you get chunks like: "The quick brown fox jumps over the la" "zy dog." the embedding for that first chunk is incomplete. it’s a sentence fragment. delimiter-based chunking avoids this. if you split on . and ? and \n, your chunks end at natural boundaries. you don’t need NLP pipelines or embedding models to find good split points — just byte search. simple concept. but doing it fast? that’s where things get interesting. enter memchr the memchr crate by Andrew Gallant is the foundation here. it’s a byte search library with multiple layers of optimization. the fallback: SWAR even without...

First seen: 2026-01-05 19:25

Last seen: 2026-01-06 03:29

Read Full Article More from this Source

So, you want to chunk really fast?

Summary

Related News

Show HN: Dwm.tmux – a dwm-inspired window manager for tmux

FBI is investigating Minnesota Signal chats tracking ICE

Xfwl4 – The Roadmap for a Xfce Wayland Compositor

Aperture: Senior QA (2004-2005)

Rust's Standard Library on the GPU