Semantic Chunking
If you've ever built a RAG (Retrieval-Augmented Generation) pipeline, you know that chunking is the hardest part. If you cut a sentence in half, you lose the semantic meaning.
Infinite Chunking Engine
Libro completely abstracts this away. You can pass a 100,000-word document into the ingest() endpoint.
- Sentence Boundary Detection: We split the text precisely at periods, newlines, and semantic shifts.
- Overlap Buffering: We automatically overlap chunks by 15% so context isn't lost between boundaries.
- Parallel Vectorization: The chunks are vectorized in parallel across our edge nodes.
Example
// You can literally pass a whole book chapter
await ctx.ingest({
userId: "author_1",
content: massiveStringChapterOne
});