Skip to content
GitHub

Overview

Chunkwise is an open-source platform for evaluating document chunking strategies and deploying Extract, Transform, Load (ETL) pipelines for Retrieval-Augmented Generation (RAG) systems. Specifically, it enables developers to systematically compare multiple industry-standard chunking strategies through chunk visualization, distribution statistics, and retrieval-based evaluation metrics. In addition, it allows teams to deploy an ingestion pipeline that chunks documents in a knowledge base using a selected strategy, generates vector embeddings, and stores them with metadata in a vector database.

Overview of Chunkwise: Documents --> Chunking Insights --> Chunking Evaluation, Documents --> AI-Focused ETL Pipeline --> Vector Database