Skip to content
GitHub

Walkthrough

Chunkwise includes a CLI tool that automates infrastructure setup and deployment.

Terminal window
git clone https://github.com/Chunkwise/chunkwise.git
cd chunkwise/cli
poetry install
eval $(poetry env activate)

Deploy the infrastructure and run the client:

Terminal window
typer cli.py run deploy
typer cli.py run client-run

The Chunkwise client provides an environment for comparing and evaluating chunking strategies before deploying them to production.

The experimentation platform allows users to test and evaluate different chunking strategies before deployment.

Users can create a workflow, upload a document and select a chunker. Users can also adjust the chunking configurations. The interface displays how the document is split into chunks, showing chunk boundaries and distribution statistics.

Users can evaluate the workflow’s retrieval performance. The platform uses a ground truth dataset and benchmarks the chunking strategy.

Users can create multiple workflows with different chunking strategies. The comparison view displays differences in chunk metrics and retrieval performance across all selected workflows.

After evaluating workflows, users can deploy them as ETL pipelines.

Users select a workflow to deploy and provide their S3 bucket credentials. The platform provisions the ETL pipeline to ingest and process documents from the specified bucket. Once the database table is created in RDS, the platform returns an ARN that users can use to retrieve the RDS credentials required to access the chunks and embeddings.