Walkthrough
Deployment
Section titled “Deployment”Chunkwise includes a CLI tool that automates infrastructure setup and deployment.
git clone https://github.com/Chunkwise/chunkwise.gitcd chunkwise/clipoetry installeval $(poetry env activate)Deploy the infrastructure and run the client:
typer cli.py run deploytyper cli.py run client-runClient
Section titled “Client”The Chunkwise client provides an environment for comparing and evaluating chunking strategies before deploying them to production.
Experimentation Platform
Section titled “Experimentation Platform”The experimentation platform allows users to test and evaluate different chunking strategies before deployment.
Visualization
Section titled “Visualization”Users can create a workflow, upload a document and select a chunker. Users can also adjust the chunking configurations. The interface displays how the document is split into chunks, showing chunk boundaries and distribution statistics.
Evaluation
Section titled “Evaluation”Users can evaluate the workflow’s retrieval performance. The platform uses a ground truth dataset and benchmarks the chunking strategy.
Comparison
Section titled “Comparison”Users can create multiple workflows with different chunking strategies. The comparison view displays differences in chunk metrics and retrieval performance across all selected workflows.
ETL Pipeline Deployment
Section titled “ETL Pipeline Deployment”After evaluating workflows, users can deploy them as ETL pipelines.
Deploy Workflow
Section titled “Deploy Workflow”Users select a workflow to deploy and provide their S3 bucket credentials. The platform provisions the ETL pipeline to ingest and process documents from the specified bucket. Once the database table is created in RDS, the platform returns an ARN that users can use to retrieve the RDS credentials required to access the chunks and embeddings.