Skip to content

Splitters Crate

Intelligent text splitting and parsing for Recoco.

This crate implements sophisticated text splitting strategies, primarily leveraging Tree-sitter to perform syntax-aware chunking of source code and structured documents.

Standard text splitters often break code in the middle of functions or classes, destroying context. recoco-splitters understands the syntax of the language it is processing, ensuring that chunks respect logical boundaries (e.g., keeping a whole function together).

To minimize binary size, Recoco feature-gates every language parser. Enable only what you need in your Cargo.toml.

[dependencies]
recoco-splitters = { version = "...", features = ["python", "rust"] }
FeatureLanguage
allall languages
cC
c-sharpC#
cppC++
cssCSS
fortranFortran
goGo
htmlHTML
javaJava
javascriptJavaScript
jsonJSON
kotlinKotlin
markdownMarkdown
phpPHP
pythonPython
rR
rubyRuby
rustRust
scalaScala
soliditySolidity
sqlSQL
swiftSwift
tomlTOML
typescriptTypeScript
xmlXML
yamlYAML
  • Recursive Character Splitter: Standard splitting by separators (paragraphs, newlines, etc.).
  • Recursive Syntax Splitter: Tree-sitter based splitting that respects code blocks and syntax nodes.

Apache-2.0. See main repository for details.