Hands-On LangChain for LLM Applications Development: Documents Splitting [Part 2]

Dec 01, 2023

∙ Paid

Once you’ve loaded documents, you’ll often want to transform them to better suit your application. The simplest example is you may want to split a long document into smaller chunks that can fit into your model’s context window.

When you want to deal with long pieces of text, it is necessary to split up that text into chunks. As simple as this sounds, there is a lot of potential complexity here. Ideally, you want to keep the semantically related pieces of text together.

LangChain has a number of built-in document transformers that make it easy to split, combine, filter, and otherwise manipulate documents. In this two-part practical article, we will explore the importance of document splitting, and the available LangChain text splitters and will explore four of them in-depth.

Why do we need document splitting? [Covered in Part 1]
Different types of LangChain splitters [Covered in Part 1]
Introduction to recursive character text splitter & the character text splitter [Covered in Part 1]
Diving deep in recursive splitting [Covered in Part 1]
PDF loading & splitting
Token splitting
Context-aware splitting

To Data & Beyond

Hands-On LangChain for LLM Applications Development: Documents Splitting [Part 2]

Table of Contents:

This post is for paid subscribers