4 Books to Deepen Your Understanding of LLMs: Theory & Engineering
4 Books Deep Diving into Large Language Models (LLMs)
Large Language Models (LLMs) have transformed natural language processing, powering applications from chatbots to text generation and beyond.
Whether you’re looking to build an LLM from scratch, fine-tune existing models, or integrate them into real-world applications, understanding both the theoretical and engineering aspects is crucial.
This article explores four essential books that provide a deep dive into LLMs:
Build a Large Language Model (from Scratch) — A hands-on guide by Sebastian Raschka that walks you through designing, pretraining, and fine-tuning an LLM step by step.
Hands-on Large Language Models — A practical resource for leveraging pretrained LLMs for text classification, retrieval-augmented generation (RAG), and multimodal applications.
LLM Engineering Handbook — A deep dive into data engineering, model fine-tuning, and deploying LLMs with MLOps, covering real-time inference and optimization.
AI Engineering — A comprehensive look at building AI applications with foundation models, discussing prompt engineering, dataset preparation, and model evaluation.
Whether you’re a researcher, developer, or AI enthusiast, these books provide the knowledge and tools needed to navigate the evolving landscape of LLMs and apply them effectively in various domains.
Table of Contents:
My New E-Book: LLM Roadmap from Beginner to Advanced Level
I am pleased to announce that I have published my new ebook LLM Roadmap from Beginner to Advanced Level. This ebook will provide all the resources you need to start your journey towards mastering LLMs.
1. Build LLM from Scratch
In Build a Large Language Model (from Scratch) bestselling author Sebastian Raschka guides you step by step through creating your own LLM. Each stage is explained with clear text, diagrams, and examples. You’ll go from the initial design and creation to pretraining on a general corpus, and on to fine-tuning for specific tasks.
Table of Contents:
Understanding Large Language Models
Working with Text Data
Coding Attention Mechanisms
Implementing a GPT Model from Scratch to Generate Text
Pretraining on Unlabeled Data
Fine-Tuning for Classification
Fine-Tuning to Follow Instructions
2. Hands-on Large Language Models
Through Hands-on Large Language Models’s visually educational nature, readers will learn practical tools and concepts they need to use these capabilities today.
You’ll understand how to use pretrained large language models for use cases like copywriting and summarization; create semantic search systems that go beyond keyword matching; and use existing libraries and pretrained models for text classification, search, and clustering.
Table of Contents:
Part I: Understanding Language Models
An Introduction to Large Language Models
Tokens and Embeddings
Looking Inside Large Language Models
Part II: Using Pretrained Language Models
4. Text Classification
5. Text Clustering and Topic Modeling
6. Prompt Engineering
7. Advanced Text Generation Techniques and Tools
8. Semantic Search and Retrieval-Augmented Generation
9. Multimodal Large Language Models
Part III: Training and Fine-Tuning Language Models
10. Creating Text Embedding Models
11. Fine-Tuning Representation Models for Classification
12. Fine-Tuning Generation Models
3. LLM Engineering Handbook
Throughout the LLM Engineering Handbook, you will learn data engineering, supervised fine-tuning, and deployment. The hands-on approach to building the LLM Twin use case will help you implement MLOps components in your own projects.
You will also explore cutting-edge advancements in the field, including inference optimization, preference alignment, and real-time data processing, making this a vital resource for those looking to apply LLMs in their projects.
By the end of this book, you will be proficient in deploying LLMs that solve practical problems while maintaining low-latency and high-availability inference capabilities.
Whether you are new to artificial intelligence or an experienced practitioner, this book delivers guidance and practical techniques that will deepen your understanding of LLMs and sharpen your ability to implement them effectively.
Table of Contents:
Understanding the LLM Twin Concept and Architecture
Tooling and Installation
Data Engineering
RAG Feature Pipeline
Supervised Fine-Tuning
Fine-Tuning with Preference Alignment
Evaluating LLMs
Inference Optimization
RAG Inference Pipeline
Inference Pipeline Deployment
MLOps and LLMOps
4. AI Engineering
In this book, author Chip Huyen discusses AI engineering: the process of building applications with readily available foundation models.
The book starts with an overview of AI engineering, explaining how it differs from traditional ML engineering and discussing the new AI stack.
The more AI is used, the more opportunities there are for catastrophic failures, and therefore, the more important evaluation becomes. This book discusses different approaches to evaluating open-ended models, including the rapidly growing AI-as-a-judge approach.
AI application developers will discover how to navigate the AI landscape, including models, datasets, evaluation benchmarks, and the seemingly infinite number of use cases and application patterns.
You’ll learn a framework for developing an AI application, starting with simple techniques and progressing toward more sophisticated methods, and discover how to efficiently deploy these applications.
Table of Contents:
Introduction to Building AI Applications with Foundation Models
Understanding Foundation Models
Evaluation Methodology
Evaluate AI Systems
Prompt Engineering
RAG and Agents
Finetuning
Dataset Engineering
Inference Optimization
AI Engineering Architecture and User Feedback
Are you looking to start a career in data science and AI and do not know how? I offer data science mentoring sessions and long-term career mentoring:
Mentoring sessions: https://lnkd.in/dXeg3KPW
Long-term mentoring: https://lnkd.in/dtdUYBrM