Training Your Agent
Understand how training works — from crawling to embedding to indexing your content.
How Training Works
Training is the process of converting your raw content (web pages, files, Q&A pairs) into a searchable knowledge base that powers your agent's responses.
The Training Pipeline
Content Extraction
Fetchply crawls your sources and extracts clean text content. HTML tags, scripts, and styles are stripped. Only readable text is kept.
Chunking
Long documents are split into segments of up to 512 tokens (~400 words). Each chunk is a self-contained piece of information that the agent can retrieve and reference.
Embedding
Each chunk is converted into a 1024-dimensional vector using Voyage AI. These vectors capture the semantic meaning of the text — similar concepts produce similar vectors.
Embeddings are processed in batches of 128 for efficiency.
Indexing
Vectors are stored in a high-performance vector database. When a visitor asks a question, Fetchply searches for the most similar vectors and returns the corresponding text chunks as context for the AI response.
Training Duration
| Content Size | Estimated Time |
|---|---|
| 10–50 pages | 1–2 minutes |
| 50–200 pages | 2–5 minutes |
| 200+ pages | 5–15 minutes |
You can navigate away from the training page — training continues in the background. You'll be notified when it completes.
Monitoring Progress
The training page shows real-time progress:
- Pages discovered and crawled
- Chunks created and embedded
- Errors encountered (if any)
- Estimated time remaining
Starting Training via API
curl -X POST https://fetchply.com/api/v1/agents/YOUR_AGENT_ID/train \
-H "Authorization: Bearer fp_your_api_key"Only one training job can run per agent at a time. Starting a new training
while one is active returns a 409 Conflict response.
Source-Scoped Training
When you add a single URL or upload a file, Fetchply can train just that source without rebuilding the entire knowledge base. This is called additive training — it's faster and doesn't interrupt existing knowledge.
Full retraining (which rebuilds everything) is used when you click Retrain or set up automatic retraining.