Train Your AI
Teach your chatbot about your business using websites, documents, and custom content
Training is what makes your chatbot actually useful. NexaDesk processes your content, breaks it into semantic chunks, creates vector embeddings, and uses retrieval-augmented generation (RAG) to answer questions accurately.
Training Sources
NexaDesk supports several types of training data:
Enter a URL and NexaDesk will crawl the page (and optionally follow links) to index the content.
- Go to Knowledge Base > Add Source > Website
- Enter the URL (e.g.,
https://yoursite.com/products) - Choose crawl depth:
- Single page — Only the specified URL
- Crawl subpages — Follow links within the same domain (up to 50 pages)
- Click Start Training
Upload PDF files containing product catalogs, manuals, or policy documents.
- Go to Knowledge Base > Add Source > File Upload
- Drag and drop or browse for PDF files (max 10MB each)
- NexaDesk extracts text, splits into chunks, and indexes the content
Paste or type custom content directly — useful for FAQs, policies, or anything not available as a URL or file.
- Go to Knowledge Base > Add Source > Text
- Give the entry a title
- Paste or write your content
- Save
Add specific question-answer pairs for precise control over responses.
- Go to Knowledge Base > Add Source > Q&A
- Enter the question and the desired answer
- The chatbot will match similar visitor questions to your answer
Managing Training Data
In the Knowledge Base section, you can:
- View all sources — See the status of each training source (active, processing, failed)
- Re-train — Update a source after the original content has changed
- Delete — Remove a source and its associated embeddings
- View chunks — Inspect how NexaDesk split your content into training segments
Training Tips
- Be specific — Product pages with detailed descriptions produce better answers than generic landing pages
- Cover edge cases — Add Q&A pairs for questions the AI gets wrong
- Update regularly — Re-train sources when your content changes (pricing, features, policies)
- Check quality — Test your chatbot after training to verify answer accuracy
How Training Works Internally
- Content is fetched and cleaned (HTML stripped, boilerplate removed)
- Text is split into overlapping chunks (~500 tokens each)
- Each chunk is embedded using an embedding model
- Embeddings are stored in a vector index
- At query time, the visitor's question is embedded and the most relevant chunks are retrieved
- The AI generates an answer using the retrieved chunks as context

