No-CodeBeginner

How to Train a Chatbase Bot on Your Website Content

Point Chatbase at your sitemap, let it crawl your pages, and ship an AI support bot that answers from your own content in under fifteen minutes.

8 minBeginner

Chatbase builds a retrieval chatbot from sources you give it: web pages, PDFs, plain text, or a Q&A list. The most common starting point is your own website, because the bot can then answer product and support questions in your wording instead of making things up. This guide takes you from a fresh account to a working bot trained on your site.

What you need

  • A free or paid Chatbase account
  • Your website URL or, better, your sitemap URL (often /sitemap.xml)
  • About 10 minutes for the first crawl and a few test questions

Step 1: Create a new agent and add a source

From the dashboard click New AI agent, then open the Sources tab. Choose the Website source type. Paste your sitemap URL into the Crawl field so Chatbase discovers every page in one pass instead of you adding URLs one at a time.

Chatbase — Sources
Sources Files Text Website Q&A Notion
------------------------------------------------
Crawl https://yoursite.com/sitemap.xml [Fetch]
Discovered links (24)
[x] / 12 KB
[x] /pricing 8 KB
[x] /docs/getting-started 31 KB
[ ] /blog/2023-archive 88 KB
Total selected: 23 links ~310 KB
Adding a sitemap so the crawler finds every page.
Trim noisy pages
Deselect archive pages, tag pages, and anything with thin or duplicate text. A smaller, cleaner source set gives sharper answers than dumping the whole site in.

Step 2: Crawl and check the character count

Click Fetch links, review the discovered list, then start the crawl. Chatbase shows a total character count per source. Plans cap total training characters, so if you blow past the limit, remove your longest low-value pages first.

Step 3: Set the system prompt

Open the Settings tab for the agent and edit the instructions. Tell the bot who it is, what it should refuse, and what to do when it does not know. A tight prompt is the single biggest lever on answer quality.

system prompt
You are the support assistant for Acme.
Answer only from the provided sources.
If the answer is not in the sources, say:
"I'm not sure, here is how to reach a human: support@acme.com".
Keep replies under 4 sentences. Never invent prices or dates.

Step 4: Test in the playground

Use the chat preview to ask the questions your real users ask. If an answer is wrong, find the page it should have used, confirm the text is actually there, and re-crawl that one URL. Bad answers are almost always a missing or messy source, not the model.

Chatbase — Playground
You
Do you offer a free trial?
Agent
Yes. The Starter plan includes a 14-day free trial with no card required. You can upgrade any time from Settings > Billing.
You
What is your CEO's home address?
Agent
I'm not sure, here is how to reach a human: support@acme.com.
Testing the trained bot before embedding it.

Result

You now have a bot that answers from your own pages and gracefully bails out when it does not know. Next steps are embedding it on your site (covered in a separate guide) and reviewing the chat logs each week to spot pages worth adding or fixing.

Watch related tutorials

Tags
#chatbase#support#rag#website#no-code