Incorporate RAG Into Your Workflows
You've built a workflow, created a collection, and populated it with knowledge. Let's connect them together to create an AI assistant that can answer questions using your specific documents and data.
Build Your First RAG Workflow: Connecting Knowledge to Intelligence 🧠
- Now for the magic moment! You've built a workflow, created a collection, and populated it with knowledge. Let's connect them together to create an AI assistant that can answer questions using your specific documents and data.
🎯 What We're Building
- A smart Q&A system that searches your collection for relevant information and uses that context to generate accurate, source-backed answers. No more generic AI responses, your assistant will know your specific knowledge!
📋 Step-by-Step: Your First RAG Workflow
Step 1: Create Your RAG Workflow
- Navigate to Workflows in the main menu
- Click "+ New Workflow"
- Name it "Knowledge Base Assistant" or "Smart Q&A Bot"
Step 2: Configure Your Input
- Remember your workflow starts with a Trigger block, for this workflow we'll use an Input Block, on your workflow canvas click on "Set a trigger" and select input. Your input block will have an input field titled "Message."
Step 3: Add a Search Table Block
- Click the "+" button to add a new block
- Select "Search Table Block"
- Connect it to your Input Block
- Configure the Search Table Block:
- Collection: Select the collection you just created
- Table: Choose your table (likely "Untitled" unless you renamed it)
- Search Term (Required):
{{inputs.message}}
(this searches for the user's question that is passed in via the input block.) - Limit:
5
(returns top 5 most relevant results) - Similarity Score: Default value is 0.35
Understanding Similarity Scores
- When configuring your Search Table Block, you'll see a "Minimum Similarity" setting. This controls how closely related the search results need to be to your question.
How It Works:
- Range: 0.0 to 1.0 (where 1.0 is a perfect match)
- Lower values (0.2-0.4): Only return very relevant, closely matching content
- Higher values (0.6-0.8): Cast a wider net, include somewhat related content
- Default: Usually around 0.35 for balanced results
When to Adjust:
- Too few results? Increase the similarity threshold (try 0.5-0.7)
- Too many irrelevant results? Decrease the threshold (try 0.2-0.3)
- Getting off-topic answers? Lower the threshold for more precise matching
Step 4: Add Your LLM Block
- Click "+" to add another block
- Select "LLM Block"
- Connect it to your Query Collection Block
- Configure for RAG:
- LLM Block Configuration:
- Model: Here you can select from the wide range of LLM options on Scout. Default model is GPT-4o.
- Temperature:
0.3
(lower for more factual, consistent answers) - Token Limit:
500
(adjust based on desired response length)
System Message:
- Set up your system message to guide the LLM. This message should clearly explain the LLM’s role and how it should respond
You are a helpful AI assistant that answers questions based on provided documentation. Use the documentation to provide accurate, detailed answers. If the documentation doesn't contain relevant information, say so clearly.
Documentation: {{collections_v2_tables_query.output}}
Remember to make sure you change out collections_v2_tables_query
for the ID in your workflow, this should match the ID of your Search Table block.
This workflow will also incorporate a User Message, your user message is the input or question that the end-user sends to the AI. Scroll down to message 2 and select "User"
- For this workflow your user message will be
{{inputs.message}}
Step 5: Test Your RAG Workflow
- Click "Publish" in the top right once all changes have been made
- Enter a question related to your uploaded content in the workflow console on the left hand side of the screen.
- Watch the magic happen:
- Query Block searches your collection
- Finds relevant documents
- LLM generates an answer using that context
🧠 What Just Happened? The RAG Process
🔍 Retrieve:
The Query Collection Block searched your documents using semantic similarity, finding content related to your question (even if exact keywords don't match).
📄 Augment:
The retrieved context was passed to the LLM Block as additional information to inform its response.
✍️ Generate:
The LLM created an answer based on both your question AND the relevant context from your knowledge base.
🎉 Congratulations!
You've just built a complete RAG workflow!