AI Support Copilot - Solution Overview
What this project solves
A SaaS product had scattered support knowledge across docs, FAQs, runbooks, release notes, and resolved tickets. Users and support agents wasted time hunting for answers or escalating issues that should have been solved instantly. The goal was to build an AI support copilot that could answer product questions accurately, cite its sources, respect tenant and role permissions, and slot into an existing product without turning into a hallucinating chatbot.
What had to be true
- Only authorised users could query documents they were allowed to see
- Answers needed source citations
- New docs had to become searchable quickly
- The system had to work for both end users and internal support staff
- Every AI response needed logging, feedback, and cost visibility
- The hot path had to stay fast enough for normal in-app chat usage
Stack
GoAWSPostgreSQLwithpgvectorRedisS3OpenAIorAnthropicCognitoECS FargateCloudWatch
Solution in plain English
The system was split into two flows:
-
Document ingestion
- Documents were uploaded or synced into
S3 - Text was extracted, chunked, embedded, and stored in
PostgreSQL - Each chunk carried tenant, product-area, and permission metadata
- Documents were uploaded or synced into
-
Question answering
- A user asked a question in the app
- The backend authenticated the user, resolved their permissions, retrieved the most relevant chunks, built a grounded prompt, called the LLM, and returned an answer with citations
- The whole exchange was logged for analytics, debugging, and feedback
High-level architecture
Ingestion flow
Chat request flow
Core components
api-service
Handles:
- authentication
- conversation endpoints
- retrieval
- prompt building
- LLM calls
- answer logging
- feedback capture
ingestion-worker
Handles:
- file fetch / sync
- text extraction
- chunking
- embedding creation
- metadata enrichment
- reindexing
admin-ui
Handles:
- content sync status
- failed ingestions
- prompt/answer inspection
- user feedback review
- cost tracking
Data model
| Table | Purpose |
|---|---|
documents |
Source file or page metadata |
document_chunks |
Searchable chunks with embeddings |
document_permissions |
Tenant / role / product scoping |
conversations |
Chat sessions |
messages |
User and assistant messages |
message_citations |
Which chunks supported which answer |
feedback |
Helpful / not helpful ratings |
ingestion_runs |
Sync and indexing runs |
audit_logs |
Admin and system audit trail |
Key design decisions
1. Retrieval first, generation second
The model never answered from memory alone. It answered from retrieved chunks.
2. Permissions were applied before retrieval output
Not after. If a chunk was not visible to the user, it never entered the prompt.
3. Citations were mandatory
Every answer had links back to source docs or titled references.
4. Conversation history was trimmed
Only the useful recent messages plus the retrieved context were sent to the model.
5. Hot paths were cached carefully
Redis cached:
- recent conversation context
- top document fragments for popular queries
- tenant-scoped config
- prompt templates
Minimal API shape
User endpoints
POST /chat/askGET /chat/conversations/:idPOST /chat/feedbackGET /chat/sources/:messageId
Admin endpoints
POST /admin/documents/syncGET /admin/ingestion-runsGET /admin/feedbackGET /admin/prompts/:conversationId
Example retrieval code
func (s *ChatService) Ask(ctx context.Context, user User, req AskRequest) (*Answer, error) {
allowedScopes := s.PermissionService.ResolveScopes(user)
chunks, err := s.Search.FindRelevantChunks(ctx, SearchQuery{
TenantID: user.TenantID,
Scopes: allowedScopes,
Query: req.Question,
Limit: 6,
})
if err != nil {
return nil, err
}
prompt := s.Prompts.BuildSupportPrompt(req.Question, chunks)
llmResp, err := s.LLM.Generate(ctx, prompt)
if err != nil {
return nil, err
}
answer := &Answer{
Text: llmResp.Text,
Citations: mapChunksToCitations(chunks),
}
if err := s.Conversations.Store(ctx, user.ID, req.ConversationID, req.Question, answer, llmResp.Usage); err != nil {
return nil, err
}
return answer, nil
}