Building Enterprise AI Customer Support: Part 2 - Database & RAG Setup

In the previous tutorial, we explored the multi-agent architecture. Now we'll build the data foundation that powers intelligent customer support: a hybrid approach combining relational databases for operational data with vector databases for knowledge retrieval.

What You'll Learn

Database schema design optimized for customer support workflows
Vector database setup for RAG (Retrieval-Augmented Generation)
Data population strategies with realistic test scenarios
Integration testing to verify system functionality

Hybrid Database Architecture

The power of our system comes from combining two complementary data storage approaches:

┌─────────────────┐    ┌─────────────────┐
│ Relational DB   │    │ Vector Database │
│ (SQLite/PG)     │    │ (Embeddings)    │
├─────────────────┤    ├─────────────────┤
│ • Customers     │    │ • Policy Docs   │
│ • Bookings      │    │ • FAQ Content   │
│ • Transactions  │    │ • Knowledge Base│
│ • Train Data    │    │ • Help Articles │
└─────────────────┘    └─────────────────┘

Why this hybrid approach?

Relational DB: Perfect for structured operational data (bookings, customers, transactions)
Vector DB: Ideal for semantic search through unstructured knowledge (policies, FAQs)
Combined: Enables both precise data operations and intelligent knowledge retrieval

Relational Database Schema Design

Our system uses four core tables optimized for customer support operations:

1. Customers Table

CREATE TABLE customers (
    customer_id TEXT PRIMARY KEY,
    name TEXT NOT NULL,
    email TEXT UNIQUE NOT NULL,
    address TEXT,
    phone TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Design rationale: Simple customer profiles with essential contact information. The address field will be used for location intelligence in later tutorials.

2. Bookings Table

CREATE TABLE bookings (
    booking_id TEXT PRIMARY KEY,
    customer_id TEXT,
    departure_station TEXT,
    arrival_station TEXT,
    departure_datetime TIMESTAMP,
    ticket_type TEXT,
    price DECIMAL(10,2),
    status TEXT,
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);

Key features:

Human-readable booking IDs (e.g., "UKC001")
Flexible status field for booking lifecycle management
Precise pricing with decimal support

3. Transactions Table

CREATE TABLE transactions (
    transaction_id TEXT PRIMARY KEY,
    booking_id TEXT,
    amount DECIMAL(10,2),
    transaction_type TEXT,
    payment_method TEXT,
    status TEXT,
    processed_at TIMESTAMP,
    FOREIGN KEY (booking_id) REFERENCES bookings(booking_id)
);

Supports: Payment processing, refund tracking, financial reporting

4. Fares Table

CREATE TABLE fares (
    route_id INTEGER PRIMARY KEY,
    departure_station TEXT,
    arrival_station TEXT,
    distance_miles INTEGER,
    standard_price DECIMAL(10,2),
    flexible_price DECIMAL(10,2)
);

Enables: Dynamic pricing, route planning, availability checking

Database Initialization

Setting up your database is straightforward with our automated scripts:

# Navigate to the project directory
cd production-ai-customer-support

# Create database schema
python -c "from utils.create_schema import create_database_schema; create_database_schema()"

# Populate with realistic sample data
python -c "from utils.populate_data import populate_data; populate_data()"

This creates a SQLite database with complete sample data including 15 realistic customers and 25+ varied booking scenarios.

Vector Database for RAG Implementation

The Challenge of Knowledge Retrieval

Traditional keyword search fails when customers ask questions like:

"Can I get my money back if I cancel?" (instead of "refund policy")
"What happens if my train is delayed?" (instead of "compensation terms")

Solution: Semantic search using vector embeddings that understand meaning, not just keywords.

Vector Database Architecture

Our RAG system follows this pipeline:

Policy Documents → Text Chunking → Embeddings → Vector Storage → Semantic Search

Implementation Details

1. Document Processing

We start with company policy documents stored in database/UKConnect_policy.txt:

# Document contains:
- Refund policies and procedures
- Terms & conditions
- Fare rules and regulations
- Compensation policies
- Special circumstances handling

2. Intelligent Chunking Strategy

# From utils/chunking_data.py
chunk_config = {
    "chunk_size": 500-1000,  # characters per chunk
    "overlap": 100,          # character overlap between chunks
    "preserve_boundaries": True  # Don't break sentences/paragraphs
}

Why chunking matters:

Manageable size: Fits LLM context windows efficiently
Overlap: Ensures context continuity across chunk boundaries
Boundary preservation: Maintains semantic coherence

3. Embedding Generation

# From database/vector_db.py
import google.generativeai as genai

def generate_embeddings(text_chunks):
    embeddings = []
    for chunk in text_chunks:
        result = genai.embed_content(
            model="models/embedding-001",
            content=chunk
        )
        embeddings.append(result['embedding'])
    return embeddings

4. Vector Storage

For development, we use pickle storage. Production systems would use dedicated vector databases like Pinecone, Weaviate, or ChromaDB.

# Initialize vector database
python -c "from database.vector_db import setup_vector_database; setup_vector_database()"

Configuration Setup

Environment Variables

Create a .env file with your configuration:

# API Configuration
GOOGLE_API_KEY=your_google_api_key_here

# Database Configuration  
DATABASE_URL=sqlite:///database/ukconnect_rail.db

# Vector Database
VECTOR_DB_PATH=database/vector_db.pkl

Project Structure

production-ai-customer-support/
├── database/
│   ├── ukconnect_rail.db          # SQLite operational database
│   ├── vector_db.pkl              # Vector embeddings storage
│   ├── UKConnect_policy.txt       # Source policy documents
│   ├── ukconnect_qa_pairs.json    # Pre-processed Q&A pairs
│   └── ukconnect_rag_chunks.json  # Processed text chunks
├── utils/
│   ├── create_schema.py           # Database schema creation
│   ├── populate_data.py           # Sample data generation
│   └── chunking_data.py           # Document processing

Sample Data Population

Realistic Customer Profiles

Our sample data includes diverse, realistic customer profiles:

# Example customer data
{
    "customer_id": "CUS001",
    "name": "James Thompson", 
    "email": "james.thompson@email.co.uk",
    "address": "45 Baker Street, London, W1U 7EW",
    "phone": "+44 20 7946 0958"
}

15 customers covering:

Different geographic locations (London, Manchester, Birmingham, Glasgow)
Various demographics and communication preferences
Mixed booking histories and support interactions

Complex Booking Scenarios

# Example booking data
{
    "booking_id": "UKC001",
    "customer_id": "CUS001", 
    "departure_station": "London Euston",
    "arrival_station": "Manchester Piccadilly",
    "departure_datetime": "2024-08-15 09:30:00",
    "ticket_type": "Standard",
    "price": 85.50,
    "status": "Active"
}

25+ bookings including:

Various routes and pricing tiers
Different booking statuses (Active, Cancelled, Modified)
Edge cases for testing (same-day changes, refunds, etc.)

RAG System Deep Dive

The Retrieval Process

When a customer asks a question, our RAG system follows these steps:

Query Analysis: Extract semantic meaning from the customer's question
Vector Search: Find the most relevant policy documents using cosine similarity
Context Preparation: Combine retrieved information with query context
Response Generation: LLM creates a response using the retrieved knowledge

Example RAG Flow

# Customer query: "Can I cancel my ticket and get a refund?"

# 1. Generate query embedding
query_embedding = embed_content("Can I cancel my ticket and get a refund?")

# 2. Search vector database
relevant_chunks = vector_db.search(query_embedding, top_k=3)
# Returns: [
#   "UKConnect Refund Policy: Tickets can be cancelled up to...",
#   "Cancellation fees apply as follows: Standard tickets...", 
#   "Refund processing typically takes 3-5 business days..."
# ]

# 3. Generate contextual response
response = llm.generate(
    query="Can I cancel my ticket and get a refund?",
    context=relevant_chunks,
    customer_data=customer_profile
)

Performance Optimization Strategies

1. Chunk Overlap Management

overlap_config = {
    "size": 100,  # characters
    "purpose": "context_continuity"
}

Ensures important information isn't lost at chunk boundaries.

2. Relevance Filtering

similarity_threshold = 0.7  # Only return highly relevant results
filtered_results = [r for r in results if r.similarity > similarity_threshold]

3. Caching Strategy

# Cache frequent queries for faster response times
cache_config = {
    "ttl": 3600,  # 1 hour cache lifetime
    "max_size": 1000,  # Maximum cached queries
}

4. Batch Processing

# Process multiple documents efficiently
batch_size = 32
embeddings = process_embeddings_batch(text_chunks, batch_size)

System Testing and Validation

Database Health Verification

# Test database connectivity and data integrity
python -c "
from database.database import DatabaseManager
db = DatabaseManager()
print('Customers loaded:', len(db.get_all_customers()))
print('Bookings loaded:', len(db.get_all_bookings()))
print('Database health: OK' if db.health_check() else 'ERROR')
"

Vector Database Testing

# Verify vector search functionality
python -c "
from database.vector_db import VectorDatabase
vdb = VectorDatabase()
results = vdb.search('refund policy', top_k=3)
print('Vector search working:', len(results) > 0)
print('Sample result:', results[0][:100] + '...' if results else 'No results')
"

End-to-End Integration Test

# Run a complete system test with sample customer interaction
python run_test_scenarios.py --session 1

This executes a full customer support scenario, testing both database operations and RAG retrieval.

Monitoring and Maintenance

Database Health Monitoring

def check_database_health():
    """Monitor database performance and integrity"""
    checks = {
        "table_integrity": verify_table_schemas(),
        "data_consistency": check_foreign_key_constraints(),
        "query_performance": measure_average_query_time(),
        "storage_usage": get_database_size_metrics()
    }
    return all(checks.values())

Vector Database Maintenance

def update_vector_database():
    """Maintain vector database accuracy and performance"""
    tasks = [
        reprocess_modified_documents(),
        regenerate_outdated_embeddings(),
        update_search_indices(),
        cleanup_unused_vectors()
    ]
    return execute_maintenance_tasks(tasks)

Production Deployment Considerations

Database Scaling

For production environments, consider:

PostgreSQL: Replace SQLite for concurrent access and advanced features
Connection pooling: Manage database connections efficiently
Read replicas: Distribute read queries for better performance
Backup strategy: Automated backups and point-in-time recovery

Vector Database Scaling

Production vector storage options:

Pinecone: Managed vector database with excellent performance
Weaviate: Open-source with GraphQL interface
ChromaDB: Simple, powerful embeddings database
Qdrant: High-performance vector search engine

Common Troubleshooting Issues

Issue: Vector Search Returns Poor Results

Solutions:

Verify embedding model consistency
Check chunk size and overlap settings
Adjust similarity thresholds
Review document preprocessing quality

Issue: Database Connection Timeouts

Solutions:

Implement connection pooling
Optimize query performance with indexes
Consider read/write splitting
Monitor concurrent connection limits

Issue: RAG Responses Lack Context

Solutions:

Increase chunk overlap
Include more context in retrieval
Improve document chunking boundaries
Enhance query preprocessing

Next Steps in This Series

With your data foundation solidly established, we're ready to build the intelligent agents that will use this infrastructure:

Coming up in Part 3: Policy Agent Build:

RAG-powered policy specialist implementation
Vector search integration with LLM responses
Testing policy retrieval accuracy
Handling complex policy queries

Key Takeaways

Hybrid architecture maximizes the strengths of both structured and unstructured data storage
Proper document chunking is absolutely critical for effective RAG performance
Realistic sample data enables comprehensive testing of edge cases and complex scenarios
Health monitoring ensures consistent system reliability and performance
Environment configuration facilitates smooth deployment across development, staging, and production

Your intelligent data foundation is now ready to power sophisticated customer support interactions. The combination of precise operational data and semantic knowledge retrieval creates the backbone for truly intelligent customer service automation.

Need help implementing enterprise AI data architectures? Contact twentytwotensors for expert consulting on AI system design and deployment.