AI Automation

Experiments

A/B test AI responses to find what works best

Experiments let you test different AI approaches and automatically adopt what works best.

What Are Experiments?

Experiments are A/B tests for AI responses:

  • Test different response styles
  • Measure customer outcomes
  • Automatically use winners
  • Continuously improve

How Experiments Work

Setup

Define an experiment:

  1. Choose what to test (response style, length, etc.)
  2. Set the variants (A vs B)
  3. Define success metrics
  4. Set traffic allocation

Running

During the experiment:

  • Tickets randomly get variant A or B
  • AI generates responses accordingly
  • Outcomes are tracked

Analysis

After enough data:

  • Statistical significance calculated
  • Winner determined
  • Results reported

Application

Winning approach is:

  • Automatically adopted
  • Applied to future tickets
  • Stored in Client Brain

Creating an Experiment

Go to Settings → AI → Experiments

Click New Experiment

Configure the Experiment

FieldDescription
NameDescriptive name
HypothesisWhat you're testing
VariantsA and B approaches
MetricWhat determines success
Traffic% of tickets to include
DurationHow long to run

Example: Response Length

Name: Short vs Detailed Responses
Hypothesis: Shorter responses resolve tickets faster
 
Variant A: Standard response length
Variant B: Concise responses (50% shorter)
 
Success Metric: Resolution rate
Traffic: 20%
Duration: 2 weeks

Start the Experiment

Click Start Experiment to begin.

Experiment Types

Tone Testing

Test different communication styles:

  • Formal vs casual
  • Technical vs simple
  • Empathetic vs direct

Length Testing

Test response length:

  • Concise vs comprehensive
  • Bullet points vs paragraphs
  • With vs without context

Structure Testing

Test response formats:

  • Question-first vs answer-first
  • With vs without greeting
  • Signature styles

Content Testing

Test what information to include:

  • Links vs inline content
  • Step-by-step vs summary
  • With vs without images

Success Metrics

Available Metrics

MetricDescription
Resolution rate% tickets resolved
First-contact resolutionResolved in one reply
Customer satisfactionCSAT score
Response timeSpeed of resolution
Escalation rate% needing human
Reopening rate% reopened after close

Choosing Metrics

Pick metrics that:

  • Align with your goals
  • Are measurable
  • Have enough volume

Monitoring Experiments

Experiment Dashboard

View running experiments:

  • Current performance
  • Sample size
  • Estimated completion
  • Statistical significance

Pausing Experiments

If a variant performs poorly:

  1. Click Pause
  2. All traffic goes to the other variant
  3. Review results
  4. Decide to resume or end

Interpreting Results

Statistical Significance

Results need 95% confidence to be meaningful:

  • < 95% - Not enough data, continue
  • ≥ 95% - Results are reliable

Winning Variant

The winner is the variant with:

  • Better metric performance
  • Statistical significance
  • Practical difference (not just statistical)

Inconclusive Results

If no clear winner:

  • Variants may be equivalent
  • Consider testing something else
  • Or refine the hypothesis

Applying Results

Auto-Apply

If enabled, winning approaches automatically:

  • Update AI behavior
  • Store in Client Brain
  • Apply to future tickets

Manual Apply

Review results first:

  1. See experiment results
  2. Click Apply Winner
  3. Confirm the change

Rollback

If issues arise after applying:

  1. Go to experiment results
  2. Click Rollback
  3. Reverts to previous behavior

Best Practices

Test One Thing

Change only one variable per experiment:

  • ✅ Short vs long responses
  • ❌ Short formal vs long casual

Adequate Sample Size

Need enough tickets for reliable results:

  • Minimum ~100 tickets per variant
  • More for small effect sizes

Run Long Enough

Give experiments time:

  • At least 1-2 weeks
  • Account for weekly patterns

Document Learnings

Record what you learned:

  • Why the winner won
  • Insights for future tests
  • Failed hypotheses

Scheduled Experiments

Set up recurring experiments:

  1. Click Schedule
  2. Define recurrence
  3. AI automatically tests variations

Good for:

  • Continuous improvement
  • Seasonal optimization
  • Changing customer needs