Skip to main content

Overview

A/B testing (also called split testing) lets you compare multiple prompt versions to see which performs better. Switchport provides deterministic routing so users get a consistent experience.

How It Works

  1. Create multiple versions of a prompt in the dashboard
  2. Set up a traffic config to distribute users across versions
  3. Execute prompts with user identification to get deterministic version assignment
  4. Record metrics to measure performance
  5. Analyze results in the dashboard to identify winners

Creating Multiple Versions

In the Switchport dashboard:
1

Navigate to your prompt

Go to Prompts and select the prompt you want to test
2

Create first version

  • Click Add Version
  • Name: v1 or formal-tone
  • Set model and prompt template
  • Click Save and Publish
3

Create second version

  • Click Add Version again
  • Name: v2 or casual-tone
  • Use different wording or approach
  • Click Save and Publish
4

Create traffic config

  • Click Traffic Config
  • Set distribution (e.g., 50% v1, 50% v2)
  • Click Activate

Executing Prompts with A/B Testing

The key to A/B testing is using subject for deterministic routing:
from switchport import Switchport

client = Switchport()

# User 1 gets assigned a version
response_1 = client.prompts.execute(
    prompt_key="product-pitch",
    subject={"user_id": "user_001"},
    variables={"product": "Pro Plan"}
)
print(f"User 1 got: {response_1.version_name}")

# User 2 might get a different version
response_2 = client.prompts.execute(
    prompt_key="product-pitch",
    subject={"user_id": "user_002"},
    variables={"product": "Pro Plan"}
)
print(f"User 2 got: {response_2.version_name}")

# User 1 will ALWAYS get the same version
response_1_again = client.prompts.execute(
    prompt_key="product-pitch",
    subject={"user_id": "user_001"},
    variables={"product": "Pro Plan"}
)
print(f"User 1 again: {response_1_again.version_name}")

# These will be equal:
assert response_1.version_id == response_1_again.version_id
The same subject always returns the same version. This ensures users have a consistent experience across sessions.

Recording Metrics for A/B Tests

To compare versions, record metrics with the same subject:
# Execute prompt
response = client.prompts.execute(
    prompt_key="welcome-email",
    subject={"user_id": "user_123"},
    variables={"name": "Alice"}
)

# Send email to user...
send_email(user.email, response.text)

# Later, track if they converted
client.metrics.record(
    metric_key="conversion",
    value=True,
    subject={"user_id": "user_123"}  # Same subject!
)
Switchport automatically:
  • Links the metric to the version that subject saw
  • Aggregates metrics per version
  • Calculates averages, success rates, and distributions

Traffic Distribution

You can distribute traffic in various ways:

50/50 Split (Classic A/B)

{
  "v1": 50,
  "v2": 50
}

Multivariate Testing (A/B/C)

{
  "v1": 33,
  "v2": 33,
  "v3": 34
}

Gradual Rollout

Start with a small percentage on the new version:
{
  "v1": 90,
  "v2": 10
}
If metrics look good, increase the new version:
{
  "v1": 50,
  "v2": 50
}
Finally, roll out completely:
{
  "v2": 100
}

Complete Example: Email A/B Test

from switchport import Switchport

client = Switchport()

def send_welcome_email(user):
    """Send welcome email with A/B testing."""

    # Execute prompt (user gets assigned to a version)
    response = client.prompts.execute(
        prompt_key="welcome-email",
        subject={"user_id": user.id},
        variables={
            "name": user.name,
            "signup_date": user.created_at
        }
    )

    # Send the email
    send_email(user.email, response.text)

    # Track which version they saw (for debugging)
    log_event("email_sent", {
        "user_id": user.id,
        "version": response.version_name
    })


def track_email_opened(user_id):
    """Track when user opens the email."""
    client.metrics.record(
        metric_key="email_opened",
        value=True,
        subject={"user_id": user_id}
    )


def track_email_clicked(user_id):
    """Track when user clicks link in email."""
    client.metrics.record(
        metric_key="email_clicked",
        value=True,
        subject={"user_id": user_id}
    )


def track_conversion(user_id):
    """Track when user converts (e.g., makes purchase)."""
    client.metrics.record(
        metric_key="conversion",
        value=True,
        subject={"user_id": user_id}
    )

Analyzing Results

In the Switchport dashboard, you can view:
  • Metric averages per version: See which version has higher satisfaction scores
  • Conversion rates: Compare success rates for boolean metrics
  • Sample sizes: Ensure statistical significance
  • Confidence intervals: Understand the reliability of results

Best Practices

Always use the same subject (e.g., user ID) for a given user across all prompt executions and metric recordings.
For initial A/B tests, use even splits to gather data faster.
Ensure you have enough data for statistical significance before declaring a winner.
Change only one variable between versions to understand what drives performance differences.
For new versions, start with a small percentage to minimize risk.
Decide what metrics matter before running the test to avoid cherry-picking results.
Watch for unexpected drops in other metrics when optimizing for one specific metric.

Common Use Cases

Email Marketing

Test subject lines, tone, call-to-action wording:
# Different email styles for different user segments
response = client.prompts.execute(
    prompt_key="marketing-email",
    subject={"user_id": user.id, "segment": user.segment},
    variables={"product": "Summer Sale"}
)
Metrics: open rate, click rate, conversion rate

Customer Support Chatbot

Test different conversation styles:
response = client.prompts.execute(
    prompt_key="support-bot",
    subject={"user_id": user.id},
    variables={"issue": user_message}
)
Metrics: resolution rate, satisfaction score, escalation rate

Product Descriptions

Test different description styles:
response = client.prompts.execute(
    prompt_key="product-description",
    subject={"product_id": product.id, "user_segment": segment},
    variables={"product_name": product.name}
)
Metrics: conversion rate, time on page, add-to-cart rate

Next Steps

Examples

See complete A/B testing examples

Metrics Reference

Learn more about recording metrics