Skip to content

Parameters Reference

Complete reference for all @pytest.mark.repeated parameters.

Parameter Overview

All parameters are specified as keyword arguments to the @pytest.mark.repeated() decorator:

@pytest.mark.repeated(times=100, threshold=95)
def test_example():
    pass

Common Parameters

times or n 🔄

Type: int Required: Yes (for all approaches) Aliases: timesn

Number of times to repeat the test.

@pytest.mark.repeated(times=100, threshold=95)  # Using 'times'
@pytest.mark.repeated(n=100, threshold=95)      # Using 'n' - exactly the same

Recommendations: - Minimum: 20-30 for basic threshold testing - Typical: 50-200 for most use cases - Large: 500-1000+ for precise statistical inference


Basic Threshold Approach

threshold

Type: int Required: Yes (for basic approach) Range: 0 to times

Minimum number of passes required for test to succeed overall.

@pytest.mark.repeated(times=100, threshold=95)
# Test passes if ≥95 out of 100 runs succeed

Examples:

# 95% success rate
times=100, threshold=95

# 90% success rate
times=50, threshold=45

# 99% success rate (strict)
times=200, threshold=198

# 100% success rate (no randomness)
times=10, threshold=10

Notes: - threshold must be ≤ times - Setting threshold=times means all runs must pass (deterministic)

stop_if_threshold_met

Type: bool Required: No Default: False Compatible with: Basic threshold approach only

If True, stops running tests as soon as the threshold is met, potentially saving time.

@pytest.mark.repeated(times=1000, threshold=10, stop_if_threshold_met=True)
# Stops at 10 runs (when threshold is met)
# Instead of running all 1000 times

Examples:

# Early stopping enabled - stops at 5 passes
@pytest.mark.repeated(times=100, threshold=5, stop_if_threshold_met=True)

# Default behavior - runs all 100 times
@pytest.mark.repeated(times=100, threshold=5)

# Explicit default - runs all 100 times
@pytest.mark.repeated(times=100, threshold=5, stop_if_threshold_met=False)

Notes: - Only compatible with threshold mode (not with H0/null or Bayesian parameters) - Useful for expensive tests where you want to stop early on success - Cannot be used with frequentist or Bayesian approaches (will raise ValueError)


Frequentist Approach

H0 or null

Type: float Required: Yes (for frequentist approach) Range: 0.0 to 1.0 Aliases: H0null

The null hypothesis proportion - the baseline success rate to test against.

@pytest.mark.repeated(times=100, H0=0.90, ci=0.95)    # Using 'H0'
@pytest.mark.repeated(times=100, null=0.90, ci=0.95)  # Using 'null' - exactly the same

Interpretation: - Test passes if we can confidently reject H₀ (i.e., prove success rate > H₀) - Test fails if we cannot reject H₀

Examples:

# High bar: must exceed 95% success rate
H0=0.95

# Moderate bar: must exceed 80% success rate
null=0.80

# Low bar: must exceed 60% success rate
H0=0.60

ci

Type: float Required: Yes (for frequentist approach) Range: 0.0 to 1.0

Confidence level for the confidence interval.

@pytest.mark.repeated(times=100, H0=0.85, ci=0.95)
# 95% confidence interval

Common values: - 0.90 - 90% confidence (less strict) - 0.95 - 95% confidence (standard, recommended) - 0.99 - 99% confidence (very strict)

Notes: - Higher ci = stricter test = need more passes to reject H₀ - 0.95 is standard in most scientific fields


Bayesian Approach

success_rate_threshold

Type: float Required: Yes (for Bayesian approach) Range: 0.0 to 1.0

Minimum success rate your code must achieve.

@pytest.mark.repeated(
    times=100,
    success_rate_threshold=0.85,  # Code must succeed ≥85% of the time
    posterior_threshold_probability=0.95
)

Examples:

# Strict: 95% success rate required
success_rate_threshold=0.95

# Moderate: 80% success rate required
success_rate_threshold=0.80

# Lenient: 60% success rate required
success_rate_threshold=0.60

posterior_threshold_probability

Type: float Required: Yes (for Bayesian approach) Range: 0.0 to 1.0

How confident you need to be that success_rate_threshold is met.

@pytest.mark.repeated(
    times=100,
    success_rate_threshold=0.85,
    posterior_threshold_probability=0.95  # 95% confidence required
)

Common values: - 0.90 - 90% confidence (easier to pass) - 0.95 - 95% confidence (standard, recommended) - 0.99 - 99% confidence (very strict)

Interpretation: Test passes if P(success_rate ≥ success_rate_threshold | data) ≥ posterior_threshold_probability

prior_passes or prior_alpha

Type: int Required: Yes (for Bayesian approach) Range: > 0 (typically ≥ 1) Aliases: prior_passesprior_alpha

Prior belief: number of successes in your Beta prior.

# Using 'prior_passes'
@pytest.mark.repeated(times=100, prior_passes=9, prior_failures=1, ...)

# Using 'prior_alpha' - exactly the same
@pytest.mark.repeated(times=100, prior_alpha=9, prior_beta=1, ...)

Common patterns:

# Uninformative prior (let data decide)
prior_passes=1, prior_failures=1  # Beta(1,1) = uniform

# Weak prior favoring 80% success
prior_alpha=8, prior_beta=2  # Beta(8,2), mean=0.8, weak

# Strong prior favoring 90% success
prior_passes=90, prior_failures=10  # Beta(90,10), mean=0.9, strong

Notes: - Higher values = stronger prior belief - Prior mean = prior_passes / (prior_passes + prior_failures)

prior_failures or prior_beta

Type: int Required: Yes (for Bayesian approach) Range: > 0 (typically ≥ 1) Aliases: prior_failuresprior_beta

Prior belief: number of failures in your Beta prior.

# Using 'prior_failures'
@pytest.mark.repeated(times=100, prior_passes=85, prior_failures=15, ...)

# Using 'prior_beta' - exactly the same
@pytest.mark.repeated(times=100, prior_alpha=85, prior_beta=15, ...)

Examples:

# Optimistic prior (expect high success)
prior_passes=19, prior_failures=1  # 95% expected success

# Balanced prior
prior_alpha=50, prior_beta=50  # 50% expected success, strong belief

# Pessimistic prior (expect low success)
prior_passes=3, prior_failures=7  # 30% expected success

Notes: - Prior strength = prior_passes + prior_failures - Sum of 2-10 = weak prior, 10-50 = moderate, 50+ = strong


Parameter Aliases Summary

pytest-repeated supports multiple names for the same parameter to match different naming conventions:

Primary Name Alias(es) Description
times n Number of repetitions
H0 null Null hypothesis proportion (frequentist)
prior_passes prior_alpha Beta prior successes (Bayesian)
prior_failures prior_beta Beta prior failures (Bayesian)

Use whichever naming convention your team prefers - they're functionally identical.


Parameter Combinations

Valid Combinations

Basic Threshold Testing:

@pytest.mark.repeated(times=100, threshold=95)
Required: times (or n), threshold

Frequentist Testing:

@pytest.mark.repeated(times=100, H0=0.90, ci=0.95)
# or
@pytest.mark.repeated(n=100, null=0.90, ci=0.95)
Required: times/n, H0/null, ci

Bayesian Testing:

@pytest.mark.repeated(
    times=100,
    success_rate_threshold=0.85,
    posterior_threshold_probability=0.95,
    prior_passes=10,
    prior_failures=2
)
# or
@pytest.mark.repeated(
    n=100,
    success_rate_threshold=0.85,
    posterior_threshold_probability=0.95,
    prior_alpha=10,
    prior_beta=2
)
Required: times/n, success_rate_threshold, posterior_threshold_probability, prior_passes/prior_alpha, prior_failures/prior_beta

Invalid Combinations

Don't mix approaches:

# WRONG - mixing basic and frequentist
@pytest.mark.repeated(times=100, threshold=95, H0=0.90, ci=0.95)

Don't mix aliases:

# WRONG - using both times and n
@pytest.mark.repeated(times=100, n=50, threshold=95)

Best practice: Use parameters for one statistical approach per test.


Type Specifications

Parameter Type Valid Range Default
times / n int > 0 None (required)
threshold int 0 to times None
H0 / null float 0.0 to 1.0 None
ci float 0.0 to 1.0 None
success_rate_threshold float 0.0 to 1.0 None
posterior_threshold_probability float 0.0 to 1.0 None
prior_passes / prior_alpha int > 0 None
prior_failures / prior_beta int > 0 None

Examples by Use Case

LLM Testing (Basic)

@pytest.mark.repeated(times=50, threshold=48)
def test_llm_accuracy():
    response = call_llm("What is 2+2?")
    assert "4" in response

LLM Testing (Frequentist)

@pytest.mark.repeated(times=100, H0=0.90, ci=0.95)
def test_llm_exceeds_90_percent():
    response = call_llm("What is the capital of France?")
    assert "Paris" in response

LLM Testing (Bayesian)

@pytest.mark.repeated(
    times=100,
    success_rate_threshold=0.85,
    posterior_threshold_probability=0.95,
    prior_alpha=17,  # Previous testing showed ~85% success
    prior_beta=3
)
def test_llm_maintains_quality():
    response = call_llm("Translate 'hello' to French")
    assert "bonjour" in response.lower()

ML Model Testing (Frequentist)

@pytest.mark.repeated(n=500, null=0.80, ci=0.99)
def test_model_exceeds_baseline():
    sample = get_test_sample()
    prediction = model.predict(sample.features)
    assert prediction == sample.label

Randomized Algorithm (Bayesian with uninformative prior)

@pytest.mark.repeated(
    times=200,
    success_rate_threshold=0.90,
    posterior_threshold_probability=0.95,
    prior_passes=1,      # Uninformative prior
    prior_failures=1
)
def test_new_random_algorithm():
    result = random_algorithm(get_input())
    assert validate_result(result)

Quick Decision Guide

Choose your approach:

  1. Need simple pass/fail? → Use threshold
  2. Need statistical rigor with hypothesis testing? → Use H0 + ci
  3. Have prior knowledge to incorporate? → Use success_rate_threshold + posterior_threshold_probability + priors

Number of repetitions (times): - Quick tests: 20-50 - Standard tests: 100-200 - Precise tests: 500-1000+


Next Steps