Parameters Reference

Complete reference for all @pytest.mark.repeated parameters.

Parameter Overview

All parameters are specified as keyword arguments to the @pytest.mark.repeated() decorator:

@pytest.mark.repeated(times=100, threshold=95)
def test_example():
    pass

Common Parameters

`times` or `n` 🔄

Type: int Required: Yes (for all approaches) Aliases: times ≡ n

Number of times to repeat the test.

@pytest.mark.repeated(times=100, threshold=95)  # Using 'times'
@pytest.mark.repeated(n=100, threshold=95)      # Using 'n' - exactly the same

Recommendations: - Minimum: 20-30 for basic threshold testing - Typical: 50-200 for most use cases - Large: 500-1000+ for precise statistical inference

Basic Threshold Approach

`threshold`

Type: int Required: Yes (for basic approach) Range: 0 to times

Minimum number of passes required for test to succeed overall.

@pytest.mark.repeated(times=100, threshold=95)
# Test passes if ≥95 out of 100 runs succeed

Examples:

# 95% success rate
times=100, threshold=95

# 90% success rate
times=50, threshold=45

# 99% success rate (strict)
times=200, threshold=198

# 100% success rate (no randomness)
times=10, threshold=10

Notes: - threshold must be ≤ times - Setting threshold=times means all runs must pass (deterministic)

`stop_if_threshold_met`

Type: bool Required: No Default: False Compatible with: Basic threshold approach only

If True, stops running tests as soon as the threshold is met, potentially saving time.

@pytest.mark.repeated(times=1000, threshold=10, stop_if_threshold_met=True)
# Stops at 10 runs (when threshold is met)
# Instead of running all 1000 times

Examples:

# Early stopping enabled - stops at 5 passes
@pytest.mark.repeated(times=100, threshold=5, stop_if_threshold_met=True)

# Default behavior - runs all 100 times
@pytest.mark.repeated(times=100, threshold=5)

# Explicit default - runs all 100 times
@pytest.mark.repeated(times=100, threshold=5, stop_if_threshold_met=False)

Notes: - Only compatible with threshold mode (not with H0/null or Bayesian parameters) - Useful for expensive tests where you want to stop early on success - Cannot be used with frequentist or Bayesian approaches (will raise ValueError)

Frequentist Approach

`H0` or `null`

Type: float Required: Yes (for frequentist approach) Range: 0.0 to 1.0 Aliases: H0 ≡ null

The null hypothesis proportion - the baseline success rate to test against.

@pytest.mark.repeated(times=100, H0=0.90, ci=0.95)    # Using 'H0'
@pytest.mark.repeated(times=100, null=0.90, ci=0.95)  # Using 'null' - exactly the same

Interpretation: - Test passes if we can confidently reject H₀ (i.e., prove success rate > H₀) - Test fails if we cannot reject H₀

Examples:

# High bar: must exceed 95% success rate
H0=0.95

# Moderate bar: must exceed 80% success rate
null=0.80

# Low bar: must exceed 60% success rate
H0=0.60

`ci`

Type: float Required: Yes (for frequentist approach) Range: 0.0 to 1.0

Confidence level for the confidence interval.

@pytest.mark.repeated(times=100, H0=0.85, ci=0.95)
# 95% confidence interval

Common values: - 0.90 - 90% confidence (less strict) - 0.95 - 95% confidence (standard, recommended) - 0.99 - 99% confidence (very strict)

Notes: - Higher ci = stricter test = need more passes to reject H₀ - 0.95 is standard in most scientific fields

Bayesian Approach

`success_rate_threshold`

Type: float Required: Yes (for Bayesian approach) Range: 0.0 to 1.0

Minimum success rate your code must achieve.

@pytest.mark.repeated(
    times=100,
    success_rate_threshold=0.85,  # Code must succeed ≥85% of the time
    posterior_threshold_probability=0.95
)

Examples:

# Strict: 95% success rate required
success_rate_threshold=0.95

# Moderate: 80% success rate required
success_rate_threshold=0.80

# Lenient: 60% success rate required
success_rate_threshold=0.60

`posterior_threshold_probability`

Type: float Required: Yes (for Bayesian approach) Range: 0.0 to 1.0

How confident you need to be that success_rate_threshold is met.

@pytest.mark.repeated(
    times=100,
    success_rate_threshold=0.85,
    posterior_threshold_probability=0.95  # 95% confidence required
)

Common values: - 0.90 - 90% confidence (easier to pass) - 0.95 - 95% confidence (standard, recommended) - 0.99 - 99% confidence (very strict)

Interpretation: Test passes if P(success_rate ≥ success_rate_threshold | data) ≥ posterior_threshold_probability

`prior_passes` or `prior_alpha`

Type: int Required: Yes (for Bayesian approach) Range: > 0 (typically ≥ 1) Aliases: prior_passes ≡ prior_alpha

Prior belief: number of successes in your Beta prior.

# Using 'prior_passes'
@pytest.mark.repeated(times=100, prior_passes=9, prior_failures=1, ...)

# Using 'prior_alpha' - exactly the same
@pytest.mark.repeated(times=100, prior_alpha=9, prior_beta=1, ...)

Common patterns:

# Uninformative prior (let data decide)
prior_passes=1, prior_failures=1  # Beta(1,1) = uniform

# Weak prior favoring 80% success
prior_alpha=8, prior_beta=2  # Beta(8,2), mean=0.8, weak

# Strong prior favoring 90% success
prior_passes=90, prior_failures=10  # Beta(90,10), mean=0.9, strong

Notes: - Higher values = stronger prior belief - Prior mean = prior_passes / (prior_passes + prior_failures)

`prior_failures` or `prior_beta`

Type: int Required: Yes (for Bayesian approach) Range: > 0 (typically ≥ 1) Aliases: prior_failures ≡ prior_beta

Prior belief: number of failures in your Beta prior.

# Using 'prior_failures'
@pytest.mark.repeated(times=100, prior_passes=85, prior_failures=15, ...)

# Using 'prior_beta' - exactly the same
@pytest.mark.repeated(times=100, prior_alpha=85, prior_beta=15, ...)

Examples:

# Optimistic prior (expect high success)
prior_passes=19, prior_failures=1  # 95% expected success

# Balanced prior
prior_alpha=50, prior_beta=50  # 50% expected success, strong belief

# Pessimistic prior (expect low success)
prior_passes=3, prior_failures=7  # 30% expected success

Notes: - Prior strength = prior_passes + prior_failures - Sum of 2-10 = weak prior, 10-50 = moderate, 50+ = strong

Parameter Aliases Summary

pytest-repeated supports multiple names for the same parameter to match different naming conventions:

Primary Name	Alias(es)	Description
`times`	`n`	Number of repetitions
`H0`	`null`	Null hypothesis proportion (frequentist)
`prior_passes`	`prior_alpha`	Beta prior successes (Bayesian)
`prior_failures`	`prior_beta`	Beta prior failures (Bayesian)

Use whichever naming convention your team prefers - they're functionally identical.

Parameter Combinations

Valid Combinations

Basic Threshold Testing:

@pytest.mark.repeated(times=100, threshold=95)

Required: times (or n), threshold

Frequentist Testing:

@pytest.mark.repeated(times=100, H0=0.90, ci=0.95)
# or
@pytest.mark.repeated(n=100, null=0.90, ci=0.95)

Required: times/n, H0/null, ci

Bayesian Testing:

@pytest.mark.repeated(
    times=100,
    success_rate_threshold=0.85,
    posterior_threshold_probability=0.95,
    prior_passes=10,
    prior_failures=2
)
# or
@pytest.mark.repeated(
    n=100,
    success_rate_threshold=0.85,
    posterior_threshold_probability=0.95,
    prior_alpha=10,
    prior_beta=2
)

Required: times/n, success_rate_threshold, posterior_threshold_probability, prior_passes/prior_alpha, prior_failures/prior_beta

Invalid Combinations

❌ Don't mix approaches:

# WRONG - mixing basic and frequentist
@pytest.mark.repeated(times=100, threshold=95, H0=0.90, ci=0.95)

❌ Don't mix aliases:

# WRONG - using both times and n
@pytest.mark.repeated(times=100, n=50, threshold=95)

Best practice: Use parameters for one statistical approach per test.

Type Specifications

Parameter	Type	Valid Range	Default
`times` / `n`	`int`	> 0	None (required)
`threshold`	`int`	0 to `times`	None
`H0` / `null`	`float`	0.0 to 1.0	None
`ci`	`float`	0.0 to 1.0	None
`success_rate_threshold`	`float`	0.0 to 1.0	None
`posterior_threshold_probability`	`float`	0.0 to 1.0	None
`prior_passes` / `prior_alpha`	`int`	> 0	None
`prior_failures` / `prior_beta`	`int`	> 0	None

Examples by Use Case

LLM Testing (Basic)

@pytest.mark.repeated(times=50, threshold=48)
def test_llm_accuracy():
    response = call_llm("What is 2+2?")
    assert "4" in response

LLM Testing (Frequentist)

@pytest.mark.repeated(times=100, H0=0.90, ci=0.95)
def test_llm_exceeds_90_percent():
    response = call_llm("What is the capital of France?")
    assert "Paris" in response

LLM Testing (Bayesian)

@pytest.mark.repeated(
    times=100,
    success_rate_threshold=0.85,
    posterior_threshold_probability=0.95,
    prior_alpha=17,  # Previous testing showed ~85% success
    prior_beta=3
)
def test_llm_maintains_quality():
    response = call_llm("Translate 'hello' to French")
    assert "bonjour" in response.lower()

ML Model Testing (Frequentist)

@pytest.mark.repeated(n=500, null=0.80, ci=0.99)
def test_model_exceeds_baseline():
    sample = get_test_sample()
    prediction = model.predict(sample.features)
    assert prediction == sample.label

Randomized Algorithm (Bayesian with uninformative prior)

@pytest.mark.repeated(
    times=200,
    success_rate_threshold=0.90,
    posterior_threshold_probability=0.95,
    prior_passes=1,      # Uninformative prior
    prior_failures=1
)
def test_new_random_algorithm():
    result = random_algorithm(get_input())
    assert validate_result(result)

Quick Decision Guide

Choose your approach:

Need simple pass/fail? → Use threshold
Need statistical rigor with hypothesis testing? → Use H0 + ci
Have prior knowledge to incorporate? → Use success_rate_threshold + posterior_threshold_probability + priors

Number of repetitions (times): - Quick tests: 20-50 - Standard tests: 100-200 - Precise tests: 500-1000+

Next Steps

Basic Usage - Learn threshold-based testing
Frequentist - Hypothesis testing approach
Bayesian - Prior incorporation approach
API Reference - Low-level API documentation

Parameters Reference

Parameter Overview

Common Parameters

times or n 🔄

Basic Threshold Approach

threshold

stop_if_threshold_met

Frequentist Approach

H0 or null

ci

Bayesian Approach

success_rate_threshold

posterior_threshold_probability

prior_passes or prior_alpha

prior_failures or prior_beta

Parameter Aliases Summary

Parameter Combinations

Valid Combinations

Invalid Combinations

Type Specifications

Examples by Use Case

LLM Testing (Basic)

LLM Testing (Frequentist)

LLM Testing (Bayesian)

ML Model Testing (Frequentist)

Randomized Algorithm (Bayesian with uninformative prior)

Quick Decision Guide

Next Steps

`times` or `n` 🔄

`threshold`

`stop_if_threshold_met`

`H0` or `null`

`ci`

`success_rate_threshold`

`posterior_threshold_probability`

`prior_passes` or `prior_alpha`

`prior_failures` or `prior_beta`