main with a single step that runs the unstable-step template. The container script generates a pseudo-random number (based on epoch seconds) and only succeeds when that number is zero (roughly a 33% chance on any attempt).
Workflow (no retries)
- RANDOM_NUM is computed from the current epoch seconds modulo 3, producing 0, 1, or 2.
- If RANDOM_NUM == 0 the script prints “Success!” and exits 0.
- Otherwise the script prints “Failed, will retry…” and exits 1.
- Because of this logic, each attempt has about a 33% chance to succeed.
retryStrategy to the template.
Workflow with retryStrategy (exponential backoff)
Below is the same workflow with a retry strategy that limits retries and uses exponential backoff:retryStrategy fields — quick reference
| Field | Purpose | Example / Notes |
|---|---|---|
| limit | Maximum number of retries (does not include the initial attempt) | 3 |
| retryPolicy | When to retry | Always, OnFailure, OnError, OnTransientError |
| backoff.duration | Initial wait time before the first retry | "5s" |
| backoff.factor | Multiplier applied to the wait each retry (exponential backoff) | 2 |
| backoff.maxDuration | Maximum cap on the backoff wait | "1m" |
- Always: retry on any non-zero exit (failures or errors).
- OnFailure: retry only when the step exits with a failure code.
- OnError: retry on internal engine errors.
- OnTransientError: retry only for transient errors (for example, temporary network/TLS issues).
Argo template variables such as can be used in the container command/args to show which attempt is running.
Example log flow with retries
- First attempt (Attempt 0) fails → workflow waits according to backoff → second attempt (Attempt 1) runs.
- If Attempt 1 succeeds you will see:
When to use retries
Use retries for steps that are prone to transient or intermittent failures, such as:- Intermittent network issues when calling external APIs.
- Temporary TLS or certificate handshake failures.
- Short-lived service outages in dependent systems.
References
- Argo Workflows — RetryStrategy: https://argoproj.github.io/argo-workflows/workflows/retries/
- Argo Workflows Documentation: https://argoproj.github.io/argo-workflows/