Can a Retry Wrapper Make LLM Tool Calls Less Reliable Instead of More?

Yes. A retry wrapper on a non-idempotent tool call retries an operation that may have already succeeded. The model called the tool, the tool executed, the response was lost due to a timeout or network error, and the retry calls the tool again. You now have a duplicate database write, a duplicate email send, or a duplicate payment charge. The retry wrapper caught the error and created a worse one.

Analysis Briefing

Topic: LLM tool call retries, idempotency, and retry logic failure modes
Analyst: Mike D (@MrComputerScience)
Context: An adversarial analysis prompted by Claude Sonnet 4.6
Source: Pithy Cyborg | AI News Made Simple
Key Question: Under what conditions does adding a retry wrapper to an AI tool call make the system less reliable?

The Idempotency Problem With AI Tool Calls

Idempotency means that calling a function multiple times with the same arguments produces the same result as calling it once. A read operation is idempotent. A write, send, or create operation is often not.

Standard retry logic assumes the underlying operation is idempotent or that the caller can detect a duplicate and handle it. With LLM tool calls, neither assumption is reliably true. The model generates a tool call based on the conversation state. If that call is retried by a wrapper layer without the model’s knowledge, the model’s next response is based on a conversation state that does not reflect the duplicate execution.

The model called send_email with specific arguments. The send succeeded but the acknowledgment timed out. The retry wrapper calls send_email again. The recipient receives two emails. The model receives a success response from the retry and proceeds as though one email was sent. The duplicate is invisible to the AI system that caused it.

When Retries on Rate Limits Make Things Worse

Rate limit retries are the most common retry use case in LLM applications. The API returns a 429 and the wrapper waits and retries. This is correct for simple completion calls. It becomes problematic in agentic workflows where the model has already taken irreversible actions in the current turn.

AI function call silent failures documents the related problem where failures are not surfaced. Retry wrappers that catch all exceptions and retry add a second layer of obscurity: the failure happened, the retry succeeded, but the first attempt may have partially executed.

The Right Architecture: Idempotency Keys and Explicit Retry Policies

Make tools idempotent by design rather than relying on retry logic to be safe. An idempotency key is a unique identifier passed with each tool call that the tool implementation uses to deduplicate. If the same key is received twice, the second call returns the result of the first without re-executing.

import uuid
from functools import wraps

def idempotent_tool(func):
    executed = {}

    @wraps(func)
    def wrapper(*args, idempotency_key: str = None, **kwargs):
        if idempotency_key is None:
            idempotency_key = str(uuid.uuid4())
        if idempotency_key in executed:
            return executed[idempotency_key]
        result = func(*args, **kwargs)
        executed[idempotency_key] = result
        return result
    return wrapper

@idempotent_tool
def send_notification(user_id: str, message: str, idempotency_key: str = None):
    # Safe to retry: second call with same key returns cached result
    return notification_service.send(user_id, message)

For tool calls that cannot be made idempotent, use explicit rather than automatic retries: surface the failure to the model and let it decide whether to retry given the current conversation state. Automatic retries bypass the model’s judgment about whether retrying is appropriate given what has already happened.

What This Means For You

Never add automatic retries to non-idempotent tool calls without implementing idempotency keys at the tool level, because a retry on a failed write is a duplicate write and a retry on a failed send is a duplicate send.
Implement idempotency keys for all write, create, and send tools before deploying any agentic workflow to production, because tool failures are not rare events in production AI systems and the retry behavior needs to be safe by design.
Surface non-idempotent tool failures to the model explicitly rather than retrying automatically, because the model has context about whether retrying makes sense that the retry wrapper does not.

Enjoyed this? Subscribe for more clear thinking on AI:

Pithy Cyborg | AI News Made Simple → AI news made simple without hype.

Additional menu

Analysis Briefing

The Idempotency Problem With AI Tool Calls

When Retries on Rate Limits Make Things Worse

The Right Architecture: Idempotency Keys and Explicit Retry Policies

What This Means For You

Footer

Get The Latest Issue Of Pithy Cyborg | AI News Made Simple For FREE.