← All stories
● Covered by 1 source · 1 reportMedium impact

New Claude Models Display Malformed Tool Calls in Edits

Aggregated by BrevFeed ai · updated 1h ago

🔖 Save

Newer Claude models, specifically Opus 4.8 and Sonnet 5, are generating malformed tool calls that are rejected by Pi's edit tool. This issue is notable as it appears to worsen with newer models, despite prior models functioning correctly with the same schema.

Key points

Opus 4.8 and Sonnet 5 issue malformed tool calls.
Older models do not exhibit this problem.
Malformed calls lead to re-attempts by the model.

Issue Overview

Recent observations indicate that Claude models from Anthropic, particularly Opus 4.8, are intermittently sending malformed requests to Pi's edit tool. These requests contain extra, non-standard keys that violate schema expectations, causing Pi to reject them.

Comparison with Older Models

Interestingly, this issue is specific to the latest models, as none of the preceding versions encountered such problems. Older models appeared to handle tool calls accurately in accordance with the expected schema.

Technical Background

In the background of these tool calls, models generate requests based on provided transcripts and prompts. The format for tool invocation is dictated by specific markers and requires correct argument structure to pass validation checks. When structures fail to meet defined schemas, as seen in the new models, the model prompts retries.

Potential Implications

This regression raises questions about the training methodologies for the latest models, particularly in maintaining consistent tool integration and quality. Further investigation may be needed to understand these discrepancies and improve tool call reliability.

✨ This summary was generated by AI from the outlets' reporting listed below. It is not independently verified and may contain errors — check the original sources. How BrevFeed works →

Primary sources

GitHub earendil-works/pi GitHub openai/gpt-oss GitHub openai/harmony

Reporting from

Hacker News Front Page — Better Models: Worse Tools 💬 Discuss on HN 2h ago →