
文章讨论近期一连串服务中断事件,其中部分事故与使用 AI 编程工具有关,并指出这类问题正如预期般出现,且有些事件影响范围很大。
Remember how I warned you a year ago that maintaining GenAI code would be harder than writing code with GenAI?
Any coder with any chops at all knows that is one thing to write code, and another to debug it (and still another to maintain it, a year or a decade later, which is even harder)
And remember how Nathan Hamiel and I warned you in August that
Right on cue, big problems have indeed started to arrive. FT just reported that “Amazon holds engineering meeting following AI-related outages”:
§
A new study from Sun Yat-sen University and Alibaba reports similar observations in a new benchmark that focuses on long-term maintainability:

As Chris Laub summarized the study on X,
Alibaba tested 18 AI coding agents on 100 real codebases, spanning 233 days each. they failed spectacularly. [It] turns out passing tests once is easy. maintaining code for 8 months without breaking everything is where AI completely collapses.
In fairness, some of the latest systems have done better than earlier ones, but for mission critical systems, even a small number of errors can be deadly. As Amazon is discovering in the real-world.
We may well move to a regime in which AI writes most code — but for a long time to come we are going to need humans to fix the mess.
No posts