- cross-posted to:
- kazkassukompais@group.lt
- cross-posted to:
- kazkassukompais@group.lt
36 commits by ‘tridge and claude’
https://www.youtube.com/watch?v=V0EAo9jo-U4&list=UU9rJrMVgcXTfa8xuMnbhAEA - video
https://pivottoai.libsyn.com/20260603-rsync-goes-ai-slop-breaks-your-backups - podcast
time: 7 min 34 sec



If what he claims is true then he’s using LLMs for test coverage with significant editing by hand. I hate LLMs, but even I have to admit this seems like one of the few, valid use cases of LLM assisted coding. Unless “slop” has become one of those words that’s just lost all meaning.
It’s a perfect example of how “using LLMs for test coverage” can also be harmful. He expected the tests to to prevent introduction of said regressions, probably based on a combination of the quantity of tests and their style (they look like what decent human written tests look like). But the tests are AI slop, and so they give a lot less value per line of code than he expects, hence a significant regression.
It is literally useful to call these tests AI slop, and the problem is in part caused by not calling them AI slop, and having consequent inflated expectations. LLMs are not any better at writing tests than at writing other code! It is merely that the bar for tests can, legitimately, be a lot lower (in projects where there would otherwise be no tests at all). Making an exception to calling AI generated tests “slop” is thus counter productive, because it leads people to act as if LLMs are actually better at writing tests than at writing other code, and not just because the bar for tests is frequently very low.
edit: actually scratch that I looked at the PR and those tests even look like dogshit and worse than the tests I seen claude write at a workplace that was into vibecoding (which i since quit).
I commend to you jonny’s thread on the tests:
https://neuromatch.social/@jonny/116666900898570791
It keeps turning out that when you look at the AI output, it’s shit.
I don’t know anything about rsync aside from as a user, but I am pretty experienced with Python and I admit those tests look really bizarre. If he did “slot machine” code it (a term I wasn’t familiar with) then yeah, I agree that’s slop. If he didn’t, I don’t understand why he made these changes. OK yeah, that’s a bad sign.
every vibe coder insists they’re shooting up krokodil responsibly
krokodil is such a good analogy goddamn