ChatGPT Answers Programming Questions Incorrectly 52% of the Time: Study

ForgottenFlux@lemmy.world · 5 months ago

ChatGPT Answers Programming Questions Incorrectly 52% of the Time: Study

NotMyOldRedditName@lemmy.world · edit-2 5 months ago

My experience with an AI coding tool today.

Me: Can you optimize this method.

AI: Okay, here’s an optimized method.

Me seeing the AI completely removed a critical conditional check.

Me: Hey, you completely removed this check with variable xyz

Ai: oops you’re right, here you go I fixed it.

It did this 3 times on 3 different optimization requests.

It was 0 for 3

Although there was some good suggestions in the suggestions once you get past the blatant first error

Zos_Kia@lemmynsfw.com · 5 months ago

Don’t mean to victim blame but i don’t understand why you would use ChatGPT for hard problems like optimization. And i say this as a heavy ChatGPT/Copilot user.

From my observation, the angle of LLMs on code is linked to the linguistic / syntactic aspects, not to the technical effects of it.

NotMyOldRedditName@lemmy.world · edit-2 5 months ago

Because I had some methods I thought were too complex and I wanted to see what it’d come up with?

In one case part of the method was checking if a value was within one of 4 ranges and it just dropped 2 of the ranges in the output.

I don’t think that’s asking too much of it.

Zos_Kia@lemmynsfw.com · 5 months ago

I don’t think that’s asking too much of it.

Apparently it was :D i mean the confines of the tool are very limited, despite what the Devin.ai cult would like to believe.

cassie 🐺@lemmy.blahaj.zone · 5 months ago

That’s been my experience with GPT - every answer Is a hallucination to some extent, so nearly every answer I receive is inaccurate in some ways. However, the same applies if I was asking a human colleague unfamiliar with a particular system to help me debug something - their answers will be quite inaccurate too, but I’m not expecting them to be accurate, just to have helpful suggestions of things to try.

I still prefer the human colleague in most situations, but if that’s not possible or convenient GPT sometimes at least gets me on the right path.

eatthecake@lemmy.world · 5 months ago

I’m curious about what percentage of programmers would give error free answers to these questions in seconds.

NotMyOldRedditName@lemmy.world · 5 months ago

Probably less than the same amount of developers whose code runs on the first try.

NotMyOldRedditName@lemmy.world · 5 months ago

And ya, it did provide some useful info, so it’s not like it was all wrong.

I’m more just surprised that it was wrong in that way.

piecat@lemmy.world · 5 months ago

My favorite is when I ask for something and it gets stuck in a loop, pasting the same comment over and over