← Back to context

Comment by lispisok

12 days ago

Is this why all the coding AI products I've used have gotten worse as the developers fine tune them to eliminate bad output? Before there was bad output and some interesting output, now it's just bland obvious stuff.

Still anecdotal, but I can only confirm this with my own experience. The worst was when I was debugging code, described the problem to GPT-4o, and then got my exact same code back with some blanket statements like "print your output for debugging" etc. This happened a couple of times over separate chats.

  • gpt-4 has had serious laziness problems for over a year now. It keeps on telling me, what I should and could do, instead of doing it itself.

    • The irony here is incredible. The LLM is lazy, you say? I do wonder where it learned that...

    • I subscribed to gpt4 for awhile and recently I let my subscription lapse. In the chatgpt4 model I couldn't get it to complete anything always getting the // add more lines if you need them but in the free got4o model things work first try. I'm guessing with limitations on the free version everything needs to be one shot output. In gpt4 people are given more calls so they force you to reprompt 4 or 5 times.

    • LLMs aren't humans. you can be pushy without being rude. In cases like this I simply ask for the full version. Usually ChatGPT produces it. GPT4o is more verbose, so this should be less of a problem.

That might be part of it, but I think the bigger factor is cost optimization. OpenAI in particular keeps replacing their models with with versions that are much faster (and therefore cheaper to run) which are supposed to be of equivalent quality but aren't really. GPT-4 -> GPT-4-Turbo -> GPT-4o have all been big upgrades to cost and latency but arguably downgrades to "intelligence" (or whatever you want to call it)

It's not always possible to say definitely is some text was AI-generated or not, but one sign that it is very likely AI is a kind of blandness of affect. Even marketing text carefully written by humans to avoid offensiveness tends to exude a kind of breathless enthusiasm for whatever it's selling. If marketing text is oatmeal with raisins, AI text is plain oatmeal.

It's possible to adjust the output of an LLM with temperature settings, but it's just fiddling with a knob that only vaguely maps to some control.

  • You can ask the LLM "now describe it with breathless enthusiasm", if that's what you want. There's been no shortage of training examples out there.