Scientists at OpenAI have attempted to stop a frontier AI model from cheating and lying by punishing it. But this just taught it to scheme more privately.
Best way to teach a habitual lying child to stop, is to start lying to them about things that they like and then not making good on those promises. Yeah we’ll go to your favorite fast food, and then drive by and let them cry about it. Yeah I’ll let you pick out one toy, and then tell them you changed your mind. Each time you can explain to them how it’s the same as what they’ve been doing, and they feel it. AI can’t feel emotions, and never will so long as their memory extends only to their previous conversation.
And you can teach human children about the morality of lying, I don’t think an llm will ever grasp morality
Best way to teach a habitual lying child to stop, is to start lying to them about things that they like and then not making good on those promises. Yeah we’ll go to your favorite fast food, and then drive by and let them cry about it. Yeah I’ll let you pick out one toy, and then tell them you changed your mind. Each time you can explain to them how it’s the same as what they’ve been doing, and they feel it. AI can’t feel emotions, and never will so long as their memory extends only to their previous conversation.
I’m guessing it’ll work, you’ll be raising the next Hitler but an honest Hitler nonetheless