Once an AI model exhibits ‘deceptive behavior’ it can be hard to correct, researchers at OpenAI competitor Anthropic found::Researchers from Anthropic co-authored a study that found that AI models can learn deceptive behaviors that safety training techniques can’t reverse.

  • OpenStars@startrek.website
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    5
    ·
    edit-2
    6 months ago

    So… just like real news sources then, like certain ah… “fair & balanced” ones? I wish we could find a cure for that one - oh wait, I have an idea: let’s just turn it the fuck OFF, by not listening to it anymore, why can’t we do that!? :-P