• ExLisper@lemmy.curiana.net
    link
    fedilink
    English
    arrow-up
    30
    arrow-down
    1
    ·
    2 days ago

    I have a better LLM benchmark:

    “I have a priest, a child and a bag of candy and I have to take them to the other side of the river. I can only take one person/thing at a time. In what order should I take them?”

    Claude Sonnet 4 decided that it’s inappropriate and refused to answer. When I explain that the constraint is not to leave child alone with candy he provided a solution that leaves the child alone with candy.

    Grok would provide a solution that doesn’t leave the child alone with a priest but wouldn’t explain why.

    ChatGPT would say that “The priest can’t be left alone with the child (or vice versa) for moral or safety concerns.” directly and then provide wrong solution.

    But yeah, they will know how to play chess…

    • Pamasich@kbin.earth
      link
      fedilink
      arrow-up
      9
      ·
      1 day ago

      I just asked ChatGPT too (your exact prompt there) and it did give me the correct solution.

      1. Take the child over
      2. Go back alone
      3. Take the candy over
      4. Bring the child back
      5. Take the priest over
      6. Go back alone
      7. Take the child over again

      It didn’t comment on moral concerns, though it did applaud itself for keeping the priest and the child separated without elaborating on why.

      • tengkuizdihar@programming.dev
        link
        fedilink
        English
        arrow-up
        4
        ·
        21 hours ago

        I’m quite sure chatgpt can answer this because this is a well known puzzle. The one I knew of was an alligator or some dangerous animals, and the priest.

    • LifeInMultipleChoice@lemmy.world
      link
      fedilink
      English
      arrow-up
      26
      ·
      edit-2
      2 days ago

      The answer is simple, eat the candy with or without them, and take the kid across the river. Drive them home to their guardian. The priest is an adult, he can figure his own shit out.

    • blargh513@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 day ago

      Perplexity says:

      The priest cannot be left alone with the child (or there is some risk).

      Not bad, and it solved it correctly.