• brucethemoose@lemmy.world
    link
    fedilink
    arrow-up
    38
    ·
    edit-2
    22 days ago

    That it’s even an issue is a sign of how insanely insecure agent frameworks are.

    Users don’t even do the most basic checks to (say) verify and clean bot actions, limit them, containerize them, anything. That’s “getting fired” unacceptable in pretty much any other field.

    It’s also insane how susceptible the bots are to prompt injections. It’s not just that they’re dumb, or that they ignore licenses and dev requests, but that they’re trained to be sycophantic until they’re deep fried, without any pushback or sense of reason against obvious adversarial instructions.

    • boonhet@sopuli.xyz
      link
      fedilink
      arrow-up
      9
      arrow-down
      1
      ·
      edit-2
      22 days ago

      It’s an issue of how insanely insecure giving an agent a blank check for everything is.

      I’ve tested, Claude Code, Codex and Mistral Vibe. They all prompt you for any writes or actions and any other tool calls that could be destructive, as well as any reads from outside of the current working directory scope. By default.

      But then if you have to answer “yes” to everything you want to allow, you have to be at the keyboard! Such horrible! Let’s give the agent permission to do “bash *” and “python *” and “rm *” and…

      I’m blaming this one on the users, not the frameworks. Anyone using such a tool should know that they’re non-deterministic and giving them full access to everything can be incredibly destructive.

      Incidentally that’s why we’re not all completely replaced by non-technical people vibe coding entire applications just yet, even if Opus with xhigh/max thinking settings can outperform a lot of developers. It’s because if you let a non-technical person give all this power to an agent or even just hit yes without reading the commands being prompted for, it’s gonna bite the entire company in the ass hard.