Summary

Court records in an ongoing lawsuit reveal that Meta staff allegedly downloaded 81.7TB of pirated books from shadow libraries like Z-Library and LibGen to train its AI models.

Internal messages show employees raising ethical concerns, with one saying, “Torrenting from a corporate laptop doesn’t feel right.”

Meta reportedly took steps to hide the activity.

The case is part of a broader debate on AI data sourcing, with similar lawsuits against OpenAI and Nvidia.

    • bean@lemmy.world
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      It’s fucked these guys can pirate all this shit and make money off it. But if the masses access it, shut it the fuck down! Break encryption! Curb the laws! Penalize! Penalize! Penalize!

  • meowmeowbeanz@sh.itjust.works
    link
    fedilink
    arrow-up
    0
    ·
    1 year ago

    So the “don’t be evil” crowd casually torrented 82TB of shadow library data through corporate hardware. Internal messages show researchers knew it crossed ethical lines, yet Zuck personally greenlit circumventing copyright. The cognitive dissonance of building AI empires on pirated foundations would be poetic if it weren’t so predictably dystopian.

    This isn’t oversight—it’s systemic rot. Fines become tax-deductible line items while lobbyists ensure regulatory capture. When your legal team costs more than the penalties, infringement transforms into R&D strategy. The only surprise is anyone still pretending capital understands “ethics” beyond PR gymnastics.

    Meanwhile indie authors get demonetized for quoting haikus. But sure, let’s investigate if open models borrowed a few ChatGPT outputs. Nothing accelerates innovation quite like megacorps rewriting IP law through sheer audacity.