Mastodon has updated its terms of service to explicitly prohibit scraping its platform to train artificial intelligence models. The new rules, which kick in on July 1, make it perfectly clear that using automated tools to slurp up user data from its main server, Mastodon.social, for LLM training is a big no-no.
Neowin received a copy of an email sent to users, notifying them of the change, which introduces new language prohibiting the "scraping of user data for unauthorized purposes, e.g., archival or large language model (LLM) training." Here's a snippet from the updated terms of service:
You are prohibited from using the Instance for the commission of harmful or illegal activities. Accordingly, you may not, or assist any other person to (or attempt to): ...
- Use, launch, develop, or distribute any automated system, including without limitation, any spider, robot, cheat utility, scraper, offline reader, or any data mining or similar data gathering extraction tools to access the Instance, except in each case as may be the result of standard search engine or Internet browser and local caching or for human review and interaction with Content on the Instance;
This policy change comes at a time when users are getting increasingly pissed off about their public posts becoming free fuel for the AI gold rush. In fact, this is probably good news for the same crowd over on Bluesky that freaked out after a massive, user-traceable dataset of their public posts was compiled and uploaded for AI research.
AI bot scraping has become a huge problem for everyone from giants like Reddit, which is now suing Anthropic, makers of Claude, for training on its posts without a license, to even Neowin readers, like Gerowen, who noted how a swarm of bots, including one Claudebot (you don't say!), hammered his personal server with over 700,000 requests in 24 hours, putting a huge strain on his "home NAS running on an old PC tower in the back woods of Kentucky."
It is important to remember that Mastodon is a federated network. These new terms apply specifically to the Mastodon.social server, which is operated directly by Mastodon gGmbH. This means that while users on the main instance are now protected, those on other independent servers in the "fediverse" will only get the same protection if their instance administrators adopt similar terms. The company is globally enforcing a new minimum age requirement of 16 for all users, raising it from the previous limit of 13.
2 Comments - Add comment