On 26 November, Daniel van Strien, a machine learning librarian at Hugging Face, uploaded a dataset of 1m public posts and ...
A Hugging Face librarian released and later removed a 1 million Bluesky posts dataset, sparking concerns over data ...
Bluesky's Firehouse is known for being an open API, but it is also its flaw as anyone can scrape its data for the likes of AI ...
Bluesky user posts and user information was scraped by an AI researcher and built into a dataset and published on open ...
Bluesky is already facing its first major AI scrape, despite the stance of its owners that it will never train generative AI ...
A Hugging Face librarian released and later removed a 1 million Bluesky posts dataset, sparking concerns over data transparency and consent. Daniel van Strien extracted the posts using the ...
Bluesky, the social media platform often seen as a rival to Twitter, is at the center of a controversy after one million of ...
Although Bluesky itself doesn’t train AI models on user data, it doesn’t prevent others from using its data for training ...
Reported by 404Media on Nov. 26, one million public Bluesky posts — complete with identifying user information — were crawled and then uploaded to AI company Hugging Face. The dataset was ...
Bluesky is facing its first major controversy over data scraping after a dataset containing one million public posts appeared ...
Daniel van Strien, a machine learning librarian at Hugging Face, took a million Bluesky posts and turned them into a dataset ...