
Bluesky, a social networking startup building a decentralized alternative to X (formerly Twitter), provided an update on Wednesday on how it is approaching a variety of trust and safety issues on its platform. The company is in various stages of developing and piloting a variety of initiatives focused on tackling bad actors, harassment, spam, fake accounts, video safety, and more.
To address malicious users or those who harass others, Bluesky says it is developing a new tool that can detect when multiple new accounts are created and managed by the same person. This could help reduce harassment, where malicious actors create multiple different personas to target victims.
Another new experiment will help detect “rude” replies and flag them for server admins. Similar to Mastodon, Bluesky supports a network where self-hosters and other developers can run their own servers that connect to Bluesky’s servers and other servers in the network. This federated feature is still in early access. But going forward, server admins will be able to decide how to take action against people who post rude replies. Bluesky will eventually reduce the visibility of these replies in the app. Repeatedly labeling content as rude will lead to account-level labels and suspensions, Bluesky says.
To reduce the use of lists to harass others, Bluesky removes individual users from the list if they block the list creator. A similar feature was recently introduced to Starter Packs, which are shareable lists that help new users find people to follow on the platform (check out the TechCrunch Starter Pack).
Bluesky also scans lists for abusive names or descriptions to reduce the ability for people to harass others by adding them to public lists with toxic or abusive names or descriptions. Those who violate Bluesky’s community guidelines are hidden from the app until the list owner changes to comply with Bluesky’s rules. Users who continue to create abusive lists are subject to additional action, but the company did not provide details, adding that lists are still an active area of discussion and development.
In the coming months, Bluesky will transition from relying on email reports to processing review reports through the app using notifications.
To combat spam and other fake accounts, Bluesky is launching a pilot that automatically detects when an account is fake, fraudulent, or spamming users. The company says the goal is to be able to take action on an account “within seconds of receiving a report” along with review.
One of the more interesting developments involves how Bluesky will comply with local laws while still allowing free speech. Bluesky uses geographic labels to hide content from users in certain areas, allowing it to comply with laws.
“This will allow Bluesky’s moderation service to maintain the flexibility to create space for free expression, while ensuring legal compliance so Bluesky can continue to operate as a service in those regions,” the company shared in a blog post. “This feature will be rolled out on a country-by-country basis, with the goal of informing users about the source of legal requests wherever legally possible.”
To address potential trust and safety concerns with recently added videos, the team is adding features such as the ability to turn off autoplay, label videos, and report videos. We are still evaluating what other features we should add, and these will be prioritized based on user feedback.
When it comes to abuse, the company says its overall framework is “asking how often something happens versus how harmful it is.” The company focuses on solving high-risk, high-frequency issues while also “tracking edge cases that could cause significant harm to some users.” The latter, it argues, only affect a small percentage of people but cause “ongoing harm” enough for Bluesky to take action to prevent abuse.
Users can raise concerns by reporting, emailing, or commenting to the @safety.bsky.app account.