Meta’s latest AI improves its terrible content moderation, just a little

https://www.theregister.com/headlines.atom Hits: 100
Summary

Meta has revealed it’s tested using AI for content moderation chores and found it does better than humans. The social networking giant on Thursday announced it has started a global rollout for its Meta AI support, a tool that handles tasks like password resets, reporting dodgy content, explaining content takedowns and allowing appeals, or managing privacy settings. The company also said “Over the next few years, we will deploy more advanced AI systems across our apps to transform our approach to content enforcement, more accurately finding and removing severe content violations like scams and illegal content, so people see less of them.” Early experiments have delivered promising results: one AI tool detected and mitigated 5,000 attempts at scamming users to reveal their passwords every day. Meta says its human teams could not detect those scams. Another AI helped to reduce the number of reports users lodged about fake celebrity profiles by over 80 percent. Other tests doubled detection of adult sexual solicitation content that violates Meta’s rules. Meta says its AI can also “Prevent an account takeover by noticing it was suddenly accessed from a new location, the password was changed, and edits were made to the profile.” The company says those changes “look harmless to a person reviewing the account, but AI was able to recognize as a threat.” That’s an odd observation given that numerous enterprise security products can detect “impossible travel” such as a single user logging in from London and an hour later requesting a password reset from San Francisco, and flag it as a likely attack. Meta also enthused that AI can “Detect a fake site spoofing a legitimate web address and pretending to be a popular sporting goods store by noticing the real logo being used with unusually low prices and a suspicious web address,” because AI “drove down views of ads with scams and other serious violations by seven percent, offering promising results and better protections for users...

First seen: 2026-03-20 05:17

Last seen: 2026-03-24 09:23