From Facebook’s data breach to rising identify theft around the world and strict European GDPR laws, Trust & Safety is becoming a critical part of major global companies that are data-driven. With 2 plus years of experience as a Trust & Safety Data Analyst, I’ve built various data models highly effective at surfacing anomalies among platform actors, worked with data scientists and engineers to put these models into a scalable production environment that runs 24/7, and continuously improve upon its efficiency and expand its functionalities.
A key element that enables our bad actor removal process to be accurate and bias-free is ‘human-in-the-loop’ model. While the concept and application of human-in-the-loop in machine learning has been around for a while, leveraging human-in-the-loop with Trust & Safety initiatives is proving to be especially effective and beneficial in my experience. Every time a platform actor is surfaced as suspicious by the data models, data input submitted by this actor are sent to a group of highly-capable and trustworthy human validators known as ‘Top Tier Validators’ for review. One thing to note is that these Top Tier Validators are educated individuals who can speak and understand English well and are mostly located in developing economies across different continents.
Here are some of the benefits of combining ‘human-in-the-loop’ model with Trust & Safety functions:
- High accuracy
- Statistical and machine learning models are not perfect
- Cost effective and helps contributors in developing economies support their families
- Availability and Parallelism
- Availability: Having human validators from different continents around the world ensures there will always be people to review data for us at any given point in time
- Parallelism (Isolation): Multiple tasks could be performed at the same time. A Top Tier Validator who is available to review one unit can do so as long as no other Top Tier Validator is reviewing the same unit
- Complies with ACID (atomicity, consistency, isolation, durability) rule of database management system:
- While this is mostly achieved by effective database design and management, it’s hard to argue that the availability and parallelism enabled by having a pool of global Top Tier Validators don’t contribute to it
Of course, the nature of Trust & Safety and the problems it tries to solve can vary greatly depending on the industry or even function within a company, and that behaviors from fraudsters vary greatly accordingly. For instance, fraudsters in fintech companies such as PayPal and Visa are likely to conduct identity theft to steal credentials and private information from customers. Fraudsters in social media sites such as Facebook and Twitter can either steal personal information from users and/or disseminate fake news via the status page. On the other hand, crowdsourcing companies which with main focus on client data enrichment through help from global contributors (taskers) are likely to face fraudsters from the global contributors pool who cheat the tasks by exploiting task design loopholes or deploying bots or algorithms to harvest test questions in the tasks so they could submit random answers in non-test-questions. That being said, regardless of the industries and business problems you are facing, if precision and promptness are of utmost importance in your organization, I am confident the combination of human-in-the-loop with Trust & Safety techniques would likely yield considerable gain in efficiency and precision with reasonable sunk cost and ongoing expensive, and can be a huge win for the organization in the long run.