2023.09 - 2023.12

Context
What is data labeling?
Data labeling is the process of preparing labeled data for AI model training. eBay's Payment & Risk team uses AI models to detect potential fraudulent transactions. Updating these models requires a continuous supply of high-quality labeled data, and data labeling is the process that produces this essential input.

A sample data labeling file
Problem
Average completion time for labeling jobs is too long
The PM frequently receives complaints from Data Scientists about the long average completion time for labeling tasks, which slows down AI model updates. This delay impacts the model’s accuracy in detecting fraud and increases the risk of financial losses.
But Why?
Gaining Insights from the stakeholders
1. Workflow Interruption
Besides Excel, stakeholders need to juggle multiple tool during the whole process, interrupting their workflow.
2. Collaboration Barriers
Lack of real-time info visibility, roles are wasting time on back-and-forth, increasing communication cost.
3. Unsuitable Tool
Excel, as the primary tool for labeling jobs, is not suitable, leading to inefficiencies & errors.
Workflow Map 🗺️ & Insights 🔍

Making our hypothesis
Shipping an all-in-one tool will enhance the overall experience and effectively improve the data labeling speed
Our research shows that all roles involved in the process frequently switch between multiple tools, many of which are not well-suited for their tasks. Additionally, a lack of transparency—such as not being able to track annotators' progress—leads to communication issues. Based on these findings, I made the hypothesis.
Brainstorming & Prioritization
Deciding on a Minimum Viable Product
With research insights in hand, we faced a long list of potential features. To build the right product and minimize risks, we focused on the most critical features that balance user needs and development effort. Given the tight timeline, we prioritized high-value, low-complexity features to efficiently address key challenges.
Data Scientist's HMW…

Constructing Information Architecture
Align with the team on feature structure and organization
I created information architectures for different roles to align the team on feature structure and organization before designing the interface. This also improved navigation, helping users find information and complete tasks more efficiently.
Data Scientist's Portal Architecture

Design & Validation
Validating my idea as fast as possible
I invited members from different roles to conduct several rounds of usability testing. Our aim was always to validate ideas as fast as possible. It didn’t matter if it was roughly drawn wireframes or high fidelity prototypes, I opted for whichever method allowed me to quickly get something tested.


Result
Data Scientist Portal
Data Scientists previously had to rely on multiple tools—SQL for data sampling, Excel for managing data and questions, Docs for instructions, and Tableau or Python for post-processing. We streamlined this fragmented process by consolidating all steps into a unified platform. Each step was also standardized to ensure labeling jobs were consistent, reducing friction for annotators.
Admin Portal
Admin act as the bridge between Data Scientists & Annotators. They need real-time visibility into tasks and annotators' statuses to ensure clear communication and proper assignment of labeling jobs.
Annotator Portal
The annotator experience is the most in need of improvement and has the greatest impact on overall labeling efficiency. When designing their workspace, I focused on creating a simple, user-friendly interface that minimizes unnecessary distractions and helps them stay focused on each individual task.
Result
A Great Success in User Experience and Business Goals!
With the introduction of Label Studio, data labeling evolved into a fully integrated end-to-end platform. PM & I set up success metrics and gathered the necessary stats through collaboration with the dev team, and we could validate the effectiveness of this tool.
-29% weighted average completion time
9.8/10.0 satisfaction rate from post-launch survey
I could sense that my efforts not only improved the overall process efficiency but also had a positive impact on employee's daily work.
