2023.09 - 2023.12

Context
What is data labeling?
Data labeling is the process of preparing labeled data for AI model training. eBay's Payment & Risk team uses AI models to detect potential fraudulent transactions. Updating these models requires a continuous supply of high-quality labeled data, and data labeling is the process that produces this essential input.

A sample data labeling file
Problem
Average completion time for labeling jobs is too long
PMs frequently receive complaints from Data Scientists about long data labeling job completion times, slowing AI model updates. With 3,000+ fraud cases daily on eBay, these delays reduce model accuracy and significantly increase financial risk.
But Why?
Gaining Insights from the stakeholders
1. Workflow Interruption
Besides Excel, stakeholders need to juggle multiple tool during the whole process, interrupting their workflow.
2. Collaboration Barriers
Lack of real-time info visibility, roles are wasting time on back-and-forth, increasing communication cost.
3. Unsuitable Tool
Excel, as the primary tool for labeling jobs, is not suitable, leading to inefficiencies & errors.
Workflow Map 🗺️ & Insights 🔍

Making our hypothesis
Defining Our Approach: Role-Specific Portals
After understanding user needs, we decided to develop a new internal labeling tool. Given the distinct responsibilities of the three roles in the workflow—along with data privacy considerations—we structured the tool into three dedicated portals, each tailored to its specific user group.
How could we address the challenges faced by each role in the process? To guide our design, we made our hypotheses:
1. Workflow Interruption
Besides Excel, stakeholders need to juggle multiple tool during the whole process, interrupting their workflow.
2. Collaboration Barriers
Lack of real-time info visibility, roles are wasting time on back-and-forth, increasing communication cost.
3. Unsuitable Tool
Excel, as the primary tool for labeling jobs, is not suitable, leading to inefficiencies & errors.
Brainstorming & Prioritization
Deciding on a Minimum Viable Product
With research insights in hand, we faced a long list of potential features. To build the right product and minimize risks, we focused on the most critical features that balance user needs and development effort. Given the tight timeline, we prioritized high-value, low-complexity features to efficiently address key challenges.
Data Scientist's HMW…

Constructing Information Architecture
Align with the team on feature structure and organization
I created information architectures for different roles to align the team on feature structure and organization before designing the interface. This also improved navigation, helping users find information and complete tasks more efficiently.
Data Scientist's Portal Architecture

Design & Validation
Validating my idea as fast as possible
I invited members from different roles to conduct several rounds of usability testing. Our aim was always to validate ideas as fast as possible. It didn’t matter if it was roughly drawn wireframes or high fidelity prototypes, I opted for whichever method allowed me to quickly get something tested.
Solution
Data Scientist Portal
Highlight 1
Optimized Repetitive Job Launching
Highlight 2
An all in one platform
Admin Portal
Annotator Portal
Result
A Great Success in User Experience and Business Goals!
With the introduction of Label Studio, data labeling evolved into a fully integrated end-to-end platform. PM & I set up success metrics and gathered the necessary stats through collaboration with the dev team, and we could validate the effectiveness of this tool.
-29% weighted average completion time
9.8/10.0 satisfaction rate from post-launch survey
I could sense that my efforts not only improved the overall process efficiency but also had a positive impact on employee's daily work.
