Label Studio

Label
Studio

Timeline

Timeline

2023.09 - 2023.12

Team

Team

1 designer (me),
1 PM, 6 SDES

1 designer (me), 1 PM, 6 SDES

Company

Company

eBay,
Payment & Risk

eBay, Payment & Risk

Role

Role

Research, Design,
Prototyping, Testing

Research, Design, Prototyping, Testing

The Payment & Risk team, responsible for protecting eBay's financial security, faced a significant challenge—slow data labeling processes were delaying updates to the detection AI model. To address this, our team developed Label Studio, a tool designed to streamline and accelerate data labeling workflows.





Impact:

  • -29% weighted average completion time

  • 9.8 out of 10.0 satisfaction rate

The Payment & Risk team, responsible for protecting eBay's financial security, faced a significant challenge—slow data labeling processes were delaying updates to the detection AI model. To address this, our team developed Label Studio, a tool designed to streamline and accelerate data labeling workflows.





Impact:

  • -29% weighted average completion time

  • 9.8 out of 10.0 satisfaction rate

The Payment & Risk team, responsible for protecting eBay's financial security, faced a significant challenge—slow data labeling processes were delaying updates to the detection AI model. To address this, our team developed Label Studio, a tool designed to streamline and accelerate data labeling workflows.





Impact:

  • -29% weighted average completion time

  • 9.8 out of 10.0 satisfaction rate

Context

What is data labeling?

Data labeling is the process of preparing labeled data for AI model training. eBay's Payment & Risk team uses AI models to detect potential fraudulent transactions. Updating these models requires a continuous supply of high-quality labeled data, and data labeling is the process that produces this essential input.

A sample data labeling file

Problem

Average completion time for labeling jobs is too long

The PM frequently receives complaints from Data Scientists about the long average completion time for labeling tasks, which slows down AI model updates. This delay impacts the model’s accuracy in detecting fraud and increases the risk of financial losses.

"Even though we start preparing for data labeling as early as possible, it still takes longer than expected."

- Jimmy, Applied Scientist

"Even though we start preparing for data labeling as early as possible, it still takes longer than expected."

- Jimmy, Applied Scientist

“Current data labeling speed is really risking our models’ effectiveness. Is it possible to hire more agents for it?”

- Pan, the head of the Data team

“Current data labeling speed is really risking our models’ effectiveness. Is it possible to hire more agents for it?”

- Pan, the head of the Data team

But Why?

Gaining Insights from the stakeholders

To better understand the workflow and pain points, we conducted an observation study to spot inefficiencies, followed by user interviews to validate assumptions and explore needs. A few things stood out to us:

To better understand the workflow and pain points, we conducted an observation study to spot inefficiencies, followed by user interviews to validate assumptions and explore needs. A few things stood out to us:

To better understand the workflow and pain points, we conducted an observation study to spot inefficiencies, followed by user interviews to validate assumptions and explore needs. A few things stood out to us:

1. Workflow Interruption

Besides Excel, stakeholders need to juggle multiple tool during the whole process, interrupting their workflow.

2. Collaboration Barriers

Lack of real-time info visibility, roles are wasting time on back-and-forth, increasing communication cost.

3. Unsuitable Tool

Excel, as the primary tool for labeling jobs, is not suitable, leading to inefficiencies & errors.

Workflow Map 🗺️ & Insights 🔍

Making our hypothesis

Shipping an all-in-one tool will enhance the overall experience and effectively improve the data labeling speed

Our research shows that all roles involved in the process frequently switch between multiple tools, many of which are not well-suited for their tasks. Additionally, a lack of transparency—such as not being able to track annotators' progress—leads to communication issues. Based on these findings, I made the hypothesis.

Brainstorming & Prioritization

Deciding on a Minimum Viable Product

With research insights in hand, we faced a long list of potential features. To build the right product and minimize risks, we focused on the most critical features that balance user needs and development effort. Given the tight timeline, we prioritized high-value, low-complexity features to efficiently address key challenges.

Data Scientist's HMW…

Constructing Information Architecture

Align with the team on feature structure and organization

I created information architectures for different roles to align the team on feature structure and organization before designing the interface. This also improved navigation, helping users find information and complete tasks more efficiently.

Data Scientist's Portal Architecture

Design & Validation

Validating my idea as fast as possible

I invited members from different roles to conduct several rounds of usability testing. Our aim was always to validate ideas as fast as possible. It didn’t matter if it was roughly drawn wireframes or high fidelity prototypes, I opted for whichever method allowed me to quickly get something tested. 

Result

Data Scientist Portal

Data Scientists previously had to rely on multiple tools—SQL for data sampling, Excel for managing data and questions, Docs for instructions, and Tableau or Python for post-processing. We streamlined this fragmented process by consolidating all steps into a unified platform. Each step was also standardized to ensure labeling jobs were consistent, reducing friction for annotators.

Data Page

Design Page

Job Page

Data tab within a Project

1

User Scenario: Data Scientists need a centralized view to track job progress, monitor dataset status, and quickly take necessary actions.

Our Solution: We designed a project dashboard that combines key project metrics with essential action buttons, enabling Data Scientists to efficiently manage their workflow.

2

User Scenario: Data Scientists often need to modify data to improve structure and integration with other tools.

Our Solution: We introduced a column attributes panel, allowing users to rename, split, and enhance columns with hyperlinks—streamlining data organization and accessibility.

Data tab within a Project

1

User Scenario: Data Scientists need a centralized view to track job progress, monitor dataset status, and quickly take necessary actions.

Our Solution: We designed a project dashboard that combines key project metrics with essential action buttons, enabling Data Scientists to efficiently manage their workflow.

2

User Scenario: Data Scientists often need to modify data to improve structure and integration with other tools.

Our Solution: We introduced a column attributes panel, allowing users to rename, split, and enhance columns with hyperlinks—streamlining data organization and accessibility.

Design tab within a Project

1

User Scenario: Creating questions in spreadsheets was frustrating—options had to be on a separate sheet, and setting dependencies was difficult.

Our Solution: We built a question editor that keeps options in place and simplifies dependency settings, making the process seamless.

2

User Scenario: Without a standardized format, instructions were often unclear and failed to properly explain tasks.

Our Solution: We introduced a preset narrative structure, guiding users to write clear and effective instructions.

Design tab within a Project

1

User Scenario: Creating questions in spreadsheets was frustrating—options had to be on a separate sheet, and setting dependencies was difficult.

Our Solution: We built a question editor that keeps options in place and simplifies dependency settings, making the process seamless.

2

User Scenario: Without a standardized format, instructions were often unclear and failed to properly explain tasks.

Our Solution: We introduced a preset narrative structure, guiding users to write clear and effective instructions.

Job tab within a Project

1

User Scenario: Previously, Data Scientists had to constantly ask Admins for progress updates.

Our Solution: We provided real-time tracking, allowing Data Scientists to monitor job progress independently.

2

User Scenario: Data Scientists used to rely on external tools, such as Tableau, for data visualization.

Our Solution: We integrated built-in visualization tools, enabling quick insights without leaving the platform.

Admin Portal

Admin act as the bridge between Data Scientists & Annotators. They need real-time visibility into tasks and annotators' statuses to ensure clear communication and proper assignment of labeling jobs.

Job Management Page

Annotator Detail Page

Job Management

1

User Scenario: Admin’s primary task is job assignment, requiring quick access to job requests and relevant details.

Our Solution: We placed job requests in the most prominent position, displaying all key information to support informed decision-making.

2

3

User Scenario: Admins review past relative jobs when assigning new ones, checking who worked on them and their completion status.

Our Solution: We added filters to quickly find relevant jobs, streamlining the assignment process.

Annotator Detail

Annotator Portal

The annotator experience is the most in need of improvement and has the greatest impact on overall labeling efficiency. When designing their workspace, I focused on creating a simple, user-friendly interface that minimizes unnecessary distractions and helps them stay focused on each individual task.

1

User Scenario: While working in spreadsheets, users struggle with dense data, making it easy to misread rows and leading to errors or slower processing.

Our Solution: Display one set of data at a time to reduce distractions and ensure clarity, with hyperlinks for quick access to additional details.

2

User Scenario: Answering questions in spreadsheets was frustrating—options were hard to display, and long explanations didn’t fit well in cells.

Our Solution: We introduced a structured question set layout, ensuring clear visibility and ample space for responses.

Workspace

Result

A Great Success in User Experience and Business Goals!

With the introduction of Label Studio, data labeling evolved into a fully integrated end-to-end platform. PM & I set up success metrics and gathered the necessary stats through collaboration with the dev team, and we could validate the effectiveness of this tool.

-29% weighted average completion time

9.8/10.0 satisfaction rate from post-launch survey

I could sense that my efforts not only improved the overall process efficiency but also had a positive impact on employee's daily work.

© Liandong Zhou, 2025

© Liandong Zhou, 2025

Site Map

Site Map

My Contact

My Contact