Label Studio

Label
Studio

Timeline

Timeline

2023.09 - 2023.12

Team

Team

1 designer (me),
1 PM, 6 SDES

1 designer (me), 1 PM, 6 SDES

Company

Company

eBay,
Payment & Risk

eBay, Payment & Risk

Role

Role

Research, Design,
Prototyping, Testing

Research, Design, Prototyping, Testing

The Payment & Risk team, responsible for protecting eBay's financial security, faced a significant challenge—slow data labeling processes were delaying updates to the detection AI model. To address this, our team developed Label Studio, a tool designed to streamline and accelerate data labeling workflows.





Impact:

  • -29% weighted average completion time

  • 9.8 out of 10.0 satisfaction rate

The Payment & Risk team, responsible for protecting eBay's financial security, faced a significant challenge—slow data labeling processes were delaying updates to the detection AI model. To address this, our team developed Label Studio, a tool designed to streamline and accelerate data labeling workflows.





Impact:

  • -29% weighted average completion time

  • 9.8 out of 10.0 satisfaction rate

The Payment & Risk team, responsible for protecting eBay's financial security, faced a significant challenge—slow data labeling processes were delaying updates to the detection AI model. To address this, our team developed Label Studio, a tool designed to streamline and accelerate data labeling workflows.





Impact:

  • -29% weighted average completion time

  • 9.8 out of 10.0 satisfaction rate

Context

What is data labeling?

Data labeling is the process of preparing labeled data for AI model training. eBay's Payment & Risk team uses AI models to detect potential fraudulent transactions. Updating these models requires a continuous supply of high-quality labeled data, and data labeling is the process that produces this essential input.

A sample data labeling file

Problem

Average completion time for labeling jobs is too long

PMs frequently receive complaints from Data Scientists about long data labeling job completion times, slowing AI model updates. With 3,000+ fraud cases daily on eBay, these delays reduce model accuracy and significantly increase financial risk.

"Even though we start preparing for data labeling as early as possible, it still takes longer than expected."

- Jimmy, Applied Scientist

"Even though we start preparing for data labeling as early as possible, it still takes longer than expected."

- Jimmy, Applied Scientist

“Current data labeling speed is really risking our models’ effectiveness. Is it possible to hire more agents for it?”

- Pan, the head of the Data team

“Current data labeling speed is really risking our models’ effectiveness. Is it possible to hire more agents for it?”

- Pan, the head of the Data team

But Why?

Gaining Insights from the stakeholders

To better understand the workflow and pain points, we conducted an observation study to spot inefficiencies, followed by user interviews to validate assumptions and explore needs. A few things stood out to us:

To better understand the workflow and pain points, we conducted an observation study to spot inefficiencies, followed by user interviews to validate assumptions and explore needs. A few things stood out to us:

To better understand the workflow and pain points, we conducted an observation study to spot inefficiencies, followed by user interviews to validate assumptions and explore needs. A few things stood out to us:

1. Workflow Interruption

Besides Excel, stakeholders need to juggle multiple tool during the whole process, interrupting their workflow.

2. Collaboration Barriers

Lack of real-time info visibility, roles are wasting time on back-and-forth, increasing communication cost.

3. Unsuitable Tool

Excel, as the primary tool for labeling jobs, is not suitable, leading to inefficiencies & errors.

Workflow Map 🗺️ & Insights 🔍

Making our hypothesis

Defining Our Approach: Role-Specific Portals

After understanding user needs, we decided to develop a new internal labeling tool. Given the distinct responsibilities of the three roles in the workflow—along with data privacy considerations—we structured the tool into three dedicated portals, each tailored to its specific user group.

How could we address the challenges faced by each role in the process? To guide our design, we made our hypotheses:

1. Workflow Interruption

Besides Excel, stakeholders need to juggle multiple tool during the whole process, interrupting their workflow.

2. Collaboration Barriers

Lack of real-time info visibility, roles are wasting time on back-and-forth, increasing communication cost.

3. Unsuitable Tool

Excel, as the primary tool for labeling jobs, is not suitable, leading to inefficiencies & errors.

Brainstorming & Prioritization

Deciding on a Minimum Viable Product

With research insights in hand, we faced a long list of potential features. To build the right product and minimize risks, we focused on the most critical features that balance user needs and development effort. Given the tight timeline, we prioritized high-value, low-complexity features to efficiently address key challenges.

Data Scientist's HMW…

Constructing Information Architecture

Align with the team on feature structure and organization

I created information architectures for different roles to align the team on feature structure and organization before designing the interface. This also improved navigation, helping users find information and complete tasks more efficiently.

Data Scientist's Portal Architecture

Design & Validation

Validating my idea as fast as possible

I invited members from different roles to conduct several rounds of usability testing. Our aim was always to validate ideas as fast as possible. It didn’t matter if it was roughly drawn wireframes or high fidelity prototypes, I opted for whichever method allowed me to quickly get something tested. 

Solution

Data Scientist Portal

The DS Portal provides Data Scientists with a structured and efficient way to manage labeling jobs. It is structured into:



Project List – An overview of all projects, allowing quick access and organization.
Inside Each Project:
· Data Tab – Manage and prepare datasets for labeling.
· Design Tab – Create structured question sets & define instructions.
· Job Tab – Initiate and monitor labeling job within the project framework.

The DS Portal provides Data Scientists with a structured and efficient way to manage labeling jobs. It is structured into:



Project List – An overview of all projects, allowing quick access and organization.
Inside Each Project:
· Data Tab – Manage and prepare datasets for labeling.
· Design Tab – Create structured question sets & define instructions.
· Job Tab – Initiate and monitor labeling job within the project framework.

The DS Portal provides Data Scientists with a structured and efficient way to manage labeling jobs. It is structured into:



Project List – An overview of all projects, allowing quick access and organization.
Inside Each Project:
· Data Tab – Manage and prepare datasets for labeling.
· Design Tab – Create structured question sets & define instructions.
· Job Tab – Initiate and monitor labeling job within the project framework.

Project List Page

Project - Data Tab

Project - Design Tab

Project - Job Tab

Project List Page

Project - Data Tab

Project - Design Tab

Project - Job Tab

Highlight 1

Optimized Repetitive Job Launching

USER SCENARIO

USER SCENARIO

Our research shows that 80% of labeling jobs are repetitive, sharing the same instructions, questions, and data schema with previous jobs—only the data changes.

Our research shows that 80% of labeling jobs are repetitive, sharing the same instructions, questions, and data schema with previous jobs—only the data changes.

Our research shows that 80% of labeling jobs are repetitive, sharing the same instructions, questions, and data schema with previous jobs—only the data changes.

OUR SOLUTION

OUR SOLUTION

We optimized the workflow with two key improvements:

  1. Project-Job Structure – Enables Data Scientists to efficiently reuse setups, minimizing redundant work and streamlining repetitive job creation.

  2. Key Action Panel – Elevates high-frequency actions like uploading data and launching jobs, keeping them accessible across all tabs.

With these enhancements, users can now launch a repetitive job in under 30 seconds, significantly improving efficiency.

We optimized the workflow with two key improvements:

  1. Project-Job Structure – Enables Data Scientists to efficiently reuse setups, minimizing redundant work and streamlining repetitive job creation.

  2. Key Action Panel – Elevates high-frequency actions like uploading data and launching jobs, keeping them accessible across all tabs.

With these enhancements, users can now launch a repetitive job in under 30 seconds, significantly improving efficiency.

We optimized the workflow with two key improvements:

  1. Project-Job Structure – Enables Data Scientists to efficiently reuse setups, minimizing redundant work and streamlining repetitive job creation.

  2. Key Action Panel – Elevates high-frequency actions like uploading data and launching jobs, keeping them accessible across all tabs.

With these enhancements, users can now launch a repetitive job in under 30 seconds, significantly improving efficiency.

Quick launching a repetitive job in 30 seconds

Highlight 2

An all in one platform

USER SCENARIO

USER SCENARIO

Among the three roles, Data Scientists rely on external tools the most—using SQL for data sampling, Excel for data and question management, Docs for instructions, and Tableau or Python for post-processing. This fragmented workflow slows down efficiency and increases complexity.

Among the three roles, Data Scientists rely on external tools the most—using SQL for data sampling, Excel for data and question management, Docs for instructions, and Tableau or Python for post-processing. This fragmented workflow slows down efficiency and increases complexity.

Among the three roles, Data Scientists rely on external tools the most—using SQL for data sampling, Excel for data and question management, Docs for instructions, and Tableau or Python for post-processing. This fragmented workflow slows down efficiency and increases complexity.

OUR SOLUTION

OUR SOLUTION

To streamline their workflow, we built integrated features that eliminate the need for external tools, enabling Data Scientists to complete the entire process—from task creation to post-processing—within our platform.

To streamline their workflow, we built integrated features that eliminate the need for external tools, enabling Data Scientists to complete the entire process—from task creation to post-processing—within our platform.

To streamline their workflow, we built integrated features that eliminate the need for external tools, enabling Data Scientists to complete the entire process—from task creation to post-processing—within our platform.

Data Attributes Setting Panel

Data Sampling Tool

Question Editor

Result Visualization Tool

Data Attributes Setting Panel

Data Sampling Tool

Question Editor

Result Visualization Tool

Project List Page

Data Page

Design Page

Job Page

Data Attributes Setting Panel

Data Sampling Tool

Question Editor

Result Visualization Tool

Admin Portal

The admin act as the bridge between Data Scientists & Annotators. They need real-time visibility into jobs and annotators' statuses to ensure clear communication and proper assignment of labeling jobs. In their portal, you’ll find:

Job Management Page – Track all pending, ongoing, and completed jobs, with quick approval and filtering options.

Annotator Detail Page – See annotator profiles, basic information, and their assigned jobs.

The admin act as the bridge between Data Scientists & Annotators. They need real-time visibility into jobs and annotators' statuses to ensure clear communication and proper assignment of labeling jobs. In their portal, you’ll find:

Job Management Page – Track all pending, ongoing, and completed jobs, with quick approval and filtering options.

Annotator Detail Page – See annotator profiles, basic information, and their assigned jobs.

The admin act as the bridge between Data Scientists & Annotators. They need real-time visibility into jobs and annotators' statuses to ensure clear communication and proper assignment of labeling jobs. In their portal, you’ll find:

Job Management Page – Track all pending, ongoing, and completed jobs, with quick approval and filtering options.

Annotator Detail Page – See annotator profiles, basic information, and their assigned jobs.

Job Management Page

Annotator Detail Page

Job Management Page

Annotator Detail Page

Job Management Tool

Annotator Detail Page

Annotator Portal

The speed of data labeling heavily depends on human annotation efficiency. When designing the Annotator Portal, I constantly asked myself:


"How can we improve how fast annotators analyze and label data?"


This mindset drove my design decisions, ensuring a seamless, distraction-free labeling workspace that enables annotators to focus, process data efficiently, and complete tasks faster.

The speed of data labeling heavily depends on human annotation efficiency. When designing the Annotator Portal, I constantly asked myself:


"How can we improve how fast annotators analyze and label data?"


This mindset drove my design decisions, ensuring a seamless, distraction-free labeling workspace that enables annotators to focus, process data efficiently, and complete tasks faster.

The speed of data labeling heavily depends on human annotation efficiency. When designing the Annotator Portal, I constantly asked myself:


"How can we improve how fast annotators analyze and label data?"


This mindset drove my design decisions, ensuring a seamless, distraction-free labeling workspace that enables annotators to focus, process data efficiently, and complete tasks faster.

1

In annotator workspace, only the current data row is displayed at a time, minimizing distractions.

2

Critical ID data is now hyperlinked, allowing users to access relevant info instantly. No more copying & pasting—just click to open.

3

Questions are presented clearly, with conditional logic ensuring only relevant ones appear, reducing cognitive load.

Annotator's Labeling Process

In recent usability testing, annotators using the new tool saw a 37% increase in efficiency per data row compared to Excel, while the error rate dropped from 5% to 2%.

In recent usability testing, annotators using the new tool saw a 37% increase in efficiency per data row compared to Excel, while the error rate dropped from 5% to 2%.

In recent usability testing, annotators using the new tool saw a 37% increase in efficiency per data row compared to Excel, while the error rate dropped from 5% to 2%.

Result

A Great Success in User Experience and Business Goals!

With the introduction of Label Studio, data labeling evolved into a fully integrated end-to-end platform. PM & I set up success metrics and gathered the necessary stats through collaboration with the dev team, and we could validate the effectiveness of this tool.

-29% weighted average completion time

9.8/10.0 satisfaction rate from post-launch survey

I could sense that my efforts not only improved the overall process efficiency but also had a positive impact on employee's daily work.

© Liandong Zhou, 2025

© Liandong Zhou, 2025

Site Map

Site Map

My Contact

My Contact