Case Study

Data Annotation to Optimize Searchability in E-Commerce

A community of 200+ TAUS contributors was formed based on their product affinity in various product categories, ranging from make-up and collectible coins to professional audio equipment, to annotate data in several European languages. The annotated data was to be used in training the client’s high-tech Machine Learning systems to optimize webshop functionalities such as searchability.
Ready to get started?
The Client
A multinational e-commerce corporation
Facilitates consumer-to-consumer and business-to-consumer sales through their website
Serve a worldwide market with their online shopping site that's best known for its auctions
The Challenge
Data annotation is the categorization and labeling of data to be used in the training of AI applications. Training datasets must be carefully categorized and annotated for each specic use case. High-quality, human-powered data annotation allows companies to build and improve AI implementations which results in enhanced customer experience solutions such as product recommendations, relevant search engine results, computer vision, speech recognition, chatbots, and more.
Our client aimed to nd an ecient and scalable data annotation solution to train their ML systems to optimize their webshop functionalities. In particular, to improve the searchability on their webshop and enhance the relevance of search results for the customers. The client’s webshop features a wide array of products for a global customer base. The annotated data had to be highly accurate so that the resulting machine learning model could identify and match the search terms with the corresponding products. To achieve that, the formation of language-specic annotator teams and platform customization were required to maximize eciency. Besides their language skills, the contributor communities were formed based on their product anity for miscellaneous product categories ranging from make-up and collectible coins to professional audio equipment to ensure accuracy and attention to nuances in the annotation process.
The Solution
The client’s trust in TAUS’ expertise in the data space and commitment to quality in all phases from community recruitment and training to platform customization led them to partner with us to design a custom solution.
In close collaboration with the client, the TAUS HLP operations manager designed a customized data annotation solution on our proprietary HLP platform:

- Scalability: The TAUS HLP Platform is easily scalable to allow for large volumes of data to be processed by a big number of data annotators in a controlled environment

- Customization: The client requested customized data annotation features to maximize eciency and effectiveness. Our teams were able to create a solution that exactly matched the client’s specific requirements

- Community based on product affinity: A highly engaged community of HLP data annotators was recruited per language based on the workers’ product affinity

- Competitive cost structure: Thanks to the short supply chain in the TAUS HLP Platform, we were able to offer a competitive cost structure while ensuring fair compensation to the communities engaged, based on our Fair Cooperation Principle.

The Results
250+ Contributors
5+ Languages
150+ Hours of platform customization work
60+ Hours of recruitment work to form communities based on product affinity
The TAUS HLP Team created a solution where our client could have access to a data annotation platform tailored to their specific needs and a community of language-specific data annotators who are recruited after careful evaluation. The customizations implemented included the following:

- Customized data tokenization based on client specifications

- On-screen explanation of the tags to use to annotate the data, including client-specific examples (1)

- Easy one-click tagging experience to optimize tagging speed and effort (2)

- Intuitive tag dragging feature to group connected tokens with the same tag (3)

- Customizable workflow to add multiple annotation/review steps as per client’s requirements

- Project level QA checks to efficiently handle consistency checks across the whole project (4)

Let's connect

Talk to our experts to see how you can utilize our Human Language Project and our worldwide communities to help you with any data services or enhancements you need.

Discover more Case Studies

TAUS Estimate API as the Ultimate Risk Management Solution for a Global Technology Corporation

Based on examples of texts from one of the largest technology companies in the world, TAUS generated a large dataset and customized a quality prediction model. The accuracy rate achieved was 85%.

Domain-Specific Training Data Generation for SYSTRAN

After the training with TAUS datasets in the pandemic domain, the SYSTRAN engines improved on average by 18% across all twelve language pairs compared to the baseline engines.

Customization of Amazon Active Custom Translate with TAUS Data

The customization of Amazon Translate with TAUS Data always improved the BLEU score measured on the test sets by more than 6 BLEU points on average and 2 BLEU points at a minimum.