NeuralSpace Outperforms Google, IBM and Rasa in NLU Benchmarking

NeuralSpace has achieved exceptional results in a benchmarking experiment of Natural Language Understanding (NLU) intent classification and entity recognition across a range of languages including Arabic, Hindi, Chinese (Mandarin) and many more. Our NLU technology surpasses industry leaders Google, IBM and Rasa in accurately identifying user intents and Rasa in extracting entities. These results are testament to our commitment to delivering reliable, best-in-class NLU solutions to our global community.

In this blog post, we will explore the significance of NLU benchmarking and our results, as well as the challenges associated with building highly accurate NLU models in multiple languages.

‍

What is Natural Language Understanding (NLU)?

Natural Language Understanding (NLU) refers to the ability of a computer or machine to comprehend and interpret human language in a way that is similar to how humans understand it. It involves the analysis of various aspects of language, including syntax, semantics, and context, to derive meaning from text. NLU is a critical component of many applications such as chatbots, virtual assistants, sentiment analysis dashboards and (semantic) search engines, as it enables these systems to understand user queries and respond appropriately.
‍

Measuring the performance of NLU

Intent classification and entity recognition are the two predominant features of NLU models. Intent classification refers to the ability of an NLU system to accurately understand the purpose or intention behind a user’s input or query. For example, if a user types “احجز رحلة إلى دبي في مساحة النادي” (in English, book a flight to Dubai in the Club Space) into a travel website’s virtual assistant, the intent of their message is “book flight”.

Entity recognition, on the other hand, refers to the ability of an NLU system to identify and extract relevant information from a user’s input or query. Entities are specific pieces of information that are relevant to the user’s intent, such as names of people, places, dates, or prices. In the above example, “دبي” (Dubai) is recognised as the entity “city” and “مساحة النادي” (Club Space) is recognised as a custom entity “seat type”. The accurate understanding of both intents and entities is crucial for a successful NLU model.
‍

NeuralSpace NLU Benchmark Overview

Our performance benchmarking was conducted using the open-source multilingual Amazon MASSIVE dataset. NeuralSpace’s model outperformed Google, IBM and Rasa in every language we tested! We applied out-of-the-box implemented methods for training an NLU model, using the user interface or APIs available on Google Dialogflow, IBM Watson, Rasa and NeuralSpace.

‍

Intent Accuracy Results

Entity Recognition Results

Not all evaluated service providers offer training custom entities. As such, benchmarking entity recognition is limited to Rasa and NeuralSpace.

We evaluated each model’s performance using the F1 score, which computes the average of precision and recall scores. The strict F1 score considers the exact boundary surface string match and entity type. The partial F1 score considers a partial boundary match over the surface string, regardless of the type.

The Challenges of Achieving High NLU Accuracy in Multiple Languages

Vocabulary and Language Complexity

Every language has its own unique vocabulary, grammar, and sentence structure. Capturing the nuances of each language is essential for accurate NLU. Colloquialisms and idiomatic phrases can have entirely different meanings than their literal translations, making it difficult for NLU models to understand them.

Contextual Understanding

Achieving high NLU accuracy requires understanding the context in which words or phrases are used. This can be particularly challenging when dealing with multiple languages and their cultural nuances.

Limited Resources

For many languages, there is a lack of large-scale, high-quality training data, which hinders the development of accurate NLU models. This is especially the case for languages that do not use the Latin (or English) alphabet, which had their keyboards developed considerably later than the one for Latin characters.

Mix of Languages

In many countries, speakers often switch between languages within a single conversation. In Saudi Arabia, for example, Arabic is often mixed with English, and in India, Hindi and English are often mixed. NLU models must be able to detect and understand content regardless of the language-mix to maintain high accuracy.

‍

How NeuralSpace Models Outperform Competitors

Data sourcing

We collect proprietary data to train in-house models for better performance. These data come from a wide range of sources, trying to capture the nuances of multiple industries, regions, age groups, religious beliefs and many more cultural and contextual settings.

Out-of-the-box support for mixed language

We fine-tune our models on mixed language datasets, making them more effective in many practical settings where users tend to use English, French or Spanish words within another language.

Fine-tuning

Our models are fine-tuned on domain and task-specific data to ensure high performance in NLU tasks, such as sentiment analysis, text categorisation and content analysis.

Multilingual capabilities

Our models are designed to handle multiple languages effectively, enabling seamless cross-lingual understanding and transfer learning across languages.

‍

In Conclusion

Highly accurate Natural Language Understanding (NLU) is crucial for AI-powered services to effectively interpret and respond to user inputs. When this technology underperforms, it can lead to frustrating user experiences, which hinders product adoption. Since its inception, NeuralSpace has focused on developing NLU models that capture the nuances of different languages and dialects. With this approach, we aim to unlock seamless AI-powered experiences for people around the world, irrespective of their language or location. We’re delighted to see this long-standing commitment lead to highly accurate models that deliver better outcomes for our customers.

Get in touch to learn more about NeuralSpace or visit our website.

‍

Featured

ABS-CBN Doubles Localization Speed with LocAI

Together, we've created LocAI, a content localization platform that will broaden the reach of its programming through digital distribution.

March 24, 2025

Introducing LocAI. Media Localization For The AI Era

Meet LocAI: a unified, intuitive platform that enables teams to script, translate, and subtitle content twice as fast as traditional manual methods.

March 24, 2025

Introducing dialectal Speech-to-Text models for Arabic

We've launching four new Arabic dialectal speech-to-text (STT) models on VoiceAI.

March 24, 2025

Maximizing Content Reach: How Broadcasters Are Leveraging AI To Unlock Global Growth

Explore key trends and challenges shaping the media industry in 2024, and three innovative ways in which AI is unlocking global growth for streaming services.

October 24, 2024

NeuralSpace Outperforms Google, IBM and Rasa in NLU Benchmarking

What is Natural Language Understanding (NLU)?

Measuring the performance of NLU

NeuralSpace NLU Benchmark Overview

Intent Accuracy Results

Entity Recognition Results