Here at Luminoso, we’re constantly researching new ways to help our clients more quickly, easily, and accurately analyze their text-based data. Our Chief Research Officer, Rob Speer, leads the charge to ensure that Luminoso’s software incorporates the most recent developments in AI and natural language processing… and that we’re the ones helping to push those boundaries even further.
A few days ago, Rob announced our latest research breakthrough: ConceptNet 5.5, an updated version of the science we use to teach computers what words mean. After exhaustive testing, we’re proud to say that it performs better than competing systems, and just as well as an average college applicant. (Yes, seriously.)
Better science, better results
This technology upgrade makes Luminoso’s software even more accurate at interpreting – without needing a training data set or a list of keywords or ontologies – what words mean and what people are actually saying (behind all the nuance, misspellings, and slang that occur in language).
How exactly do we accomplish this, you might ask?
Luminoso’s proprietary methodology relies in part on ConceptNet, an open-source knowledgebase developed at MIT Media Labs that helps computers understand language in the same way that humans do.
When people communicate, we bring to the table our basic knowledge of how the world works. For example, we know that the sun is hot, water is wet, and that most people spend way too many hours of their day watching cat videos. (Guilty as charged.) We don’t need to explain these things when we’re communicating with another person because we assume that they also understand. However, computers don’t know these basic facts unless we teach them – and this is exactly what ConceptNet does.
ConceptNet is important to our methodology because it increases accuracy and prevents computers from drawing incorrect or illogical conclusions about text that a human with common sense would never make. This reduces the need to have a person devote precious hours to checking and verifying text analytics outputs, like they had to in the past.
Making ConceptNet as smart as an average college applicant
With ConceptNet 5.5, we’ve updated the way that we identify different forms of the same word. This ensures that the nuanced meaning of those terms are better represented, and also makes Luminoso’s output with other systems, like Google’s word2vec.
The great news? After testing ConceptNet 5.5 for quality and accuracy, and comparing our results against those of similar systems, we found that ConceptNet 5.5 is not only more accurate than other systems, but is also on par with the performance of an average college applicant.
No big deal, right? 😄
For all you developers out there, we’ve also updated ConceptNet’s API to provides outputs in JSON-LD format, which will make it easier to link ConceptNet to other knowledgebases you may want to use.
These upgrades have been integrated into Luminoso’s products and are already in place to help you understand what’s important in your text-based data.