Luminoso Software Updates

What a time we’ve been having over the last few weeks! Aside from the public launch of our newest solution, Compass, our development team continued to plow forward and improve on everything we offer. Check out what our team has been working on here, including:

  • Project “branch” creation (API)
  • Project name duplication fix (API)
  • Japanese one-character fix (Compass)

As always, please feel free to reach us at support@luminoso.com if you have any questions or comments!

Natural language can be such an ass headache

It was exciting to see Luminoso’s new product for streaming text analytics, Compass, get an article in Wired. Skimming past the picture of Catherine and me looking ridiculous at SXSW long ago, there’s an image of our “concept cloud” visualizer looking at what people say on Twitter when they’re sick:

Luminoso's concept cloud, showing words, phrases, and emoji people use when they're feeling sick.

Wait a minute. Zoom in. Enhance.

The text "ass headache" appears in the word cloud, near "biggest headache" and "got the worst headache".

The article includes a screenshot that includes a natural-language glitch that’s already caused a lot of amusement around the office.

Here’s what’s going on. One important thing that Luminoso does is to identify relevant phrases that contain more information than the sum of their parts. When looking at text from people who are feeling sick, the phrase “throat hurts so bad” is much more informative than the words “throat”, “hurts”, “so”, and “bad” in isolation.

Usually, these informative phrases end up being reasonable phrases of natural language, or at least close enough (“headache is killing” is missing the object, but we all get the idea).

One case where this missed slightly is the phrase “ass headache”. This is not an affliction that people would usually complain of. And yet it looks entirely reasonable to the computer, given the source data, which contains many phrases such as:

  • “I got this crazy ass headache”
  • “I have a biggg ass headache”
  • “I gotta mean ass headache bruh”

Statistically, it looks like an “ass headache” is a thing you can have. You can have a crazy one, or a mean one, or simply a biggg one, but lots of people have one.

Because we’re actual speakers of the language, as opposed to computers stumbling through it to the best of their ability, we know how these phrases should really be interpreted. We understand that the word “ass”, for whatever reason, can be a modifier for the adjective before it. (That doesn’t stop us from humorously reinterpreting it as a modifier for the noun after it, as an early XKCD comic encourages us to, which is essentially what Luminoso’s analytics did!)

XKCD #37, by Randall Munroe.

XKCD #37, by Randall Munroe.

Phrases that come up in our everyday conversation can contain surprising grammatical quirks. And that’s why natural language is such an ass headache.

Luminoso Software Updates

Hey y’all, another few weeks, much more work from our development team, and many more improvements to our solutions!

Take a look here for the most recent updates, including:

  • Language support in…wait for it…RUSSIAN!
  • Improvements to negation handling
  • New doc_fields parameters to request documents in API

Please feel free, as always, to reach out to us with any questions, comments or feedback. The best way to do that is to send us an e-mail at support@luminoso.com.

Edelman v. Sichuan Garden & Yelp Reviews!

Our own Alice Kaanta, an analytics engineer at Luminoso, provides an interesting take on the high profile social media eruption of Edelman v. Sichuan Garden.

 


In recent news, Ben Edelman, an HBS professor, made a fully-loaded verbal assault against Woburn-based restaurant Sichuan Garden over a $4.00 discrepancy in his check.

You might have observed the blowback across social media as this story went viral, largely in defense of the Sichuan Garden, though defenders of Edelman’s argument exist. Since no one has time to read all of the commentary strewn across social media, we decided to analyze some of the surrounding feedback using Luminoso.

People have taken to Yelp to defend/decry Sichuan Garden, so we’ve loaded 204 of Sichuan Garden’s reviews into our solution.

Sichuan Garden - All reviews (1)

All Reviews
(
Color Representation: Positive Sentiment Negative Sentiment Sichuan Garden Edelman, Harvard Food Service)

 

It seems that most of the reviews about Sichuan Garden are about the food and the authenticity of the restaurant. However, “Edelman”, “Ben Edelman”, and “Harvard” are a definite presence.

Sichuan Garden - Five Star Edelman (1)

All Five-Star Reviews (Color Representation: Very related to “Edelman” Moderately related to “Edelman” Slightly related to “Edelman” Not related to “Edelman”)

 

We found that people who discussed “Edelman” also gave the restaurant five stars. What they had to say about Professor Edelman was not flattering, as the top term related to “Edelman” was “bully”.

Sichuan Garden - Recommended Reviews (1)

Recommended Reviews
(Color Representation: Positive Sentiment Negative Sentiment Sichuan Garden Edelman, Harvard Food Service)

Sichuan Garden - Not Recommended (1)

Not Recommended Reviews
(Color Representation: Positive Sentiment Negative Sentiment Sichuan Garden Edelman, Harvard Food Service)

 

If you’re familiar with Yelp, you might have noticed that some reviews are quietly hidden, having been sent into the “not recommended” reviews section. According to Yelp’s FAQ, this might be because, “… the review might have been posted by a less established user, or it may seem like an unhelpful rant or rave.”  “Not recommended” reviews do not contribute to a restaurant’s star rating, nor are the reviews easily viewable.

We found that all of the reviews mentioning “Edelman” appear to have been marked “not recommended”, which suggests that Yelp understandably does not wish to host a flame war.  It also suggests that well-meaning Sichuan Garden supporters aren’t actually contributing to the restaurant’s star rating or visible positive reviews…

Sichuan Garden - Graph (1)

…that is, unless they are avoiding the Edelman commentary entirely, and sticking to talking about the food.

Concept-Based vs. Keyword-Based Text Analytics; What’s the difference and why does it matter?

A parting gift in the form of a blog post, by our MBA intern, Saman Djabbari.


The average person is thinking one of two things after reading the title of this post. “What in the world is text analytics?” or “You have my attention, what’s the difference?”

Leading industry analyst, Seth Grimes, who in fact has posted on this site before, defines text analytics as “software and transformational processes that uncover business value in ‘unstructured’ text”. Make sense? Not quite? Well, Grimes continues the definition to state that “text analytics applies statistical, linguistic, machine learning and data analysis and visualization techniques to identify and extract salient information and insights. The goal is to inform decision making and support business optimization.” To simplify that, text analytics is here to help you run your business more effectively by ideally saving you time to uncover insights and ideas from your data you might never have been aware.

Now that you get the basics, let’s highlight two of the existing methodologies for text analytics: keyword-based and concept-based. But, before I go into the details of the two, I’m going to provide you with an analogy to help make understanding these methodologies as clear as possible.

Think of some common decisions people have to make in everyday life – let’s think about trying to decide whether to make dinner or order in. By choosing to make dinner you have to purchase ingredients and follow a recipe, which can take a good portion of time. By choosing to order in you typically tell someone over the phone what you want, or you place an order via an app or website. And, are you picking up or do they deliver? For one meal, the cost of each option can often times be comparable, which is why this is a common decision for people, but the true differentiator and pain alleviator is in time and convenience.

Now let’s go into the text analytics methodologies.

In keyword-based text analytics you need to tell the software exactly what to look for. A good example of this is utilizing Boolean logic, which involves typing out a string of words for the software you’re using to detect. Have you ever used Boolean logic? It’s terrible. You have to type in each and every word, permutation of misspelling, any jargon you think might exist, and separating them with conjunctions. There isn’t a greater waste of time than typing in a separate iteration and synonym of a word over and over while separating them with “and’ and “or” repeatedly. Not only must you identify what it is you’re looking for, but also what you’re not looking for, again wasting valuable human hours that can instead be directed toward deriving the insights from the data. Let’s call this the “cooking” method.

Rest assured, there is an easier way. Concept-based text analytics allows you to upload the data to the solution and it will immediately begin to derive insights after a few minutes of processing the data. Sounds much easier, right? Using concept-based text analytics will save you valuable time, not having to tell a system what to search for, and rather let the solution discover new insights for you. Let’s call this the “ordering-in” method.

A perfect example is the work that the Health Media Collaboratory at the University of Illinois Chicago, who is one of our clients, performed using Luminoso’s concept-based text analytics. In order to analyze 140,000 tweets, they simply uploaded the data to the solution all without having to pre-program the solution. The most important themes were delivered to them – “ordering in” their insights if you will.

Stop telling your software what to do and stop wasting money. Go with a concept-based text analytics platform. You’ll save your business thousands of dollars, save time and pain of labor hours, and be more effective in gaining the insight you’ve always been looking for. Don’t get me wrong, I love to cook sometimes…but, you get the point.

Luminoso Software Updates

Another few weeks have passed, which means that our the development team has made parts of our solution better, and added some new features.

Check out our most recent software update here highlighting improvements on the following:

  • Improvement to upload of subsets
  • Dashboard CSV downloads per subset
  • Enhanced approach to numbers in documents

As always, if you ever have any questions, comments or feedback about your Luminoso experience, or want to know more about our solutions and software updates, please feel free to reach out to us at support@luminoso.com.