Introducing Audience Uploads
By Mercedes Lois BullNov 18
It seems like we only just gifted you with a rich casket of wonders in the last Brandwatch update two weeks ago. Yet here we are again, ready to gently lather you with yet more wonderful improvements.
It’s less revolution than it is evolution this time round, starting with a much-awaited UK addition to our location operators as well as a more radical upheaval of our sentiment analysis included too.
Fine-grained Location Data
We introduced this handy feature a few weeks ago, initially just for the United States of America, because well, God Bless America and all that and also for Austria, because well, it was convenient. So stop asking questions.
We’ve added good old Blighty to the feature now too, meaning that you can define specific areas to track your data within the UK.
It works in the same way it does for other regions. Once you’ve decided the areas you would like to monitor, you can add them to your query in order to isolate specific regions.
Let’s say we’d like to look at what the Home Counties thought of the Jubilee. While we haven’t been astute enough to include the Home Counties as a region within Brandwatch’s operator list – look at the semantic difficulties in determining what constitutes a Home County – users will be able to put together their own Location Groups to do it manually.
As we’ve got over 60,000 location codes for all the towns, cities, states and counties that we’ve included so far- Brandwatch has a handy location code finder on the query creation screen. Using this, we can just add Kent, East Sussex and the other Southern London-huggers to the query ourselves.
We’ve included thousands of places across the UK for you to isolate and play with, and perhaps interestingly, we used Wikipedia to discover towns with populations of around 5k or more for our loose benchmark as to what constitutes a town. So apologies to Speldhurst; you’re just too small to be included.
New Sentiment Analysis System
Now, technically this isn’t a brand new concept, but we are changing how our sentiment analysis works by default.
After listening to our clients, we no longer think the neutral mentions are anywhere near as important as those classified as either positive or negative. Therefore, we’ve decided to chuck ‘em in the bin. Actually we haven’t, rather we’ve tidied them onto the shelf, away from all our lovely guests when they enter the Brandwatch house.
When you open up a new dashboard in Brandwatch now, you’ll be greeted with the beaming face of positive mentions – and the grumpy face of negative ones.
More importantly than pure aesthetics, however, is the switch to our rules-based sentiment classifiers by default. This won’t affect any existing queries you have running, but all new ones made from this point onwards will be bracketed under this system instead of our usual hybrid classifiers.
Like with before, any client can swap which classifiers are being used on their queries whenever they like. This now means you won’t have to select the industry your query is about. A bit cleaner and clearer all round, wouldn’t you agree?
It wasn’t simply a decision driven by the pursuit of clarity, of course. We put a lot of thought and research into the topic and decided it would be the better choice.
We’ve got seven languages already supported using the rules-based system, with a further five in development (Chinese, Russian, Swedish, Italian and Dutch). For those languages that we haven’t made a rules-based classifier for yet, you’ll be put on the ‘classic’ hybrid classifier until we have time to make a rules one for it.
There are lots of different ways of measuring the accuracy of sentiment analysis, which can actually provide some red herrings if you don’t know what stats you’re looking at.
When you consider the fact that for the vast majority of queries the bulk (80%+) of the mentions are neutral and for this the efficacy of the classifiers is very high, providing they’re getting the neutral ones right (which is much easier than the positive and negative ones). This means when you hear numbers like ‘80% sentiment accuracy’, it’s not actually that indicative of how useful that accuracy is.
We found that our clients were much more interested in the accuracy of particularly the positive and negative mentions. There are two ways of measuring this:
Precision is the percentage of correct findings in the overall output. For example, 23% of Positive Precision means that only 23% of all mentions classified by the classifier as positive were actually positive.
Recall is the percentage of positive mentions found out of all positive mentions in the data set. For example, 58% of Positive Recall means that the classifier was able to find 58% of all positive mentions that exist in the data set.
We came to a decision after hearing feedback from our users, that precision was actually much more important than recall. Basically, this means that users are less bothered by the classifiers accidentally missing some positive mentions, as long as the mentions it does find are genuinely positive.
The hybrid system was slightly better at finding all the positive mentions than the rules-based system, though the rules-based one is really very good (22% better) at making sure the mentions it does classify are accurate. We think that’s a stronger characteristic to have.
Let us know what you think about sentiment analysis and how you think it should be approached. To find out any more about how it works and what our process is here at Brandwatch, don’t hesitate to get in touch.