product-update

Product Update: Random Tagging

This weekend will see another wave of improvements made to the Brandwatch app. As usual, we’ve compiled a quick rundown of how you can use them and why you might want to.

The smaller things we’ve been working on are in many ways just as significant as the larger features we have been introducing over the past few months.

This weekend will see a number of improvements made to the speed of components, especially the Top Sites and History features.

We’ve also expanded our coverage of travel sites like TripAdvisor, Expedia and Travelocity, particularly to properly extract reviews and comments on those sites. Our LinkedIn ‘answers’ coverage has also been enhanced.

However, the most visible feature added is the new random tagging option we’ve introduced to the mentions component.

This option lets users take a random sample from a data set. In this example, we have almost 26,000 mentions for Adobe, far too many to manually check all of the data for sentiment accuracy and other information.

After selecting all of the desired instances in the mentions tab, we can assign a tag to the selection. With the update, there is now the opportunity to apply the tag to only a small portion of the data.

The next menu will prompt the user to tag only a random sample, either dictated by percentage or by absolute volume.

Having a smaller data set can be much easier to work with, and is still representative of the larger data set. Despite our – and other companies’ – best attempts, it’s no secret that automated categorisation isn’t perfect, and that robots haven’t overtaken humans in intelligence (yet!).

With this in mind, although it’s incredibly useful to look at large data sets for patterns, trends and other interesting insights, when the data needs to be 100% accurate at the mention level it can require manual validation.

This is when either users or our team of analysts go through the data and verify metadata, such as sentiment and topics, as well as manually looking for themes.

Sampling the data to a manageable volume before performing human mark-up will present users with a clean, accurate segment of data that will represent the total volume very closely.

Our report writing experts have found that around 400 mentions is the golden number for sample sizes. Regardless of whether a data set is one thousand or one million entries large, no more than 500 mentions will be required to get a sample representative of the original data set to an accuracy of around 95%.

While this feature is valuable to users looking to get a fair impression of a large query, or to examine a comfortable amount of individual mentions at random, it is important to exercise caution when using sampled data.

A graph plotted over a long period of time will include gaps in the data – there may not be a mention for every day in a time period, for example – which could prove hazardous if you’re trying to spot chronological insights in the data. For this reason, it’s best to use random tagging only for general overviews and snapshots of a query, rather than in-depth analysis.

Remember that random tagging can also be done after an in-query search or after filtration, meaning you can get a random sample of a specific type of mentions, not just a sample of all your data.

It’s also a handy technique to randomly split up workloads if you’re validating or otherwise working with lots of data and wish to split it evenly.

All your tags, whether they are deliberate segmentations of the data or if they’re random samples you’ve made, are now available in the controls panel, allowing you to filter the data by tag(s) in any of the components in Brandwatch.

You can also rename and filter them in the mentions tab, simply by clicking on ‘Tag Filter’.

By being able to filter by tags, you will then be able to use our other components, like charting and authors, with the sampled data.

If you think of any interesting ways to use random tagging, please let us know. Get in touch if you have any ideas about this, or any other developments we work on at Brandwatch.