We have been developing and improving Brandwatch for several years, so as you’d expect, what goes on behind the scenes is extremely complex. This section is a general overview of how everything works, starting from how we collect the data, to how we clean it and analyze it, and finally how you can access and use the data.

For the super keen ones among you who want to know more, please do not hesitate to contact us – we love talking geek.

Gathering Data

Brandwatch is all about monitoring what people say about your brand, products, competitors, industry or any related topics. Queries are the Boolean search strings that tell Brandwatch what to look for. Our crawlers then inspect every corner of the social web to find the data you’re interested in, then constantly revisit all those sources to check for new data as it is generated.

Unlike many providers, we crawl and collect our own data with our own technology. Watch this video from the head of our backend development team to discover more about how Brandwatch crawls the web.

We crawl over 70 million sources including blogs, forums, news sites and, of course, major social networks across the world. Plus, we’re a Twitter Certified Product, meaning we have full Twitter coverage in real-time. Our Channels feature also allows the tracking of any public Facebook page with no need for admin rights.

Brandwatch can identify and deliver conversations in over 27 languages, and we’re constantly adding more to our crawling capabilities.

All of this means that whatever you’re tracking, Brandwatch gives you comprehensive coverage of the online content you need.

Cleaning Data

Clean data is critical to successful social media monitoring. Many other systems simply provide you with masses of data, taking little care over the relevancy of that data to your interests. This means your data needs to be manually validated and cleaned, costing your team precious time and resources.

To avoid this, we continue to innovate with our intelligent automated systems which address the sources of irrelevant data. We are dedicated to eliminating the following from your data:

Several measures are in place to prevent spam from appearing in your datasets, including multiple layers of smart pattern-matching algorithms, keyword-density checks to detect SEO text and the ability to report problem sites yourself and add them to our black-list.

Adverts and navigation text:
We developed a specific algorithm that recognizes which parts of a web page are real content, as opposed to navigation text and adverts.

Our own fingerprinting technology ensures that the same mention is not picked up twice, ensuring accurate counts and non-inflated numbers. Combined, these make Brandwatch a best-in-class system for data quality.

Analyzing Data

Each web page found by the Brandwatch crawler goes through a series of automated analysis processes.

Language detection:
Our Natural Language Processing algorithms recognize the language used in the pages we crawl; this improves the accuracy of the other analysis processes and means that customers can filter their data by language.

Date and time:
Our crawler applies several forms of logic to identify the date and time each piece of content was posted, meaning that you can accurately filter your brand’s mentions by date range.

Detecting the location of a source isn’t always as simple as tracking the location of the domain, so we have multiple intelligent techniques to look for evidence of location, to the country, state, region and city level.

Sentiment analysis:
We maintain industry-standard sentiment analysis systems that work based on collections of pre-defined rules and also cutting-edge bespoke classifiers, which we can manually train on bodies of text, in order to make them custom tailored to your queries. Like people, the more specialized a sentiment analysis classifier is, the better it will understand new information.

Topic analysis:
Frequent topics of conversations are continuously highlighted within each query’s recent mentions, highlighting new or growing trends among the query’s online coverage. This particular mechanism is much smarter than a simple ‘word cloud’, as it identifies the grammatical role of each word and joins words into common phrases, resulting in more relevant combinations.

Presenting Data

Giving our customers the tools to make use of all this clever behind-the-scenes technology is of paramount importance to us here at Brandwatch.

The interface is built in the latest webapp technology, making it incredibly snappy and light on your system resources. It runs in all modern browsers and you don’t need any additional software or plugins.

Our dashboards are renowned for their intuitiveness and flexibility:

  • Simple, default dashboards show you a quick overview of key aspects of your dataset, ideal for those reporting or performing ad hoc monitoring.
  • Furthermore, you can create totally customized dashboards from scratch using our array of powerful components and filters, giving the power-user all the tools they need.
  • All data can be exported, either as a whole set or just the chunks you need. Exports come in Excel and CSV format and charts can also be exported directly in a variety of formats and sizes.

Access to our Application Programming Interface (API) is also available to clients, depending on your package. This is an incredibly powerful asset if you need to integrate Brandwatch data into your own system alongside other metrics and datasets.

Both the Brandwatch app and the API function by making calls to our data-centre, where we store all our data using a large distributed, redundant collection of servers. This guarantees availability and performance.

Listen, analyze and act with confidence

Find out how our technology can help you find more value from social data.