Font Size

- Aa +

Tue 1 Dec 2015 09:40 AM

Font Size

- Aa +

What’s in a Tweet? Much more than 140 characters of data

Tatjana de Kerros, founder of data science company, explains how businesses can mine and analyse huge amounts of information from a single Tweet

What’s in a Tweet? Much more than 140 characters of data

Companies that are seeking to build big data analytical capabilities should look no further than a Tweet.

A Tweet is not limited to 140 characters. On the contrary, it can yield a vast amount of data intelligence that goes far beyond the content of the Tweet. In fact, this data is far more valuable to companies seeking to extract and analyse social data than the actual content. The content of a Tweet is temporal – it has a sell-by date from which the data is actionable. But not a Tweet’s metadata on which big data capabilities are built.

Twitter metadata refers to unique pieces of information that can be extracted from a single Tweet. Did you know that a single Tweet holds 150 unique pieces of metadata that can be extracted through access to Twitter’s API? And this library of data is continuing to grow, as from 2013 third-party Twitter developers can add additional metadata to a post, such as additional data contained in images, videos, polls, and tags.

Just as companies such as Google and SalesforceIQ are rolling out new products to integrate the metadata capabilities of email content, – a data science start-up that provides data mining and semantic analytics – is helping companies mine Twitter data and build actionable predictive insight from the volume of data generated by Tweets.

Put simply, the metadata of a single Tweet is a unique pool of information. Beyond containing the ‘content’ (text), it contains a unique user ID, creation data, timestamp, and the unique ID and username for all replies, favourites, mentions and RTs the Tweet has received. But the real seismic pool of data intelligence lies behind what the eye can see. Each Tweet contains the date the account was created, the number of Tweets by the user, number of followers and who the user is following, as well as the number of favourites this user has. With the right data science solution, this type of metadata is cross-correlate user interests, affinities, and most importantly- their digital footprint.

What most don’t realize, is that a Tweet is an important geo-location tool for user data. Whilst users can opt in or out of showing their location (if opted in, this appears when extracting Twitter data), increased use of mobile applications provide more often than not, a geo-tag.

Each Tweet also contains a timezone and offset, enabling to identify region-specific user behaviours, sentiment and conversations. Coupled with language detection, a Tweet also provides the place ID, country, and what type of location (ie city, neighbourhood).

Finally, Twitter data mining extraction tools such as the one developed by Tagmemics enables to extract latitude and longitude of the location as well as all tags used in a Tweet. Tagmemics then extracts and analyses this data to track and predict consumer behaviour, sentiment, market fluctuations and product development.

Now imagine having this data for not one user, but hundreds of thousands pertaining to the volume of conversation. Being able to extract Twitter metadata not from just a hashtag, but products, brands, competitors, location and keywords, and monitor changes over time. This is what social networks are – networks of users that enable you to identify and measure networks, volumes and footprints. An important part of data science is to identify these networks, and predict behaviour over time. All from a single Tweet.

As Twitter continues to introduce new features to its direct messaging, content algorithms, and character limitations- Twitter mining and analytics is becoming key to reap the value of the data the social network generates.

Social data is the missing link to big data, and having the ability to use social metadata will continue to power products, brands and governments- and organically build big data capabilities.

Tatjana de Kerros is the founder of, the first data science start-up in the Middle East to offer data mining & advanced semantic analytics in Arabic and English. Tagmemics provides the power to mine and apply sentiment and intent analysis across social and web data for brands, media, agencies and government, maximizing big data and monetization opportunities.