What’s in a Tweet? Much more than 140 characters of data

Tatjana de Kerros, founder of data science company Tagmemics.com, explains how businesses can mine and analyse huge amounts of information from a single Tweet

Companies that are seeking to build big data analytical capabilities should look no further than a Tweet.

A Tweet is not limited to 140 characters. On the contrary, it can yield a vast amount of data intelligence that goes far beyond the content of the Tweet. In fact, this data is far more valuable to companies seeking to extract and analyse social data than the actual content. The content of a Tweet is temporal – it has a sell-by date from which the data is actionable. But not a Tweet’s metadata on which big data capabilities are built.

Twitter metadata refers to unique pieces of information that can be extracted from a single Tweet. Did you know that a single Tweet holds 150 unique pieces of metadata that can be extracted through access to Twitter’s API? And this library of data is continuing to grow, as from 2013 third-party Twitter developers can add additional metadata to a post, such as additional data contained in images, videos, polls, and tags.

Just as companies such as Google and SalesforceIQ are rolling out new products to integrate the metadata capabilities of email content, Tagmemics.com – a data science start-up that provides data mining and semantic analytics – is helping companies mine Twitter data and build actionable predictive insight from the volume of data generated by Tweets.

Put simply, the metadata of a single Tweet is a unique pool of information. Beyond containing the ‘content’ (text), it contains a unique user ID, creation data, timestamp, and the unique ID and username for all replies, favourites, mentions and RTs the Tweet has received. But the real seismic pool of data intelligence lies behind what the eye can see. Each Tweet contains the date the account was created, the number of Tweets by the user, number of followers and who the user is following, as well as the number of favourites this user has. With the right data science solution, this type of metadata is cross-correlate user interests, affinities, and most importantly- their digital footprint.

What most don’t realize, is that a Tweet is an important geo-location tool for user data. Whilst users can opt in or out of showing their location (if opted in, this appears when extracting Twitter data), increased use of mobile applications provide more often than not, a geo-tag.

Each Tweet also contains a timezone and offset, enabling to identify region-specific user behaviours, sentiment and conversations. Coupled with language detection, a Tweet also provides the place ID, country, and what type of location (ie city, neighbourhood).

Finally, Twitter data mining extraction tools such as the one developed by Tagmemics enables to extract latitude and longitude of the location as well as all tags used in a Tweet. Tagmemics then extracts and analyses this data to track and predict consumer behaviour, sentiment, market fluctuations and product development.

Now imagine having this data for not one user, but hundreds of thousands pertaining to the volume of conversation. Being able to extract Twitter metadata not from just a hashtag, but products, brands, competitors, location and keywords, and monitor changes over time. This is what social networks are – networks of users that enable you to identify and measure networks, volumes and footprints. An important part of data science is to identify these networks, and predict behaviour over time. All from a single Tweet.

As Twitter continues to introduce new features to its direct messaging, content algorithms, and character limitations- Twitter mining and analytics is becoming key to reap the value of the data the social network generates.

Social data is the missing link to big data, and having the ability to use social metadata will continue to power products, brands and governments- and organically build big data capabilities.

Tatjana de Kerros is the founder of Tagmemics.com, the first data science start-up in the Middle East to offer data mining & advanced semantic analytics in Arabic and English. Tagmemics provides the power to mine and apply sentiment and intent analysis across social and web data for brands, media, agencies and government, maximizing big data and monetization opportunities.

Join the Discussion

Disclaimer:The view expressed here by our readers are not necessarily shared by Arabian Business, its employees, sponsors or its advertisers.

NOTE: Comments posted on arabianbusiness.com may be printed in the magazine Arabian Business

Please post responsibly. Commenter Rules

  • No comments yet, be the first!

All comments are subject to approval before appearing

Further reading

Features & Analysis
Your AI-powered doctor

Your AI-powered doctor

How pioneering research project could one day help influence...

Decoding the legal framework for entrepreneurs in the UAE

Decoding the legal framework for entrepreneurs in the UAE

Certain considerations start-up founders need to keep in mind...

Are entrepreneurs in the UAE more risk-averse than investors?

Are entrepreneurs in the UAE more risk-averse than investors?

Prashant K. (PK) Gulati, a technology innovator, angel investor...

Most Discussed
  • 9
    Revealed: huge disparity in Dubai school fees

    I recall a recent study by Alpen Capital suggesting that the average cost of a child's entire life of schooling in Dubai is about AED 1 million. Although... more

    Monday, 29 May 2017 9:21 AM - New Expat
  • 3
    How Saudi Arabia blundered into OPEC oil cut

    Well written piece. Clearly the pressure on OPEC countries holding to their quotas will become even harder. Nigeria etc. are desperate to pump & sell a... more

    Monday, 29 May 2017 9:18 AM - Victory Red