Last month I attended a digital advertising conference here in NYC which was swarming with social media benchmarking vendors. If you wanted to learn more about software that measures how your company or brand is faring on Twitter, Facebook, or Pinterest, then this was the place to be. These buzz-monitoring apps make perfect sense for consumer-focused product companies (sneakers, clothing, soft drinks), but I didn’t necessarily connect the dots between social media content and big data for financial service firms.
That is until I saw this article in American Banker on big data in the banking world. Specifically, BNY Mellon Bank ($1.4 trillion assets) is launching its own big data project, which will involve collecting and aggregating transactional information from customers across many different systems—their web site, ATM network, customer service, trading desks, and any other relevant interaction points.
The goal is to pull these separate data streams into a centralized data store, and then mine it to learn customer behaviors and preferences. The results will be fed back to their marketing department to help pinpoint customers who would most likely be interested in new bank offerings. BNY Mellon will also use this data to gain more complete awareness of customer needs in their future interactions with the bank.
It doesn’t stop there. BNY Mellon has extended the scope of its big data project beyond its own internal IT operations by harvesting content from the social world—blogs, Twitter, LinkedIn, and other online forums.
How much data can be found in Tweets and posts that would be useful for banks and financial companies?
This is hard to gauge. But according to an IDC report referenced in the American Bank piece, 1.8 trillion gigabytes of data was generated in 2012 with the majority of that considered unstructured social data.
These numbers for social data sound about right. Earlier in the year, Twitter reported its users were sending 340 million tweets per day. Doing a quick back of the envelope calculation—340 million x 140 x 365—I come up with at least 10,000 gigabytes of data just from Twitter alone. Then if you start adding Linkedin with its 175 million users and Facebook’s close to 1 billion users and the millions of active blogs out there, it’s easy to see how unstructured text from social begins to reach the volumes in the IDC range.
For large financial firms with millions of their own customers, filtering out, processing, and storing what’s relevant clearly falls in the big data solution space. The larger point is that banks are looking at this public data as an auxiliary treasure trove from which they can supplement their existing records with more granular details about their own customers, and even perhaps find potential new markets. Like everyone else they are also concerned about their brand and the buzz around it.
Lessons learned? Here’s one: even those companies most closely associated with large traditional fixed-field databases —in this case, a financial institution, but also consider, say, insurance, power utilities, and telecom carriers—will by necessity also have to deal with petabytes of content in order to complete the big data puzzle.