Last month at Gigaom Structure Data, there was an interesting panel of big data experts sharing innovative ways they’re using the millions, if not billions and even trillions of data points to build new products and services. This led me to consider that the volume of data we’re able to collect and store today is HUGE. In a single day on the Internet, PayPal processes over $315 million in transactions1, Facebook takes in about 350 million uploaded images2, and Twitter blasts out 400 million tweets from its users3. Multiply these numbers by 365, account for some data for each interaction, and you’re in the zettabyte range and beyond for total yearly storage. And that’s not even accounting for extra metadata!
Visualizing big data in dollars
When we talk about big data, just how big are the numbers? Let’s try visualizing dollars, a common and frequently used measurement of financial success. It might be relatively easy to imagine hundreds, thousands, and even millions, but what about billions? Or trillions? One trillion one dollar bills stacked reaches 68,000 miles into space, one-third of the way to the moon. In 2013, we made over 2 trillion searches4 on Google—so if these searches were one dollar bills we would be two-thirds of the way to the moon.
Let’s scale down time from a year to a day…
Last year when Twitter turned 7 years old, over 200 million active users were sending 400 million tweets per day. If you had 400 million dollars and spent a $1,000 every day, it would take you 1,096 years to spend all the money.
($400,000,000/$1,000)=400,000 days)/365 days = 1096 years
…to an hour
Facebook users share over 41 million pieces of content every hour5. If you had 41 million dollars and spent $1,000 every day, it would take you 112 years to spend all the money.
($41,000,000/$1,000)=41,000 days)/365 days = 112 years
…to a minute
Google receives over 2 million search queries a minute6. If you had 2 million dollars and spent a $1,000 every day, it would take you 5 years to spend all the money.
($2,000,000/$1,000)=2,000 days)/365 days = 5 years
Visualizing big data over time
Now that we know just how big the number is when we talk about big data, let’s visualize what big data looks like over a period of time as it might reveal new trends. I took three organizations – PayPal, Facebook and Twitter and charted out the dollars processed, number of photos uploaded, and tweets amassed per year. And here are the results:
Big data gets even bigger after 2009
PayPal, founded in 1998, had a steady 10-year-rise, but notice the steep line from 2009 to 2011 in dollars processed – which increased by 60%. Facebook’s steep line, representing the number of photos uploaded from 2011 to 2013 was a 73% increase. Twitter’s big data growth was even more extreme or, I should say, exponential. In 2009, there were about 10 billion tweets, and they’ve been more than doubling every year since then, reaching the 120 billion mark in 2012. Perhaps this spurt was pushed by Oprah’s blessing when she joined Twitter in 2009 but more likely certain network effects were taking over.
Dollars processed, photos uploaded and tweets are not remotely comparable, but what I’m attempting to demonstrate is that big data got really, really big over the past couple of years with a pure unstructured data broadcasting service – Twitter—leading the charge.
In fact, if this trend continues, PayPal, Facebook and Twitter’s big data numbers will help them get over half way to the moon, just like Google searches. And perhaps we need a new benchmark for total data—e.g., “we’ve reached 10 Twitters of storage”?
So why is big data significant?
Amassing data itself isn’t inherently valuable. What’s significant is the metadata that accompanies big data – enabling Facebook to suggest pages to like or for Twitter to recommend people or organizations for you to follow. These suggestions are powerful in that they provide positive user experience, yield important insights to nearly anything you can imagine, and most importantly, can be the basis for a revenue generating business model.
In a future post, we’ll explore the type of metadata these platforms might have and the innovation that is possible as a result. Stay tuned!