Big data is digital data of unusual size (beyond the capacity of a traditional database) or generated at spectacular velocity, such as the data collected from telescopes or by social media providers. Big data is often unstructured—it doesn’t fit neatly into a predefined database construct. The exciting promise of big data is that it can be collected first and then analyzed for insight, patterns, and connections without knowing in advance what to look for. 

As the cost for data storage has decreased, the amount of data being stored has skyrocketed and is growing at an unprecedented rate. Each of us generates digital data in many ways, including the more obvious forms of digital communication (email, text, phone calls, video conferencing, photo sharing, social media), transaction-based data (such as credit card statements, billing/payment systems, employee records), and vast amounts of sensory data (such as maps, location sensing, and tracking). Financial companies, medical and scientific research companies, and city planners are examples of businesses with significant big data challenges.

To be useful, big data analytics must be flexible and as close to real-time as possible. Big data analytics lets businesses move from traditional database queries (Who sold the most this month?), to what’s happening analysis (Why are we suddenly flooded with customer complaints through multiple channels?), and to what if questions necessary for planning or product development (What impact will outsourcing have on our customer support issues?).

Today many vendors offer big data solutions. An open source solution, the Apache Hadoop* framework, provides scalable, distributed, reliable big data analysis.

