What really is Big Data?

We hear about Big Data everywhere – it’s the latest buzzword. But what really is Big Data, and why should businesses care?

Big data is about two things:

  1. Large data sets of various types of mostly unstructured data, often with time-sensitive requirements (often referred to as high Volume, Variety, and Velocity)
  2. Some relatively new techniques and tools to deal with this kind of data.


The Big Data Challenge and Opportunity

The challenge for businesses trying to manage and analyze Big Data is that traditional data management tools do not work well with Big Data.

In general, the traditional way of handling data is with a relational database for on-line transaction processing, and a separate data warehouse and business intelligence tools for analytics, which provides processing relief from the main database. Large relational databases tend to be expensive propositions, as the costs of the processing units and the disks are very high.

Relational databases are based on what’s called “early structure binding”. What this means is that you have to know what questions are going to be asked to the database so that you can design the schema, tables and relations. With big data, this assumption is often not correct.


Types of Big Data

Unlike transactional data, the analysis of big data is much less predictable. Big data is often either (1) various types of online data (text, images, video), or (2) machine data.

The first one is what we call our digital footprint. This is all our emails, the blogs we read and possibly write, tweets, Foursquare check-ins, Facebook entries, etc.

The second type is machine data, such as the log files generated by all those computers supporting our digital footprint. But there are also other examples of machine data, such as data obtained by sensors that present us with real time flight tracking, etc.

Most of this data is unstructured, which can be loosely defined as a variable number of fields of variable size. Big data also tends to be large, very large. Additionally, this data tends not to be mission critical.


Analyzing Big Data: Today’s Big Data Tools

Big data tools such as Hadoop and Splunk are based on a different paradigm of data management than the traditional relational data tools. They are based on distributed processing of data that is also distributed, where the primary requirements are not ACID properties, but the flexibility to do ad hoc analytics. These tools are designed to work on commodity hardware, and are resilient enough to handle the failures expected from cheap hardware. But these tools have a higher latency when processing this data and they have dropped the support of many (or all) the ACID properties. This is just the price to pay for dealing with very large unstructured data.

This is what big data is all about – a different paradigm for processing data.


How is Big Data Important in Healthcare?

Nowhere are the benefits of Big Data more tangible than in the healthcare sector.

Experts have predicted a 20% decrease in patient mortality as the medical field gears up to analyze streaming patient data with large-scale software applications. That’s not only a great return on investment —that’s using information to save human lives.

The biggest risk in any healthcare situation is human error, such as missed warning signs, overlooked risk factors, and cursory assessments. That’s why healthcare providers are beginning to turn to computers to help them make quick and accurate decisions about patient health. The applications of Big Data in the future of healthcare include:

  • Reducing readmissions
  • Accessing data anywhere
  • Point-of-care decision-making
  • Innovative smart devices
  • Genome sequencing

There are many published examples of healthcare facilities using big data to positively impact patient outcomes, including specific examples of reducing the instances of fatal blood clots, and identifying patient characteristics and symptoms that increase the likelihood of re-admittance.

Hospitals across the country have begun utilizing Big Data software applications to make dramatic improvements in the diagnosis of patient health. In many cases, new or existing data is collected, parsed, and applied to analyze small but significant pieces of evidence that would likely be overlooked by a doctor simply reading a chart.

Read the full blog post on Big Data in Healthcare here


How is Big Data Important in Social Media?

In many ways, social media is the unmined data frontier.  Social media alone generates more information in a short period of time than existed in the entire world just several generations ago. Popular sites like Facebook, Instagram, Foursquare, Twitter, and Pinterest create massive quantities of data that when properly analyzed can offer a brand a golden ticket into the minds of its consumers.

However, this is easier said than done.

Unfortunately, the data produced by social media is not only enormous—it’s unstructured. The task of capturing, processing, and managing this data is unquestionably beyond human scale. In fact, it’s beyond the scale of most common software used by marketers.

Luckily, a slew of Big Data applications have been created specifically to make sense of social media data. At Opallios, we work with our clients to use these tools to determine the impact of every tweet, tag, pin, check-in, and like on their brand, helping them to listen in on and make sense of the social media conversation. Employing Big Data applications to measure brand engagement is essential for businesses wanting to connect with modern consumers. The conversation is happening—it’s up to you to start listening.

Read the full blog post on Big Data in Social Media here


Why Opallios for Big Data?

Opallios has literally written the book on Big Data. Our CTO, Peter Zadrozny, is the author of a number of IT books including one on big data, and teaches a university course on big data analytics. Our team is made up of experienced system architects and developers who are deeply involved in the forefront of the rapidly growing Big Data industry, with expertise in big data analytics, and the architecture and administration of big data solutions. We utilize the latest technologies, such as Hadoop, Hive, HBase, Pig, and Splunk, to meet our clients’ specific needs.

Learn More

The Opallios Advantage

We believe that successful Web 2.0 products need UX and engineering to cohesively work together. In our ecosystem, UX teams work hand-in-hand with our engineering folks in an agile manner to create awesome products for you.

arrow Toll Free : 1-(888)-205-4058
Contact us