Who are we?

Opallios is an innovative technology consulting firm with a unique combination of big data, Salesforce, strategy, and product development expertise.

We work with clients in many vertical industries, including technology, healthcare, financial services and hospitality.  We help our clients leverage the cloud and Salesforce to realize business value from the data revolution.

Peter Zadrozny of Opallios to Speak at the Palo Alto Data Science Association Meet-up

Santa Clara, CA – February 27, 2014 – Opallios, a company specializing in big data and cloud solutions, today announced that Peter Zadrozny, CTO and Founder of Opallios, will be speaking at the Palo Alto Data Science Association Meet-up to be held on March 6th, 2014 in Menlo Park, CA.

Mr. Zadrozny will be speaking about his experiences with projects that involve the use of sentiment analysis algorithms. He will walk through the phases of data acquisition, preparation, how to get good training data, what elements impact a Naïve Bayes model, and how Splunk was instrumental as a key tool in solving big data problems.

The Palo Alto Data Science Association (PADSA) Meet-up group was founded November 2013 and has grown to over 400 data scientists, engineers, and technologists. Drawing primarily from the expertise of members, topics lean toward live demos, real world applications, and innovative approaches to solving data science challenges.

About Opallios

Opallios specializes in outsourced product development, embracing emerging technologies in data and analytics to offer an unparalleled and innovative value proposition. From strategy through to implementation, Opallios follows a unique operational model to leverage the best talent pool across the world, deploying the latest viable technologies while maintaining focus on high quality and efficient total cost of ownership.

For more information on Opallios visit www.opallios.com or write to inquiries@opallios.com

For more information on the Palo Alto Data Science Association visit www.meetup.com/Palo-Alto-Data-Science-Association

What really is Big Data ?

Big Data is the latest buzzword, and as such everybody is using it to their convenience. This has inevitably created some confusion. In this blog post we will try to clear it up.

Big data is about two things:  large sets of typically unstructured data and some relatively new techniques to deal with this kind of data. To get a good perspective we need to start by reviewing relational databases.

In general, the traditional way of handling data is by using a relational database. When a database is used for on-line transaction processing we tend to see a separate setup for analytics. This is commonly known as a data warehouse, which provides processing relief from the main database. It also has some analytical or so-called business intelligence tool. Large relational databases tend to be expensive propositions, as the costs of the processing units and the disks are very high.

Relational databases are based on what’s called “early structure binding”. What this means is that you have to know what questions are going to be asked to the database so that you can design the schema, tables and relations. Any new questions that don’t fit this schema require some modification of the schema that usually implies a fair amount of time and good technical skills.

These restrictions of relational databases can be considered the price to pay for having a system that can be considered fully transactional, that is, it fully complies with the ACID properties.

Let’s move to the actual “big data”. It can be broken down in two parts. The first one is what we call our digital footprint. This is all our emails, the blogs we read and possibly write, tweets, Foursquare check-ins, Facebook entries, etc. The second part is machine data, such as the log files generated by all those computers supporting our digital footprint. But there is also of plenty other machine data such as the one obtained by sensors that present us with real time flight tracking, etc.

Most of this data is unstructured, which can be loosely defined as a variable number of fields of variable size, which can be or not present. Big data also tends to be large, very large. Just think of the web access log files of a popular web site. It can generate a few megabytes per day, maybe even by the hour. Additionally, this data tends not to be mission critical. Not only that, in general it does not require the functionalities offered by a fully transactional system. After all, most of the times all we do with it run some analytics.

Now that we have a better understanding of big data, let’s flip the relational database paradigm from centralized, high performance, fully transactional processing to a distributed processing with a higher latency that might comply with just one or two of the ACID properties, and sometimes none.

Big data tools such as Hadoop and Splunk are based on this other paradigm, distributed processing of data that is also distributed. These tools are designed to work on commodity hardware, and are resilient enough to handle the failures expected from cheap hardware. But these tools have a higher latency when processing this data and they have dropped the support of many (or all) the ACID properties. Just think of it as the price to pay for dealing with very large unstructured data.

This is what big data is all about, a different paradigm for processing data.

One last thought, these big data tools can also handle structured data, which could also be small, so don’t place limitations on the functionalities of these tools.

On the next blog post we will explain in more detail the big data tools and their underlying techniques.

Impact of Big Data in the Social Media

Social Media: The Unmined Frontier

IBM estimates that 2.5 quintillion bytes of new data are created every day. To put this into perspective, social media alone generates more information in a short period of time than existed in the entire world just several generations ago. Popular sites like Facebook, Instagram, Foursquare, Twitter, and Pinterest create massive quantities of data that—if translated properly by large-scale applications—would be any brand’s golden ticket into the minds of its consumers.

Unfortunately, the data produced by social media is not only enormous—it’s unstructured. The task of capturing, processing, and managing this data is unquestionably beyond human scale. In fact, it’s beyond the scale of most common software. Because of this, a glass wall exists between marketers and the data—they can see it, but they can’t harness it.

It’s easy to see how Big Data fits into the picture. The Big Data industry deals in sets of data that range from a few dozen terrabytes to many hundreds of petabytes. A slew of Big Data applications have been created specifically to make sense of social media data. Savvy marketers use these tools to determine the impact of every tweet, tag, pin, check-in, and like on their brand. Read on to learn more about utilizing Big Data to listen on the social media conversation.

Making Sense of the Conversation

Companies used to be able to hire humans to separate the chaff from the wheat. For instance, Nielsen ratings were a straightforward way for businesses to analyze the effectiveness of their advertising on television. It was highly actionable data—a glance at the numbers gave executives a ballpark to work with when making costly advertising decisions.

Social media, on the other hand, is a lot like word of mouth—except that everything your consumers say is filed away in massive databases, and only a microscopic fraction of the conversation is even remotely relevant to your company’s brand. The trick to navigating these massive tracts of information is to know how to look for the “right” data. The “right” data is that which drives consumer actions. It’s one thing to have ten million impressions on a YouTube value—it’s another to understand why your brand is stimulating so much chatter. In order to harness the chatter and turn it into actionable information, companies need to start turning to Big Data for translation.

Even today, companies hire college interns and clueless marketers to babysit their Twitter feeds and guide their brands. They fail to recognize that social media is a different beast; they also fail to recognize that is an untapped source of vital information about their brand. Big Data utilizes large-scale applications to instantly translate massive swaths of data into human readable information. This information can then be acted upon to increase your bottom-line. In other words, there is a way to prove that your YouTube video is driving sales—you just have to know how to look.

Harnessing Data to Drive Your Brand

Today’s consumer interacts with your brand in countless ways. They like your page on Facebook, watch your advertisements on YouTube, download your free apps, read your news on their tablets, chat online with your customer service representatives—the list goes on. Data is there for the mining, but there’s simply too much of it, and most of it is contradictory. Luckily, there are endless ways for you to utilize Big Data applications to make this information work for your brand:

Hire the right people. The first step is getting the right team on board. Making use of all the data requires skills that your marketers may never have needed before. You can hire your own database engineers, computer scientists, and statisticians, or you can make use of myriad Big Data solutions offered by budding startups and major corporations.

Follow the keywords. Tuning in for specific words, and analyzing when and where they are used, can yield surprisingly clear results. Imagine being able to pick apart customer service calls and pinpoint the motivating factor behind a recent slew of terminated accounts. Armed with this information, you could identify at-risk customers and start meeting their needs before they jump ship.

Measure brand engagement. Fostering emotional rapport between product and consumer has always been the end goal of marketing. Social media creates brand engagement on a level far beyond what anyone could have imagined, so learning how to have a dialogue with your customers is more important than ever.

Correlate the facts. By comparing factors such as website traffic, product purchases, advertisement spending, and customer inquiries, it’s possible to track the effectiveness of your media investments—and make informed adjustments when you’re spending too much or too little.

Anticipate the future. Perhaps the most profitable application of Big Data is in anticipating the future. Companies can analyze customers’ behavioral data in order arrive at informed conclusions regarding how new products will fare in the marketplace. It’s also possible to head off looming PR crises, and react in real time to customer’s evolving perceptions and needs.

The buzz around Big Data is considerable. Start-ups and corporations alike are touting solutions for data-mining your way to increased profits. Fortunately, this is one case where the solution lives up to the hype. Employing Big Data applications to measure brand engagement is essential for businesses wanting to connect with modern consumers. The conversation is happening—it’s up to you to start listening.

Impact of Big Data in the Healthcare Sector

The Big Players in Big Data

There is a good deal of buzz around Big Data—hardly a week goes by without a new startup or business preaching the virtues of mining massive tracts of information. Understandably, many have trouble seeing beyond the hype, but nowhere are the benefits of Big Data more tangible than in the healthcare sector.

IBM predicts a 20% decrease in patient mortality as the medical field gears up to analyze streaming patient data with large-scale software applications. That’s not just a return on investment (though it is)—that’s using information to save human lives. Even now, major corporations like Microsoft, Dell, IBM, and Oracle are pioneering data-mining platforms that will help medical professionals stay on top of patient data and provide improved medical care.

The Big Data industry deals primarily in collections of data that are beyond the ability of common software to capture, process, and manage in a timely manner. The amount of data processed can range from a few dozen terabytes to many petabytes of information. A plethora of Big Data tools have been released recently to make meaning of it all—you could say they’re finding new ways to make data earn its space. Read on to learn more about Big Data’s colossal impact on the healthcare industry.

Mining Data to Save Lives

The biggest risk any patient takes when admitted to a hospital is being examined by another human. Missed warning signs, overlooked risk factors, and cursory assessments are often part and parcel of hectic admission wards. That’s why healthcare providers are beginning to turn to computers to help them make quick and accurate decisions about patient health. There are endless exciting opportunities for Big Data in the future of healthcare:

• Reducing Readmissions. What if a pile of data could tell you how likely a patient was to be readmitted after treatment? Doctors would able to judge whether patients would benefit from short or longer stays, as well as track specific treatment factors leading to the reoccurrence of ailments. That saves time (and money) for both the doctor and the patient.

• Accessing Data Anywhere. Using secure data querying technologies capable of parsing enormous amounts of information, it’s possible for medical professionals to access remote data for a more timely and complete understanding of their patient’s ailment.

• Point-Of-Care Decision-Making. Imagine tools and equipment with built-in data processing capable of helping doctors make instant, life saving calls at point-of-care. The most obvious application for this is in fast-paced areas such as the hospital emergency room.

• Innovative Smart Devices. We’re already surrounded by intelligent devices capable of funneling large amounts of data at light speed. Why not put a chip in an in-home diabetes monitor that can send back frequent and useful data about a patient’s in-home treatment? Take it a step further, and make smart toothbrushes, smart toilets, and smart scales capable of reporting instantly on a patient’s condition.

• Genome Sequencing. Though still a distant dream, whole genome sequencing may be the most intriguing item on this list. When the human genome was first decoded, it took over ten years to process the data (which is in petabytes)—now it takes merely a week. As Big Data technology expands to process even larger amounts of data, every-day genome sequencing may become available to the private sector.

Many of these possibilities are already in the process of being implemented, but some possibilities are a little farther off. Either way, it’s encouraging to consider that the amassment of huge quantities of data—a practice frowned upon by many privacy-loving citizens—can and will have a profound effect on lives saved and improved by medical technology.


Big Data Making a Difference

According to Business Week, New York-Presbyterian began implementing Microsoft technology in 2010 to scan patient records—and they have reduced instances of fatal blood clots by nearly a third. Also featured was Seton Healthcare Family, which used data-mining to discover that a bulging jugular vein is a predictor that patients admitted for congestive heart failure will be arriving back through the hospital’s spinning doors almost as soon as they’ve been sent out.

In a recent article, fastcompany.com highlighted a bad case of readmissions at the Washington Hospital Center, a healthcare facility near D.C. Emergency room doctors were starting to see an unsettling trend in the amount of patients returning to the ER with the same issues they had before. When it became clear that the center was going to be penalized for readmissions, they worked with a computer scientist at Microsoft Research to parse data from over 300,000 ER visits.

They found two major red flags—ER visits longer than 14 hours, and the word “fluid.” Readmission rates for American hospitals are as high as 20% within the first thirty days, and costs Medicare $17 billion a year. Washington Hospital center hopes to use the information it gleaned to get at the root of its readmission woes.

Hospitals across the country have begun utilizing Big Data software applications to make dramatic improvements in the diagnosis of patient health. In many cases, new or existing data is collected, parsed, and applied to analyze small but significant pieces of evidence that would be overlooked by a doctor simply reading a chart. In 2011 alone, Big Data generated over $30 billion dollars in revenue, according to research from IDC. That number is only expected to go up as the healthcare sector warms up to new technologies.