What Will Happen With Big Data in 2017? Prepare for the Unexpected!
Quentin Gallivan, CEO I Pentaho
No, my computer hasn’t been hacked. The honest truth is that I can’t predict with certainty what surprises 2017 will bring. This year the UK voted to leave the EU and America voted for Donald Trump despite nearly all pre-election polls predicting different outcomes. The reason these events surprised us boils down to a data problem. Pollsters had access to huge, diverse data sets, but were relying on outdated methods. Their greatest failure was not factoring voter sentiment into their analyses, especially from the rural locations where this was not well understood. The fact that governments on both sides of the Atlantic were caught out by the same data problems also meant neither had adequate transition plans.
In the same way that those outdated polling methods no longer serve governments, the old ways of analyzing data don’t cut it anymore in the business world. Every day I talk to more enterprises blending their corporate data with sentiment, location and sensor data for more precise insights to grow revenue, gain a 360 degree view of their customers, mitigate risk and operate more efficiently. As big data growth compounds with data from sensors and devices we begin to see there will be no slowdown in data volumes as IoT and big data merge together. The gap widens between innovators willing to get started, experiment and refine their implementations and companies sitting on the fence.
And while next year will no doubt throw us more curveballs, I am confident enough to predict five ways in which big data and IoT systems will evolve in 2017 to help businesses prosper during uncertain times:
Self-service data prep will unlock big data’s full value. Organizations building advanced, big data deployments like the ones needed to accurately predict election outcomes are buckling under huge, diverse data volumes. The amount of time spent simply preparing data is overwhelming organizations struggling for resources and time. This is often to the tune of anywhere between 50-70% of IT time spent preparing data. That sentiment data I mentioned only exacerbates this problem needing to be continually ingested from a huge universe of social network feeds and prepared for analysis. Self-service visualization tools that can only analyze data after it’s been prepared are diminishing in value. Our customer Sears Holdings does spot checks and visualizes its data throughout its lifecycle, which enables it to make more valuable data-driven decisions – in time for them to matter – while reducing costs. Expect more software vendors in 2017 to follow our lead and start offering tools that bridge the gap between analytics and data prep with an integrated experience for both.
Organizations are replacing self-service reporting with embedded analytics. As I first predicted in 2015, embedded analytics would become ‘the new BI’. We are now really starting to see our vision of ‘next generation applications” mature and replace self-service reporting. Organizations can see that analytics are an expectation and must be embedded at the point of impact regardless of the end-users sophistication. In our customer CERN’s case, this involves 15,000 users in various operational roles accessing Pentaho analytics from their normal line-of-business applications.
IoT’s adoption and convergence with big data will make automated data onboarding a requirement. This year predictive maintenance became a marquis use case for IoT’s ROI potential and this will continue to gather speed in 2017. Everything from shipping containers to oil-drilling screws to train doors is being fitted with sensors to track things like location, operating status and power consumption. And speaking of trains, expect to hear more about our project with Hitachi Rail to build ‘self-diagnosing’ trains that can detect if a problem is brewing on a train to either be taken out of service or repaired before the failure has taken place. In order to ingest, blend and analyze the massive volumes of data all these sensors generate, more businesses will need to be able to automatically detect and onboard any data type into its analytics pipeline. This is simply way too big, complex, fast-moving and mind-numbing a job for overburdened IT teams to handle manually.
2017’s early adopters of AI and machine learning in analytics will gain a huge first-mover advantage in the digitalization of business. Big data and IoT use cases in business and industry are approaching the data variety, volume and velocity levels of large-scale scientific models for which AI and machine learning were originally conceived. Early adopters gain a jump start on the market in 2017 because they know that the sooner these systems begin learning about the contexts in which they operate, the sooner they will get to work mining data to make increasingly accurate predictions. This is just as true for the online retailer wanting to offer better recommendations to customers, a self-driving car manufacturer or an airport seeking to prevent the next terrorist attack.
Cybersecurity will be the most prominent big data use case. As with election polls, detecting cybersecurity breaches depends on understanding complexities of human behavior. Accurate predictions depend upon blending structured data with sentiment analysis, location and other data. BT’s Assure Cyber service, for example, uses Pentaho to help detect and mitigate complex and sustained security threats by blending event data and telemetry from business systems, traditional security controls, advanced detection tools among others.
So there you have it. I’ve ventured to make a few predictions, all inspired by you – our customers and community users who continue to demand greater ease of use, connectivity, automation, agility and practical solutions to the hardest data problems out there. We will never cease tackling these hardest of challenges head on and wish you great prosperity and success as you continue to prepare for the unexpected in 2017!