It’s exciting to see Big Data as a Service, or “BDaaS” at its inevitably abbreviated, start to take off. Numerous media outlets have covered the trend’s potential and challenges. Social media (namely, Twitter) continues to share a Forbes article about Big Data as a Service by Bernard Marr, author and industry expert. Marr estimates an impressive $30 billion valuation of the market by 2021. Yet, he explained:
At the moment, BDaaS it is a somewhat nebulous term, which is often used to describe a wide variety of outsourcing of various Big Data functions to the cloud. This can range from the supply of data, to the supply of analytical tools with which to interrogate the data (often through a web dashboard or control panel) to carrying out the actual analysis and providing reports. Some BDaaS providers also include consulting and advisory services within their BDaaS packages.
While there is not yet a precise definition of BDaaS, common themes are emerging and diverging. Here’s my take on some of the recent conversation.
Agree: BDaaS service providers all have slightly different capabilities. A host of services now use the BDaaS moniker. These services have both commonalities and key differences:
- Most, if not all, are cloud-based, though with differences in architecture (multi-tenant, single-tenant, etc.)
- Most are promoted as a managed service, though with varying levels of management.
- Some are based on Hadoop, or other open-source software.
- A few include access to external or third-party datasets.
- A few have their own visualization capabilities, though most support industry-standard tools.
Most BDaaS vendors essentially offer Big Data Analytics Processing as a Service (but BDAPaaS doesn’t really roll off the tongue.) This definitional issue will persist, requiring that buyers be diligent in their evaluations. Christian Prokopp, author and engineering director at Big Data Partnership, presciently blogged on BDaaS over a year ago:
Today, as a consequence of the Big Data trend, Businesses can turn to Big Data as a Service (BDaaS) solutions to bridge the storage and processing gap. Interestingly, a definition and classification of BDaaS is missing today and various types of services compete in the space with very different business models and foci.
Agree: Big Data as a Service improves enterprise agility. With big data’s well-documented potential, an outsider might wonder why more enterprises don’t have programs yet. To paraphrase a data-savvy friend, “This stuff is hard, yo!” There’s a plethora of complex new technology, legacy investments, many other IT priorities and a shortage of skills. BDaaS removes many of the technical and skill barriers to data processing, so enterprises can focus on using data effectively and be more adaptable to change. In a recent TDWI article about BDaaS, Raghuveeran Sowmyanarayanan, vice president at Accenture, described the benefits this way:
BDaaS enables agility, fluidity, and flexibility. For example, enterprises find it’s relatively easier and quicker to adapting changes at lesser cost. Fluidity is the degree to which your BDaaS can be rapidly and cost-effectively repurposed and reconfigured to respond to (and proactively drive) change in a dynamic world.
Agree: Security, privacy and integration are key issues for BDaaS. While many enterprises have developed skills to manage these vexing data challenges on-premises, the cloud is a very different environment. Enterprises tell us they still struggle to understand this new paradigm, and integrate it with significant legacy investments. BDaaS providers, many of which are digital natives, often have different approaches on issues like integration and security, requiring careful scrutiny. Writing on Datafloq about BDaaS, Shridhar Revankar, data professional and industry expert, further adds:
Implementation challenges can stem from lack of preparation, crunch in expertise and compromised data security, all of which will have to be addressed in the future management of BDaaS technology.
Disagree: Hadoop as a Service = Big Data as a Service. I parrot the chorus: big data is about more than just Hadoop. Not only are there many other technologies out there, we’ve watched the definition of big data morph over the last few years. While early definitions often narrowly categorized big data as newer formats like semi-structured data (logs, sensor data, etc.), in common usage, big data is now often shorthand for “all the kinds of data our company has to deal with.” To wit, a recent TechTarget Big Data Analytics survey asked respondents which of these formats their organizations consider “big data”:
- 79% include structured data (eg. transactions, customer records)
- 59% include semi-structured data (log, social, sensor, clickstream)
- 52% include unstructured (videos, images, audio)
Add to this a variety of analytic processing requirements for different data jobs (batch, real-time and advanced), and clearly Hadoop is not the appropriate engine for every big data job. That’s why forward-thinking BDaaS providers, including Cazena, offer multiple processing engines (Hadoop, MPP SQL and Spark) to address the full spectrum of data and analytical needs.
Disagree: Big Data as a Service threatens IT. Earlier this year, SearchCloudApplications.com put out a clever headline: “Beware the BDaaS double boomerang.” The article includes an interview with Nik Rouda, senior analyst with Enterprise Strategy Group, and describes the phenomenon of a marketing department sourcing a DaaS or BDaaS service independently, finding success and then coming back to IT.
“These DaaS projects always reach a transition point where they come back to IT. The marketing people say, ‘We built this and it’s working great, when can you roll it out companywide?'” ESG’s Rouda said. “IT puts its foot down, saying that it can neither support nor guarantee the solution, and asks how much it will cost and whether it is even secure.” The result is a second boomerang right back to marketing, “when IT says ‘We’re not Hadoop experts, maybe you are better off going with that third-party DaaS provider,'” he said. And just like that, IT is no longer in control.
It doesn’t have to be that way. BDaaS is poised to help IT leaders, enabling them to quickly offer big data capabilities within the scope of existing data and governance programs. Instead of messing with Hadoop for six months, IT can source BDaaS that meets enterprise requirements and then reallocate precious resources to helping the business analyze data more effectively. IT can quickly implement BDaaS in days or weeks, enabling their organization with unprecedented data access, unlimited scale and new agile data analytics capabilities. Talk about a quick win!