Business Analytics.com

big data analytics

By Scott Robinson

What is big data analytics?

Big data analytics is the often complex process of examining big data to uncover information -- such as hidden patterns, correlations, market trends and customer preferences -- that can help organizations make informed business decisions.

On a broad scale, data analytics technologies and techniques give organizations a way to analyze data sets and gather new information. Business intelligence (BI) queries answer basic questions about business operations and performance.

Big data analytics is a form of advanced analytics, which involve complex applications with elements such as predictive models, statistical algorithms and what-if analysis powered by analytics systems.

An example of big data analytics can be found in the healthcare industry, where millions of patient records, medical claims, clinical results, care management records and other data must be collected, aggregated, processed and analyzed. Big data analytics is used for accounting, decision-making, predictive analytics and many other purposes. This data varies greatly in type, quality and accessibility, presenting significant challenges but also offering tremendous benefits.

Why is big data analytics important?

Organizations can use big data analytics systems and software to make data-driven decisions that can improve their business-related outcomes. The benefits can include more effective marketing, new revenue opportunities, customer personalization and improved operational efficiency. With an effective strategy, these benefits can provide competitive advantages over competitors.

How does big data analytics work?

Data analysts, data scientists, predictive modelers, statisticians and other analytics professionals collect, process, clean and analyze growing volumes of structured transaction data, as well as other forms of data not used by conventional BI and analytics programs.

The following is an overview of the four steps of the big data analytics process:

  1. Data professionals collect data from a variety of different sources. Often, it's a mix of semistructured and unstructured data. While each organization uses different data streams, some common sources include the following:
  1. Data is prepared and processed. After data is collected and stored in a data warehouse or data lake, data professionals must organize, configure and partition the data properly for analytical queries. Thorough data preparation and processing results in higher performance from analytical queries. Sometimes this processing is batch processing, with large data sets analyzed over time after being received; other times it takes the form of stream processing, where small data sets are analyzed in near real time, which can increase the speed of analysis.
  2. Data is cleansed to improve its quality. Data professionals scrub the data using scripting tools or data quality software. They look for any errors or inconsistencies, such as duplications or formatting mistakes, and organize and tidy the data.
  3. The collected, processed and cleaned data is analyzed using analytics software. This includes tools for the following:

Types of big data analytics

There are several different types of big data analytics, each with their own application within the enterprise.

Key big data analytics technologies and tools

Many different types of tools and technologies are used to support big data analytics processes, including the following:

Big data analytics applications often include data from both internal systems and external sources, such as weather data or demographic data on consumers compiled by third-party information services providers. In addition, streaming analytics applications are becoming more common in big data environments as users look to perform real-time analytics on data fed into Hadoop systems through stream processing engines, such as Spark, Flink and Storm.

Early big data systems were mostly deployed on premises, particularly in large organizations that collected, organized and analyzed massive amounts of data. But cloud platform vendors, such as Amazon Web Services (AWS), Google and Microsoft, have made it easier to set up and manage Hadoop clusters in the cloud. The same goes for Hadoop suppliers such as Cloudera, which support the distribution of the big data framework on AWS, Google and Microsoft Azure clouds. Users can spin up clusters in the cloud, run them for as long as they need and then take them offline with usage-based pricing that doesn't require ongoing software licenses.

Big data has become increasingly beneficial in supply chain analytics. Big supply chain analytics uses big data and quantitative methods to enhance decision-making processes across the supply chain. Specifically, big supply chain analytics expands data sets for increased analysis that goes beyond the traditional internal data found on enterprise resource planning and supply chain management systems. Also, big supply chain analytics implements highly effective statistical methods on new and existing data sources.

Big data analytics uses and examples

The following are some examples of how big data analytics can be used to help organizations:

Big data analytics benefits

The benefits of using big data analytics include the following:

Big data analytics challenges

Despite the wide-reaching benefits that come with using big data analytics, its use also comes with the following challenges:

History and growth of big data analytics

The term big data was first used to refer to increasing data volumes in the mid-1990s. In 2001, Doug Laney, then an analyst at consultancy Meta Group Inc., expanded the definition of big data. This expansion described the increase of the following:

Those three factors became known as the 3V's of big data. Gartner popularized this concept in 2005 after acquiring Meta Group and hiring Laney. Over time, the 3V's became the 5V's by adding value and veracity and sometimes a sixth V for variability.

Another significant development in the history of big data was the launch of the Hadoop distributed processing framework. Hadoop was launched in 2006 as an Apache open source project. This planted the seeds for a clustered platform built on top of commodity hardware that could run big data applications. The Hadoop framework of software tools is widely used for managing big data.

By 2011, big data analytics began to take a firm hold in organizations and the public eye, along with Hadoop and various related big data technologies.

Initially, as the Hadoop ecosystem took shape and started to mature, big data applications were primarily used by large internet and e-commerce companies such as Yahoo, Google and Facebook, as well as analytics and marketing services providers.

More recently, a broader variety of users have embraced big data analytics as a key technology driving digital transformation. Users include retailers, financial services firms, insurers, healthcare organizations, manufacturers, energy companies and other enterprises.

High-quality decision-making using data analysis can help contribute to a high-performance organization. Learn which roles and responsibilities are important to a data management team.

20 Dec 2023

All Rights Reserved, Copyright 2010 - 2024, TechTarget | Read our Privacy Statement