Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Consider This: The Big Data Workout
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > Consider This: The Big Data Workout
Uncategorized

Consider This: The Big Data Workout

MartynJones
MartynJones
12 Min Read
SHARE

To begin at the beginning

Miss Piggy said, “Never eat more than you can lift”. That statement is no less true today, especially when it comes to Big Data.

Contents
To begin at the beginningTo begin at the beginningIntegrated Circuit Wafer TestingThe Internet of Things – IoTNet ActivitySignal DataMachine DataOther Applications of the Data GovernorSummaryAdditional graphics:

The biggest disadvantage of Big Data is that there is so much of it, and one of the biggest problems with Big Data is that few people can agree on what it is. Overcoming the disadvantage of size is possible; overcoming the problem of understanding may take some time.

To begin at the beginning

Miss Piggy said, “Never eat more than you can lift”. That statement is no less true today, especially when it comes to Big Data.

The biggest disadvantage of Big Data is that there is so much of it, and one of the biggest problems with Big Data is that few people can agree on what it is. Overcoming the disadvantage of size is possible; overcoming the problem of understanding may take some time.

More Read

What is an ADAPA instance?
Things Interrupting the Internet of Things
The ATI’s Radeon HD 5970 is a Supercomputer in your desktop
Reprise: For SOA ROI, all you need is a ‘just-good-enough’ number
Google’s Peter Norvig Offers Kind Words for Bing, Exploratory Search

As I mentioned in my piece Taming Big Data, “the best application of Big Data is in systems and methods that will significantly reduce the data footprint.” In that piece I also outlined three conclusions:

  • Taming Big Data is a business, management and technical imperative.
  • The best approach to taming the data avalanche is to ensure there is no data avalanche – this approach is moving the problem upstream.
  • The use of smart ‘data governors’ will provide a practical way to control the flow of high volumes of data.

“Data Governors”, I hear you ask, “What are Data Governors?”

Let me address that question.

Simply stated, the Data Governor approach to Big Data obtuseness is this:

  • The Big Data Governor’s role is to help in the purposeful and meaningful reduction of the ever-expanding data footprint, especially as it relates to data volumes and velocity (see Gartner’s 3Vs).
  • The reduction techniques are exclusion, inclusion and exception.
  • It’s implementation is made through a development environment that can target hardware, firmware, middleware and software forms of hosting and continuously monitored execution.

In Short, it is a comprehensive approach to reducing the Big Data footprint whilst simultaneously maintaining data fidelity.

Here are some examples:

Integrated Circuit Wafer Testing

What’s this all about? Here’s an answer the good folk at Wikipedia cooked up earlier:

“Wafer testing is a step performed during semiconductor device fabrication. During this step, performed before a wafer is sent to die preparation, all individual integrated circuits that are present on the wafer are tested for functional defects by applying special test patterns to them. The wafer testing is performed by a piece of test equipment called a wafer prober. The process of wafer testing can be referred to in several ways: Wafer Final Test (WFT), Electronic Die Sort (EDS) and Circuit Probe (CP) are probably the most common.” (Link / Wikipedia)

Fig.1 – IC Fab Testing and the CE Data Governor

This exhibit shows where the Data Governor is placed in the Integration Circuit fabrication and testing/probing chain.

In large plants, the IC probing process generates very large volumes of data at high velocity rates.

Based on exception rules the Data Governor reduces the flow of data to the centralised data store.

It also speeds up velocity and time to analysis.

Greater speed and less volume mean that production showstoppers are spotted earlier, thereby potentially leading to significant savings in production and recuperation costs.

Let’s look at some of the technical details:

  • Taking our example of the IC Fab test/probe chain, a Data Governor should be able to handle a hierarchy or matrix of designation and exception.
  • For example, a top-level Data Governor actor could be the Production Run actor.
  • The Production Run actor could designate and assign exception rules to a Batch Analysis actor.
  • In turn, the Batch Analysis actor could designate and assign exception rules to a Wafer Instance Analysis actor.

The Internet of Things – IoT

Intrinsically linked to Big Data and Big Data Analytics, the Internet of Things (IoT) is described as follows:

“The Internet of Things (IoT) is the network of physical objects or “things” embedded with electronics, software, sensors and connectivity to enable it to achieve greater value and service by exchanging data with the manufacturer, operator and/or other connected devices. Each thing is uniquely identifiable through its embedded computing system but is able to interoperate within the existing Internet infrastructure.” (Link / Wikipedia)

Fig.2 – The Internet of Things and the CE Data Governor

This exhibit shows where the Data Governor is placed in the Internet of Things data flow.

The Data Governor is embedded into an IoT device, and functions as a data exception engine.

Based on exception rules and triggers the Data Governor reduces the flow of data to the centralised / regionalised data store.

It also speeds up velocity and time to analysis.

Greater speed and less volume means that important signals are spotted earlier, thereby quite possibly leading to more effective analysis and quicker time to action.

Net Activity

Much play is made of the possibility that we will all be extracting golden nuggets from web server logs sometime in the near future. I don’t want to get into the business value argument here, but would like to describe a way of getting Big Data to shed the excess web-server-log bloat.

Fig.3 – Web Server Activity Logging and the CE Data Governor

This exhibit shows where the Data Governor is placed in the capture and logging of interactive internet activity.

The Data Governor acts as a virtual device written to by standard and customised log writers, and functions as a data exception engine.

Based on exception rules and triggers the Data Governor reduces the flow of data from internet activity logging.

It also speeds up velocity and time to analysis.

Greater speed and significantly reduced data volumes may lead to more effective and focused analysis and quicker time to action.

Signal Data

Signal data can be a continuous stream of data originating from devices such as temperature and proximity sensors, by its nature, it can generate high-volumes of data and at high velocity –  it can add lots of data, and very quickly.

Fig.4 – Signal Data and the CE Data Governor

This exhibit shows where the Data Governor is placed in the stream of continuous signal data.

The Data Governor acts as an in-line data-exception engine.

Based on exception rules and triggers the Data Governor reduces the flow of signal data.

It also speeds up velocity and time to analysis.

Greater speed and significantly reduced data volumes may lead to more effective and focused analysis and quicker time to action.

Machine Data

“Machine-generated data is information which was automatically created from a computer process, application, or other machine without the intervention of a human.” (Link / Wikipedia)

Fig.5 – Machine Data and the CE Data Governor

This exhibit shows where the Data Governor is placed in the stream of continuous machine generated data.

The Data Governor acts as an in-line data analysis and exception engine.

Exception data is stored locally and periodically transferred to an analysis centre.

Analysis of the totality of the same class and origins of data can be used to drive ANN* and statistical analysis which can be used to support (for example) the automatic and semi-automatic generation of preventive maintenance rules.

Greater speed and significantly reduced data volumes may lead to more effective and focused analysis and quicker time to proactivity.

Other Applications of the Data Governor

The options are not endless and the prizes are not rich beyond the dreams of avarice, but there are some exciting possibilities out there. Including applications in the trading; plant monitoring; sport; and, climate change ‘spaces’.

Fig.6 – Other Applications in the Big Data ‘space’ and the CE Data Governor

Summary

To wrap up, this is what the CE Data Governor approach looks like at a high level of abstraction:

  • Data is generated, captured, created or invented.
  • It is stored to a real device or virtual device.
  • The Data Governor (in all its configurations) acts as a data discrimination and data exception manager and ensures that significant data is passed on.
  • Significant data is used for ‘business purposes’ and to potentially refine the rules of the CE Data Governor.

To summarise the drivers:

  • We should only generate data that is required, that has value, and that has a business purpose – whether management oriented, business oriented or technical in nature.
  • We should filter Big Data, early and often.
  • We should store, transmit and analyse Big Data only when there is a real business imperative that prompts us to do so.

Moreover, we have a set of clear and justifiable objectives:

  • Making data smaller reduces the data footprint – lower cost, less operational complexity and greater focus.
  • The earlier you filter data the smaller the data footprint is – lower costs, less operational complexity and greater focus.
  • A smaller data footprint accelerates the processing of the data that does have potential business value – lower cost, higher value, less complexity and best focus.

Many thanks for reading.

Can we help? Leave a comment below. contact me via LinkedIn or write to martyn.jones@cambriano.es

Additional graphics:

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

image fx (2)
Monitoring Data Without Turning into Big Brother
Big Data Exclusive
image fx (71)
The Power of AI for Personalization in Email
Artificial Intelligence Exclusive Marketing
image fx (67)
Improving LinkedIn Ad Strategies with Data Analytics
Analytics Big Data Exclusive Software
big data and remote work
Data Helps Speech-Language Pathologists Deliver Better Results
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Mobile Business Intelligence: Who is Hot in 2014?

6 Min Read

The Social Wars Trilogy

2 Min Read

Maybe text mining SHOULD be playing a bigger role in data warehousing

3 Min Read

Are Your Business Applications Unnecessarily Complex?

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?