Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Here’s How Big Data Can Help Protect Infrastructure
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Here’s How Big Data Can Help Protect Infrastructure
Big DataExclusiveUnstructured Data

Here’s How Big Data Can Help Protect Infrastructure

Richard
Richard
8 Min Read
big data can help protect infrastructure
SHARE

Critical infrastructure (CI) is a term used by large businesses (and other organizations, like government agencies) that refers to the most important components within a given software architecture. Importance is usually defined by the needs of stakeholders – often this is a business’ customers. But it might also be finance or marketing departments who need to understand revenue or user behavior.

This is an extract from Artificial Intelligence for Big Data by Anand Deshpande and Manish Kumar.

As you can imagine a digitized infrastructure produces a wealth of data. That’s great because this data can produce high quality insights that can aid decision making – but a digitized infrastructure can also be vulnerable to cyberattacks too. For example, Stuxnet, a malicious worm which was discovered back in 2010, targeted SCADA (Supervisory Control and Data Acquisition) systems and caused significant damage to fuel enrichment plans in Iran by interfacing with Programmable Logic Controllers (PLCs).

Fortunately, the wealth of data that modern digitized infrastructures provide can also be used to defend against cyberattacks. By using a combination of data from components within the infrastructure as well as data from outside of it, it’s possible to build a pretty robust defense mechanism.

Essentially, you need 4 things – ‘4 Vs’ – which together ensure that you can use big data to guard against cyberattacks:

More Read

ai photo editing software
AI Technology Leads to Innovative Photo Editing Software
Blogging from the Gartner BI Summit: Day 2
Taming Big Data
DQ Problems? Start a Data Quality Recognition Program!
Decision Management and software development III – DSLs

● High volume of data
● A variety of data
● Data at velocity (speed at processing and its availability)]
● Valuable data – high quality data that is relevant

Those are the core components, but the process is important. We’ll look at that now.

Cybersecurity and big data: the process

A data-driven framework for cybersecurity has 3 elements, as the diagram below shows:

big data can help protect infrastructure

Let’s go through each of these steps.

Data collection and analysis

The systems that constitute critical generate data assets in the form of event logs. The data collection stage is where you gather these logs from all the components, both software and hardware.

As well as those components within your infrastructure, you also need contextual information. You can only get this by collecting data from outside of your immediate infrastructure. For example, historical data is useful here as it allows you to generate insights based on past events. Using supervised learning techniques, for example, you can take similar steps based on what is optimal according to past data.

The data – or ‘log’ – that is generated by these various components can be categorized in a number of different ways:

Structured data: In the case of structured format, the individual elements (or attributes) of an entity are represented in a predefined and consistent manner across time periods. For example, the logs generated by the web servers (HTTP log) represent fields such as the IP address, the time the server finished processing the request, the HTTP method, status code, and so on. All these attributes of a web request are represented consistently across requests. The structured data is relatively easy to process and does not require complex parsing and pre-processing before it is available for analysis. With structured data, processing is fast and efficient.

Unstructured data: This is a free-flowing application log format that does not follow any predefined structural rules. These logs are typically generated by the applications and are meant to be consumed by someone who is troubleshooting the issues. The intention is to log the events without an explicit goal of making the logs machine readable. These logs require extensive preprocessing, parsing, and some form of natural language processing before those are available for analysis.

Semi-structured data: This is a combination of structured and unstructured data where some of the attributes within structured format are represented in an unstructured manner. The information is organized into fields which can be easily parsed but the individual fields need additional preprocessing before being used in analysis.

Anomaly detection

As you start gathering data, patterns begin to form within that data. This pattern generally remains consistent, but there could be some fluctuations. For example, an online retailer might expect more orders during the holiday season.

However, when there is an unanticipated change in the regular pattern of data anomaly detection will trigger an alert and notification. One of the most important characteristics of reliable and accurate anomaly detection is that it should be able to generate the alert as soon as the event occurs, with minimum lag between event time and the alert/notification time.

The diagram below illustrates the ideal, reliable, and unreliable anomaly detection components based on the time difference between the event and alert time:

big data can help protect infrastructure

Corrective and preventive actions

When suspicious activity is detected, there are two ways to respond. In the first case, the alert/notification requires manual intervention in order to trigger the corrective action. In the second case, the system itself takes some corrective action based on the context and the acceptable threshold of the error margin.

For example, if an attack into thermostat circuitry starts increasing the temperature of the cold storage in an unanticipated manner, the system can switch the control to an alternate thermostat and ensure that the temperature is back to normal and maintained at normal levels. This component can use supervised learning as well as reinforcement learning algorithms for triggering the corrective actions on their own based on historical data or the reward function. When the correction is applied and the infrastructure’s state is restored to normal, the system will need to analyze the root cause and train itself to take preventive actions.

Mapping the model on to a big data infrastructure

In typical Big Data environments you will find a layered architecture. Layers within the data processing pipeline help in decoupling various stages through which the data passes to protect the infrastructure.

How the data ‘flows’ through our schema can be seen in the diagram below. Where different frameworks fit into this process are also shown:

big data can help protect infrastructure

Clearly, big data is going to play a big part in cybersecurity in the future. As threats grow, using the data at our disposal to recognise those threats quickly is going to be vital.

If you want to learn more about the relationship between big data, artificial intelligence, and cybersecurity, check out Artificial Intelligence for Big Data From Packt.
Get the eBook for just $10 (limited period offer).

TAGGED:big datainfrastructure
Share This Article
Facebook Pinterest LinkedIn
Share
ByRichard
Follow:
Richard Gall is co-editor of the Packt Hub. He’s interested in politics, tech culture, and how software is being used by modern businesses.

Follow us on Facebook

Latest News

image fx (2)
Monitoring Data Without Turning into Big Brother
Big Data Exclusive
image fx (71)
The Power of AI for Personalization in Email
Artificial Intelligence Exclusive Marketing
image fx (67)
Improving LinkedIn Ad Strategies with Data Analytics
Analytics Big Data Exclusive Software
big data and remote work
Data Helps Speech-Language Pathologists Deliver Better Results
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

big data analytics for smart devices
AnalyticsBig Data

Big Data, Big Difference: Building Smarter Devices with Data Analytics

7 Min Read
investing apps and Fintech
Big DataExclusiveFintech

Big Data Paves The Road For A New Generation Of Investing Apps

8 Min Read

Operationalize Your Big Data Analytics Program

8 Min Read
data analytics tools
AnalyticsBig DataExclusive

Budget-Friendly Data Analysis Tools for Small and Scaling Businesses

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?