Understanding the Compliance and Security Risks of Unstructured Data
Managing the Exponential Growth of Data
With the increased consumption of mobile technology such as smartphones and tablets, innovations in mobile networks and WiFi, combined with the growth of the Internet of Things (IoT) and smart devices, the creation and consumption of data is constantly accelerating. Mind bogglingly fast.
According to Domo, and the majority of search results after a quick Google investigation, it’s claimed that there are 2.5 quintillion bytes of data created each day. And over the last two years, 90 percent of the data in the entire world was generated. That’s 1.7MB of data, created every second, by every individual throughout 2020. And IDC predicts the total sum of the world’s data will be around 175 zettabytes by 2025 – that’s up from 33 zettabytes in 2018. That’s an awful lot of a data.
Unstructured Data: Fast Facts
- 1.7MB of data is created every second by every person during 2020.
- In the last two years alone, the astonishing 90% of the world’s data has been created.
- 2.5 quintillion bytes of data are produced by humans every day.
- 463 exabytes of data will be generated each day by humans as of 2025.
- 95 million photos and videos are shared every day on Instagram.
- By the end of 2020, 44 zettabytes will make up the entire digital universe.
- Every day, 306.4 billion emails are sent, and 500 million Tweets are made.
Source: https://techjury.net/blog/how-much-data-is-created-every-day/#gref
A More Focused View
While these global statistics present astronomical numbers, what if we take a more focussed view?
How are you handling the increasing amount of data growth at your own workplace?
Almost 90% of data owned by organisations is now estimated to be unstructured and growing at 55-65% each year. And we know, unstructured data creates security and compliance risks.
Consider all the documents, spreadsheets, photos, videos, audio, web pages, text files, social media presentations and more, which can contain sensitive or personally identifiable information (PII), that you and your colleagues may have saved in an unencrypted or password-free file and then stored this in a folder on OneDrive or SharePoint – because you think it’s secure. That’s a lot of unstructured data floating around the organisation.
Data Classification and Compliance Risks
And when we say floating, we mean floating. Why? Because unstructured data doesn’t live in a database nor have a pre-defined data model or schema. Unstructured data is difficult to classify, difficult to manage, and difficult to determine exactly what the content is– especially when its video content or a spreadsheet.
Effectively, this situation presents a data governance risk – particularly in highly regulated industries such as financial and legal services, healthcare, education and other public sector organisations that have a duty to comply with specific data protection legislation and regulations.
And if that’s not all, there is also a security risk. Many existing data classification tools can’t tell you if, for example, a Microsoft Word file is infected by a macro virus. So not only do you need to be able to classify your unstructured data across your cloud environment and identify any containing PII or sensitive data, but you also need to be able to scan that information for any security threats.
Cloud Data Governance Challenges
As hybrid working becomes the norm, the migration from on premises infrastructure to the cloud continues to grow, and ultimately represents yet another challenge for data governance.
In the light of the recent pandemic and sudden increase in remote working, to adapt, many organisations have rapidly undertaken a “lift and shift” operation, simply migrating all their unstructured data from on premises to cloud platforms such as Office 365. Of course, visibility still remains an issue. It’s just the same problem in a different place.
There are data classification tools available – but many were designed pre-cloud and have gaps in functionality and capability to meet our modern requirements. They are also not typically compatible with the latest file formats and are unable to redact any personal or sensitive information as it classifies to keep data secure.
The growth of unstructured data is not slowing down. Organisations must get to grips sooner rather than later with data classification and governance across their cloud environments in order to identify and protect sensitive information and avoid costly or damaging security and compliance breaches.