We all have seen the explosive growth of data – according to International Data Corp (IDC) data is expected to grow at a five-year compound annual growth rate of 26% through 2024, reaching a total world-wide estimate of 149 zettabytes, up from 59 zettabytes in 2020. What are the challenges resulting from the amount of data retained, and what are the benefits of data minimization? How can organizations overcome such overwhelming growth to ensure that risks are reduced and data is used efficiently? An effective information governance program can help to overcome the challenges and provide benefits such as operational efficiencies and improved compliance. Let’s take a closer look at the challenges and how information governance (IG) can help solve them.
Challenge #1: Too Much Data, Not Enough Time
Most corporate information security officers (CISOs) and records and information managers (RIM) will agree that the sheer volume of data owned by their organizations poses risk – on many levels – to their companies. Large data volumes mean a wider surface area for cyberattacks, and IT must continually source and manage storage infrastructure. But large volumes also mean delayed response times to regulators, courts, and auditors, and increasing costs for storage and management. So it’s not just CISOs and RIM practitioners who struggle with data volumes:
- Privacy officers need to ensure that responses to data subject access requests are fulfilled in a timely manner
- The legal department needs to respond accurately to litigation and eDiscovery demands
- Risk and compliance must satisfy regulatory and audit requests, often on a tight timeline
One way to deal with data growth is by implementing an information governance framework. A successful IG program covers all phases of the information lifecycle – including creation, protection, storage, access, use and deletion – and the operational activities and requirements associated with each phase. Ensuring that there are sound governance rules around an organization’s data retention – what needs to be kept and what can be deleted – will go a long way to overcoming the challenges posed by growing data volumes.
Challenge #2: We Know We Have It, But How Do We Find It?
Finding data you need when you need it is one of the critical challenges of data growth. Like the proverbial “needle in a haystack” locating data that is responsive to a specific request may appear to be nearly impossible. One way to solve the challenge of finding data when it is needed is to ensure that it has been classified. Data can be classified by its purpose, use, or information type (such as accounting or financial data). Organizing the data and ensuring that it is appropriately identified with metadata (the “data about the data”) is critical.
A key IG tool for governing data is to create a data taxonomy, which organizes the data into categories and sub-categories that are relevant to the organization. The taxonomy shows the categories of data, while another tool, the data map, links these classifications to where the data is stored. Both are important to ensuring data can be located when it is needed.
Challenge #3: We Really Need to Keep It!
Deciding what needs to be retained can pose a significant challenge to an organization that is dealing with a crush of data. There can be a tension between those in records management and information security who advocate for only retaining what is legally required, and those in the business or IT who want to keep everything for data analytics and client data mining. Keeping everything is generally not the best approach, since it can cause delayed response times, an inability locate what is needed, and other risks discussed in this article. Moreover, many data privacy regulations, such as the EU’s General Data Protection Regulation (GDPR), contain requirements for storage limitation, meaning that companies must establish retention periods for data types and disclose these to consumers at the time of collection.
Once a company has decided how to classify the data, the next step in an IG program would be to apply retention requirements governing the data to the classifications in the taxonomy. Some data is legally required to be retained for a specific period, by either national, state, or local law. A company may also choose to set a retention period for certain business data that does not have a legal requirement but has value for a certain length of time. Creating a retention schedule that shows the classification of the data and the legal or business requirements for how long it must be kept imposes a structure on the data that permits the organization to then take the next critical step in governing the data – deleting or disposing of it when it has met its legal requirement or business purpose.
Challenge #4: What If We Need It Someday?
For data deletion to be effective, organizations must overcome the challenge of their employees’ reticence to get rid of data they “might need someday”. People often have pride of authorship or are concerned that they may not be able to recreate something that has been purged according to retention policy. While a retention schedule is a fine tool for indicating when data should be purged, it will only be effective when there are policies and processes in place to ensure the disposal is operationalized throughout the organization.
Technology can be applied to ensure data is deleted per retention requirements as long as the organization ensures that there is an effective process for setting aside data responsive to regulatory, tax, or eDiscovery requests; in other words, a legal hold process. Speaking of legal holds, some institutions decide that the best way to respond to litigation is to retain everything, just in case. This approach is less effective than selectively segregating and holding data that is specifically subject to the eDiscovery request, since retaining everything leads to much higher collection and review expense. Integrating an effective legal hold process into the overall IG program ensures that only responsive data is held.
Challenge #5: Copies, Copies, Copies
Copies of data not only add to the corporate storage burden, but also complicate responsiveness to eDiscovery, regulatory inquiries, and privacy data subject access requests (DSARs) by greatly increasing the amount of data a company needs to sort through. Legacy data repositories where obsolete data is housed also pose risks and costs. Deploying technology to de-duplicate data and store only one copy can be instrumental to slim down the data pool and reduce redundant data. Ensuring that outdated repositories are purged of stale data will also improve an organization’s data profile and operational efficiency.
How IG Can Help Reduce the Data Burden
An information governance program provides a structure for documenting and managing data collection, use, sharing, retention, and disposal policies across the organization. It identifies where the data is stored, making it easier to find the data you need. Policies, processes, and technologies implemented under an IG operating model can bring benefits such as improved data control, operational efficiency, and improved compliance with laws and regulations.
To learn more about how IG can help you minimize your growing pool of data, please contact us.