Dark Data Within Enterprises.
- October 3, 2022
- Posted by: Aanchal Iyer
- Category: Data Science
Introduction
Let us first understand the concept of dark data. It is all of the idle, unidentified, and unused data that is present across an organization, that is created as a result of users’ daily online interactions with innumerable systems and devices. As big data continues to grow exponentially, so does the amount of dark data. It also includes everything from server log files to machine data to unstructured data that results from social media. Most organizations may think that this dark data cannot really provide any value, is redundant or incomplete, or cannot be accessed with the tools available. Most often companies aren’t even aware that dark data exists. However, dark data could be one of the significant resources of an organization. Data is progressively a huge organizational asset, and organizations need to tap into its full value.
According to a recent global intelligence survey of close to 1,300 business, IT, and executive leaders, it has been estimated that most organizations hold 55% or more of their organization’s data comprised of dark data
More on Dark Data
Irrespective of the percentage, dark data hinders daily and strategic operational decision-making, intensifies organizational risk and decreases employee efficiency while also increasing the timelines of critical initiatives. Various organizations are using data intelligence software to battle dark data and make full use of available data to gain better insights. Such applications help data governance and IT teams to automatically enrich and harvest technical metadata from various data sources present across the enterprise for usage within one central repository, improve the business context for more clarity, introduce guardrails for protection and use, and offer a mechanism to leverage and socialize it for tactical and strategic value. However, to generate value, data needs to meet various daily needs and generate the same enterprise “data truths” via various organizational lenses. Following are a few examples:
Information Technology
IT data custodians and stewards require visibility to identify and handle all of the organization’s data assets competently. The data intelligence software decreases the quantity of dark data that is not visible throughout the company and increases the reach of data custodians and data stewards to help the enterprise utilize and prevent the data. By pairing data visibility with governance, custodians and stewards can more accurately operationalize policies and rules.
Data Governance Teams
To be able to offer guidance and business context, governance teams need a good amount of data visibility. They also need the ability to quickly associate business data and the technical assets they handle. Amplified AI discovery capabilities available in data intelligence applications enable data stewards to identify potential asset matches. Thus, saving valuable time in asset association and data classification.
Within the Business
Business analysts and other business users need instant visibility into data. This visibility can point to the best data to use for certain decisions or analysis. It is essential that users can access a current enterprise data catalog that offers business context and suitable guidelines.
Conclusion
The requirement for enterprise data visibility depends significantly on the organizational perspective. Data intelligence can facilitate data identification, and simplify navigation. It can also promote an understanding of data assets, data governance protocols, and business context. AI, Machine Learning, and RPA (robotic process automation) can assist IT teams to set the policies for appropriate data access. Other things that AI and RPA can assist with are:
- Automating data discovery.
- Shining some light on data.
- Enabling businesses to make faster decisions.
- Controlling smarter insights, and gaining better business outcomes.