Intelligent Data Management, Enter Intelligent Data Management, Intelligent Data Management and The Lifecycle of Information, Advantages of Intelligent Data Management, Choosing Right Intelligent Data Management

Top Stories

»	49% Indian startups now from tier 2, 3 cities: Jitendra Singh
»	'India ranks 3rd in global startup ecosystem & number of unicorns'
»	Tripura exported over 9K tonnes of pineapples in 2 years
»	CPI inflation eases to 6.71% in July, IIP falls to 12.3%
»	Rupee depreciates 12 paise to close at 79.64 against US dollar

Satyen Vyas | 01 Jun, 2010
We are living in the age of âinformation overloadâ. Companies are relying heavily on enterprise planning systems like ERP, SCM and CRM to automate and manage their resources. These systems generate and host a vast amount of data which is well structured in its own way to fulfill a specific need. Apart from this structured data, enterprise also store massive volumes of unstructured data in the form of e-mail, IM, documents and images. The structured and unstructured information is required to be stored and retained within these systems for various strategic, business and regulatory requirements.

As information accumulates in the production servers, the performance of the systems deteriorate; storage need grows; disaster recovery, backups and upgrades take longerâleading to extended business outages. Todayâs dynamic enterprises cannot afford these outrages. No wonder Charlie Gary from Meta group says, âData is growing at 125 percent per year. In a typical enterprise, up to 80 percent of this data is inactive and yet remains in production systems to cripple performance.â This is what I call the data tsunami and it can be avoided by intelligent data management.

Enter Intelligent Data Management (IDM)

Information by itself is considered a resource and enterprises need to plan effectively and put the right combination of strategy, software and hardware tools in place to avoid a data tsunami. Apart from the new data that is generated every day, the strict data retention policies and legal regulations to retain transactional data over long periods is fuelling data growth. These ever-increasing volumes of inactive data which is retained for compliance affect the application performance, limit data access and strain storage infrastructure. This has resulted in the increased complexities in mission-critical IT environment and is a growing concern among businesses. This is where the increasingly popular concept of intelligent data management (IDM) comes into picture.

IDM helps companyâs strategies on how to manage data through its lifecycleâfrom the time the data is generated/ captured to the time it is deleted from the systems.

The value of information keeps changing with time, processes, business and regulatory needs. This in turn affects the probability of usage of data. Data reuse and data deduplication has been one of the key metrics of IDM which helps strategies the storing of data on different tiers to cost-effectively optimize the storage infrastructure and enhance performance. A well-planned IDM strategy will allow the enterprise to retain all the reporting and access capabilities as if the data were lying on the same server.

Analysts have been scouting through experiences to come out with the best practices that would guide companies through the changing times of IDM. The experiences of various organizations clearly demonstrate clear best practices.

Classifying data

The importance of data retention policies is of key significance. The data value otherwise called data classification forms the foundation for a successful and efficient information management. The data retention policies need to have a buy-in from all the entities which own or use the data. Classification of data, which helps organize the data onto different tiers is probably the most important step for IDM.

Choosing the right storage tier

In a recent conference in California, Data Base Administrators complained that their senior management was misinterpreting the hierarchical storage management (HSM) and was looking forward to totally removing Tier 1 (production tier) from their IT environment. But, the Tier 2 storage could not handle data request of any real-time production environment. It was only for the data which was rarely accessed. Tiring the data should be for eliminating the unnecessary load on the production servers, improving performance and achieving optimized storage utilization.

Data De-Duplication

Deduplication is a storage-optimization technology that reduces the data footprint by eliminating multiple copies of redundant data and storing only unique data. Copies of the redundant data are replaced by references to the original data, sometimes called âpointersâ. Deduplication of redundant data can occur at the file, sub-file, or block level. At the sub-file and block levels, data commonly is divided into smaller segments which can be more easily analyzed for possible redundancies, as compared to using file level data, and more efficiently stored.

Deduplication can occur in primary storage, such as file sharing devices (NAS) like the Dell NF and NX products. For example, the NX4 can help to reduce footprints for large File workloads with redundant or static data. However, secondary storage (i.e. backup data) with its vast amounts of redundant data, is currently receiving the majority of industry focus, such as in backup to disk (B2D) implementations. B2D is especially attractive because of the nature of backup data. Typically the bulk of the data in an organization has not changed considerably from the last backup job, so storing copy after copy of the same file or data can unnecessarily consume resources â storage capacity, power, cooling and management. The information to the right provides more details on deduplication technologies and implementations.

Restoring data

Businesses need to expect the unexpected and be prepared for any eventuality. The archived data is always in âRead onlyâ mode for compliance reasons. The software which enables the company to archive the data needs to allow for de-archiving the data into the production database without losing data integrity. This is necessary in case of editing requirements of the archived data (e.g. product recall).

Data security and compliance

The need for setting apt user and management level access privileges for data increases as we classify the data into various tiers based on its value. Only required users need to be given access to production, archive, or both depending on their responsibilities. Also, sensitive data (e.g. financial, health data) needs to be protected in production, archive and non-production environments (testing, development, and outsourcing).

One of the key drivers for IDM is compliance. Various regulatory bodies across the world have been coming out with their own version of governing data retention. For todayâs global companies, archiving software should allow for incorporating any number of regulations without overriding the other and help achieve compliance

Data integrity

ILM requires that data of any value be available for immediate access for reporting and compliance purposes. A few regulatory bodies also require all the tiered dataâsay production and archived data to be accessed through the same application which created the data. This online seamless availability of data can be achieved only if data integrity and referential integrity are maintained during the hierarchical staging of data.

Many vendors are attacking the archive market from a packaged application perspective (e.g. Oracle Applications, PeopleSoft, and SAP). But most companies will have a need to archive more than a single application; for this reason, users should evaluate the scope of packaged solutions. What companies need is a comprehensive enterprise archiving solution which covers both structured data as in packaged application like Oracle Apps and unstructured data like e-mail, IM, and documents.

* Satyen Vyas is the Director Advanced Systems Group, Small & Medium Business (SMB) Dell India.
* The views expressed by the author in this feature are entirely his own and do not necessarily reflect the views of SME Times.