By Hal Kreitzman
The dramatic growth in storage requirements has led vendors to refocus their tools, strategies and practices in order to help organisations manage data more efficiently
Storage requirements and capabilities have grown dramatically over the last 15 years, but while requirements have often outpaced storage capabilities, recent developments in technology and methodology may help to close that gap.
The huge increases in storage have made it necessary to refocus on the tools, strategies and practices that are used by enterprises to manage data resources and forced companies to pay more attention to the lifecycle of information.
The rise has been driven by the use of and reliance on electronic information, increasing pressure by legislative and regulatory agencies to keep and categorise data, as well as the increased risk of business outages due to unauthorised access, data tampering, environmental failures, meteorological disturbances, terrorist activity and pandemic situations.
Many technology characteristics of storage systems, such as scalability, capacity, performance and security, have improved in the last decade or so.
However, in certain key areas those improvements have not kept up with computational and access requirements. During this period storage systems’ scalability has expanded significantly. From a hardware perspective, there are many systems today where you can upgrade disk drives ‘on the fly’ with hot-swappable drives. Software enables scalability through the use of new methodologies like on-demand and virtualisation.
Capacity has also increased dramatically during this time.
As an example, a PC disk drive’s capacity has grown from considerably less than 1Gbyte in 1990 to over 300Gbytes in 2005. This represents a growth of over 600% in 15 years. Some drives released by EMC can hold up to 1Tbyte of storage.
Security of storage systems has also undergone huge changes in an effort to combat today’s threats. Information needs to be appropriately secured to maintain availability, resiliency, and chain of ownership.
Storage is now subjected to all sorts of security challenges, including unauthorised data access, modification, destruction, or theft, denial of service (DoS) attacks, malware, and hardware theft.
As a result of the increase in security-related incidents, several vendors have recognised that security has to be integrated with their solutions.
For example, EMC has responded with its digital rights management (DRM) feature in its Documentum solution, while Sun Microsystems has integrated security at key points in its information lifecycle offerings.
However, the growth in scalability, capacity and security has not been matched by that of performance. Relative performance of data access of storage mediums has not kept pace with performance improvements in other computing areas.
Since 1977, CPU performance has increased around two million times. During the same time frame, data transfer rates have only increased 266 times for RAID and 30 times for disk. This problem continues to present challenges for network, hardware and software designers to meet business requirements.
Vendors and service providers are constantly looking for ways to improve technology. Some of the responses to the need for constant advances in scalability, capacity, security, and performance — such as information lifecycle management (ILM) — have been around for some time.
Other approaches have borrowed concepts that have been developed in other technological areas such as object-based storage device’s (OSDs) use of application development.
Other approaches, such as storage grids, are the result of the cyclical shift from centralised to decentralised and back to centralised computing philosophies.
However, all of the ideas, concepts and products address specific issues raised by the massive growth and reliance on storage technology to remain scalable, meet and exceed capacity requirements and maintain high levels of performance and security.
ILM has been around for some time, but until recently has not gained much traction.
The increased focus on legislative and regulatory requirements to keep, monitor, audit, and classify information has made ILM important. At the same time, the increase in information volume has made it necessary to implement traditional tiered-storage structures, and hierarchical storage management (HSM).
ILM is a combination of processes and technologies that determines how data flows through an environment.
These processes include the creation and acquisition of data, its publication, and its retention and disposal in term of the length of time data is maintained and what happens after it is no longer needed.
ILM enables users to manage their data from inception to retirement. If implemented properly, ILM can minimise an enterprise’s costs, increase performance and increase resource utilisation, providing users with the right information at the right time.
While ILM has been around for a few years, its use is by no means matured. Additional assistance from hardware and software vendors is required to keep pace with the rapidly changing business environments.
As a result of the retention-focused legislation and the performance challenges that firms are facing, it is imperative that some type of ILM is implemented for the enterprise.
One technology that has been developed as a result of the explosion of storage systems associated with distributed systems is storage grids.
Much like grid computing, storage grids are maximising the use of individual storage units by treating them as part of a much larger virtual storage device. This will lead to increased resource utilisation, decreased costs and ultimately increased return on investment.Storage grids are an object-based approach to produce a smarter disk. A file system is then layered on top to create a grid or cluster.
This type of approach makes storage systems easier to manage, less prone to faults, allows data access at file and block levels and delivers a high level of control and automation.
The ability to parallel file functions will reduce or eliminate bottlenecks where applications are disk intensive. It will also mean that it should not matter where a firm places data since all data will potentially be accessible by any program.
Key vendors such as Oracle, HP, EMC and Network Appliance (NetApp) are creating storage grid products.
However, the methodologies used to implement storage grids are not standardised.
Therefore buyers should make sure they thoroughly understand how the grid is implemented, because this will affect the ability to scale transparently and cost efficiently.
A third emerging storage technology is one that has been borrowed from application development methodologies.
While traditional storage formats did nothing more than store bits of data and some basic addressing information linking pieces to form information, object-based storage involves storing not only the data, but the data attributes as well.
Basically, it makes the data itself more intelligent so many information functions can be handled by the device, eliminating the front-end.
This creates a unique opportunity to enable cross-platform data sharing and application-level security at the data layer. It also makes it possible to self manage certain storage management functions, like space utilisation.
Within object-based storage objects are storage containers with a file-like interface, effectively representing a convergence of the network attached storage (NAS) and storage area network (SAN) architectures.
Any type of data, such as files, database records, medical images or multimedia, can be stored and it is not limited to any particular storage device type.
Object-based storage allows cross-platform data sharing and application-level security, increasing accessibility among different platforms and improving security capabilities at the device level. However, currently there is a lack of products and it has proved a difficult concept for companies to get to grips with.
While standards are in the process of being developed by the Storage Networking Industry Association (SNIA), there still are not that many viable products.
The idea is so different from the way the storage industry and users of their products have traditionally dealt with and thought of storage, it will take some time before the idea fully gains traction.
Another approach to storage format is to increase the density of information using not only the surface of the medium, but the density as well.
This type of approach is termed holographic storage. This is storage recording and access technology that takes advantage of the medium’s depth, as opposed to only using the surface like most technologies.
This type of approach to data storage offers high storage densities, fast transfer rates, and a durable and reliable medium. Holographic storage is also cost effective and the technology can easily be applied to many types of devices from handheld to enterprise storage products.
Problems surrounding the take-up of holographic storage include that, historically, it has been difficult to develop a suitable storage format and also that the industry needs time to develop devices and build market acceptance of them.
This technology is expected to take hold in consumer electronics first, where there is a push for compact video recording devices. However, firms interested in holographic storage should keep their eyes on the development of this technology for enterprise storage requirements.
When it does take hold, standards will be issued much like the Beta versus VHS video recording was in the past.
The latest tech
A technology that particularly targets the shortfall in the performance of storage devices is InfiniBand (IB). A few years ago, IB was introduced as a potential replacement for fibre channel (FC). It has the capability to eliminate bandwidth as an issue for storage devices in data centres, potentially increasing bandwidth by a factor of ten.
In addition to high-speed access, IB offers low-latency, high-performance, and serial I/O interconnect. It is designed for deployment primarily in server clusters ranging from two to thousands of nodes.
In addition to connecting servers, IB can also connect communications and storage fabrics in data centres.
Stumbling blocks to the widespread take-up of this format are that it is only good over short distances — within data centres for example — and technologies such as ethernet and FC are now seriously entrenched in storage infrastructures.
Unfortunately, the introduction of the technology was also poorly timed and many of the vendors went out of business.
Recently, major vendors, like Cisco are bringing it to market again. In April 2005, Cisco acquired Topspin, which manufactured IB technology.
Cisco responded shortly after the acquisition with an IB-based server fabric switching (SFS) product. With this type of vendor commitment, IB could make a huge difference in the limits of storage technology.
While there are many traditional ways to protect data, such as encryption, these have negatively affected performance as they added an additional processing layer. Other approaches such as fibre channel-security protocol (FC-SP) are being developed.
FC-SP is a security framework that includes protocols to enhance FC security in several areas, including authentication of fibre channel devices, cryptographically secure key exchange, and cryptographically secure communication between fibre channel devices.
FC-SP offers a way of protecting data as it moves between devices on a fibre network, and while it does not address the security of data that is stored on the FC network, it does directly address the challenge of protecting information while it is being transferred between devices within a network.
This standard has been in development since 2004 by the technical committee T11 of the International Committee for Information Technology Standards (INCITS).
Unfortunately, it will be some time before all members and the vendor community ratify the standard. It will also be some time before viable products are available.
The importance of this approach to security is the recognition of the potential level of security breach and the industry response to deal with the issue.
While there are many challenges facing the information storage market, it is clear that vendors are responding in a timely and appropriate manner to address these issues.
Hal Kreitzman, research director, recent storage management developments, Experton Group
As you move up the chart, management complexity increases requiring more in-house skills, increased storage capacity and increased total cost.
On the other hand, the cost per Gbyte decreases as you increase the number of Gbytes.
Enterprises that do not have massive storage requirements do not have sufficient IT staff to support sophisticated storage approaches, and are concerned about storage support costs, may start by using the services of external service providers.
These service providers offer a full-range of storage, backup, restore and archive services. Typically, the enterprises are linked to these providers via the Internet or point-to-point network connections.
The next level up is where servers are directly connected to the storage devices via direct attached storage (DAS).
This puts additional pressure on the enterprise’s IT organisation to make sure the software has all of the latest patches, the data is backed-up on a timely basis, data restore services are provided, and a storage archive is maintained. This type of approach is restrictive in that the storage resources are only available to that server.
The next level up, network attached storage (NAS), introduces shared storage via the network.
NAS separates data from applications by storing data on filers attached to the local area network (LAN). Filers can share files across multiple applications, platforms, and operating systems.
NAS allows multiple server access through a file-based protocol. This allows administrators to implement simple and low cost load-balancing and fault-tolerant systems.
Up from there are storage area networks (SAN). These are a collection of computers and devices connected over a high-speed network, dedicated to the task of storing and protecting data.
Instead of storing data locally, each server sends data across the network to a shared pool of storage.
There are two levels of SANs.
The first is iSCSI connected allowing communication over internet protocol (IP) and the second is fibre connected. Fibre is faster than iSCSI, more costly and more complex than iSCSI.
The next level up from this is based on current technologies in development such as information lifecycle management (ILM) and InfiniBand (IB).
But whether these technologies will become part of the industry’s established storage infrastructure such as NAS and SAN will depend on customer take-up, vendor backing and how quickly industry standards can be established for the technologies.
As a result of the increase in security-related incidents, several vendors have recognised that security has to be integrated with their solutions.
While there are many traditional way to protect data, such as encryption, these have negatively affected performance.For all the latest tech news from the UAE and Gulf countries, follow us on Twitter and Linkedin, like us on Facebook and subscribe to our YouTube page, which is updated daily.