Analysis of Integration Trends from a Difficult-to-Severe Storage Architecture

In order to increase the efficiency of the use of disk resources, enterprise storage devices entered the networking era in the late 1990s. At the file access level, file servers that use the Ethernet, NFS, and CIFS protocols, namely NAS (Network Attached Storage), have emerged; At the block access level, a storage area network (SAN) represented by the Fibre Channel protocol emerged.

Through the network, it is possible to centralize the access requirements of multiple hosts and allow multiple front-end hosts to access the same storage device at the back end. This solves the problem of “storage islands” caused by individual hosts connecting to independent storage devices in the past. Managing storage resources eliminates the need to install separate storage devices for front-end and desktop hosts, as well as increase disk resource configuration flexibility and disk space utilization. However, the actual environment is not so ideal, and many factors have hindered the realization of the aforementioned purpose.

Storage Networking Constraints For SAN applications, the purpose of SAN import is to integrate storage resources and increase disk space utilization. However, in reality, disk array controllers of different brands or different brands of products with the same brand name Can not be compatible, so it is difficult to allocate disk resources between different brands or storage devices of different product families.

Constrained by procurement policies and the upgrading of IT products, it is almost impossible for the entire IT environment to use the same brand, type of storage equipment, the user's SAN storage environment is often a variety of brand models of disk equipment The composition still forms an island, although it is much better than if each host had previously connected a separate storage device, but there is still a long way to go from full integration of the storage infrastructure.

This situation will also affect the establishment and implementation of advanced applications such as remote backup and data migration. At present, many enterprise storage devices can provide functions such as clustering and remote replication, which can help users establish high availability and remote backup mechanisms. However, the problem is that the high availability or remote replication features that ship with most storage devices can only be performed on devices between the same product family. This is equivalent to forcing users to purchase two sets of identical storage devices. It is not possible to use storage devices of different brands or grades depending on the difference in the workload between the primary and backup stations, which increases the burden on users.

For data migration, because the old and new devices cannot be compatible, the user must stop the machine to move the data to the new hardware. This results in business interruption and increased operating costs. It also allows the enterprise to view the update or upgrade of the system. Data migration is a daunting task.

In other words, although the SAN breaks the previous one-to-one connection between the front-end host and the back-end storage device, the front-end and back-end can form a more flexible connection and resource configuration, but it can not be integrated with different backend models of storage devices. As a result, the use of resources has not been optimized, and it also limits the application of advanced functions.

Characteristics and Benefits of Storage Virtualization To address the deficiencies of existing network storage architectures, some vendors have proposed the concept of "storage virtualization" to decouple front-end hosts from back-end storage devices and use the transit virtual layer as a storage for front-end and back-end storage. Basic service.

According to the access model, storage virtualization products can be divided into two types: block access and file access environments, corresponding to SAN and NAS applications. SAN virtualization SAN virtualization products are usually in the form of inter-network connectors, placed between the front-end host and the back-end storage device. The back-end storage device does not directly map the disk space to the front-end host. Instead, it maps the disk area to SAN virtualization and then maps the SAN virtualization to the front-end host.

Therefore, in the SAN virtualization architecture, the SAN virtualized inter-network connector inserts a virtual layer between the front and back ends. For the back-end storage device, the virtualized inter-network connector is equivalent to being able to load its disk space. Front-end host; for the front-end host, the virtualized inter-network connector acts as a storage device that provides disk space. In other words, the access between the front and back ends is performed via the intermediary of the virtualized gateway.

With this architectural intermediary, the SAN Virtualization Gateway can provide many useful access services through its own virtualization software:

(1) Unified storage pool:

Users can use the SAN virtualized inter-network connector to bridge storage devices of different brand models, load the disk space provided by these storage devices separately, and then combine the disk areas from different storage devices to form a storage pool (Pool). Uniform application. On this storage pool, a virtual disk area can be created as required and loaded to the front-end host through different transmission channels.

Through the storage pool of the virtual gateway, users can use the space of the underlying storage device more flexibly. Between the heterogeneous storage devices at the bottom, the space is allocated to the front-end host. Users do not need to manage the disk space accessed by the front-end host. It is actually provided by which storage device at the back end.

Since all the storage resources are applied under the virtual layer bridge of the virtual gateway, the connection between the front-end server and the back-end storage device also changes from the fixed-position connection and the spatial mapping in the traditional SAN environment to the virtual one. Layer dynamic bridge, so more flexible management, space utilization can also be effectively improved, there is no longer the problem of previously stored islands.

(2) More flexible connection architecture:

Since all front and back accesses are performed through the transitive virtual layer, the connection to the front end host is provided by the virtual layer instead of the back-end storage device, which also allows the host support of the entire storage environment to get rid of the back-end storage device. limits.

Under the SAN virtualization architecture, the type of front-end host that can be supported by the storage environment is determined by the transit virtualized gateway. The user can store the virtual machine disk in the pool and use any host provided by the virtual gateway. Mapped to the front-end host, regardless of the type of host supported by the underlying storage device.

This feature will allow users to have a more flexible storage connectivity architecture. For example, if the host of the underlying storage device is the FC interface, but via bridging of the virtualized inter-network connector, the virtual disk device of the virtual layer storage pool can be loaded with different host interfaces such as iSCSI, FC, and even FCoE to load the front-end host.

(3) More flexible advanced applications:

In addition to a more flexible spatial configuration and connection architecture, more flexible advanced applications such as Replication, Snapshot, and Clone can be implemented through the SAN virtual layer.

Remote replication can be divided into three types: host, storage, and network. Many enterprise storage devices have built-in synchronous or non-synchronous replication capabilities, allowing users to create local or remote data mirror backups as a basis for disaster recovery at the local or remote site, but the limitation is only Duplicate operations are performed between the same factory and the same series of storage devices. In other words, users must pay double investment and purchase two sets of the same storage device and copy function authorization.

If host-side replication software is used, it is not limited by the type of back-end storage device. However, this requires the installation of a software agent on each front-end host that requires image backup. This requires not only a lot of licensing fees, but also an agent program. It also affects the performance of the host.

There are no problems mentioned above with SAN virtualized inter-network connectors. Under the SAN virtualization architecture, replication jobs can be performed by the SAN virtualization layer without executing jobs through front-end hosts or back-ends. Replication is performed between 2 SAN-virtualized inter-network connectors, eliminating the need for back-end storage devices. What are the brand models? As long as two sets of SAN virtualized inter-network connectors are built at each site, and then the storage devices at the two sites are integrated into their respective SAN virtualized inter-network connector storage pools, 2 Between the SAN virtualized inter-network connectors, the replication relationship can be established by using the virtual disk area in the storage pool as a unit.

Snapshots and Clone

Many current enterprise storage devices provide disk snapshots and Clone functions to make backups for native disks for data protection or development testing. However, if there are multiple storage devices of different brand types at the same time in the user's environment, the user must separately purchase snapshot or Clone function authorization for devices of different brand models, and set the snapshot or Clone job execution policy separately. Management is quite troublesome.

Under the SAN virtualization architecture, the virtual layer can be used to perform the snapshot and Clone operations in a unified manner. By purchasing the snapshot or the Clone function of the SAN virtualized inter-network connector, a snapshot of the storage pool's virtual disk unit and Clone can be taken. . Users only need to put the space of the back-end storage device into the storage pool of the SAN virtual layer, and then they can obtain the disk backup through the snapshot of the virtual layer and the Clone function. This makes it much easier to build or manage.

Data Migration Migration of data when updating storage devices has always been one of the most time-consuming and troublesome tasks in IT management, and will also seriously affect the normal access of front-end hosts.

In the SAN virtualization architecture, the data migration work when the device is updated can be performed by the virtual layer. Because the SAN virtual layer isolates the direct connection between the front-end host and the back-end storage device, all storage devices are controlled under the control of the virtual layer and then bridged to the front-end server. Therefore, the access path of the front-end host can be transferred through the virtual layer. , and then with the background data copy migration function; the virtual layer can allow the old device's disk space to continue to provide access services for the front-end server, and then at the peak time, the data will be moved one by one to the new device's disk space, pending After the data move is completed, the access path is transferred to the new device, so that the downtime required for data migration can be minimized. Hierarchical storage Many storage devices are currently advertised as providing tiering capabilities. Different performance levels of disk space can be configured for the front-end host's access performance requirements. However, the limitation is that only the disks connected to the local controller can be assigned. Layer management cannot cover storage devices other than this unit. Therefore, when there are multiple storage devices of different brands and models in the user environment, this layered management function will lead to blind spots that cannot be taken into account.

However, if the SAN virtualization architecture is adopted, the aforementioned problems can be solved. Since all storage devices are bridged to the front-end server under the control of the SAN virtualization layer, as long as the appropriate access path settings are made on the virtual layer, the access performance requirements of the front-end server can be easily met. The space provided by high-performance storage entities is allocated to critical application servers whose front-ends require high performance, while disk spaces with ordinary performance can be reserved for backup, archiving, and other applications that do not require high performance.

Alternatively, the time when the data is generated can be used as a distinction, and the data for a certain period of time can be moved to a low-cost storage medium. This move can be easily done with the assistance of a virtual layer.

NAS virtualization is located at the archival level of NAS virtualization technology relative to SAN virtualization technology for access path and disk space management issues. The main goal is to access directory management issues.

In a large NAS application environment, due to the large number of shared files and the large number of front-end users, access and connection relationships between file servers, directories, files on the NAS, and client computers will become very complicated. In addition to being difficult to manage, it is not easy to change the connection structure or update the device. Once the back-end NAS device changes, it will also affect the modification of many access paths.

One way to solve this problem is to insert a virtual layer between the client computer and the NAS, and manage the front-end and back-end access connections through the intermediary of the virtual layer.

The traditional network file transfer or sharing application relies on the file server or between the NAS and the client computer to identify and confirm the access path through the universal naming convention (UNC), and the directory and path provided by the UNC can be used. Let the client computer access the files on the network. In the NAS virtualization architecture, the front-end computer accesses the space on the back-end NAS, not through the location or name of the entity, but through the virtual location given by the global namespace of the virtual layer (GlobalNameSpace). .

Under the architecture of the global namespace, you can get rid of the dependence on UNC. All the file storage resources are consolidated into a unified virtual storage pool by the virtual layer. Therefore, the user accesses the "logical" name or location and "actual" name or location of the file. Irrelevant - User-initiated access requirements are redirected to the set location by the virtual layer without knowing the actual location of the file. Just as users do not need to know the physical IP location, as long as the translation through DNS can automatically connect to the correct Web. If an access path fails, it can also be automatically transferred to another access path through the NAS virtual layer, which can also improve the reliability of file access services.

Through the intermediary of the NAS virtual layer, the access path is not limited to the physical connection. The administrator can easily move data between different NAS or file servers without worrying that the front-end user's original access will be affected accordingly. It can greatly reduce the difficulty of data migration. Moreover, administrators can formulate policies that allow the virtual layer to automatically move files to storage devices of different levels according to the attributes or time of the files to achieve data archiving or hierarchical storage.

The actual practice is usually to insert an application server with global namespace function software into the network as a transit network gateway. This application server, like a DNS server on an IP network, logs in all NAS and files. The physical access path on the server is converted to the global namespace and then mapped to the front-end client computer. If there is any change in the back-end storage device, simply change the access settings on the application server and it will not affect the front-end user's computer.