Apr

How To Prevent Downtime In The Data Center

With the increasing computational requirements and complexity of the systems in the data center, unplanned downtime, data centers are a major threat to organizations in terms of violations of business processes, revenue loss and reputation risks.

Recently completed market research on data centers, conducted by data center experts shows that overwhelming majority of respondents (91 percent) experienced unplanned shutdown of DPC over the past 24 months. Regarding the frequency of downtime, Respondents had an average of two data centers blackouts over the past two years. Partial shutdown, or those that were limited to a few racks, occurred in an average of six times over the same period.

The results show that many companies are more aware of the causes of downtime and take measures to minimize the risk. Studies have paid close attention to the high data centers who experienced the least amount of downtime and identified seven acts that are largely shared between different types of entities.

Not every data center is able to take all seven of these actions. But even the implementation of several of them can significantly reduce the incidence of unplanned downtime and mitigate its impact.

1. Consider the availability of data center on priority – even higher cost minimization

Given the increasing budget deficits in the industry, this may be difficult for many organizations. However, when increasing reliance on IT systems to support mission-critical business processes, it has the potential to significantly impact on the profitability of the enterprise. For companies with profitability models, which depend on the ability to deliver IT data center and network services to customers, can be particularly expensive.

2. Use best practices for redundancy of data center to increase the availability

It all boils down to the basics. There are a number of proven best practices that serve as a good basis for data center backup. These recommendations are proven approaches to the use of cooling and power management technologies in an effort to improve the overall performance of the data center. They include everything from matching cooling capacity and air distribution for IT loads to the use of local expertise for the design and maintenance of equipment life.

3. Allocate sufficient resources for recovery in case of unplanned downtime

This involves more than just having enough people to throw switches and restart the server after a failure. It implies a willingness wide variety of resources such as food, accommodation, transportation, alternative schemes for staff in case of power resulting from the disaster. Hurricane Sandy taught Americans that there is sufficient fuel for generators, and streamlined supply chain replenishment of disaster that may last for many days, is crucial for the sustainable operation of certain facilities.

4. Enlist the full support of senior management for efforts to prevent and manage unplanned downtime

The study reflects a difference in perception, which often exists between senior management and those who tells them the bad news when it comes to downtime. 48 percent of respondents in senior positions sure enough leadership support efforts to prevent blackouts. While 71 percent of respondents – department heads – believe that their organization is sacrificing accessibility for improving efficiency and reducing costs in their data centers. Heads of departments are also more likely than senior management believe that unplanned outages happen too often. This discrepancy highlights the importance of open debate within companies about unplanned downtime and the level of support and investment needed to prevent and manage incidents.

5. Regularly test generators and distribution systems to provide backup power in case of power

The most severe form of this test is commonly referred to as a “fork”. Such routine testing is usually found in some industries, such as healthcare. It confirms the correctness of the interaction during the failure of the automatic transfer of equipment from the battery to the generator and back. Such a test command object keeps in shape, supporting their preparation for unplanned downtime. It also gives the team confidence data center that will take off without further untoward incident, and gives them time in a controlled manner to cope with any difficulties.

6. Regularly test or monitor UPS battery

Having a dedicated battery monitoring system is important. Battery failure is the leading cause of loss of power with UPS systems. The use of intelligent battery monitoring can provide early notification of potential battery failure. It is best to implement a monitoring system that monitors the status of each battery.

7. Implement a data center infrastructure management (DCIM)

It is important to provide a framework for the effective data center management as a visual model of the object and centralized monitoring systems infrastructure for monitoring data center. This includes the deployment platform DCIM, able to provide a holistic view of data center operations based on real-time data, which covers the interaction of the object and its IT systems.

Resources: