Data centers have been around for decades, and they’ve held military secrets, transfer of essential information in commerce and banking, and even our private and personal data. Data centers in airports make the world go round. 

Those old enough might remember painting wreaths made from perfectly folded data center punch cards—Those were the days! The technological transformation of data centers is akin to a mad scientist’s revolution. Typically hidden, but profoundly impactful, this product of our human creativity allows information to be securely stored in a centralized location.

Data center maintenance has come a long way, but the main idea remains. While data centers have their roots in the huge computer rooms of the 1940s—think ENIAC—one of the earliest examples of the data system, a center’s maintenance is essential.

We know that over time, computers began performing numerous important business tasks. As more corporate data assets migrated to the data center, downtime due to equipment dysfunction became a serious threat to business growth and profitability. Those who were manufacturers of data center equipment realized the need for an active maintenance program to assure operational quality of their products. The introduction of annual maintenance contracts provided data center owners with peace of mind with the benefit of improved levels of service.

The nature of corporate data evolved into a critical asset for most companies, and proper maintenance of the IT equipment became a necessity for supporting the availability of key business applications. The program maintenance (PM) concept of today represents an evolution from a reactive maintenance mentality (“fix it, it’s broken”) to a proactive approach (“check it and look for warning signs and fix it before it breaks”) to maximize availability (24x7x365). No matter the size of a business, its data center simply needs the right program and strategies.

What is in a Data Center?

Wikipedia defines a data center as a building or a dedicated space inside a building or group of buildings used to house computer systems and associated components such as telecommunications and storage systems.

Data centers’ technology computing and networking equipment is concentrated for the purpose of collecting, storing, processing, distributing, or allowing access to large amounts of data. They have been around since the advent of computers.

How Many Servers are in a Data Center?

The number of servers in a Data Center depends on the size an type of company, as well as type of Data Center.

Here is what stores the information in a Data Center
 

There are basically two types of storage in a data center, disk storage and tape storage. Tape storage has been around for over fifty years. Currently, the most popular format is LTO tape that holds up to 30TB on a LTO-8 tape. While tape was originally designed to load computer programs, Tape is now most commonly used for backup. There is no other storage media that holds as much data as tape, and tape also delivers the lowest cost per GB. An additional fact that makes tape desirable is because tape is removable and portable. Although disk storage is popular for backup, tape is still used for many reasons.

Often, the backup is initially made to disk, and then moved to the tape at a pre-determined time. This is partly because of LTO tape reliability, but also because you can write somewhere around 100 times more data on tape than disk. Disk backup is faster, can be accessed more quickly, both making disk backup a good choice for initial backup, however.

Disks are stored in rack units or in shelf modules. Tape may be stored off-site and they (tapes) have the smallest footprint. Both entail specific temperature and humidity requirements that ensure the longevity of archiving—the specifications should be followed—don’t store in a greenhouse. Specifications include redundant power and dedicated cooling equipment that doesn’t shut down at the end of the normal office hours, and an engine generator to protect IT functions from extended power outages.

The US Department of Energy reports there are three million data centers across US urban and rural areas. More than 90% of servers are housed in data centers owned or leased by small to medium size businesses. Less than 10% of servers are located in large data centers. Rack yield is the number of racks that can fit within a compute space, normally set to be 25 sq.ft. to allow aisle and perimeter space around the server room.

According to the US Chamber of Commerce, the capital and operating expenditures of data centers up front investment for the initial construction includes land purchase, shell construction, and equipment installation. Annual operating costs to run data centers consist of power, staff, taxes, maintenance, and other administration costs. After the build, data centers have annual expenditures for power, staffing, taxes, maintenance, administrative costs, and more. The Chamber of Commerce estimates that annual expenditure for the operation of a data center accounts for 8.6% of initial capital expenditures, 39.5% of that percentage, though, is on administration, maintenance, ‘and others’ not to include power, staffing, real estate tax, and insurance.

What is the Best Way to Maintain a Data Center?

The long ago approach to maintenance of data centers was reactionary. We’ve come a long way! Now we mitigate maintenance so that problems don’t arise in the first place. According to a recent study by Ponemon Institute, a minute of data center downtime now translates to $7,900. But the average reported incident length is 90 minutes, that’s $700,000 for an incident that could have been preemptively mitigated. The huge cost is related to the fact that modern data centers support critical websites and software applications.


One purpose for preventive maintenance is to schedule inspections so problems can be spotted before they cause disruption. Any data centers that do not perform planned and preventative maintenance have an increased risk of asset failure. Here are the concepts to consider to assure that third party maintenance (TPM) works to keep your data center working reliably through proactive and preventative steps. Remember that regular scheduled maintenance can easily pay for itself by mitigating and preventing unplanned problems and failures.

Who will carry out the maintenance?

The data center owner will receive a warranty, but who will meet service needs once the warranty runs out? There are four options. First, the data center owner can renew the warranty with the manufacturer. This is a costly endeavor because the warranty will increase with age.
An authorized third party is less costly than extending the manufacturer’s warranty. An unauthorized third party (not recommended) will be even less costly than extending the manufacturers’ warranty. Finally, the data center owner may decide to maintain his or her own physical infrastructure equipment, possibly through hiring extra staff. The probability that the IT staff is/are properly trained might limit internal staff, especially if they are not sufficiently competent. Knowledge can diminish over time, too, through lack of training and skill. Turnover, too, is an issue that might diminish the staff’s effectiveness.
Further, TPM have knowledge, ability, and experience in the following areas are deemed critical to the operation of maintenance:

  • Spare parts
  • Product knowledge
  • Local support
  • Knowledge of data center environment
  • Training
  • Product updates
  • Documentation
  • Tools

You Decide to Work With a Trusted Third-Party Maintenance Provider. Now What?

First, the PM process needs to be well defined for both the provider and the data center owner. It should detail the PM statement of work (SOW), issued by the PM provider, and clearly describe the scope of the PM. Some elements that need to be included in the POW are
Dispatch provisions: PM visitations are generally recommended one year after the installation and commissioning of the equipment. Some high usage component, such as humidifiers, may require an earlier visitation and constant monitoring.

In addition, a plan for equipment tuning for optimal performance is necessary. Also, proper protocols should be outlined and followed for easy access to the equipment at the center’s site. The owner’s operational constraints need to be accounted for.

Provisions for parts replacement: The SOW should include recommendations for which parts require a ‘preventative’ need for replacement or upgrade. Availability of stock, supply of tested and certified parts, contingency plans in the case of defective parts, and removal and disposal of used parts should be addressed in the SOW. This is a good time to introduce the idea of spares. Stock of spares available to the data center owner should be ISO certified.

Documentation: SOW should include and specify a PM output report that documents the actions taken during every PM visit. If, a vendor is involved, the vendor for any needed technical follow-up should automatically review the report. Well-documented PM reports also ensure data is readily available when auditors perform inspections.

What Does Data Center Maintenance Entail?

Measure

Measure performance through KPIs. For example, PM compliance, availability, and reliability. This lets you optimize the PM to maximize effectiveness and minimize costs.

Understand recommendations

Stay up-to-date by consulting equipment manuals. The hardware and equipment come with paperwork that offers recommended maintenance timelines and practices. Adhere to the recommendations; it ensures warranty validation and helps system performance.

Safety and Cleanliness

Data centers can be hazardous. Technicians must be aware of potential hazards when performing preventative maintenance activities. Make certain technicians are familiar with the health and safety processes; this is done through documentation and safety training. Data center maintenance also involves cleaning tasks such as removal of dust that can block airflow and create overheating, possibly leading to downtime—or worse—damage.

Enforce PM Compliance

Data center owners cannot afford downtime, thus it is important to complete maintenance on time. Measuring and enforcing preventative maintenance compliance (PMC) specifications accomplishes the task. Maintenance schedules have been made, and the PMC score reflects the percentage of work that’s done on time. Use the 10% rule that specifies an action should be completed within 10% of the scheduled interval. If PM is scheduled every 90 days, then it needs to be completed within 9 days of the date to be in compliance. 

Keep detailed PM and work order records

Documentation, documentation, documentation! Well-documented PM reports ensure data is available for the data center owner, another maintenance professional, and most importantly, for audits. 

Make Sure to Have a Computerized Maintenance Management System (CMMS)

This is the best way to track and measure, thus improve, the preventative maintenance management. The CMMS software ensures that preventative maintenance is performed regularly and according to protocols. CMMS software helps the data center cut down the cost of maintenance, increase the life of assets, and improve reliability and productivity. CMMS will reduce downtime, too.

Data Center Monitoring: Letting your fingertips do the work!

Data center monitoring cuts time, removes stress, and generally improves the health of a data center. It is the best way to eliminate reactive and redundant maintenance. Expert monitoring service experts can proactively watch from afar and note and respond to patterns that might point to possible problems.


This type of monitoring does away with time on the phone with tech support trying to figure out the cause of alarm(s) that might point to a problem. Digital remote monitoring service solutions allow you to generate a trouble ticket via your smartphone without having to explain the problem. The mission of data center monitoring enables the problem to be spotted before the alarms sound in the first place.

Cybersecurity

It is important to focus on a Secure Development Lifecycle(SDL). The SDL process considers and evaluates security throughout the development lifecycle of products and solutions. Understanding a vendor's SDL helps a company evaluate their digital services over a period of time.

Resolve to reduce reactive and redundant maintenance

Eliminating reactive tasks makes the data center unburdened with problems. Digital remote monitoring services allow the business owner greater assurance that a team of expert monitoring by service experts has their back. Digital remote monitoring solutions let you generate a trouble ticket through your phone without having to wait for tech support to get to the bottom of your concern or question.

Is Cloud Storage The Right Option For You?

Cloud storage offers some advantages. One is cost savings that can be allocated up or down. Cost and storage space can be scalable depending on your growth and on your current needs. Data redundancy and replication provides multiple copies of data that can prevent concern about data loss. Regulatory compliance throughout the world can be ensured when the business chooses vendors that provide worldwide compliances.


On the other hand, there are disadvantages of cloud storage that are worth noting. ‘Vendor lock-in’ can present a problem, especially for medium and large-sized businesses. A business that selects a certain cloud vendor for data storage is locked in to migrating data to another vendor. Maybe you experience a problem with the existing cloud vendor; migrating data to another cloud vendor is not feasible. The volume of data can be a huge setback, and the complex issues entailed in migration don’t present a real option.


Another issue in using cloud storage is with security and privacy. This is crucial. Cloud storage involves transferring control of confidential information to a third-party company. That company likely deals with various other businesses. You must have complete trust in your cloud vendor and feel assured that their other customers are also trustworthy.


Consider the Dropbox event of 2014. A security glitch caused the company to accidentally leak private mortgage applications and tax returns. As much as leaked information is threatening, it is also known that the National Security Agency has allegedly spied on the data stored by cloud vendors.


Sure, cloud storage offers hardware independence, but the advantages of cloud storage are limited. Another problem is that data centers require huge amounts of electricity and Internet services for their operations. When unprecedented power failures or Internet connections fail (perhaps due to a natural disaster), it can be impossible to access your data. Also, while data redundancy of cloud storage can be helpful, cloud storage can be a problem in cases of unprecedented hardware failures. The AWS outage of 2019 reminds us of the loss of customers’ data that was never recovered.


We are productive beings, so our knowledgeable and trusting data management team makes our organizations more productive. Another benefit of proper data management is cost efficiency; the money you save can drive down costs for your customers, too. Excellent data management makes it easier for companies to respond quickly and efficiently to things like changes in the market; this includes reacting appropriately to competitors. At the intersection of all bold moves are the choices that businesses make every day. It is what drives action. Action may drive growth; it all depends on your strategic effectiveness.

ISC Group specializes in maintenance, and repair services of post-warranty data storage systems. Including disk arrays, tape libraries, and Hitachi VSP support. (877) 472-8273