UPS Monitoring: A Vital Tool for Maximizing Uptime
Robert Sember, UPS Product Line Manager Eaton Corp.
When it comes to a mission-critical facility’s power infrastructure, the adage of ‘set it and forget it’ doesn’t apply, especially when the cost of downtime can run into the thousands and sometimes millions of dollars per second. Considering the financial impact of power disturbances, companies are depending on remote monitoring and diagnostic services to monitor the health of their uninterruptible power systems (UPSs) and batteries to help protect their critical operations. By implementing vigilant oversight of critical power equipment, managers can avert catastrophe and proactively plan for service and maintenance.
Remote monitoring and diagnostic services have emerged as a must-have solution for facility, IT and data center managers who are responsible for critical equipment. Since UPSs serve as the central nervous system of the power infrastructure, knowing the health of UPSs is paramount to achieving high nines of uptime. UPSs and batteries degrade over time due to age, use and environmental conditions, leaving an organization vulnerable to power failure in the event of an emergency. This reality is overlooked in many organizations until it is too late and their UPSs fail. Over the past three years, managers have increasingly utilized these services as they have evolved from basic monitoring to offer analysis of trends in equipment performance and, in some cases, 24-hour live phone support and access to technicians that can make on-site visits when unforeseen power events occur.
Smart Implementation
The process for installing a remote monitoring and diagnostic service is non-invasive and can be performed by either a service technician or the customer. With the correct hardware and software, installation can be completed in 10 minutes or less. Most monitoring systems operate via a direct connection with an off-site server, communicating data about daily conditions and critical events.
Online capabilities lay the foundation for the service to gather and communicate data. Information is e-mailed to the monitoring system’s server once daily or upon a critical event, where it is analyzed and stored in a central repository. With some services, a variety of UPS health and status reports can also be e-mailed to pre-designated contacts at the customer’s organization, depending on their preferences. Most services provide monitoring for dangerous fluctuations in environmental conditions such as temperature and humidity. Many of the causes for failure of mission critical equipment can be traced back to extreme conditions in the data center environment, even in controlled environments. Imagine your HVAC/CRAC unit going down after-hours or over the weekend; if the equipment is not monitored and has little employee foot traffic the conditions could go unnoticed and the UPS could fail.
Another part of the installation requires the customer’s firewall to be configured to allow the Web card in the UPS to transmit e-mail. This allows for one-way communication of data to the remote server for monitoring and for the delivery of critical events and daily “heartbeats.” One such service uses a Web card which communicates with the remote monitoring servers utilizing Simple Mail Transport Protocol (SMTP). This protocol offers reliability, pervasiveness and accessibility.
In many organizations, the facilities group is responsible for choosing and implementing a remote monitoring service while the IT group is responsible for granting access to the system. These groups also appreciate the non-invasiveness of most remote monitoring services. All SMTP communication flows from the customer to the vendor only. In most cases facilities and IT managers understand the common goal of having a consistent and reliable backup power supply.
When choosing a monitoring solution, it is best to look for a company that also offers live support and is available to assist with service issues, including guiding a customer through the installation process. Many times, a monitoring service will make a technician available to the customers to assist with installation, troubleshooting or managing critical events.
Similar to the many services that currently offer remote monitoring, one system such as Eaton’s eNotify monitors more than 100 UPS and battery operating data trends. It also features monthly monitoring reports and 24x7 access to live support from Customer Reliability Center (CRC) analysts. Eaton has 300 field engineers, support specialists and technicians in North America with 24-hour access to factory design engineers during escalated emergencies. Technical resources can be dispatched immediately to resolve problems that could jeopardize critical operations.
Case in Point
Bexar Metro 9-1-1 Network District needed the ability to monitor UPS units at more than 20 locations in the metropolitan area of San Antonio. This was critical to their work in supporting the area’s 9-1-1 call centers. With the eNotify service, Bexar Metro has 24x7 monitoring and onsite support for its UPS network.
“Our IT staff can now check system availability, generate monthly reports and monitor the health of each UPS,” said Bill Buchholtz, executive director of Bexar Metro 9-1-1 Network District. “The eNotify service is an efficient way for us to manage and anticipate any critical events, especially when we have our UPSs in multiple locations across a large geographic area.”
With any monitoring service, there will most likely be some customer responsibilities to ensure connectivity between the UPS unit and remote monitoring servers. Such responsibilities may include confirming that the Ethernet cable is physically connected to the network identification or Web card, maintaining connectivity to the server (either internal or external) and keeping the monitoring service partner informed of any actions on the customer’s end which might affect the unit’s availability or the Web card’s connectivity. These small tasks are worth the effort to ensure the UPS is proactively being monitored.
Utilizing Health Reports
Perhaps the most popular feature of remote monitoring and diagnostic services are the reports. Built upon data that is gathered and analyzed by the system, reports are tailored for a variety of purposes ranging from daily status e-mails or notification of critical issues to general summaries of equipment performance suitable for internal stakeholders.
Daily “heartbeat” e-mails summarize the status and activities of the UPS and attached external sensors based on the information gathered by the network identification or Web card. For example, daily e-mails are sent to analysts that identify the unit in question and includes the day’s data as an attachment. In the event that the system does not receive a daily “heartbeat,” an e-mail can be automatically sent to the customer letting them know of a possible problem with the unit. These e-mails should undergo a rigorous process to ensure that they are valid before being stored in the database server. Once the data is stored on the server, all current and historical data is checked for anomalies. The rules for detecting anomalies should be defined on a per-model basis to ensure customized, accurate detection. In the event that an anomaly is detected, an analyst can be notified and determine whether the event necessitates further action.
If a potentially critical event is identified, the service deploys an alert based on the customer’s designated protocol. The alert may be posted on a Web user interface or may be delivered as a phone call or e-mail to a data center manager. With more comprehensive services, there are approximately 50 events out of 200 that can trigger an event e-mail. Similar to status e-mails, if an event e-mail is triggered it is checked for authenticity by the remote monitoring servers. Out of the 50 possible event e-mails, only 10 may be considered critical events, such as a UPS hardware fault or a UPS battery is completely discharged. Any event e-mail that is deemed critical is sent to the analysts, where they will decide whether the event warrants dispatching a technician to the site.
Another point of consideration when choosing a remote monitoring service is how information is disseminated. Many data center managers prefer a service that generates reports, a very important feature to assist them in proactively managing the health of their equipment in a time-efficient manner. Data center managers are also leveraging monthly reports to concisely relay pertinent information to internal stakeholders, helping them to improve on job performance.
Detailed reports can be generated monthly and reflect the previous month’s data per individual unit. These reports reflect data gathered by the system and provide a color-coded reading of the system’s overall health (green, yellow or red). The status of the battery is reflected, including how many times the unit went on battery, how long it was on battery and how many times the battery was completely discharged. A Relative Health Index (RHI) score is assigned to the unit and included in the report. This score falls within a range of 0 to 10 and is computed using a weighted average of the individual parametric data RHI values along with battery information. The report also includes a summary of critical event information and a link to the service support history Web page.
Proactive Maintenance: Worth the Effort
Research indicates that regular preventive maintenance (PM), which affords the opportunity to detect and repair potential problems before they become significant and costly, is crucial in order to achieve maximum performance from your equipment. In fact, studies show that routine preventive maintenance appreciably reduces the likelihood that a UPS will succumb to downtime. The 2007 Study of Root Causes of Load Losses compiled by Eaton revealed that customers without preventive maintenance visits were almost four times more likely to experience a UPS failure than those who complete the recommended two preventive maintenance visits per year.
To select the best coverage for your UPS and its application, consider the
following questions:
1. What type of UPS service do I need?
A. Depot Exchange Repair or Replace: You contact the UPS service provider and then ship the UPS to a repair facility. The service provider returns the repaired unit or a refurbished unit to you.
B. Advance Swap Depot Exchange: You contact the UPS service provider who then ships a refurbished unit to you. The original UPS unit is returned to a repair facility.
C. On-Site Repair: You contact the UPS service provider and a factory-trained field technician arrives at your site to diagnose and repair electronic or battery-related problems.
Smaller UPS products (below 1,000 VA) generally can be repaired at a depot, while products over 1,000 VA and up to 15 kVA can either be repaired at a depot or serviced on-site. Larger UPSs that are either hardwired (cannot be unplugged) or too heavy to ship can only be serviced via on-site field technicians.
2. Do I buy a support agreement, extended warranty or pay as I go?
A. Support agreements, or service contracts, usually combine parts and labor coverage (electronics, batteries or both), at least one or more UPS preventive maintenance inspections annually, and a combination of coverage hours and
arrival response time. Plans can be tailored to meet most any need. Special features like remote monitoring, battery replacement insurance and spare part kits may also be added.
B. Extended warranty (or basic warranty) may also be purchased for many UPS products. A warranty commonly covers specified parts and labor such as electronic components for a fixed period of time but will not include 24x7 coverage or arrival response times. Nor will warranties include preventive maintenance, although extra services can be purchased in addition to a warranty extension. The more additional services that are added to a warranty, the closer you are to a support agreement.
C. Time and Material (T&M) service is a pay-as-you-go approach in which when something breaks the service provider conducts a repair. T&M can be done either via depot repair or on-site, based on the type of product. T&M can be expensive depending on what needs to be repaired. In addition, the uncertainty of knowing when a field technician will arrive can make T&M an unacceptable service solution for some customers. Support agreement (contract) customers always take priority, resulting in T&M response times of up to five days based
on the product and location for non-contract customers.
Remember that warranties cover repairs but do not promise when or how fast repairs will be made. Support agreements include repairs, time of repair and the speed of arrival (or advance swap exchange vs. waiting for a returned product). Pay special attention to which items are covered in a warranty or support agreement. Warranties or support agreements for large UPS models usually cover only electronics, with battery coverage available as an optionally purchased item. Twenty percent of customers purchase battery coverage on larger UPS models but most pay as they go.
3. What should be covered?
A. UPS electronics parts and labor coverage
B. UPS Batteries Parts and Labor Coverage: Often the leading cause of failure, batteries generally need to be replaced every five years or less. Batteries may need to be replaced more frequently, especially if they are discharged frequently or operate in a warm environment.
4. How long should I plan for a UPS to last and how much should service cost?
A. Large UPS products usually have a 15 to 20 year life span.
B. Small UPS products can last 10 or more years, but are often replaced much sooner.
C. All UPS product life expectancies can be maximized or extended via routine preventive service, part replacements and upgrade/modification kits. Batteries and capacitors can be replaced to rejuvenate a UPS and provide years of reliable
power protection.
D. The total cost of ownership (TCO) varies widely based on the size of UPS, amount and type of batteries, quantity and type of services desired and application. For example, is the UPS frequently discharging its batteries? Very basic warranty coverage may cost five to 10 percent of the product purchase price and a comprehensive, premium support agreement could exceed 35
percent of product purchase price per year.
Another important question is does your monitoring service offer ongoing maintenance? Conducting preventive maintenance ensures that equipment is running properly and can identify potential problems before there is an emergency. This may incorporate scheduled maintenance to upgrade firmware and update configuration settings. This also includes unscheduled maintenance in the event of a critical issue. Look for a service with a process that is structured to allow either the customer or the monitoring party to initiate action to correct any critical issues. Access to technicians who can travel on-site to address issues identified by the system is also important.
The Future: Knowing the Health of the Complete Power Chain
Over the next five years, providers of remote monitoring and diagnostic services will meet their clients’ needs by continuing to move toward increasingly automated solutions. While many companies are currently offering monitoring services, 24x7 accessibility to support and technicians is currently limited. The installation process will also continue this trend toward automation as the process for customer installations is simplified. Remote monitoring services will also continue to offer more trend analysis and develop more of a diagnostic capability.
For facilities, IT and data center managers, all aspects of their jobs are strained by increased demands and decreased resources. As power demands and utility costs continue to increase, data center managers are expected to maintain reliable power and security while meeting business objectives and corporate sustainability goals. While the automation of processes including monitoring the health of mission critical equipment will provide some relief, a critical component will be the ability of IT and facilities departments to work together toward common business objectives. It makes sense to focus monitoring capabilities on the UPS because it is the most mission-critical piece of equipment in the data center, but monitoring services will expand in the future to include the health of the complete power chain. In today’s environment there is little room for error, which is why more data center managers are turning to remote monitoring and diagnostic services to manage their own time more efficiently and avoid costly downtime.
Bob Sember is a UPS product line manager for Eaton Corp. where his work encompasses battery services, single-phase product services and eNotify Remote Monitoring and Diagnostic Service.
For more information, please visit www.eaton.com/enotify.