Datacenter Monitoring… Is It the Gold at the End of the Rainbow?

by Visitor Siggy62 on ‎11-04-2011 10:46 AM


I don't think anyone would disagree that monitoring a data center from utility input to the server is a good idea. Certainly more data is better…right? Data enables us to see where there is an issue, forecast potential failures, plan for maintenance, and ultimately calculate PUE. While I'm certain that my short list of benefits does not do the value of data justice, it does bring me to my real question… to what end are we collecting all this data?

Over the past few years we've seen a flurry of activity as companies acquire others in the monitoring space. Some even speak of monitoring and "control" (which to me has its own frightening implications… the last thing I'd want is having non-technical personnel with access to my data center controls). Others speak of open architecture, middleware, DCIM, and other fancy terms.

What I'd like to know is – what's at the core of this flurry? Who is driving all this development? Is it the true data center operators or the equipment suppliers looking for a way to differentiate themselves because they see the UPS being commoditized? I mean, let's face it… once a problem has been identified; someone still has to fix it…right? What benefit is all this fancy software when a breaker fails to trip. Ooops! Nothing replaces good, old fashioned maintenance plans, procedures and practices.

So maybe I'm missing something. I put this question to all of my data center readers...

  • What information do you want to see in a monitoring package?
  • How will you use this information or how does this information make your life easier?
  • What (if any) parts of your infrastructure do you want to control?
  • Who should have such access?
  • Where should this kind of access be available?

As you write your response, please include a little about the size of your data center so that we can appreciate your comments in the proper context.

by Visitor lcampbell
on ‎11-08-2011 10:49 AM

Lou, always a pleasure to read your blog. i maintain 2 data centers, either one being redundant. i have been factory trained by Liebert, for their ac's, and as a first responder on their UPS's. what i like about your blog is that you are not encumbered by what is going on in the servers, but you realize that without us old school techs keeping the infrastructure humming seamlessly, their virtual world cannot exist, or would definately not be running all the time. i want to know how much my data centers are drawing in amps and kw, and what percentage my support equipment is using of that. are my server racks balanced, so my heat load is evenly distributed. infrastructure access, from the service entrance to the transfer switches to the gen sets to the distribution gear, and to some extent the ups's, should be by good, caring electrical techs whose expertise keep the power flowing to energy hungry servers which, quite frankly, we know only enough about to get us in trouble. i have met some IT people who knew some about maintaining electrical infrastructure, but here anyway, realize its as big a job as what they do to maintain and monitor the servers, so they leave the neanderthal (that's what they call us) infrastructure to us. 

by Visitor KENatCPRNC
on ‎01-05-2012 08:09 AM

Good points from both! One area of great importance that is not getting enough attention is predictive maintenance/failures, meaning having the tools to assist in diagnosing potential problems before they actually occur. Yes, some DCIM products do allow alarming when assigned circuits approach threshold levels or alarm if phases start to get heavily unbalanced, but more is needed. Some products like that from EDSA can do fault simulation and examine design options as to failure analysis prior to implementation, and others such as some work order management systems (IE: FacilityOne) will bring attention to activities of maintenance and/or repair that start to occur beyond normal expectations. Most well trained and experienced "Neanderthals" (to use lcampbell's term) can spot potential areas of concern by examining increases in power demand, phase unbalances, increases in temperature, etc., but keeping watch over large sites (1000's of servers and multiple power paths) and spread over multiple geographical sites can be challenging at the least. More aids in assisting those folks would be very welcomed. The DCIM market is predicted to explode in growth over the next 3-5 years, so hopefully the vendors will do more to address the issue of predictive maintenance and not just static reporting.

BTW: Happy New Year Lou!


by Visitor muralcr
on ‎03-08-2012 05:14 AM

well i was CAT application engineer and now a MEP consultant .at least in india most of the IT end users specife 1.1kva (800 W)per 100Sqft and yet to find some one to enlighten me why 1.1Kva when i work out the Data center load it doesnt go anywhere near 600W/sqft based on inputs from end users.general comment by socalled software specialists is its beyong application engineers like me .software engineering is diffrent ball game!

Help us grow the Caterpillar Community: Invite a Friend

Meet Our Bloggers