Application Management System for Performance and Alert Data

Problem:

Deploying performance and fault monitoring tools to monitor and configure your complex infrastructure adds layers of complexity such as:

multiple products to log on to
multiple products to maintain
out of control alerts

Solution:

How do we solve the issue of application management overload? The easiest way is to use a single application or portal where you can have all your performance and alert data presented to you, “a single pane of glass” if you will.

First: Let’s talk about the ease of management part.

Depending on the size and complexity of your environment, you are usually logging into products that manage and/or watch routers, switches, servers, storage, virtual infrastructure, backups, Anti-Virus. The first issue is the fact that you have to log in to half a dozen or more products to ensure that your infrastructure is running smoothly. This is not only time consuming, it also presents headaches of its own; managing several complex credentials being the biggest. The management products are supposed to solve problems, not add to them.

Second: During an outage or similar crisis, you do not have a single place where you can log in to isolate the issue.

You are wasting precious moments logging into several management products to figure out where the issue is. Once you identify the issue, you have to log into that product (server, router, storage, etc.) and fix it. Having a single pane of glass simplifies this process and greatly reduces the MTTR by quickly and painlessly identifying the source of the problem.

Third: There are the unintelligent alerts that multiple products generate.

During an outage, you can receive upwards of 300 alerts. This number is usually multiplied by 2 or 3 depending on if you have email, text and pager alerts set up. This not only confuses you during an emergency, it also renders your smartphone useless when you need it the most. You cannot call your NOC or network guy when the phone is buzzing nineteen to the dozen.

Let’s dive a little deeper into why hundreds of alerts are generated. Let’s say that your core MPLS switch went down-either during a routine upgrade or in production. Let us also assume that this switch had a dozen servers and a SAN hanging off of it that are mission critical to your business. Now, you have one alert that the switch is down, a dozen or more alerts that your servers are down, another dozen alerts saying your application is non-responsive, in addition to storage and backup alerts. An intelligent alerting system will suppress all alerts except the one about the switch. When you get an alert saying your core switch is down, you are already thinking about the impact and have made the implications and business impact in all of 2 seconds.

So basically it boils down to this- you need a management app for your management apps-one app to rule them all. One Management app that delivers this is the Orion Enterprise Operations Console (EOC) from SolarWinds. [this_link_has_been_removed] The EOC provides a single pane of glass view into your critical infrastructure and helps you manage runaway alerts. With the right products and modules from the Orion Suite, you can greatly simplify your performance and fault management while maintaining SLA you have promised to deliver to your internal (or external) customers.

Petri IT Knowledgebase Team Petri Contributor

How To Tame Application Management Overload in Your Environment

Problem:

Solution:

SHARE ARTICLE

Related Articles