By Tony Hill
May 26, 2016
Whether your company hosts servers onsite or uses a third party virtual server platform like Amazon Web Services, it’s your primary duty as the CTO or IT Admin to establish strong monitoring strategies and ensure your company’s IT operations run smoothly. We’re here to give you a head start, highlighting the top IT server KPIs to incorporate into your daily operations.
KPI #1: Server Uptime
Server uptime is a great KPI to kick things off for IT server performance management. This indicator tracks the percentage of time your IT infrastructure is up and running, and most industry insiders will tell you that above 99% is considered favorable. Generally, the remaining 1% (known as downtime) is relegated to critical system maintenance or updates, periodic reboots and the unenviable system crash (we’ll talk more on how to prevent this in a bit).
In your role as IT admin, you have to first decide how often you will track this KPI - some companies opt for monthly monitoring and others view overall statistics and average uptime on a quarterly or annual basis. Next, you can incorporate real-time monitoring to ensure that all servers remain at or above your desired benchmark, and receive an alert if any server drops below it. You’ll likely need to spend a little extra on a custom IT dashboard software solution to set up a feature like this, but it will be well worth the investment to prevent a catastrophe.
KPI #2: Average Outage Response Time
While most IT professionals would agree that server uptime is one of the most important KPIs, an equally important indicator is outage response time. Think about it this way: your company prides itself in being available 24/7 for your customers, and one misstep could be the end of your competitive advantage and brand positioning. Customers place orders and expect consistent results, yet if one or a group of servers crash, orders could get delayed, or worse yet, lost in the clouds.
For these reasons, it is imperative that you set a comprehensive outage response plan in case your servers go down for any reason (e.g. weather, technical issues, etc.). Position your entire company to respond collectively to this situation, and develop a plan to minimize downtime. Next, decide upon the desired response time to get your servers back up and running. We suggest between 5 to 10 minutes on average, but this is only a starting point and could vary depending on your industry and your reliance upon technology.
KPI #3: Security Penetration Attempts
These days, everyone is talking about site and server security. With more and more news coming out of prominent company’s websites and sensitive data being compromised, your ability to track the number of penetration attempts is critical in guarding against a serious external breach.
This is where your custom IT dashboard can really come in handy, as it allows you to measure this KPI daily, monthly or even in real-time as the attempts take place. While we’d like to tell you that the benchmark for this indicator is zero, in reality, a security-related issue is probably going to occur every so often. Yet, if your IT team jumps on the issue immediately and determines the root cause of the alleged breach, your company can remain agile and strategize accordingly about how to prevent these situations in the future.