Downtime, Outages, and Outages: Understand your real costs
This content is brought to you by Evolven. Evolven Change Analytics is a unique AIOps solution that tracks and analyzes all actual changes made to the enterprise cloud environment. Evolven helps leading companies reduce the number of incidents, reduce problem resolution time and eliminate unauthorized changes.To know more
When it comes to mission critical applications or data center performance quality, companies are willing to invest heavily. Unfortunately, these investments are not always fully delivered.
Dealing with system downtime
Despite the efforts that have been invested in infrastructure resilience, many IT organizations continue to deal with database, hardware and software outages that last from a few minutes to several days, making them completely inoperable for the business and causing enormous losses.
Sometimes the world of IT outages can feel uncomfortable.
Despite the array of advanced solutions and the growing amount of data collected by leading enterprise software vendors and IT departments - from ERP to CRM and beyond - outages remain a valid and dire threat to the industry.
On the other hand, IT outages have somehow become an inherently accepted, even expected, part of business life.
This is counterintuitive...
IT downtime review
While IT professionals face periods of downtime from time to time and focus their efforts on overcoming them, the business organization as a whole is impacted by "financial pain" which is often quite significant.
In the past, we've delved deep into the various ways IT downtime can impact bottom-line operations (you can read more about that here:Cost and scope of unplanned outages). In doing so, we consider different aspects, from direct sales losses and reputational damage to indirect effects such as reduced productivity.
Now I want to look at the problem and explore how organizations should address and assess threats to their IT operations, including systems, applications and data, looking at solid (and established) benchmarks that pose potential costs of downtime and disruption.
Measuring the failures of big brands
When should the industry start measuring the financial impact of major brand failures like the one that recently occurred?Facebook, ANDone that affected hundreds of thousands of Lloyds Bank customers, or thejetstar failurethat caused hundreds of flight delays?
In other words, at what point is an outage "significant enough" that a cost analysis becomes valuable for the industry to learn and predict the impact of future outage incidents?
Well, apparently, at some point, the outage creates an impact that PR cannot ignore. This is the point of no return, followed by financial impact estimates.
The cost of downtime varies significantly across industries. The size of the affected company is obviously a critical factor, but not the only important one. The role of IT systems in the company is also crucial.
To give an IT outage a numerical value, its impact on various organizational and business aspects must be predefined so that the entire industry can learn and optimize accordingly.
A failure of a critical application can result in two different types of losses:
- Application Service Outage: The impact of downtime varies by application and organization;
- Data Loss: Possible data loss due to a system failure can have significant legal and financial implications.
Well, I'm sure you'll agree that today's data centers should never go down; Applications need to be available 24/7, and internal (let alone external) end users around the world need to rely on data center availability (for critical data and application availability) at all times .
Well, reality bites. This is not the case in the back office (ie inside the data center). No organization enjoys 100% uptime. Should you try to reach 100%? Safe. However, you also need to develop a deep understanding of the impact of downtime and ways to minimize it.
Worst blackout nightmare ever? Probably what happened to you...
Some past blackout incidents have turned into PR disasters, like the mythological 2010 Virgin Blue disaster or the more recent one that devastated Facebook.
Why? The huge impact likely had something to do with it.
As a reminder, the Virgin Blue outage prevented passengers from boarding flights for 11 days (!!), resulting in negative press, reputational damage and millions of dollars in losses.
More specifically, Virgin Blue's reservation management company, Navitaire, ended up compensating Virgin Blue for over $20 million (Navitaire Booking Error Nets Virgin $20 Million in Compo).
There are many other incidents that still attract media attention. Here is just a recentUSA Today article on the Wells Fargo power outagethat prevented customers from accessing their accounts for many hours.
I can safely say that anyone in IT would agree that failures or downtime are VERY bad for business. They are undesirable, very harmful financially and must be fought with all available means.
Configuration errors are fundamental
The IT Process Institute's Visible Operations Handbook historically reported that "80% of unplanned outages are due to poorly planned changes made by administrators ("operations personnel") or developers" (visible operations).
The Enterprise Management Association reported that 60% of availability and performance failures are due to misconfigurations.
How much does it cost?
Downtime can cost organizations $5,600 per minute and up to $300,000 per hour in web application downtime (according to a2014 Gartner Analysis).
The average hourly cost of corporate server downtime worldwide, 2017-2018:
Application maintenance costs are increasing by 20% per year. But that can't solve all your problems. Previous industry research found that at least a quarter of surveyed downtime was caused by configuration errors. (How much will you spend on app downtime this year?).
How common is downtime or outage?
Okay, so downtime can be a financial nightmare. That part is clear. However, if you want to properly assess the potential risk of disruption in your organization, the immediate question must be, "How likely is this to happen?"
Source:data center knowledge
Okay, failures are too common to think "I probably won't have a major failure". Now the question arises how you can calculate your specific risk for your company.
Production and application downtime costs clarified
Unplanned outages rely on the IT department to fix them. However, as I mentioned earlier, these outages ultimately affect the entire organization.
An important part of a thorough default risk assessment process is estimating how much money you will lose per hour (or minute, or whatever time frame you choose) as a result of default.
For organizations that rely solely on the capacity of datacenters to deliver network and IT services to customers, such as For companies such as telecom service providers or e-commerce companies, downtime can be particularly costly, with the cost of a single event more than $1 million (over $11,000 per minute), according to expert estimates.
In a USA Today survey of 200 data center managers, more than 80% said their downtime costs exceeded $50,000 per hour. More than 25% reported downtime costs of more than $500,000 per hour (!!).
According to another survey, while companies cannot achieve zero downtime, one in ten say their availability needs to be greater than 99.999%.
To get a complete understanding of the impact of production and release downtime, let's take a look at how the consequences of downtime manifest.
Downtime costs: per year or per incident?
AStudy 2017found that 46% of 400 IT decision makers experienced more than four hours of IT-related downtime in a 12-month period; 23% said they incur costs ranging from $12,000 to more than $1 million per hour.
More than 35% admitted they are unsure of the cost of a business interruption.
If you ask Delta Airlines, which had to cancel 280 flights in 2017 because of a power outage, the losses from a single power outagecan reach more than 150 million dollars.
A few years ago, Dun & Bradstreet reported that 59% of Fortune 500 companies experience at least 1.6 hours of downtime per week.
If you take an average Fortune 500 company (or a company with at least 10,000 employees) and assume that it pays IT staff members an average of $56 per hour, then (assuming all of IT is busy sorting out time downtime) working alone is part of that. for a company of this size would come to $896,000 per week, which is equivalent to over $46 million per year (Assessing the financial impact of downtime).
Of course, reality is more complicated, as many parameters have to be taken into account, such as: B. the time of the event (weekday or weekend? day or night?) and much more. However, understanding the cost of downtime goes a long way in assessing your potential risk and return on investment in tools that can help minimize the impact of downtime.
Has the industry been able to learn from the past and minimize collateral damage during an outage?
How have things changed from the past?
So, we already know that there is still downtime and disruptions that the industry has yet to successfully eliminate. But how have costs changed over time? Are these incidents less harmful today?
Em 2010,an investigation by Coleman Parkesfound that IT downtime costs companies a total of more than 127 million hours per year in employee productivity, an average of 545 hours per company.
In 2009, the average cost of downtime varied significantly by industry, from about $90,000 per hour in the media industry to about $6.48 million per hour for large online brokerage agencies (How to quantify downtime).
According to a survey of IT managers conducted during these years, companies are becoming increasingly aware of the direct financial costs of computer failures. The survey found that one in five companies are losing $12,000 per hour due to system outages (How to quantify downtime).
As mentioned above, a subsequent Gartner analysis in 2014 found average costs of $5,600 per minute and over $300,000 per hour.
As early as 2004, a conservative estimate by Gartner put the cost per hour of computer network downtime at $42,000. As a result, a company that suffers more than an average of 175 hours of downtime per year can lose more than $7 million per year. However, the cost of each outage affects each business differently, so it's important to know how to calculate the exact financial impact (How to quantify downtime).
It makes sense to think that the cost of disruption will only increase over time (since we are all more reliant on data systems these days). Here's how to understand why past data can be multiplied by a meaningful number to reflect current reality...
every minute counts
More than a decade ago, the average cost of data center downtime across all industries was estimated at approximately $5,600 per minute (Unplanned IT outages cost over $5,000 per minute), an account numbergardener, remained the same until 2014. The previous Ponemon Institute study mentioned above calculated the minimum, average, average, and maximum cost per minute of unplanned outages based on information from 41 data centers. It turned out that the highest cost of an unplanned outage exceeds $11,000 per minute.
On average, the cost of an unplanned outage is likely to exceed $5,000 per minute.
it just makes more sense
AStudy 2013saw an increase of more than 41% over the previous averages described above and an average cost of more than $7,900 per minute.
E2015 ITIC-Umfrageclearly demonstrated that the cost per hour (compared to 2008 data) has increased by 25-30%.
Impact of downtime per year
A previous Gartner analysis estimated that downtime could be as high as 87 hours per year. Obviously, this is the sum of many interruptions, from a few minutes to several hours (The average large enterprise experiences 87 hours of network downtime per year).
How have things changed?
After2011 investigationrevealed that while the industry has been successful in combating the downtime epidemic and reducing its incidence, we are still seeing significant downtime and huge revenue losses (source:resulted in over 3 million (apparently Whatsapp users) switching to Telegram)
The impact on reputation and loyalty
How much is your company's reputation worth? This can be extremely difficult to assess, as can the long-term impact of a damaged reputation and its impact on sales and profitability.
In this case, the cost of failure includes lost customers (both short and long term) and other tangible items that reflect the cost of reputational damage such as: B. Out of stock, marketing times (crisis management and recovery from mark) and the media budget needed to reboot and polish. the profile of an organization.
Which parameters should affect your calculation?
When trying to estimate the cost of downtime, there are obvious direct costs (eg lost business during downtime). However, many indirect costs must also be calculated, such as employee overhead or reputation issues mentioned above.
Staffing costs derive from the cost of exhausting “war room” tasks focused on getting IT systems up and running again, the cost of being late on all other scheduled tasks, the cost of staff overtime (if applicable ) and much more. Add to this the value of data loss, emergency maintenance fees (especially if the outage occurs outside of business hours), and additional repair costs that can persist long after service is restored.
It goes without saying that you should consider these costs when calculating the impact of downtime, as they are often very high; But even a rough estimate can be extremely helpful in understanding the risks and deciding what level of technology to rely on to combat them.
There is also the impact of lost sales. To get an accurate estimate of total lost sales, the impact percentage needs to be increased to reflect the true lifetime value of customers permanently switching to a competitor. For example, the Facebook (and Whatsapp) outage mentioned above.Unconscious Costs: Denying the True Cost of Network Downtime. How much revenue is lost if these users send fewer billable ad impressions?
Stocks fell 25%
Although it is difficult to quantify so many parameters, they are significant and significant. For example, when Amazon.com was offline for several hours in its early days, its inventory dropped 25% in a single day (Unconscious Costs: Denying the True Cost of Network Downtime)!
In thatAmazon Cloud outageFor example, the company continued to struggle to get its cloud services back online. As a result, many customers questioned the reliability of their cloud and Amazon's communications surrounding the outage. Other customers felt that they should be compensated for downtime as part of their SLA.
I know you're curious: in terms of the SLA, Amazon's EC2 SLA was not breached despite the nearly four-day outage (Seven Lessons from the Amazon Outage).
The cost of downtime: calculate it yourself
How much will you lose from unexpected server or business application downtime?
According to various sources, the easiest way to calculate the potential for lost revenue during an outage is to use this equation:
|LOSS OF INCOME||=||(GR/TH) x I x H|
|GRAMM||=||annual gross income|
|E||=||total annual working time|
|H||=||Number of hours of inactivity|
How to minimize the risk of interruptions and stoppages?
Downtime and crashes are catastrophic, but they don't have to be overly shocking. By using solutions that focus on getting to the root of the problem, failures can be prevented before they happen.
Evolved change analysisdeveloped a unique AIOps solution that focuses on the changes that are the root cause of performance issues. Evolven helps the company's IT and cloud operations teams prevent and correct incidents before they start.
contact usto see how we're helping leading companies dramatically reduce incidents and MTTR.
Downtime cost is defined as any profit that a company loses when its equipment or network stops functioning. The cost of downtime implies not only direct financial loss but can have an impact on your company in at least the other 4 ways.What is the real cost of downtime? ›
For the Fortune 1000, the average total cost of unplanned application downtime per year is $1.25 billion to $2.5 billion. The average hourly cost of an infrastructure failure is $100,000 per hour.What is the difference between downtime and outage? ›
Downtime occurs when a system can't complete its primary function. It can be broken up into two types: IT outages and brownouts. IT brownouts occur when a system is slowed or partially available. This might mean customers can access your site, but pages load slowly or dynamic features like "add to cart" don't function.What is the meaning of outage cost in business? ›
Outage Costs means the actual increased costs of replacement energy incurred by Transmission Owner during an Outage calculated in accordance with this section and does not include costs that would have been incurred notwithstanding the Generating Facility interconnection.
Calculating Downtime Cost
The duration of the downtime and the cost incurred per minute you're offline are the two variables that most affect the financial impact of an outage.
The cost of downtime = downtime duration x per-minute cost.
You can use around $400 as a cost-per-minute figure for small enterprises. In the case of large and medium businesses, use $10,000. Many people only associate downtime costs with lost revenue.
Common categories of downtime include excessive tool changeover, excessive job changeover, lack of operator, and unplanned machine maintenance.What are some examples of downtime? ›
Downtime has many causes, including shutdowns for maintenance (known as scheduled downtime), human errors, software or hardware malfunctions, and environmental disasters such as power outages, fires, flooding or major temperature changes.How do you solve downtime? ›
- Track Downtime. Before jumping into the steps of reducing downtime, it is critical to track it. ...
- Monitor Production. Having a system to monitor production can also help reduce downtime. ...
- Create a Preventative Maintenance Schedule. ...
- Provide Operator Decision Support. ...
- Perform DMAIC Analysis.
TDC is a methodology of analyzing all cost factors associated with downtime, and using this information for cost justification and day to day management decisions. Most likely, this data is already being collected in your facility, and need only be consolidated and organized according to the TDC guidelines.
Repeated downtime events can result in unhappy customers, which can quickly translate into bad customer reviews and tarnished brand image. Data Loss: Downtime affects not only your business but your clients as well. Downtime due to cyberattacks, server or network outage can result in corrupt, damaged or stolen data.What are the two types of downtime? ›
Downtime falls into two categories: planned and unplanned. Planned downtime is notable because it offers advanced warning and gives users a chance to prepare. Planned downtime is usually done for upgrades or maintenance to the network infrastructure.How do you explain downtime? ›
a time during a regular working period when an employee is not actively productive. an interval during which a machine is not productive, as during repair, malfunction, maintenance.How do you define an outage? ›
an interruption or failure in the supply of power, especially electricity. the period during which power is lost: a two-hour outage on the East Coast.How much does 1 hour of downtime cost the average business? ›
About 98% of organizations claim only one hour of downtime costs over $100,000. Looking at each industry's breakdown, we'll find out if this is true. In the IT industry, downtime is typically calculated at about $5,600 per minute.How do companies keep their costs down? ›
Cost cutting measures may include laying off employees, reducing employee pay, closing facilities, streamlining the supply chain, downsizing to a smaller office, or moving to a less expensive building or area, reducing or eliminating outside professional services, such as advertising agencies and contractors, etc.How much does unplanned downtime cost? ›
That downtime comes at a cost, and it isn't cheap. For example, the average automotive manufacturer loses $22,000 per minute when the production line stops. That quickly adds up. Overall, unplanned downtime costs industrial manufacturers as much as $50 billion a year.What is a high cost of downtime? ›
How Much Does Downtime Cost a Company? The average cost of downtime is significant. Each minute costs an average of $9,000, according to the Ponemon Institute, bringing the downtime cost per hour to over $500,000.What is the industry standard for downtime? ›
World Class Standards For Downtime
Aim for unscheduled downtime to be 10% or less.
All manufacturing downtime reduces overall output by stopping production. Unplanned downtime can cost 15 times more than planned downtime. The loss of revenue during any type of asset maintenance can be as high as $3 million per incident.
The first way to measure your equipment downtime is in actual time. For a given asset (or set of assets), record the amount of time during each month that the asset is broken down. Keeping a running tally and comparing it to past months will help you know when an asset is having more issues than normal.What is downtime in accounting? ›
Downtime is the period during which equipment is not operational. This situation is caused by such factors as maintenance, setup for a job, broken equipment, or missing inputs, such as raw materials or qualified operators.What is the importance of downtime? ›
A little downtime is important for your brain health. Research has found that taking breaks can improve your mood, boost your performance and increase your ability to concentrate and pay attention. When you don't give your mind a chance to pause and refresh, it doesn't work as efficiently.What is a downtime plan? ›
Planned downtime is scheduled time when production equipment is limited or shut down to allow for planned maintenance, repairs, upgrades or testing.What is downtime for maintenance? ›
In manufacturing, “downtime” occurs when an unplanned event halts production for a period of time. This event can be a malfunction, repair, or changeover of tools or equipment. Maintenance downtime in particular is when a machine is not operating or being productive due to required maintenance work.What is downtime behavior? ›
Downtime behavior determines how events related to a CI are handled when received while that CI was in downtime. To access. Administration > Event Processing > Automation > Downtime Behavior. Alternatively, click Downtime Behavior.How can we minimize the risk of system downtime? ›
- Test Server Backups On A Regular Basis. When a server goes down, you can mitigate damage by restoring it quickly. ...
- Utilize Cloud Solutions. ...
- Keep Everything Up To Date. ...
- Invest In Reliable Equipment.
To get a quick estimate of your company's probable downtime costs, use the following formula, based on the size of your business and the number of minutes your most recent incident lasted: Downtime cost = minutes of downtime x cost-per-minute.What is unplanned downtime? ›
Unplanned downtime occurs when there is an unexpected shutdown or failure of equipment or process. Unplanned downtime not only causes costly delays in maintenance, production schedules and order deliveries, but it also increases the chance of personnel injury, environmental incidents and emergency repairs.How do you optimize maintenance and operation costs? ›
- Eliminate tasks that do not correspond to any failure mode.
- Instead of “fixing”, find a cure.
- Optimise work orders.
- Avoid reactive maintenance.
- Negotiate contracts with current suppliers.
- Know the life cycle of your assets.
- Cut down on day-to-day wastage.
- Optimise MRO inventory.
Consequences of unplanned downtime
Lost productivity and revenue: Every minute of downtime can result in lost productivity and revenue, affecting a business's bottom line. Decreased customer satisfaction: Unplanned downtime can lead to delayed deliveries, canceled orders, and frustrated customers.
You can't have the high without the low. The better you are at resting, the better you will be at working.” Downtime is essential for increasing attention, boosting mood, unlocking creativity, and solving problems. It's also necessary for improving learning and memory and restoring mental health at work.Is downtime a KPI? ›
Revenue is directly impacted by downtime because the less equipment is running, the fewer products are made and sold. Therefore, one of your maintenance KPIs is downtime. All sorts of quantifiable actions can influence downtime, such as the mean time to repair (MTTR) or planned maintenance percentage.How do you handle downtime at work? ›
- Offer to help a colleague or manager. ...
- Organize and clean your workspace. ...
- Go for a walk. ...
- Clean your email inbox. ...
- Read industry news. ...
- Compile a list of contacts. ...
- Record your voicemail greeting. ...
- Write a note of appreciation.
On this page you'll find 11 synonyms, antonyms, and words related to outage, such as: blackout, brownout, disruption, interruption, dimout, and disconnection.What is an outage problem? ›
An Internet outage or Internet blackout or Internet shutdown is the complete or partial failure of the internet services. It can occur due to censorship, cyberattacks, disasters, police or security services actions or errors.What is a major outage? ›
More Definitions of Major Outage
Major Outage means any Power Outage that lasts for at least ten (10) consecutive minutes and/or any Temperature Irregularity, in each case causing inoperability of Customer's Equipment.
Combination: Bundle goods and services across an organization to reduce costs. Elimination: Remove unnecessary products, processes, benefits, and workflows. Optimization: Streamlining processes and workflows to reduce bottlenecks and redundancies. Substitution: Using cheaper products or services.How can you reduce costs without affecting quality? ›
- Renegotiate with Suppliers. ...
- Buy in Larger Quantities. ...
- Improve Efficiency. ...
- Reduce Wastage. ...
- Outsource Tasks. ...
- Review Employee Productivity. ...
- Cut Energy Usage. ...
- Review Finance Arrangements.
- Reduce Power Use. ...
- Analyze Costs and Minimize Spend. ...
- Negotiate With Suppliers. ...
- Restructure You Rent. ...
- Maximize Productivity. ...
- Digital Everything. ...
- Reduce Wages & Reduce Hours. ...
It's important to give your brain a break numerous times throughout the day, experts say. While there's no hard-and-fast prescription, try aiming for a rest period about every 90 minutes or whenever you start to feel drained, are unable to concentrate, or are stuck on a problem, suggests Friedman.Why is it called downtime? ›
downtime (n.) also down-time, 1952, "time when a machine or vehicle is out of service or otherwise unavailable;" from down (adj.) + time (n.). Of persons, "opportunity for rest and relaxation," by 1982.What is 5 nines availability downtime? ›
Availability is normally expressed in 9's. For example, “5 nines uptime” means that a system is fully operational 99.999% of the time — an average of less than 6 minutes downtime per year. The chart shows what impact various availability levels have on your server downtime.What are downtime metrics? ›
The most well-known downtime metric is Mean Time to Repair (MTTR). The MTTR metric reflects the average time it takes to troubleshoot and repair a failed piece of equipment.What is maximum tolerable downtime? ›
The amount of time mission/business process can be disrupted without causing significant harm to the organization's mission.
Downtime in production is separated into two different categories: planned and unplanned. 'Planned downtimes' are scheduled and budgeted stops during production such as scheduled maintenance and product changeovers.What is an example of a downtime cost? ›
For example, if your revenue is $2,000 per hour, and your system goes down for 3 hours, and you depend on the Internet for 40% (uptime), then your downtime loss would estimate $2,400 per hour.What is the explanation of downtime? ›
The term downtime is used to refer to periods when a system is unavailable. The unavailability is the proportion of a time-span that a system is unavailable or offline. This is usually a result of the system failing to function because of an unplanned event, or because of routine maintenance (a planned event).What is meant by the term downtime? ›
: time during which production is stopped especially during setup for an operation or when making repairs. : inactive time (such as time between periods of work) napping during our downtime.What is downtime and its causes? ›
Downtime is a period during which production or business processes come to a halt due to application unavailability, technical glitch, network outage or natural disaster.
- Know the best windows of time for planned downtime based on your company's production cycle. ...
- Prioritize all your assets and know which should be handled first. ...
- Implement clear guidelines and well-defined standard operating procedures (SOPs) for each repeated operation.
- Plan for Recovery. The best way to ensure a fast recovery is to plan ahead. ...
- Keep Everything Up to Date. ...
- Educate Your Workforce. ...
- Install a Backup Power System. ...
- Test Your Infrastructure. ...
- Consider Disaster Recovery as a Service.
What is downtime at work? It is a period during which an equipment or machine is not functional or cannot work. It may be due to technical failure, machine adjustment, maintenance, or non-availability of inputs such as materials, labor, power.What does downtime mean in maintenance? ›
In manufacturing, “downtime” occurs when an unplanned event halts production for a period of time. This event can be a malfunction, repair, or changeover of tools or equipment. Maintenance downtime in particular is when a machine is not operating or being productive due to required maintenance work.What is another word for downtime? ›
A break or intermission in work or activity. break. pause. intermission. interlude.