A single point of failure (SPOF) in the context of a website and a web server refers to any critical component whose failure would result in the entire system becoming unavailable.
It seems that A2 Hosting experienced a full network outage yesterday at around 8:22 PM (Eastern), which lasted for 1 hour and 45 minutes. Services were restored by approximately 10:07 PM (Eastern).
The issue was not isolated to any particular servers, but rather appeared to be a connectivity problem at the data center(s) in Michigan where your servers are located. The Arizona, Singapore and Netherlands data centers were not impacted.
A2 Hosting has been vague in providing details about the incident. You can review their updates here: https://a2status.com/incidents/4098
To reiterate, this was not a problem with your specific servers, but a connectivity issue at the Michigan data center.
Several of you have asked about ways to prevent this type of issue from impacting your businesses. And I completely understand the concerns.
This situation is what we call a Single Point Of Failure or SPOF.
In the realm of information technology (IT), a single point of failure (SPOF) refers to a component or part of a system that, if it fails, will cause the entire system or service to fail. This critical element is the only source providing a specific service, and when it goes down, it creates a bottleneck that halts operations or disrupts access.
SPOFs can exist in various forms, such as:
Hardware: A single server, power supply, or network switch that isn’t backed up or redundant. If it fails, everything dependent on it goes offline.
Software: A single application or database that all users rely on. If that application crashes, users lose access.
Network: A single router or connection point that manages all traffic. If it goes down, network communications can cease.
Human: A key person with unique knowledge or access. If that person becomes unavailable, critical tasks may be delayed or impossible to complete.
IT systems strive to eliminate single points of failure through redundancy (backups, failover systems) and distributed architecture (spreading out responsibilities across multiple systems) to ensure that no single failure can bring down the entire system. This, however, can be extremely cost prohibitive for most of us.
For example, let’s look at Facebook or Meta. They have over 12000 physical servers, located in 55 distinct data centers globally. They can lose connectivity to any number of servers or even entire data centers and the presentation of https://www.facebook.com will not suffer.
Though Google is very tight lipped with their numbers, we can assume the same level of redundancy.
The bottom line is, to keep your costs at a manageable level, the SPOF is almost unavoidable.
I hope this helps explain the system and how it works and sets an expectation for availability. Not pleasant. But factual.
Lastly, changing hosting companies does not really solve the issue. Over the years I have worked with numerous hosting companies including InMotion Hosting, Arvixe, Host Gator, Host Papa, A Small Orange and several others. All hosting companies are built on the same basic structure. Changing to a different hosting company simply moves the SPOF to a provider. It truly changes nothing as far as the potential for failures.
Scott W
Web Tweeks, LLC