18-03-2009, 11:23 PM
|
#1 (permalink)
|
|
Registered User
Join Date: Dec 2008
Location: australia
Posts: 5
|
Sla
|
|
99% SLA on network uptime.
Can someone please explain what the above actually means? With a dedicated server what is the normal or acceptable amount of time per month for the server to be down for whatever reason? Just wanting to know because have only recently started paying for a managed dedicated server (found here) but wondering what is and isn't acceptable.
I left previous shared hosting because the site was down for an average of an hour or so a week. But the new dedictaed server was great the first month but now has been down several times in a matter of a week or more. Not sure if I am just getting nervous becuase of previous bad experience but am also paying alot more than before so expect more from the server. So what would you consider to be a normal amount of down time?
|
|
|
19-03-2009, 12:06 AM
|
#2 (permalink)
|
|
Ozzie Web Hosting
Join Date: Oct 2006
Location: Hunter Valley, NSW
Posts: 586
|
Quote:
|
but now has been down several times in a matter of a week or more
|
If you are certain that you can rule out that you aren't experiencing any issues with your ISP then that amount of downtime is not acceptable, I'd be seeking answers as to why the server was having issues or if they were experiencing network issues that would cause your server to be down.
|
|
|
19-03-2009, 12:41 AM
|
#3 (permalink)
|
|
Ozzie Web Hosting
Join Date: Oct 2006
Location: Hunter Valley, NSW
Posts: 586
|
99% SLA on network uptime = network uptime is the total time in a calendar month that the providers network is available through the Internet.
It's a guarantee that the providers network will be up and functioning 99% of the time per calendar month, however you need to read the specific terms of the providers service level agreement to determine what is and isn't included in this guarantee.
Hope this helps clarify your question.
|
|
|
19-03-2009, 01:18 AM
|
#4 (permalink)
|
|
Registered Provider
Join Date: Feb 2008
Location: Perth, Australia
Posts: 181
|
Just to clarify here, usually a network uptime SLA does not also include your physical server. The physical server can quite often have a different SLA to what the network does(eg. 99.98% on network, and 99.95% on physical host)
|
|
|
19-03-2009, 08:04 AM
|
#5 (permalink)
|
|
Registered User
Join Date: Dec 2008
Location: australia
Posts: 5
|
Thanks for responses. It appears last nights issue was my ip was blocked by the firewall so was only me that could not access. The last downtime was a power outage in the datacentre so was out of anyones control so I guess I will keep learning things about dedicated servers so that I understand things a bit more. Thanks for responses.
|
|
|
19-03-2009, 10:31 AM
|
#6 (permalink)
|
|
New Sprout Hosting
Join Date: Mar 2009
Location: Australia
Posts: 9
|
Hi sparka,
To expand on this a little:
Quote:
Originally Posted by sparka
99% SLA on network uptime.
Can someone please explain what the above actually means?
|
It depends on the number of days in a particular month, for example,
31 days x 24 hours x 99% = 736.5 hours uptime out of a possible 744 hours, therefore a possible 7.5 hours downtime each month.
If it were 99.9% uptime, there would only be 45 mins downtime as acceptable.
Quote:
Originally Posted by sparka
The last downtime was a power outage in the datacentre so was out of anyones control
|
This should be something which can be controlled. You should be covered by the datacentre's UPS's (uninteruptible power supplies) which manage the power coming into the DC and they should switch over seemlessly to battery and then generator power if there is a drop in main's power...thus avoiding any downtime whatsoever.
Cheers,
Gavin
|
|
|
19-03-2009, 01:08 PM
|
#7 (permalink)
|
|
Registered User
Join Date: Feb 2008
Location: My house!
Posts: 188
|
Quote:
Originally Posted by newsprout
This should be something which can be controlled. You should be covered by the datacentre's UPS's (uninteruptible power supplies) which manage the power coming into the DC and they should switch over seemlessly to battery and then generator power if there is a drop in main's power...thus avoiding any downtime whatsoever.
|
Sadly if you have been following developments lately you'd know that doesn't always happen, with both Primus and Equinix suffering from power related outages in the past couple of months.
But in theory... UPSes are a wonderful thing.. when they work!! 
|
|
|
19-03-2009, 01:14 PM
|
#8 (permalink)
|
|
Registered Provider
Join Date: Feb 2008
Location: Perth, Australia
Posts: 181
|
Quote:
Originally Posted by Bendweb
Sadly if you have been following developments lately you'd know that doesn't always happen, with both Primus and Equinix suffering from power related outages in the past couple of months.
But in theory... UPSes are a wonderful thing.. when they work!! 
|
That's why in all our racks we use about 8-10RU just for our own UPSes. They aren't anything awesome but 5-10 min is all we need - for just in case something goes wrong with our DC's UPS's/Generators... We learn't the hard way...
|
|
|
21-03-2009, 12:46 AM
|
#9 (permalink)
|
|
Cooking? Traderecipes.net
Join Date: Jun 2002
Location: Brisbane, QLD, Australia
Posts: 1,282
|
Quote:
Originally Posted by Bendweb
Sadly if you have been following developments lately you'd know that doesn't always happen, with both Primus and Equinix suffering from power related outages in the past couple of months.
But in theory... UPSes are a wonderful thing.. when they work!! 
|
Well, if your a customer of the above companies then I presume they've supplied uptime guarantees as well. Not sure about Equinix but Primus from reports around had a cascading failure of UPS then Backup Generator equipment. To me that sounds like a broken ATS (automatic transfer switch) or something wasn't regularly checked and maintained (like generator tests etc).
Power issues are more avoidable then a lot of other issues and I don't understand why there's been a bit of a rash of them lately. Makes me wonder if it comes down to the crunch between higher density racks vs. higher than design spec installations.
Stu
__________________
GooFi - Google Maps WiFi!
Mobile ME?
|
|
|
23-03-2009, 09:57 AM
|
#10 (permalink)
|
|
Registered Provider
Join Date: Mar 2009
Location: Sydney
Posts: 17
|
Sparka,
A normal or acceptable amount of downtime to me would be not more than 45 minutes to 1.5 hours in any given month. You shouldn't expect to see this level of outage consistently month after month though. It should be quite reasonable to expect many months of the year with out any outages at all.
There are a few things you can do to get a better picture of what is happening. Setup some external monitoring, there are a bunch of basic free ones. Be careful though, as you've already experienced not all detected outages will be your hosts fault. There are often 4-5 or more independent networks between you and your provider.
For each outage, courteously request an outage report, if you don't get enough detail, ask for more. If the outage reports consistently relate to the same problem over a period of time more than a few weeks you should start to ask the question of why the root cause of the issue is not being addressed. A good outage report should offer some explanation of what steps are being taken to prevent a similar type of outage occurring again. In reality outages happen and they happen to everyone, the important thing is that the provider is learning from the problems, analysing and doing things to stop the same thing causing another outage.
99% SLA sounds a little low, as Gavin pointed out, this is 7.5 hours per month. On the network level you should be looking for something around 99.8 or 99.9%.
An SLA by definition is supposed to represent the level of downtime that can occur before the provider has to start providing some form of rebate against the cost of the monthly service. It's basically a penalty that should act as an encouragement to keep outages down. It also serves to set expectations. If they are saying we'll provide 99% uptime and if we don't we'll give you your money back, you are right in expecting that on average your uptime should be somewhere around 99%.
In my view SLA's are of questionable value these days. There are often many caveats in the fine print that effect the way in which they are paid out and in reality even if they are always paid out and paid automatically, if the outages are frequent enough and long enough that you are getting money it's probably time to move. This loss of business will also speak louder to the provider than the cost of paying out on the SLA.
By definition as a business you should be adding value to the services you buy from your suppliers, if you're not you don't have a business. If a provider gives you back a few percent of the cost of a service and the cost of that service is only a small percentage of your operating expenses - the money you get will do nothing to appease your unhappy customers. It won't go anywhere near the cost of paying out on your own SLA if you need to.
If you're signing a long term contract for hosting perhaps of more interest are the termination clauses, make sure that you can exit the contract if the company fails to provide an acceptable level of service. If it's outage central and you need to move but you're still liable to pay out the value of the contract then you have a problem.
Andrew Rogers
|
|
|
23-03-2009, 01:16 PM
|
#11 (permalink)
|
|
Australian Data Hosting
Join Date: Feb 2007
Location: Melbourne
Posts: 1,083
|
Quote:
Originally Posted by anchor
For each outage, courteously request an outage report, if you don't get enough detail, ask for more. If the outage reports consistently relate to the same problem over a period of time more than a few weeks you should start to ask the question of why the root cause of the issue is not being addressed. A good outage report should offer some explanation of what steps are being taken to prevent a similar type of outage occurring again.
|
G'day Andrew and sparka,
The above is pretty accurate summation of things. The only bit I would add is, you must be sure the problem is not at your end first. You'll only be frustrated by the WTF questions from your host if they know they have been on the air the whole time. There is a terrific web site with a name like www.ismysitedown.com. That's not it and I can't remember the proper domain name. Perhaps someone else will be kind enough to post it. The other one I use is Central Ops to make sure it's not me or my ISP.
PS: The bit about being courteous is much appreciated.
__________________
Cheers,
Mike
I may be house trained but I still don't do windows!
|
|
|
23-03-2009, 01:51 PM
|
#13 (permalink)
|
|
Australian Data Hosting
Join Date: Feb 2007
Location: Melbourne
Posts: 1,083
|
Thank you Vee, that's the site.
I still love putting their own URL in to check. It cracks me up.
__________________
Cheers,
Mike
I may be house trained but I still don't do windows!
|
|
|
25-03-2009, 01:11 PM
|
#14 (permalink)
|
|
Ozzie Web Hosting
Join Date: Oct 2006
Location: Hunter Valley, NSW
Posts: 586
|
Quote:
Originally Posted by adhc
I still love putting their own URL in to check. It cracks me up.
|
It's not hard to crack you up Mike  but you're right it cracked me up too! 
|
|
|
25-03-2009, 10:43 PM
|
#15 (permalink)
|
|
Registered User
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 4
|
Quote:
Originally Posted by newsprout
This should be something which can be controlled. You should be covered by the datacentre's UPS's (uninteruptible power supplies) which manage the power coming into the DC and they should switch over seemlessly to battery and then generator power if there is a drop in main's power...thus avoiding any downtime whatsoever.
Cheers,
Gavin
|
I was actually at Equinix when this outage occurred. It was due to a power rail with 63A failing (melting) due to it powering way too many racks. I think we counted 15 racks were out due to this one breaker turning into soup..
Unfortunately this isn't something that either the network providers or the webhost providers can prevent, simply because it is Equinix's responsibility to monitor their power grid. I'm not aware of any data centers that will allow you to run around under the raised floor, looking at all the breakers and rails.
I've got some photos of the breaker if anybody is interested in looking at the damage that was caused.
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|