Author Topic: 11/23/2015 16:45 Down again? UP 1322 28 Nov 2105!  (Read 189296 times)

Offline Kestryll

  • Administrator
  • Sr. Member
  • *****
  • Posts: 393
  • Karma: +88/-57
    • View Profile
    • Calguns.net
Re: 11/23/2015 16:45 Down again?
« Reply #180 on: November 24, 2015, 11:30:58 AM »
I have some new information, and it's pretty much a case of a series of unfortunate events.

A little background; data is stored on mirrored disks which are then mirrored to another pair which in turn are mirrored to another pair and so on. In addition there are several backup disks which come in to play should one of the main disks fail.
And boy did it fail.

Apparently disc 2 of the primary pair failed so the server called up backup drive A and started rebuilding the mirror with drive 1 and drive A.
At the completion of the rebuild drive 1 failed. Drive A, now the primary drive of the primary pair, called up backup drive B and started rebuilding the mirror.
Midway through the second rebuild drive A failed taking out the primary mirrored pair.

Somewhere in this process one of the controller boards died as well.

Some of you may be wondering why there was no alert of a predictive failure, good question.
Apparently the data center failed to do a firmware update some time back that would have sent the predictive failure alert for all three drive so there was no notice given until things started dying.

FedEx should be delivering the new hardware as I am typing this if they haven't already and it will all be installed as quickly as possible.
The real delay comes from restoring the terabytes of data from tape which takes hours and hours.

As I said, a series of unfortunate events but they are working on it as quickly as they can.
"The problem with ‘post-modern’ society is there are too many people with nothing meaningful to do, building ‘careers’ around controlling the lives of others and generally making social nuisances of themselves. They justify their meddling by discovering social ‘problems’ and getting the media to mag

Offline Kestryll

  • Administrator
  • Sr. Member
  • *****
  • Posts: 393
  • Karma: +88/-57
    • View Profile
    • Calguns.net
Re: 11/23/2015 16:45 Down again?
« Reply #181 on: November 24, 2015, 11:33:23 AM »
It could be their chassis cooling failed, causing several disks to die off - more than the RAID level they use could tolerate, necessitating a complete restore after they get equipment spares. It could have been the data center cooling too.  I'm just spitballing based on the limited info we have.

I think you hit it on the head with the data center/chasis cooling issue.
"The problem with ‘post-modern’ society is there are too many people with nothing meaningful to do, building ‘careers’ around controlling the lives of others and generally making social nuisances of themselves. They justify their meddling by discovering social ‘problems’ and getting the media to mag

Offline Red-Osier77

  • Sr. Member
  • ****
  • Posts: 334
  • Karma: +12/-83
  • karma:+69/-899
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #182 on: November 24, 2015, 11:33:54 AM »
Thanks for the update

Offline 00Medic

  • Newbie
  • *
  • Posts: 4
  • Karma: +0/-1
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #183 on: November 24, 2015, 11:49:15 AM »
Wow!!

The system didn't just crash, it went all China syndrome. That's unfortunate.

Offline Red-Osier77

  • Sr. Member
  • ****
  • Posts: 334
  • Karma: +12/-83
  • karma:+69/-899
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #184 on: November 24, 2015, 11:54:18 AM »
We burnt it to the ground.

Offline swalt

  • Full Member
  • ***
  • Posts: 162
  • Karma: +12/-29
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #185 on: November 24, 2015, 11:56:29 AM »
I'm guessing some over there are sweating bullets and lots of cursing and swearing.  Can't they just slap the end of the rack a few times??  Mallet sometimes works too.
« Last Edit: November 24, 2015, 12:02:52 PM by swalt »

Offline supernachos

  • Newbie
  • *
  • Posts: 2
  • Karma: +0/-0
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #186 on: November 24, 2015, 11:58:40 AM »
Hmmm... Interesting that we cannot resolve www.calguns.net via DNS anymore. Even if the primary DNS servers, SOA for calguns.net were afflicted by the hardware failure, I would have expected  DNS look up to be cached atleast for 3 days ( refresh of DNS record  is 3 hours I think) by down stream DNS servers?! Please correct me if I am wrong on this assumption .. :)
« Last Edit: November 24, 2015, 12:06:28 PM by supernachos »

Offline Casual Shooter

  • Jr. Member
  • **
  • Posts: 65
  • Karma: +4/-9
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #187 on: November 24, 2015, 12:02:18 PM »
Who's missing their red stapler?

Offline AAR

  • Newbie
  • *
  • Posts: 48
  • Karma: +9/-17
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #188 on: November 24, 2015, 12:04:24 PM »
Does this mean my post count will be lower over at .net:eek:?....

NOOOO

Offline omfgun

  • Newbie
  • *
  • Posts: 1
  • Karma: +0/-0
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #189 on: November 24, 2015, 12:05:38 PM »
... restoring the terabytes of data from tape which takes hours and hours....
How many hours of data loss will there be after the restore?  Just wondering which threads will need to re-updated / recreated.

Offline Marin(DOWN)Range42

  • Newbie
  • *
  • Posts: 15
  • Karma: +0/-4
  • But first watch me pull a rabbit out of my hat!
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #190 on: November 24, 2015, 12:07:57 PM »
Sorry to hear that Kest,

Might be looking at different choices for the future, so I would just mention this.
Primary disks (active) (mirrored fine)
Second set of disks with full software load and blank database (located in skeleton secondary running somewhere else home...different provider.)
Backup schedule Monday (to onsite backup location or second pair of disks), Tuesday ( Main active to backing up to online storage or to your home second set of backup disks backing up online), Wednesday (Both), Thursday, Friday, Saturday repeat, Sunday who cares as nobody worthy posts that day anyway :)

If Hosting goes down and not able to logon or respond, I get it. No ability to re-route DNS info so that needs a fix), because as I understand you to be a techie, if you could have re-routed, you likely had a second location with a temporary solution....ie .org server/database.

You got hosed and anyone in tech has been there. I'm sorry it happened. But there's little to no reason that you should have not had the hosting company provide access to the edge server at minimum to change DNS direction. Good luck and oh I'd like to offer up Stilly to help out......

Offline supernachos

  • Newbie
  • *
  • Posts: 2
  • Karma: +0/-0
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #191 on: November 24, 2015, 12:09:09 PM »
... restoring the terabytes of data from tape which takes hours and hours....
How many hours of data loss will there be after the restore?  Just wondering which threads will need to re-updated / recreated.

Many modern Data Centers don't use tape anymore, but back up to disk as well.

Offline FLIGHT762

  • Newbie
  • *
  • Posts: 2
  • Karma: +0/-0
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #192 on: November 24, 2015, 12:18:43 PM »
It could be their chassis cooling failed, causing several disks to die off - more than the RAID level they use could tolerate, necessitating a complete restore after they get equipment spares. It could have been the data center cooling too.  I'm just spitballing based on the limited info we have.

I think you hit it on the head with the data center/chasis cooling issue.

I agree, thermal agitation causing the Shod effect in high gain.   :P

Offline Dyson

  • Newbie
  • *
  • Posts: 41
  • Karma: +2/-29
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #193 on: November 24, 2015, 12:34:02 PM »
sounds like i might actually have to hang out with relatives this thanksgiving :p

Offline FP562

  • Full Member
  • ***
  • Posts: 161
  • Karma: +17/-81
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #194 on: November 24, 2015, 01:05:36 PM »
Damn that is crazy.... Thanks Obama.

I am just watching Cops now, i was owning some noobs in fallout 4. I have calgunners texting me all day asking what happen lol

Offline Ubermcoupe

  • Jr. Member
  • **
  • Posts: 70
  • Karma: +11/-22
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #195 on: November 24, 2015, 01:09:50 PM »
Thanks for the update, Kes!



I have some new information, and it's pretty much a case of a series of unfortunate events.

A little background; data is stored on mirrored disks which are then mirrored to another pair which in turn are mirrored to another pair and so on. In addition there are several backup disks which come in to play should one of the main disks fail.
And boy did it fail.

Apparently disc 2 of the primary pair failed so the server called up backup drive A and started rebuilding the mirror with drive 1 and drive A.
At the completion of the rebuild drive 1 failed. Drive A, now the primary drive of the primary pair, called up backup drive B and started rebuilding the mirror.
Midway through the second rebuild drive A failed taking out the primary mirrored pair.

Somewhere in this process one of the controller boards died as well.

Some of you may be wondering why there was no alert of a predictive failure, good question.
Apparently the data center failed to do a firmware update some time back that would have sent the predictive failure alert for all three drive so there was no notice given until things started dying.

FedEx should be delivering the new hardware as I am typing this if they haven't already and it will all be installed as quickly as possible.
The real delay comes from restoring the terabytes of data from tape which takes hours and hours.

As I said, a series of unfortunate events but they are working on it as quickly as they can.
The information contained in this document is CONFIDENTIAL and LEGALLY PRIVILEGED, intended only for the recipient(s) named above. If the reader of this message is not the intended recipient, you are notified that any use, copying, disclosure, retention or distribution is unlawful and illegal.

Offline jctheguy

  • Newbie
  • *
  • Posts: 3
  • Karma: +0/-0
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #196 on: November 24, 2015, 01:19:41 PM »
IT derp here......I feel for you bro...(Shedding tears now)...

Offline kapache

  • Newbie
  • *
  • Posts: 4
  • Karma: +0/-0
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #197 on: November 24, 2015, 01:39:42 PM »
NA

Offline kapache

  • Newbie
  • *
  • Posts: 4
  • Karma: +0/-0
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #198 on: November 24, 2015, 01:42:46 PM »
Thanks for the update, Kes!



I have some new information, and it's pretty much a case of a series of unfortunate events.

A little background; data is stored on mirrored disks which are then mirrored to another pair which in turn are mirrored to another pair and so on. In addition there are several backup disks which come in to play should one of the main disks fail.
And boy did it fail.

Apparently disc 2 of the primary pair failed so the server called up backup drive A and started rebuilding the mirror with drive 1 and drive A.
At the completion of the rebuild drive 1 failed. Drive A, now the primary drive of the primary pair, called up backup drive B and started rebuilding the mirror.
Midway through the second rebuild drive A failed taking out the primary mirrored pair.

Somewhere in this process one of the controller boards died as well.

Some of you may be wondering why there was no alert of a predictive failure, good question.
Apparently the data center failed to do a firmware update some time back that would have sent the predictive failure alert for all three drive so there was no notice given until things started dying.

FedEx should be delivering the new hardware as I am typing this if they haven't already and it will all be installed as quickly as possible.
The real delay comes from restoring the terabytes of data from tape which takes hours and hours.

As I said, a series of unfortunate events but they are working on it as quickly as they can.


Firmware being upgrade for alerts to be sent sounds odd. Why didn't they implement monitoring scripts that sends alerts when a drive drops from the raid? Seems they are making bs excuses to what happened. I bet a drive dropped from the raid it never got replace, and now that two drives from the array died the whole system went poof!
« Last Edit: November 24, 2015, 01:44:44 PM by kapache »

Offline NME_SNIPER

  • Newbie
  • *
  • Posts: 1
  • Karma: +0/-0
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #199 on: November 24, 2015, 01:43:22 PM »
OK, who will be the first to come up with a really good conspiracy theory as to why .net is down?

  • Pelosi
  • Øbama
  • chemtrails
  • .45 > 9mm
  • fluoride
  • mandatory vaccination
  • Harry Tanges
  • Moms Demand Don't Stop
  • bacon

It was Senator Leland Yee!!! He took Calguns down as revenge for them not coming to his aid regarding his second amendment rights to buy and sell firearms in California

Offline swalt

  • Full Member
  • ***
  • Posts: 162
  • Karma: +12/-29
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #200 on: November 24, 2015, 01:55:36 PM »
OK, who will be the first to come up with a really good conspiracy theory as to why .net is down?

  • Pelosi
  • Øbama
  • chemtrails
  • .45 > 9mm
  • fluoride
  • mandatory vaccination
  • Harry Tanges
  • Moms Demand Don't Stop
  • bacon

It was Senator Leland Yee!!! He took Calguns down as revenge for them not coming to his aid regarding his second amendment rights to buy and sell firearms in California

Nawww........it was Hillary......she has experience with data loss and taking servers out of service.  Add that to her recent anti 2A statements, and there ya go!  ;D

Offline Marauder2003

  • Newbie
  • *
  • Posts: 24
  • Karma: +1/-8
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #201 on: November 24, 2015, 02:05:00 PM »
Who's missing their red stapler?

I have 2.  :)
Stop Making Stupid People Famous

Offline cockedandglocked

  • Full Member
  • ***
  • Posts: 152
  • Karma: +18/-48
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #202 on: November 24, 2015, 02:07:36 PM »
Firmware being upgrade for alerts to be sent sounds odd. Why didn't they implement monitoring scripts that sends alerts when a drive drops from the raid? Seems they are making bs excuses to what happened. I bet a drive dropped from the raid it never got replace, and now that two drives from the array died the whole system went poof!

That was my thought too, it doesn't sound like an unfortunate series of coincidences, it sounds like they weren't diligent and are now paying the price. Coming up with excuses that are extremely unlikely, but we can't prove it didn't happen, to try to save face and not sound incompetent. Did they not realize that their system was incapable of warning them that a drive failure was detected? Does that mean that, until now, they've never head a drive failure? Heck, my server at work only has 4 disks, and I have to replace a dead one at least once every 2 years. And guess what, I even get an email when one goes bad  8)
I'm only here for the timeshare presentation

Offline Loopwell

  • Newbie
  • *
  • Posts: 24
  • Karma: +1/-7
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #203 on: November 24, 2015, 02:21:22 PM »
Just checking in to this historic event, we will always remember.

Offline Kestryll

  • Administrator
  • Sr. Member
  • *****
  • Posts: 393
  • Karma: +88/-57
    • View Profile
    • Calguns.net
Re: 11/23/2015 16:45 Down again?
« Reply #204 on: November 24, 2015, 02:25:54 PM »
Firmware being upgrade for alerts to be sent sounds odd. Why didn't they implement monitoring scripts that sends alerts when a drive drops from the raid? Seems they are making bs excuses to what happened. I bet a drive dropped from the raid it never got replace, and now that two drives from the array died the whole system went poof!

That was my thought too, it doesn't sound like an unfortunate series of coincidences, it sounds like they weren't diligent and are now paying the price. Coming up with excuses that are extremely unlikely, but we can't prove it didn't happen, to try to save face and not sound incompetent. Did they not realize that their system was incapable of warning them that a drive failure was detected? Does that mean that, until now, they've never head a drive failure? Heck, my server at work only has 4 disks, and I have to replace a dead one at least once every 2 years. And guess what, I even get an email when one goes bad  8)



I've known Michael the owner of PE/GV for over a decade, he has at one point donated two years of hosting free back when Calguns was small and needed time to grow. He's been a Calgunner for most of the time Calguns has existed and has helped dozens of Calgunners and 2A orgs get web stores and sites off the ground.

Nobody has to come up with excuses, for over ten years I've gotten straight answers whenever I've had a problem.

I don't care how unlikely it is I don't need them to prove jack squat because I've never been given a reason to mistrust Michael or PE/GV.
I have been given many reasons to trust them over the last decade so when I get a call from my friend and owner of the hosting company to tell me 'This is what happened' I believe him that that is what happened.
"The problem with ‘post-modern’ society is there are too many people with nothing meaningful to do, building ‘careers’ around controlling the lives of others and generally making social nuisances of themselves. They justify their meddling by discovering social ‘problems’ and getting the media to mag

Offline My66quick

  • Jr. Member
  • **
  • Posts: 74
  • Karma: +4/-14
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #205 on: November 24, 2015, 02:47:47 PM »
IN. Don't want to miss the chance of being involved in a piece of history.

Offline readysetgo

  • Full Member
  • ***
  • Posts: 113
  • Karma: +6/-38
  • [applaud] [smite]
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #206 on: November 24, 2015, 03:13:42 PM »
hahaha, just read the comments on Facebook, maybe we should sick Randall on them idi... umm, fine gentlemen. :D

Quote
The temp. Calguns.org page is such a joke in the market place. Nothing but BS threads being posted!!! Nothing is being taken serious and needs to regulated seriously with a moderator if that's going to be up and running until this gets situated!!!

Here's another:
Quote
So, either the forum goes down so much that you need a backup, or the creators of the forum are so arrogant that they feel like no one can survive without their forums. Which one is it?

Hi Haters!

There is only ONE calguns.org

Offline skyhawk

  • Jr. Member
  • **
  • Posts: 58
  • Karma: +5/-18
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #207 on: November 24, 2015, 03:28:32 PM »
Hmmm... Interesting that we cannot resolve www.calguns.net via DNS anymore. Even if the primary DNS servers, SOA for calguns.net were afflicted by the hardware failure, I would have expected  DNS look up to be cached atleast for 3 days ( refresh of DNS record  is 3 hours I think) by down stream DNS servers?! Please correct me if I am wrong on this assumption .. :)

Calguns DNS is hosted at the provider who is down. DNS records are cached according to a TTL value that is provided with each record. However some recursive DNS servers (like your ISP provides) ignore these values and cache according to their own timers.  In any case, the provider itself (ProfEdge) also hosts its own DNS. So lookups cannot be done, because recursive DNS servers (like the Google 8.8.8.8 server for example) can't even reach PROFEDGE to get IPs for their DNS servers, to then query them for CALGUNS hostnames.

But it doesn't much matter, because the info we have about the failure says that even if we could resolve an IP for host www.calguns.net, the web application server which front ends the database is down.

Offline skyhawk

  • Jr. Member
  • **
  • Posts: 58
  • Karma: +5/-18
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #208 on: November 24, 2015, 03:31:52 PM »

Nobody has to come up with excuses, for over ten years I've gotten straight answers whenever I've had a problem.

I don't care how unlikely it is I don't need them to prove jack squat because I've never been given a reason to mistrust Michael or PE/GV.


I agree - sometimes s*it just happens. It sucks when it does, but I don't care if you are big or small, this kind of thing can bite anybody.  Now everyone go clean some guns or turkeys, maybe put some lead downrange, and let the geeks do what they do to get the train back on the rails.

Offline Ubermcoupe

  • Jr. Member
  • **
  • Posts: 70
  • Karma: +11/-22
    • View Profile
Re: 11/23/2015 16:45 Down again?
« Reply #209 on: November 24, 2015, 03:52:10 PM »
hahaha, just read the comments on Facebook, maybe we should sick Randall on them idi... umm, fine gentlemen. :D

Quote
The temp. Calguns.org page is such a joke in the market place. Nothing but BS threads being posted!!! Nothing is being taken serious and needs to regulated seriously with a moderator if that's going to be up and running until this gets situated!!!

Here's another:
Quote
So, either the forum goes down so much that you need a backup, or the creators of the forum are so arrogant that they feel like no one can survive without their forums. Which one is it?
...

I'll admit, im in CGN withdrawal - LOL - but those ^^^ fellas are literally dying.  ;D I can't help but laugh at their misfortune and realize I should just check back in 36 hours.

If it's down, it's down. I'm sure the powers that be are working hard to get a fix. NBD. :shrug:

Some of those posts though...
The information contained in this document is CONFIDENTIAL and LEGALLY PRIVILEGED, intended only for the recipient(s) named above. If the reader of this message is not the intended recipient, you are notified that any use, copying, disclosure, retention or distribution is unlawful and illegal.