Outage yesterday

Philippe Lewicki's Avatar

Philippe Lewicki

19 Jul, 2016 10:33 PM

Yesterday our entire farm did go down and recovered on its own very well (good part).
A few hours later the old servers came back online and messed up everything without warning.

I though if fixed the database by moving the data from the new covered database to the original one that came back online.

Unfortunately I forgot to kill the second master and for some reason it came online and tried to merge its slave with the others.

Overnight we receive a lot of complains of data inconsistency, I suspect both masters where active and load balanced.
I did backup both masters and killed one of them and all the slave and then but back online all the slaves.

Couple of questions
1- how to prevent this in the future, specially having 2 master running.
2- I don't see anymore the auto-scaling features when will they come back ?

  1. Support Staff 1 Posted by Igor Savchenko on 19 Jul, 2016 11:00 PM

    Igor Savchenko's Avatar

    Hi Philippe,

    Yes, yesterday Scalr has a major outage what created a mess with new and old servers (we're going to post details soon). This is the first major outage in Scalr history (~ 9 years). We've learned a lot and going to make sure that this will never happen again.

    Yesterday scalr lost track on some porting of instances (records were removed from scalr db), so desired state engine started to launch new servers according to farm configuration. This action created a lot of duplicates that we've imported back to scalr after some time.

    Answering your questions:

    1. This would never happen under normal circumstances. Yesterday was an uncommon issue that was not very well handled by scalr from many perspectives.

    2. To avoid further disruption during the recovery period. We've turned off auto-scaling for some farms. You can turn it back on anytime, but editing your Farm and switch Manual scaling to Automatic.

    Let me know if you have any other questions.

    Regards,
    Igor

  2. 2 Posted by marc on 19 Jul, 2016 11:47 PM

    marc's Avatar

    Hello Philippe,

    We are closing this ticket as resolved as per the previous comments. If any issues persist or if you have any questions, please reopen this ticket by replying or open a new ticket and we can resume troubleshooting efforts if necessary. Thank you for your patience while we worked through these issues.

    Many thanks,
    Wm. Marc O'Brien
    Scalr Technical Support

  3. marc closed this discussion on 19 Jul, 2016 11:47 PM.

Comments are currently closed for this discussion. You can start a new one.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac

Recent Discussions

02 Jul, 2019 07:54 PM
07 Jun, 2019 07:12 PM
02 May, 2019 04:04 PM
28 Mar, 2019 05:24 PM
22 Feb, 2019 08:11 PM