Error with Scalarizr when launching new instances, must be launched manually

Nate Policar's Avatar

Nate Policar

25 Aug, 2018 08:51 AM

When launching new instances, the machines are getting stuck in 'Waiting for OS to boot'. If I do nothing, eventually Scalr reports instance has failed to launch. If I ssh in and manually restart the scalarizr server, system will continue booting and initalizing as normal. Server ID 2877826e-472d-4def-b0b9-79559776ac96 is an example of one of the instances having this issue if you need a live instance to investigate on.

The following is what was reported by the system log on a failed instance:

Aug 25 01:06:38 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:38,399-07:00 - DEBUG - agent.scalrinit.metadata.vmware - Does not fit: VMware Tools are not installed.
Aug 25 01:06:38 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:38,399-07:00 - DEBUG - agent.scalrinit.metadata.configdrive_legacy - Does not fit: Disk drive with User-data not found (label: config-2)
Aug 25 01:06:38 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:38,406-07:00 - DEBUG - agent.scalrinit.metadata.gce - Fitting...
Aug 25 01:06:38 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:38,410-07:00 - DEBUG - agent.scalrinit.metadata.gce - Does not fit: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1 (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fe2402517b8>: Failed to establish a new connection: [Errno -2] Name or service not known',))
Aug 25 01:06:38 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:38,410-07:00 - DEBUG - agent.scalrinit.metadata.azure - Fitting...
Aug 25 01:06:38 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:38,410-07:00 - DEBUG - agent.scalrinit.metadata.azure - Does not fit
Aug 25 01:06:38 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:38,414-07:00 - DEBUG - agent.scalrinit.metadata.docker - Fitting...
Aug 25 01:06:38 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:38,415-07:00 - DEBUG - agent.scalrinit.metadata.docker - Does not fit
Aug 25 01:06:38 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:38,415-07:00 - DEBUG - agent.scalrinit.metadata.personality - Fitting...
Aug 25 01:06:38 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:38,415-07:00 - DEBUG - agent.scalrinit.metadata.personality - Does not fit: User-data files ['/etc/scalr/private.d/.user-data', '/etc/.scalr-user-data', '/rootfs/etc/.scalr-user-data'] do not exist or not accessible.
Aug 25 01:06:38 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:38,421-07:00 - DEBUG - agent.scalrinit.metadata.cloudstack - Does not fit: HTTPConnectionPool(host='10.0.0.1', port=80): Max retries exceeded with url: /latest/instance-id (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.HTTPConnection object at 0x7fe2419ad5c0>, 'Connection to 10.0.0.1 timed out. (connect timeout=1)'))
Aug 25 01:06:38 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:38,422-07:00 - DEBUG - agent.scalrinit.metadata.cloudstack - Fitting...
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,398-07:00 - DEBUG - agent.scalrinit.metadata.configdrive - Fitting...
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,399-07:00 - DEBUG - agent.scalrinit.metadata.configdrive - Does not fit: Disk drive with User-data not found (label: config-2)
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,399-07:00 - DEBUG - agent.scalrinit.metadata.vmware - Fitting...
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,399-07:00 - DEBUG - agent.scalrinit.metadata.vmware - Does not fit: VMware Tools are not installed.
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,400-07:00 - DEBUG - agent.scalrinit.metadata.configdrive_legacy - Fitting...
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,400-07:00 - DEBUG - agent.scalrinit.metadata.configdrive_legacy - Does not fit: Disk drive with User-data not found (label: config-2)
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,407-07:00 - DEBUG - agent.scalrinit.metadata.gce - Fitting...
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,410-07:00 - DEBUG - agent.scalrinit.metadata.gce - Does not fit: HTTPConnectionPool(host='metadata.google.internal', port=80): Max retries exceeded with url: /computeMetadata/v1 (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fe240251fd0>: Failed to establish a new connection: [Errno -2] Name or service not known',))
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,411-07:00 - DEBUG - agent.scalrinit.metadata.azure - Fitting...
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,411-07:00 - DEBUG - agent.scalrinit.metadata.azure - Does not fit
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,415-07:00 - DEBUG - agent.scalrinit.metadata.docker - Fitting...
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,416-07:00 - DEBUG - agent.scalrinit.metadata.docker - Does not fit
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,416-07:00 - DEBUG - agent.scalrinit.metadata.personality - Fitting...
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,416-07:00 - DEBUG - agent.scalrinit.metadata.personality - Does not fit: User-data files ['/etc/scalr/private.d/.user-data', '/etc/.scalr-user-data', '/rootfs/etc/.scalr-user-data'] do not exist or not accessible.
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,424-07:00 - DEBUG - agent.scalrinit.metadata.cloudstack - Does not fit: HTTPConnectionPool(host='10.0.0.1', port=80): Max retries exceeded with url: /latest/instance-id (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.HTTPConnection object at 0x7fe2402516d8>, 'Connection to 10.0.0.1 timed out. (connect timeout=1)'))
Aug 25 01:06:39 ip-10-0-0-12 scalr-upd-client: 2018-08-25 01:06:39,425-07:00 - DEBUG - agent.scalrinit.metadata.cloudstack - Fitting...

Additionally, I get the following when I ssh in and try to start scalr manually:

● scalarizr.service - Scalarizr - Guest agent for Scalr.
   Loaded: loaded (/usr/lib/systemd/system/scalarizr.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Sat 2018-08-25 01:47:32 PDT; 33s ago
     Docs: https://github.com/Scalr/scalarizr
 Main PID: 10160 (code=exited, status=1/FAILURE)

Aug 25 01:47:32 ip-10-0-0-152.us-west-2.compute.internal scalarizr[10160]: y = [deepcopy(a, memo) for a in x]
Aug 25 01:47:32 ip-10-0-0-152.us-west-2.compute.internal scalarizr[10160]: File "/opt/scalarizr/embedded/lib/python3.5/copy.py", line 223, in <listcomp>
Aug 25 01:47:32 ip-10-0-0-152.us-west-2.compute.internal scalarizr[10160]: y = [deepcopy(a, memo) for a in x]
Aug 25 01:47:32 ip-10-0-0-152.us-west-2.compute.internal scalarizr[10160]: File "/opt/scalarizr/embedded/lib/python3.5/copy.py", line 146, in deepcopy
Aug 25 01:47:32 ip-10-0-0-152.us-west-2.compute.internal scalarizr[10160]: d = id(x)
Aug 25 01:47:32 ip-10-0-0-152.us-west-2.compute.internal scalarizr[10160]: KeyboardInterrupt
Aug 25 01:47:32 ip-10-0-0-152.us-west-2.compute.internal systemd[1]: scalarizr.service: main process exited, code=exited, status=1/FAILURE
Aug 25 01:47:32 ip-10-0-0-152.us-west-2.compute.internal systemd[1]: Stopped Scalarizr - Guest agent for Scalr..
Aug 25 01:47:32 ip-10-0-0-152.us-west-2.compute.internal systemd[1]: Unit scalarizr.service entered failed state.
Aug 25 01:47:32 ip-10-0-0-152.us-west-2.compute.internal systemd[1]: scalarizr.service failed.

  1. 1 Posted by Nate Policar on 25 Aug, 2018 11:04 PM

    Nate Policar's Avatar

    Update: we were able to resolve this issue by updating scalr manually to 6.9.1 via yum and creating a new AMI.

    Would still appreciate some feedback on whether this issue was exclusive to 6.9.0 or if this will be a potential issue going forward. Thx!

  2. 2 Posted by Nate Policar on 25 Aug, 2018 11:06 PM

    Nate Policar's Avatar

    Scalarizr status for newly launched amis still says Update server denied (farm id #25460 role #100433)

  3. Support Staff 3 Posted by Marat Komarov on 27 Aug, 2018 10:45 PM

    Marat Komarov's Avatar

    The posted stack trace shows that Scalarizr was stopped from the outside (Ctrl-C signal). The are no any known issues in Scalarizr 6.9.0, that may cause such initialization problems.

    We need a live server, or to full scalarizr_update.log and scalarizr_debug.log from the terminated one. Unfortunately 2877826e-472d-4def-b0b9-79559776ac96 is no more Running.

    Regards,
    Marat

Reply to this discussion

Internal reply

Formatting help / Preview (switch to plain text) No formatting (switch to Markdown)

Attaching KB article:

»

Attached Files

You can attach files up to 10MB

If you don't have an account yet, we need to confirm you're human and not a machine trying to post spam.

Keyboard shortcuts

Generic

? Show this help
ESC Blurs the current field

Comment Form

r Focus the comment reply box
^ + ↩ Submit the comment

You can use Command ⌘ instead of Control ^ on Mac