Reported Outages

Elevated error rates at CHI@UC

Resolved Posted by Michael Sherman on July 13, 2022
Outage start Wednesday, July 13, 2022 2:41 p.m.
Expected end Monday, July 18, 2022 2:41 p.m.

We're currently seeing elevated error rates when provisoning nodes at CHI@UC.

When deploying an instance at CHI@UC, if it moves to state "error", with a message about "provisioning timed out", please reach out to the helpdesk, and include the UUID of your instance and node(s).

We're currently working to narrow down which nodes are affected and reproduce the issue.

KVM@TACC API and WebUI down

Resolved Posted by Michael Sherman on July 11, 2022
Outage start Monday, July 11, 2022 12:40 p.m.
Expected end Monday, July 11, 2022 2:14 p.m.

2:14PM: The issue is now resolved, and KVM is accessible again.

Upstream Network maintenance affecting CHI@UC and Jupyter

Resolved Posted by Michael Sherman on July 05, 2022
Outage start Monday, July 11, 2022 8 p.m.
Expected end Monday, July 11, 2022 9 p.m.

On July 11 at 8pm Central, the routers upstream from CHI@UC will undergo emergency maintenance, which should last between 15 minutes and 1 hour. During this period, all external connectivity to CHI@UC and Jupyterhub will be down. Your instances will keep running, but you won't be able to access them during this period.

Other sites will not be affected.

Floating IP Reservation Outage for CHI@TACC

Resolved Posted by Adam Cooper on July 01, 2022
Outage start Friday, July 01, 2022 5 p.m.
Expected end Wednesday, July 06, 2022 12:56 p.m.

Update July 6th: Reservations for floating IPs at CHI@TACC should work as normal now. 

Update: July 5th

Reservations for Floating IPs at CHI@TACC are still failing, and we are working on a fix.

This only affects reservation for a floating IP. You can work-around this issue by not reserving IP addresses, and using ad-hoc IPs instead. You can find instructions for this method here: https://chameleoncloud.readthedocs.io/en/latest/getting-started/index.html#associating-an-ip-address

Help Desk outage

Resolved Posted by Cody Hammock on June 30, 2022
Outage start Friday, July 01, 2022 12 p.m.
Expected end Saturday, July 02, 2022 9:32 a.m.

RESOLVED: Maintenance is complete.

The ticketing system for the Chameleon help desk will be down from 12:00 noon (CDT) on Friday July 1 2022 until 8:00 AM on Tuesday July 5th for maintenance. For assistance during this time, please send an email to users@lists.chameleoncloud.org.

Provisioning failures for CHI@TACC

Resolved Posted by Cody Hammock on June 29, 2022
Outage start Tuesday, June 28, 2022 12 p.m.
Expected end Wednesday, June 29, 2022 4:15 p.m.

We discovered an issue preventing DHCP from working when provisioning nodes, this has now been resolved.
If you saw error messages like:

Exceeded maximum number of retries. Failed to provision instance <uuid>: Timeout reached while waiting for callback for node <uuid>

or that the instance simply failed to start after a long while at TACC during this time, please try again, as this may have been the cause.

Out-of-band switch maintenance at UC

Resolved Posted by Michael Sherman on June 29, 2022
Outage start Thursday, June 30, 2022 10 a.m.
Expected end Thursday, June 30, 2022 10 a.m.

Between 10AM and 11 AM Central Time, there will be brief interruptions of the out-of-band network for certain racks at UC.

Users may notice failures to power instances on/off, or to deploy new instances. Running instances are unaffected.

User Portal Maintenance

Resolved Posted by Adam Cooper on June 27, 2022
Outage start Tuesday, June 28, 2022 11 a.m.
Expected end Tuesday, June 28, 2022 11:30 a.m.

Scheduled TLS Certificate Maintenance

storage system interruption at UC

Resolved Posted by Michael Sherman on June 13, 2022
Outage start Monday, June 13, 2022 3:08 p.m.
Expected end Wednesday, June 15, 2022 2:13 p.m.

Update: 06/15/2022:

Provisioning of new instances at UC is now functional with all the Chameleon supported images available. 

Unfortunately, we were not able to restore all of the images. We are still in the process of restoring some of them and will be contacting users whose images have not been possible to restore and working with them through available options. If you don’t find the image you are looking for, please reach out to help desk.