Reported Outages

kvm@TACC Unavailable April 22, 2022

Resolved Posted by Cody Hammock on April 22, 2022
Outage start Thursday, April 21, 2022 8 p.m.
Expected end Friday, April 22, 2022 4:08 p.m.

KVM@TACC was unavailable starting in the evening of April 21, 2022. It has been resolved.

CHI@UC down

Resolved Posted by Michael Sherman on March 24, 2022
Outage start Thursday, March 24, 2022 10:25 a.m.
Expected end Thursday, March 24, 2022 12 p.m.

Update: This has been resolved as of 11:42 AM, and the site is back up. Running nodes should not have been affected, aside from the temporary loss of network connectivity.


CHI@UC is currently down due to a failure of the controller node's load-balancer. We will update here with more information.

Network Switch failure at UC

Resolved Posted by Michael Sherman on March 01, 2022
Outage start Tuesday, March 01, 2022 4:04 p.m.
Expected end Sunday, May 01, 2022 4:04 p.m.

Update: Connectivity has been restored. Root cause was a software bug preventing the creation of a PVST instance on the switch, due to a large number of configured vlans. Using a single instance for all VLANs restored functionality.


The 1g switch serving out-of-band access for nodes in rack BG-41 has encountered a (so far) unrecoverable software error, preventing traffic to the out of band interface on nodes P3-CPU-020 to P3-CPU-038.

Networking outage at UC

Resolved Posted by Michael Sherman on February 22, 2022
Outage start Monday, February 21, 2022 3 p.m.
Expected end Tuesday, February 22, 2022 11:17 a.m.

Update 11:16 CST: This should now be resolved. A forwarding loop in the underlying network topology caused some ports to become shut down. Instance provisioning and floating IPs should now be working again. Please reach out if you're still seeing issues on the UC site.


We're currently observing networking issues at UC. New instances are failing to provision, and existing ones are unreachable. We're still investigating the root cause, but will update here when resolved. Other sites are unaffected.

TACC Network maintenance 6 March 2022

Resolved Posted by Cody Hammock on February 21, 2022
Outage start Sunday, March 06, 2022 10 a.m.
Expected end Sunday, March 06, 2022 4 p.m.

Update: The work completed at 12:00 PM (CST).

Network maintenance will be carried out at the TACC site between 10:00 AM and 4:00 PM (CST) on Sunday, March 6th. Access to all systems hostetd at TACC will be unavailable during this time, includeing CHI@TACC, KVM@TACC, CHI@Edge, and the Chameleon Portal. Instances will continue to run, but users will have no access to TACC services and systems until the upgrade is complete.

 

Please submit any questions you may have via the Chameleon Helpdesk: https://chameleoncloud.org/user/help/

CHI@NU currently down

Resolved Posted by Michael Sherman on February 17, 2022
Outage start Thursday, February 17, 2022 4:50 p.m.
Expected end Friday, February 18, 2022 5:50 p.m.

This outage has been resolved, and CHI@NU is fully operational.


The CHI@NU site is currently inaccessible due to unexpected issues during a service upgrade. Any running nodes should be unaffected, but are currently inaccessible, along with the Horizon WebUI and API. Other sites are unaffected.

If this outage interrupts your work, feel free to use resources at another site, and please let us know via the helpdesk if you have a use-case that requires the CHI@NU site.

CHI@TACC, KVM@TACC, and CHI@EDGE networking outage

Resolved Posted by Francois Halbach on February 11, 2022
Outage start Thursday, February 10, 2022 4:14 p.m.
Expected end Thursday, February 10, 2022 5:38 p.m.

Outage Start: 2022-02-10 16:14

Outage End: 2022-02-10 17:38

Update: this outage has been resolved.


Due to a networking issue at TACC, CHI@TACC, KVM@TACC, and CHI@EDGE are currently unavailable. 

This affects site access as well as already running resources.

Site networking staff are investigating, but there is no ETA for resolution at this time.

CHI@UC Networking outage

Resolved Posted by Michael Sherman on February 03, 2022
Outage start Thursday, February 03, 2022 11:27 a.m.
Expected end Friday, February 04, 2022 3:10 p.m.

Update - 3:09 PM CST - The outage should now be resolved.


Due to an upstream hardware failure, L2 stitching connectivity, and the creation of new instances is failing at CHI@UC.

Existing instances not using stitching or the SharedWAN network are not affected.

Site networking staff are investingating, but there is no ETA for resoluton at this time.

CHI@UC network maintenance

Resolved Posted by Michael Sherman on January 28, 2022
Outage start Tuesday, February 01, 2022 9 a.m.
Expected end Tuesday, February 01, 2022 2 p.m.

Update: This maintenance has been reschedule for Feb 25th. Phase2 nodes (all nodes starting with names nc...) will be mostly unaffected.


On Tuesday, Feb 1st, there will be a short interruption the management network for CHI@UC due to network upgrades.

During this time, it will not be possible to create new instances, or change the power state of existing ones. Running instances will not be interrupted.

Chameleon at TACC experiencing networking disruptions (January 10th 2022)

Resolved Posted by Francois Halbach on January 11, 2022
Outage start Monday, January 10, 2022 2:14 p.m.
Expected end Monday, January 10, 2022 6:32 p.m.
Resolved: The networking outage at TACC has been resolved. Access to Chameleon resources at TACC is now restored.
 
 
Dear Chameleon users,
 
The networking outage at TACC has been resolved. Access to Chameleon resources at TACC is now restored.
 
Kind regards,
 
François Halbach