Reported Outages

CHI@TACC Certificate Expiry

Resolved Posted by Cody Hammock on May 21, 2024
Outage start Monday, May 20, 2024 7:45 p.m.
Expected end Tuesday, May 21, 2024 9:40 a.m.

CHI@TACC was unavailble between 7:15PM CDT Monday May 20, 2024 and 9:40AM CDT Tuesday May 21, 2024 due to an automation failure issuing an updated SSL certificate. The issue has been corrected and we do not expect further disruption.

CHI@Edge outage

Resolved Posted by Mark Powers on May 16, 2024
Outage start Thursday, May 16, 2024 5 p.m.
Expected end Friday, May 17, 2024 5 p.m.

The outage on CHI@Edge is now resolved, thank you for your patience.


CHI@Edge is experiencing issues launching and interactiving with containers. Operators are looking into this issue.

 

Planned Help Desk Outage

Resolved Posted by Mark Powers on April 24, 2024
Outage start Monday, April 29, 2024 9 a.m.
Expected end Monday, April 29, 2024 1:14 p.m.

UPDATE: The ticketing system is now back online, and operating as normal.


The ticketing system for the Chameleon help desk will be down from 9:00 AM (CDT) on Monday, April 29 until 5:00 PM for maintenance. For assistance during this time, please send an email to users@lists.chameleoncloud.org.

April 4 - Jupyterhub Outage

Resolved Posted by Mark Powers on April 04, 2024
Outage start Thursday, April 04, 2024 10 a.m.
Expected end Thursday, April 04, 2024 12:14 p.m.

The networking issue has now been resolved


Chameleon's JupyterHub is current experiencing an unexpected outage. Operators are investigating.

April 16th: Brief maintenance affecting CHI@UC network uplink

Resolved Posted by Michael Sherman on April 03, 2024
Outage start Tuesday, April 16, 2024 9:30 a.m.
Expected end Tuesday, April 16, 2024 10 a.m.

The morning of April 16th, from 9:30-10:00am, we are scheduling a maintenance window which will affect the CHI@UC network uplink.
This is in order to debug some issues affecting the routers "upstream" from CHI@UC.

For the duration, you may encounter failures to access the CHI@UC website or API, or access to public IPs.

Partial Authentication Outage

Resolved Posted by Mark Powers on March 06, 2024
Outage start Wednesday, March 06, 2024 1:55 p.m.
Expected end Thursday, March 07, 2024 1:36 p.m.

UPDATE: The instability should be fixed, and logging into services should work as normal.

----

Some users are reporting issues authenticating to CHI@UC, CHI@TACC, and CHI@Edge dashboards. We are working on resolving this issue.

KVM@TACC System maintenance

Resolved Posted by Cody Hammock on March 06, 2024
Outage start Wednesday, March 27, 2024 8 a.m.
Expected end Wednesday, March 27, 2024 10:33 a.m.

Resolved: Work is complete and KVM@TACC is available.

On Wednesday, March 27 2024 KVM@TACC will be unavailable between 8:00am CDT and 4:00pm CDT to perform necessary system maintenance. Access to the KVM@TACC web interface and API will be unavaiable, and network access to instances will be interrupted. Instances will continue to run, even though they will not be reachable.

 

CHI@UC: Network Uplink Maintenance

Resolved Posted by Michael Sherman on March 04, 2024
Outage start Monday, March 11, 2024 9 a.m.
Expected end Tuesday, March 12, 2024 1 p.m.

5:00 PM We have a workaround in place that should restore connectivity to all floating IPs, however we are still investigating the root cause. There may be additonal minor interruptions while we troubleshoot, but currently all floating IPs tested respond to ping and ssh.


1:30 PM: The maintenance is done, but we are receiving reports of failure to SSH to public IPs from for some users and some instances, and are investigating. Access to the API and dashboard have been fully restored.

 

Chameleon Portal, CHI@Edge and Jupyterhub

Resolved Posted by Mark Powers on January 30, 2024
Outage start Tuesday, January 30, 2024 8:30 a.m.
Expected end Wednesday, January 31, 2024 5:59 p.m.

Update 5:45 PM Jan 31: Jupyterhub and CHI@Edge are back online.


This morning the VM cluster that hosts Portal, CHI@Edge, and JupyterHub went down. This affected the ability to create new leases at any Chameleon site, and access to the help desk.

CHI@UC Site Maintenance Feb 6th 2024

Resolved Posted by Michael Sherman on January 24, 2024
Outage start Tuesday, February 06, 2024 6 a.m.
Expected end Thursday, February 08, 2024 3:21 p.m.

Update: 3pm Feb 8: Outage is resolved, all skylake and rtx_6000 nodes now have 2 10G network interfaces (up from one each)


Update: 7PM Feb 6th: The site is online, and all P3 nodes (everything except compute_skylake and gpu_rtx_6000) are accessible.


On Feb 6th, 2024, we'll be taking the CHI@UC site down for maintenance, in order to replace some failing network hardware.