Reported Outages

Upcoming Maintenance window for Chameleon Auth server

Resolved Posted by Michael Sherman on October 18, 2023
Outage start Monday, October 23, 2023 8 a.m.
Expected end Monday, October 23, 2023 8:44 a.m.

This maintenance is now completed.


On the morning of Monday, October 23rd, there will be a brief outage affecting login to all Chameleon sites and services.

We expect a 5-10 minute outage while we apply updates to the service that handles federated login.
This won't affect any running workloads or nodes, but you may need to refresh your browser once the outage ends.

CHI@TACC Haswell+InfiniBand nodes extended unavailability

Resolved Posted by Cody Hammock on October 04, 2023
Outage start Wednesday, September 20, 2023 8 a.m.
Expected end Monday, October 09, 2023 8:32 a.m.

Resolved: the work is complete.

Starting at 9/20/2023 the Haswell+InfiniBand nodes have been unavailable while they are reconfigured in order to decommission a portion of them. This work is taking longer than anticipated to complete, but is expected to conclude by 10/06/2023.

We thank you for your patience.

CHI@NU offline due to water leak

Resolved Posted by Michael Sherman on September 18, 2023
Outage start Sunday, September 17, 2023 6 p.m.
Expected end Monday, September 18, 2023 8 p.m.

 

Due to a water leak in the StarLight datacenter, power has been cut to the CHI@NU racks.

We're monitoring the situation, and will update when we have more information on an estimated time to resolution. 

CHI@NU down

Resolved Posted by Michael Sherman on September 13, 2023
Outage start Wednesday, September 13, 2023 8:31 a.m.
Expected end Wednesday, September 13, 2023 6 p.m.

CHI@NU is currently inaccessable due to a certificate issue. Site staff are investigating.

CHI@IIT down

Resolved Posted by Michael Sherman on August 15, 2023
Outage start Friday, July 14, 2023 6:13 p.m.
Expected end Wednesday, August 16, 2023 6:13 p.m.

CHI@IIT is currently offline.

We suspect a hardware failure in the controller node, and have escalated to site staff.

CHI@TACC Object Store Unavailable

Resolved Posted by Cody Hammock on August 15, 2023
Outage start Tuesday, August 15, 2023 9:35 a.m.
Expected end Tuesday, August 15, 2023 11:43 a.m.

Resolved: The Object Store is once again available.

The Object Store for CHI@TACC is currently unavailable. Staff is working to restore access.

TACC Network maintenance 20 August 2023

Resolved Posted by Cody Hammock on August 04, 2023
Outage start Sunday, August 20, 2023 8 a.m.
Expected end Sunday, August 20, 2023 11:45 a.m.

Resolved: TACC network maintnance is concluded as of 11:45 AM (CDT)

TACC network infrastructure will not be available from 8 AM to 2:00 PM (CDT) on Sunday, 20 August 2023. Network maintenance will be performed during this time.

CHI@TACC, KVM@TACC, and the Chameleon Portal will be affected. Baremetal instances and VMs will continue to run, but will be unreachable.

CHI@UC Datacenter outage July 28th-31st

Resolved Posted by Michael Sherman on July 17, 2023
Outage start Friday, July 28, 2023 12 p.m.
Expected end Monday, July 31, 2023 4:54 p.m.

CHI@UC is back online, and we're observing normal usage patterns so far.

Please reach out if you're encountering issues.

CHI@TACC Liqid nodes

Resolved Posted by Cody Hammock on July 05, 2023
Outage start Wednesday, July 05, 2023 11:13 a.m.
Expected end Wednesday, October 18, 2023 11:01 a.m.

RESOLVED: The faulty hardware has been identified and replaced. The Liqid subsystem is once again available to use.

The Liqid composeable hardware subsystem for CHI@TACC is experiencing intermittent issues with connecting the PCI devices with the connected hosts.

CHI@UC Network maintenance July 5th

Resolved Posted by Michael Sherman on June 22, 2023
Outage start Wednesday, July 05, 2023 9 a.m.
Expected end Wednesday, July 05, 2023 11:30 a.m.

The maintenance is proceeding well, but will likely take an extra 30 minutes or so, finishing at 11:30 rather than 11AM.

Afterwards, all instances will be acceissable as normal.