YARN-10649
The RMNodeImpl.updatedExistContainers was not removing entries related to containers that had already finished execution. As the number of finished containers increased, these objects stayed in memory indefinitely, which gradually consumed more memory, ultimately causing the system to run out of memory (OOM).
Launch on Chameleon
Launching this artifact will open it within Chameleon’s shared Jupyter experiment environment, which is accessible to all Chameleon users with an active allocation.
Download ArchiveDownload an archive containing the files of this artifact.