In our HDFS cluster we observed that append operation can take as much as 10X write lock time than other write operations. getDatanodeManager().getNumLiveDataNodes() is very expensive. The fix is not to invoke it unless we really need to. Just add a short cut to return true if liveReplicas >= minReplication

5 2 2 6 Mar. 23, 2024, 10:22 AM


Launch on Chameleon

Launching this artifact will open it within Chameleon’s shared Jupyter experiment environment, which is accessible to all Chameleon users with an active allocation.

Download Archive

Download an archive containing the files of this artifact.

Version Stats

5 2 2