FLINK-29412
This bug concerns a connection leak in the Flink Table Store, which is a data lake storage developed under the umbrella of Apache Flink. In particular, the extract()
method in the OrcFileStatsExtractor
class creates a new reader configured with a HadoopReadOnlyFileSystem
, both of which leak connections. OrcFileStatsExtractor
forgets to close the reader, and HadoopReadOnlyFileSystem
fails to close the input streams it opens. As such, any procedure or function that requires an OrcFileStatsExtractor
will leak connections.
This notebook reproduces this connection leak bug by hacking the unit tests that the fix patch contributed. The unit tests create a new TraceableFileSystem
for testing purposes that keeps track of a List
of open connections a file system currently has for testing purposes. Then after each file system test, it asserts that the List
is empty to ensure all connections are closed. This notebook instead creates a new test called testCloseConnections
, which creates a new file store table, writes/commits to it 500 times, and then records the number of open connections after each commit. Since commits create an Orc reader to get the metadata of the table, we should see that connections are leaked every commit.
Launching this artifact will open it within Chameleon’s shared Jupyter experiment environment, which is accessible to all Chameleon users with an active allocation.
Download ArchiveDownload an archive containing the files of this artifact.