Comment 15 for bug 1943049

Revision history for this message
Lucas Kanashiro (lucaskanashiro) wrote :

After reading the comments from this bug and the upstream discussions, I tried to find an approach which would minimize the size of the patch we would need to carry. I tested the following scenarios:

1) There is a docker upstream PR where tianon is trying to backport the fix to the 20.10 branch (the one we currently have in the archive), during the discussion, one of the upstream maintainers mentioned we might need just a newer version of runc to fix the issue:

https://github.com/moby/moby/pull/42836#issuecomment-916422920

After inspecting the runc git repo I found the following commit which seems to address the issue:

https://github.com/opencontainers/runc/commit/960182fdf03d99eb848c111ae791

I did backport this patch to the current runc package we have in Impish and tried to run the test case @athos-ribeiro provided in comment #2, using the docker.io package from the archive. But it did not work, the failure was still reproducible.

2) Since in the comment of the docker upstream maintainer he said we could need runc version 1.0.2 (in Impish we have 1.0.1), I imported this new version to our runc package and ran the same test case using docker.io from the archive. The issue was still there.

3) I kept the runc/1.0.2 installed in my VM and added tianon's patch backporting the fix to the docker.io package:

https://github.com/moby/moby/pull/42836/files

With the patched docker.io and runc/1.0.2 I was still able to reproduce the issue.

4) I removed all the custom packages from my VM and built the source package present in the PPA @juliank linked in comment #9 targeting Impish, and finally got the issue fixed (as others already mentioned). However, I am not happy in adding a patch with 1800+ lines containing a bunch of refactoring, the fix itself is less than 80 lines. I'd prefer to wait until the PR backporting the fix is merged:

https://github.com/moby/moby/pull/42836

For libpod, Sergio helped me to investigate this issue and he noticed that in Fedora (with glibc 2.34) it is working fine. There we can find a newer version of it compared to what we have in Ubuntu. I did some investigation in its upstream git repo and I did not find any specific patch addressing this issue to backport. The other option would be to update libpod to the Fedora's version but then it'd likely require the update of some dependencies. I do not believe we have time to do that and we are also in the Feature Freeze. Moreover, Reinhard, who is the libpod maintainer, would not be happy if we release Impish with a broken package.

With all that said, I believe the easier and the least worse solution would be to disable clone3 syscall from glibc. WDYT?