-
Notifications
You must be signed in to change notification settings - Fork 612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zstdchunked error deleting temp layers #3623
zstdchunked error deleting temp layers #3623
Comments
Do you have a link for that one? |
Interesting. You are using nerdctl v1.7 (which does not have the fix mentioned). |
It's one of your issues actually (#3509 and a few similar ones).
Oh I thought it's already in the release. But looking at the code, I think it wouldn't help because zstd:chunked converter is using the same problematic module from snapshotter repo nerdctl/pkg/cmd/image/convert.go Line 42 in 2a0b9f5
I need zstd:chunked lazy loading specifically, because with gzip, decompressing my 30 GB of layers utilizes 100% of CPU and barely hit the disk speed limit for a very prolonged time, so with the regular zstd image container startup time is 8-10 minutes approximately, depends on s3 throughput, and with estargz it's barely better, 5 minutes total for decompressing and disc writing. And considering that I need most of my 50 GB uncompressed image for ML inferencing anyway on the machine, it would still be disk bounded in any case, but I'd like to optimize it as much as possible :/ |
I see. So: This is definitely a stargz issue (since it happens with ctr as well) Maybe we can fix it here (just rewrite the func in nerdctl and bypass stargz converter), so, a few comments:
I need tests for this - at least a reproducer. Is there a chance you could share one of these images? Or provide a way for me to build one, that would be close enough to what you have to trigger the issue? |
Cc @ktock for stargz |
Absolutely, you can use nvidia triton inference server image, it reproduces the error. I kinda realized why it was not reported previously - it works as intended on smaller images. Also I've got a bug in nerdctl v2.0.0-rc3 image convert not being able to find images pulled from dockerhub (even with a URL), only from other registries. Here's my log:
To reproduce, pull I hope it helps! |
@GrigoryEvko on my side, I did a quick read of the source: looks like Build does require a SectionReader - https://github.com/containerd/stargz-snapshotter/blob/a6b9bdb5a9e113277fa213e002e65bf1a761509c/estargz/build.go#L153 - although it is not clear why that is useful, compared to just a Reader - especially if the content is not compressed - furthermore, there is a fair amount of filesystem back and forth - https://github.com/containerd/stargz-snapshotter/blob/a6b9bdb5a9e113277fa213e002e65bf1a761509c/estargz/build.go#L652 - and finally it seems like after compression, we do decompress the result (same here, not sure why - it appears we want the size of the uncompressed content - https://github.com/containerd/stargz-snapshotter/blob/a6b9bdb5a9e113277fa213e002e65bf1a761509c/nativeconverter/zstdchunked/zstdchunked.go#L166) Anyhow, there is possibly room for improvement here. Addendum: estargz.Build does need a ReaderAt for good reasons, although it is not clear how the current implementation could scale to that size (30G+). |
I will look into this one. |
PR incoming for this specifically: #3626 |
See containerd#3623 for details Signed-off-by: apostasie <[email protected]>
See containerd#3623 for details Signed-off-by: apostasie <[email protected]>
So:
I will assume this is safe to do (albeit it seems wasteful), but more importantly, we defer deletion of the uncompressed layer inside the converter function, which likely may happen before other calls for the same desc are done. This looks like the culprit ^. Using mutexes to prevent the same desc to be processed in parallel does the trick - although this is ugly.
I sent a PR here on nerdctl, but I am not convinced this is enough and there may be more issues at play here (containerd gc-ing layers?). @ktock feel free to carry the PR over or use the info here to write a better patch on stargz. |
Ohh, great!! Thank you so much! Can we instead decompress all layers only once? I think it's how it was supposed to be implemented and some concurrency leaked into it. Anyway really impressed by the quick fix! Thanks again😄 |
I tried a few approaches - notably, with a ref counter and storing the uncompressed desc in the map. Anyhow, peeps at stargz will probably have better ideas than me on this. |
Description
The issues seems to be related to the code imported from stargz-snapshotter, because it persists across both nerdctl and stargz-snapshotter packages. (reported here containerd/stargz-snapshotter#1842)
Using both packages (nerdctl with
nerdctl image convert --zstdchunked --oci src target
) results in similar errors:Also on some steps I ended up with containerd bug present in latest 1.7.22 version, but not in 2.0 rc it seems, it was reported multiple times recently.
Is it just me? I'm using Ubuntu 22.04 with containerd installed from apt repo (but tried several tarball releases, nerdctl from latest release, stargz-snapshotter both from release and built from latest git repo).
Also it seems related to the recent fix in nerdctl #3079 , because zstdchunked converter looks almost identical to pre pull request nerdctl code.
Thanks in advance! Maybe I need to use dockerized environment for conversion? Maybe it's my giant (30GB) image? I use heavy ML docker with many python packages in layers and some models in layers as well, because it's the image for kubernetes autoscaling deployment with registry storage on s3.
Steps to reproduce the issue
No response
Describe the results you received and expected
Zstdchunked converter working as expected
What version of nerdctl are you using?
v1.7.7
Are you using a variant of nerdctl? (e.g., Rancher Desktop)
None
Host information
No response
The text was updated successfully, but these errors were encountered: