Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create cache interface to store frequently accessed I/O bound data #1912

Open
kyujin-cho opened this issue Feb 15, 2024 · 3 comments
Open

Create cache interface to store frequently accessed I/O bound data #1912

kyujin-cho opened this issue Feb 15, 2024 · 3 comments
Labels
action:on hold Hold it. Wait for the restart. comp:manager Related to Manager component
Milestone

Comments

@kyujin-cho
Copy link
Member

kyujin-cho commented Feb 15, 2024

After #1911 is resolved, we need to think about implementing cache interface to store frequently accessed, I/O bound datas. API responses from Harbor API will fall into this case. Interface must be designed in a way to utilize heterogeneous cache backends, but for now let's just think of using only one cache backend, redis.

@kyujin-cho kyujin-cho added this to the 24.03 milestone Feb 15, 2024
@kyujin-cho kyujin-cho added the comp:manager Related to Manager component label Feb 15, 2024
@achimnol
Copy link
Member

Just an idea: Could we consider some build-artifact caching solutions (like what Pantsbuild uses...)?

@jopemachine
Copy link
Member

jopemachine commented Nov 14, 2024

I’m leaving a question as I don't fully understand the issue.

Problem Situation

When caching large responses from the Harbor API in Redis, using outdated responses becomes an issue if the registry is updated after caching. For instance, if you cache the results of an image rescan and then use this cache for the next image rescan, any new images uploaded to the registry in the meantime won’t be scanned.

Questions

Cache Invalidation Timing

To avoid using outdated responses, cache invalidation is necessary, but it’s unclear when this invalidation should occur. For example, if you clear the cache with every image rescan, you essentially won’t be using the cache for rescanning. This implies that the cache should serve other purposes. What tasks, then, actually use this cache?

Criteria for Caching Responses

Which types of responses should be cached? Should every response from the Harbor API be cached if it exceeds a certain size, and should this size threshold be hardcoded or configurable? Alternatively, should caching be limited only to responses from specific tasks, such as image rescanning?

@jopemachine
Copy link
Member

Due to the unclearness of this issue, I will put this issue on hold and work on #1913 first.

@jopemachine jopemachine added the action:on hold Hold it. Wait for the restart. label Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action:on hold Hold it. Wait for the restart. comp:manager Related to Manager component
Projects
None yet
Development

No branches or pull requests

3 participants