-
-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caddy 2 caching layer and API endpoints #2820
Comments
Yep, this is a good idea, it definitely needs to be a part of Caddy's cache module. Right now, the cache key is just the request URI: caddy/modules/caddyhttp/httpcache/httpcache.go Lines 85 to 88 in 19e834c
Since groupcache has no explicit cache invalidation features, all we need to do is encode the information related to cache expiration into the key. And yes, an admin endpoint would be needed. These are pluggable, as admin endpoints are themselves Caddy modules: Lines 29 to 54 in 19e834c
Agreed! We've talked to some CDNs who would go crazy for this combination. Anyone want to post an end-to-end example of something being added to the cache (including what the key is), an API request that invalidates it, and then the next version being added to the cache and used in its place? |
Also looping @maruel in on this conversation |
My 2¢ from having worked on fairly distributed systems. For scalable infrastructures, requirement for cache invalidation is generally an anti-pattern. I'd recommend to use on of the following in the key:
Using these patterns remove the need for cache coherence between nodes in the caching infrastructure. A good example of what failed for us:
Sure you could use TTL (my preference), and/or explicit cache invalidation to work around those once they are detected, but the whole system becomes eventually consistent; especially that all the nodes in the system must have a coherent view of the cache data. |
@maruel Interesting thoughts and experience. My original message did not mention TTL based eviction since it seems to be a mandatory feature to support HTTP Cache-control headers on an HTTP server. Do you plan to rely on the cache module to handle Cache-control headers? What about a basic abstraction layer (maybe one for eviction policies and one for invalidation) and different implementations (e.g. a local disk-based, local mem-based, a distributed one). Not all use cases require a distributed setup, neither coordination between nodes. It would be interesting to get more feedback from Caddy users and their use cases. Maybe you already have? Just to emphasize that my main motivation to create this issue was a cache invalidation API with modern actionable purge methods (i.e. surrogate keys). That's a really common and recurrent need in order to work with multi-tenant architectures in the SaaS world. |
Even supporting max-age correctly is tricky: How long should the proxy cache it? Should the max-age value be reduced as time goes? I'm not familiar with proxy implementations but I tend to favor safety over performance: cache for a small amount of time (like 1% of the cache max-age statement) so we don't need to answer the question above. |
Not sure if you folks have already looked at https://github.com/mailgun/groupcache It is fork of groupcache and support TTL based eviction and a way to remove(sort of). |
@varun06 I did see that once upon a time, but the theoretical guarantees are less strong. I think if we went that route we'd have to do some careful profiling and benchmarking before we decide to go with it. It has to be better than simply encoding expiration into the keys themselves. Also, this issue is a little unsettling (a common gotcha that I've done myself, and something easy to fix, but let's see what the response time is): mailgun/groupcache#14 In the meantime, I've moved the cache handler to a separate repo for further development: https://github.com/caddyserver/cache-handler I might transfer this issue as well. (or just link to it) |
Thanks for the explanation @mholt. I am working on a similar exercise(Distributed Cache, although we offload much of caching to CDN layer) for reverse proxy used by a big retailer. It is in very initial phase. If I learn something that might help caddy, I will contribute or let you folks know. |
Moving discussion to new repo dedicated for the caching layer: caddyserver/cache-handler#1 Would be happy to have people work on it! |
1. What would you like to have changed?
The Caddy 2 caching layer is currently a WIP. It seems to be the right time to suggest a cache invalidation API that supports cache tags.
2. Why is this feature a useful, necessary, and/or important addition to this project?
A caching layer improves throughput by reducing bandwidth, latency, and workload. It can apply to less frequently updated content such as a blog or more critical components such as an API. The on-demand TLS feature coupled with a great caching and invalidation API would be a set of killer features for SaaS companies and I think would bring lot of new adopters.
There exist 3 main methods regarding cache invalidation:
The third method is the most interesting since it enables and supports basic to advanced use cases while not requiring a lot of extra work.
It would be really convenient to have a REST API exposed by Caddy for cache invalidation. Requests could be formulated using the commonly accepted HTTP PURGE method.
3. What alternatives are there, or what are you doing in the meantime to work around the lack of this feature?
Using a solution other than Caddy.
4. Please link to any relevant issues, pull requests, or other discussions.
Here are some resources for an introduction about surrogate keys/cache tags:
The text was updated successfully, but these errors were encountered: