-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distribute cache concurrency issue #83
Comments
Hi @Caskia, Your observation is correct. How would you implement the Token Bucket algorithm in this particular case? Please feel free to contribute with details, pseudo-code or open up a new pull request. Thanks |
the error :
from it ? Thanks for report. |
@cristipufu hi, check the error message please ? |
There's a workaround for the distributed cache issue, but requires a little bit of coding (be aware that this is only a concept and it's not tested): Overwrite the rate limiting middleware part: DistributedClientRateLimitMiddleware .cspublic static IApplicationBuilder UseDistributedClientRateLimiting(this IApplicationBuilder builder)
{
return builder.UseMiddleware<DistributedClientRateLimitMiddleware>();
}
public class DistributedClientRateLimitMiddleware : RateLimitMiddleware<DistributedClientRateLimitProcessor>
{
private readonly ILogger<ClientRateLimitMiddleware> _logger;
public DistributedClientRateLimitMiddleware(RequestDelegate next,
IOptions<ClientRateLimitOptions> options,
IRateLimitCounterStore counterStore,
IClientPolicyStore policyStore,
IRateLimitConfiguration config,
ILogger<ClientRateLimitMiddleware> logger)
: base(next, options?.Value, new DistributedClientRateLimitProcessor(options?.Value, counterStore, policyStore, config), config)
{
_logger = logger;
}
protected override void LogBlockedRequest(HttpContext httpContext, ClientRequestIdentity identity, RateLimitCounter counter, RateLimitRule rule)
{
_logger.LogInformation($"Request {identity.HttpVerb}:{identity.Path} from ClientId {identity.ClientId} has been blocked, quota {rule.Limit}/{rule.Period} exceeded by {counter.Count}. Blocked by rule {rule.Endpoint}, TraceIdentifier {httpContext.TraceIdentifier}.");
}
} Now, we just have to overwrite the DistributedClientRateLimitProcessor.cspublic class DistributedClientRateLimitProcessor : ClientRateLimitProcessor
{
private readonly IRateLimitCounterStore _counterStore;
private readonly IRateLimitConfiguration _config;
private readonly RedLockFactory _redLockFactory;
public DistributedClientRateLimitProcessor(ClientRateLimitOptions options,
IRateLimitCounterStore counterStore,
IClientPolicyStore policyStore,
IRateLimitConfiguration config,
IRedLockFactory redLockFactory)
: base(options, counterStore, policyStore, config)
{
_counterStore = counterStore;
_config = config;
_redLockFactory = redLockFactory; // this should be somehow singleton injected
}
public override async Task<RateLimitCounter> ProcessRequestAsync(ClientRequestIdentity requestIdentity, RateLimitRule rule, CancellationToken cancellationToken = default)
{
var counter = new RateLimitCounter
{
Timestamp = DateTime.UtcNow,
Count = 1
};
var counterId = BuildCounterKey(requestIdentity, rule);
var expiry = TimeSpan.FromSeconds(2);
// serial reads and writes on same key
using (var redLock = await _redLockFactory.CreateLockAsync(counterId, expiry)) // there are also non async Create() methods
{
// make sure we got the lock
if (redLock.IsAcquired)
{
var entry = await _counterStore.GetAsync(counterId, cancellationToken);
if (entry.HasValue)
{
// entry has not expired
if (entry.Value.Timestamp + rule.PeriodTimespan.Value >= DateTime.UtcNow)
{
// increment request count
var totalCount = entry.Value.Count + _config.RateIncrementer?.Invoke() ?? 1;
// deep copy
counter = new RateLimitCounter
{
Timestamp = entry.Value.Timestamp,
Count = totalCount
};
}
}
// stores: id (string) - timestamp (datetime) - total_requests (long)
await _counterStore.SetAsync(counterId, counter, rule.PeriodTimespan.Value, cancellationToken);
}
}
// the lock is automatically released at the end of the using block
return counter;
}
} One can implement a distributed lock mechanism in
Then we just have to replace the services.AddSingleton<IRateLimitCounterStore, DistributedCacheRateLimitCounterStore>(); services.AddStackExchangeRedisCache(options =>
{
options.Configuration = "localhost";
options.InstanceName = "SampleInstance";
}); |
services.AddDistributedRedisCache(options =>
{
// e.g. contoso5.redis.cache.windows.net,abortConnect=false,ssl=true,password=...
options.Configuration = Configuration["IpRateLimitingRedisString"];
options.InstanceName = "ip_rate_limit_";
});
// needed to store rate limit counters and ip rules
services.AddMemoryCache();
//load general configuration from appsettings.json
services.Configure<IpRateLimitOptions>(Configuration.GetSection("IpRateLimiting"));
//load ip rules from appsettings.json
services.Configure<IpRateLimitPolicies>(Configuration.GetSection("IpRateLimitPolicies"));
// inject counter and rules stores
services.AddSingleton<IIpPolicyStore, DistributedCacheIpPolicyStore>();
services.AddSingleton<IRateLimitCounterStore, DistributedCacheRateLimitCounterStore>(); @cristipufu like it? But it not work, distribute cache concurrent problem also. |
There's also another possible solution (also overriding the |
Just wanted to throw in some more information, I run this plugin on an API that was getting about 1 million requests per 4 hours. After getting complaints during high peak times about sluggishness, I removed this limiter and it went down from about 0.6sec per request to 0.15sec per request. I used a Redis backend for Cache as well. So there is defiantly something that needs to be done to help performance under high loads. Otherwise, for small projects this plugin totally rocks. |
@STRATZ-Ken were you using the 3+ version? (the previous versions had a simple lock which means that each request would have throttled all the others) The issue probably is this block of code: using (await AsyncLock.WriterLockAsync(counterId).ConfigureAwait(false))
{
var entry = await _counterStore.GetAsync(counterId, cancellationToken);
if (entry.HasValue)
{
// entry has not expired
if (entry.Value.Timestamp + rule.PeriodTimespan.Value >= DateTime.UtcNow)
{
// increment request count
var totalCount = entry.Value.Count + _config.RateIncrementer?.Invoke() ?? 1;
// deep copy
counter = new RateLimitCounter
{
Timestamp = entry.Value.Timestamp,
Count = totalCount
};
}
}
// stores: id (string) - timestamp (datetime) - total_requests (long)
await _counterStore.SetAsync(counterId, counter, rule.PeriodTimespan.Value, cancellationToken);
} As you can see this will hold a lock per ip/client/etc (depending on the plugin's configuration) that makes 2 Redis calls inside it. So if one would implement this atomic operation directly in Redis, replacing the lock + the 2 calls with a single Redis call, I think that the performance would improve a lot. |
Any update on this? |
Any plans to fix this? |
1 similar comment
Any plans to fix this? |
The We resolved this issue by refactoring the For each rate limit rule, time is divided into intervals that are the length of the rule's period. Requests are resolved to an interval, and the interval's counter is incremented. This is a slight change in semantics to the original implementation, but works in our use-case. This approach requires a dependency on |
As far as I see, there is still race condition there, because call to Redis INCR command and a call to set expiration are different calls and counter may change in between. Looks like the only way to make it work correctly is to make it atomic using Lua script, exactly as explained in Redis documentation for INCR command. |
No race condition, it does not matter what the counter value is when the key expiry is set. The expiry only needs to be set once (but can be set any number of times), and the check for a counter value of 1 ensures this. This is why the algorithm divides time into intervals of the rule's period. The expiry time will be the same for each interval, and any client may set it, as long as it is set at least once. |
Whatever is non-atomic (and your code is non-atomic) will produce race conditions. ..and so on. That's why Redis documentation says it needs to be done by Lua script which is atomic, Redis guys wrote it for a reason. |
@olegkv the right place to discuss the fork is over at its repo, so I've created an issue there for this. |
I’ve only recently found this repo and was wondering if there are known instances of using this in a production environment and how successful it has been. Presumably the side effect of this issue is an incorrect request count. I think we can get by with that within a certain range. Are there any other consequences of this that we should know? |
@nick-cromwell I maintain a fork whose master branch is used in production. It has a known issue in a distributed deployment where one node may set a Redis key and then leave the network (e.g. crash) before it sets the key to expire. In this case, access to the rate-limited resource is eventually blocked. This is an extremely rare case, but is serious enough that it has been addressed in the atomic-lua branch, which is being tested and has not yet been used in production. |
Thanks for that, @simonhaines. I'll keep a look out for a pull request from the atomic-lua branch. I'm not familiar with Lua script, but based on your discussion here with @olegkv it seems that will resolve this issue once and for all. |
@simonhaines Are you good to make a PR from your |
@nick-cromwell I don't think so. The fork changes the |
Hey @stefanprodan, what do you think to the changes proposed by @simonhaines? I'm wanting to use this package but holding off until the concurrency issues are sorted. |
I was reading an recent article about how GitHub solved their rate limiting issues link with the actual Lua script at the end. Does anyone consider this as a valid approach that would be worth looking into an possibly creating a pull request? |
@validide there is a pull request #180 that incorporates distributed rate-limiting and it looks like a good way forward. However the proposed changes require an increase to the major version, and it looks like the project structure is going to be refactored with the distributed rate limited split out to a separate project. |
Wish it would get merged and v bumped already though... it's been sitting a while |
this should have high priority since for high load projects this issue is a blocker... |
I realize this is an open-source project and everyone is contributing on their own time so is there any way I could help? |
We've pushed a new version 4.0.0 https://www.nuget.org/packages/AspNetCoreRateLimit/4.0.0 compatible with |
@cristipufu, sorry to revive this, but wanted to ask something. I've read through all of this thread, and the related PRs, and the general documentation, but something is still unclear to me: is the concurrency issue identified originally here still a problem if you're using SQL Server for the distributed store since it still uses the If it is still an issue for SQL Server cases, is the expectation here basically that as you scale up, you eventually need to switch over from SQL Server to Redis (and the new implementation that uses the direct Redis calls)? Apologies again for reviving this just to ask, but I'm trying to understand the risk factor here of using SQL Server so that I can make a recommendation to my team. |
Yes
I think that you can follow the Redis processing strategy example and implement something similar in SQL |
Can the docs be updated to provide an example of the in-memory fallback for the "New Version". Thanks. |
Hi @stefanprodan ,
When deploying a lots of api services(use distributed cache to store counter) behind load balance, every instance need to get conuter and use process' lock to increse count, as the code below
I think it has concurrency issues, counter will not work correctly when a lots of requests come in. Every instance's counter count by themself then save to the cache store, the last one will cover the previous one.
Why not use Token Bucket algorithm to achieve this feature.
The text was updated successfully, but these errors were encountered: