Open Closed

Urgent Caching/DB Issues in Production #8639


User avatar
0

Hi,

We're finally in production, but I'm having a very hard time keeping the system in a healthy state. It appears like we're having issues with caching which is happening all the time.

  • ABP Framework version: v8.3.0
  • UI Type: Razor Pages
  • Database System: EF Core (SQL Server)
  • Tiered (for MVC) or Auth Server Separated (for Angular): Tiered, OpenIddict, yes

I've sent logs and the module classes to liming.ma@volosoft.com but let me know if I should send them somewhere else.

I'm seeing tons of these types of entries in the API logs:

[2025-01-14 00:03:03.768] [Warning] wn1ldwk0000OB (276) <Volo.Abp.Caching.DistributedCache> The operation was canceled.
System.OperationCanceledException: The operation was canceled.
   at System.Threading.CancellationToken.ThrowOperationCanceledException()
   at Microsoft.Extensions.Caching.StackExchangeRedis.RedisCache.GetAsync(String key, CancellationToken token)
   at Volo.Abp.Caching.DistributedCache`2.GetAsync(TCacheKey key, Nullable`1 hideErrors, Boolean considerUow, CancellationToken token)
[2025-01-14 00:03:03.770] [Error] wn1ldwk0000OB (276) <Microsoft.EntityFrameworkCore.Database.Connection> An error occurred using the connection to database '"(redacted)"' on server '"tcp:(redacted).database.windows.net,1433"'.

And

[2025-01-14 00:03:29.807] [Debug] wn1ldwk0000OB (277) <Volo.Abp.PermissionManagement.PermissionStore> PermissionStore.GetCacheItemAsync: pn:U,pk:de5c48e0-9804-4e05-b190-99304d28c0d8,n:CabMD.EDTAccount
[2025-01-14 00:03:29.811] [Warning] wn1ldwk0000OB (277) <Volo.Abp.Caching.DistributedCache> The operation was canceled.
System.OperationCanceledException: The operation was canceled.
   at System.Threading.CancellationToken.ThrowOperationCanceledException()
   at Microsoft.Extensions.Caching.StackExchangeRedis.RedisCache.RefreshAsync(IDatabase cache, String key, Nullable`1 absExpr, Nullable`1 sldExpr, CancellationToken token)
   at Microsoft.Extensions.Caching.StackExchangeRedis.RedisCache.GetAndRefreshAsync(String key, Boolean getData, CancellationToken token)
   at Microsoft.Extensions.Caching.StackExchangeRedis.RedisCache.GetAsync(String key, CancellationToken token)
   at Volo.Abp.Caching.DistributedCache`2.GetAsync(TCacheKey key, Nullable`1 hideErrors, Boolean considerUow, CancellationToken token)
[2025-01-14 00:03:29.813] [Debug] wn1ldwk0000OB (277) <Volo.Abp.PermissionManagement.PermissionStore> Not found in the cache: pn:U,pk:de5c48e0-9804-4e05-b190-99304d28c0d8,n:CabMD.EDTAccount
[2025-01-14 00:03:29.813] [Debug] wn1ldwk0000OB (277) <Volo.Abp.PermissionManagement.PermissionStore> Getting all granted permissions from the repository for this provider name,key: U,de5c48e0-9804-4e05-b190-99304d28c0d8
[2025-01-14 00:03:29.814] [Warning] wn1ldwk0000OB (277) <Volo.Abp.Caching.DistributedCache> The operation was canceled.
System.OperationCanceledException: The operation was canceled.
   at System.Threading.CancellationToken.ThrowOperationCanceledException()
   at Microsoft.Extensions.Caching.StackExchangeRedis.RedisCache.GetAsync(String key, CancellationToken token)
   at Volo.Abp.Caching.DistributedCache`2.GetAsync(TCacheKey key, Nullable`1 hideErrors, Boolean considerUow, CancellationToken token)
[2025-01-14 00:03:29.816] [Error] wn1ldwk0000OB (277) <Microsoft.EntityFrameworkCore.Database.Connection> An error occurred using the connection to database '"(redacted)"' on server '"tcp:(redacted).database.windows.net,1433"'.
[2025-01-14 00:03:29.825] [Error] wn1ldwk0000OB (277) <Volo.Abp.AspNetCore.Mvc.ExceptionHandling.AbpExceptionFilter> ---------- RemoteServiceErrorInfo ----------
{
  "code": null,
  "message": "An internal error occurred during your request!",
  "details": null,
  "data": {},
  "validationErrors": null
}

Testing went well, but now that this is live and we're getting lots of users - it's failing dramatically and the users are very frustrated.

I'm trying to understand:

  1. What the nature of this error is
  2. Why does it happen so often
  3. What solutions are there for resolving the issue

4 Answer(s)
  • User Avatar
    0
    liangshiwei created
    Support Team Fullstack Developer

    Hi

    I just simply check the logs, (because it’s too late now)

    It looks like a cache avalanche, and a large number of requests to query the database caused timeout.

    You can try

    1 Increase cache expiration time 2 Increase database timeout

  • User Avatar
    0
    liangshiwei created
    Support Team Fullstack Developer

    Hi,

    Is your project problem solved?

  • User Avatar
    0

    We're trying to adjust things.

    One question, why would we constantly see all kinds of caching updates related to Volo.Abp.PermissionManagement.PermissionStore? Can we adjust how that's cached?

  • User Avatar
    0
    liangshiwei created
    Support Team Fullstack Developer

    Hi,

    Normally the cache is not changed, but it is deleted when the permission grant is changed.

    https://github.com/abpframework/abp/blob/dev/modules/permission-management/src/Volo.Abp.PermissionManagement.Domain/Volo/Abp/PermissionManagement/PermissionGrantCacheItemInvalidator.cs#L10

    If your application has high load, you can consider upgrading the configuration of the Redis server.

Made with ❤️ on ABP v9.2.0-preview. Updated on January 16, 2025, 11:47