"AutomationPool_Checkout" stored procedure does not work for big Automation Pool


Description

When the AutomationPool table has a large number of records, the AutomationPool_Checkout stored procedure, which is used for retrieving data, might be executed slowly or fail. In this case, Marketing Automation activities (for example, enrolling contacts in a campaign, moving contacts to the states of the campaign) are not performed or performed with long delays. The following message can be found in the Marketing Automation Engine logs:

Error: Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.

Call Stack:
System.Data.SqlClient.SqlException:
   at System.Data.SqlClient.SqlCommand+<>c.<ExecuteDbDataReaderAsync>b__180_0 (System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Threading.Tasks.Task.Execute (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at Sitecore.Xdb.Sql.Common.Extensions.DbCommandExtensions+<>c__DisplayClass1_0+<<ExecuteReaderWithRetryAsync>b__0>d.MoveNext (Sitecore.Xdb.Sql.Common, Version=4.0.0.0, Culture=neutral, PublicKeyToken=null)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at Sitecore.Xdb.Sql.Common.Extensions.DbCommandExtensions+<ExecuteReaderWithRetryAsync>d__1.MoveNext (Sitecore.Xdb.Sql.Common, Version=4.0.0.0, Culture=neutral, PublicKeyToken=null)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at Sitecore.Xdb.Sql.Common.Extensions.DbCommandExtensions+<ExecuteReaderWithRetryAsync>d__0.MoveNext (Sitecore.Xdb.Sql.Common, Version=4.0.0.0, Culture=neutral, PublicKeyToken=null)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at Sitecore.Xdb.MarketingAutomation.SqlServer.Pool.Commands.CheckoutCommand`1+<ExecuteAndProcessResultsAsync>d__30.MoveNext (Sitecore.Xdb.MarketingAutomation.SqlServer, Version=3.0.0.0, Culture=neutral, PublicKeyToken=null)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089)
   at Sitecore.Xdb.MarketingAutomation.SqlServer.SqlServerCommand`2+<ExecuteAsync>d__17.MoveNext (Sitecore.Xdb.MarketingAutomation.SqlServer, Version=3.0.0.0, Culture=neutral, PublicKeyToken=null)
Inner exception System.ComponentModel.Win32Exception handled at System.Data.SqlClient.SqlCommand+<>c.<ExecuteDbDataReaderAsync>b__180_0:

The issue results in increased resource consumption for the Marketing Automation database.

Solution

To resolve the issue, you can update the AutomationPool_Checkout stored procedure to reduce the load on the AutomationPool table.

  1. Make a backup of the Marketing Automation database.
  2. Modify the AutomationPool_Checkout stored procedure as follows:
    ...
    FROM
        [xdb_ma_pool].[AutomationPool]
        WITH (READPAST, ROWLOCK)
    WHERE
        [Scheduled] <= @Now AND
        [Priority] >= @MinimumPriority AND
        [ContactId] NOT IN (
            SELECT [ContactId]
            FROM [xdb_ma_pool].[ContactWorkerAffinity] WITH (READUNCOMMITTED)
            WHERE [WorkerId] <> @WorkerId
        )
    ORDER BY [Randomizer] ASC
    --ORDER BY [Priority] DESC, [Scheduled] ASC, [Randomizer] ASC
    ...
    Note: Due to these changes, the records are processed in random order (not according to their priority and scheduled date, as usual).
  3. Wait for the value of SELECT COUNT(*) FROM [xdb_ma_pool].[AutomationPool] to decrease to 0.
  4. Revert the changes to the AutomationPool_Checkout stored procedure that have been applied at step 2.