How to rebuild Path Analyzer maps


Introduction

The units of data that power Path Analyzer map visualizations are called Trees. Trees are the data structures that are built in the memory on the Processing Servers and persisted in RDB as blobs, in the Trees table. Each Tree corresponds to a time period (a day, month, or quarter for example), and a particular Tree Definition. Tree Definition is a record in the corresponding Tree Definitions table.
Each Tree Definition references a Map Item* under the Marketing Control Panel in the master database (by item ID).

*Map Item is a marketing user-friendly way of describing a Tree Definition. See Path Analyzer’s glossary for more info.

These are out of the box maps supplied with Sitecore XP  8.0. There are 7 maps in total: 5 visit maps and 2 goal maps.

There is a record in the "TreeDefinitions" table for each map item:

The Path Analyzer uses these records to build Trees and store them in the "Trees" table:

This mechanism is called Tree Construction. It involves streaming the data from the xDB and building the Tree structures from the Interaction data in memory. This operation is performed on the Processing Servers and implemented via various Agents, depending on the operation required. See the Agents  section that follows for more information on how this process works.

Important to know:

Map Rebuild Mechanism

The Tree Construction process is built to run on auto-pilot. However, there are scenarios that require manual admin user involvement to get this process complete.

It is important to know that the "Tree" table comes empty out of the box, when installing Sitecore XP  8.0. This is because the Trees reflect the interaction data from customer’s xDB instance.

Consider the following scenarios:

Scenario 1: You just installed Sitecore 8.0 XP that references an empty xDB (*).

* empty xDB means no data in the Interactions collection.

This is a simple scenario. Since there was no historical data, the Path Analyzer will not perform Tree Construction for the past dates, however it will build maps for all new interactions the next day after all sessions have been flushed to xDB, and the maps will be available for the analysis after that. This work is performed by the dailyMapAgent described in the Agents section that follows.

Scenario 2: You installed Sitecore XP 8.0 recently, it references to an existing xDB with some historic interaction data (*).

* the xDB data could either be migrated from the previous versions of DMS, transferred from other xDB or automatically generated by a script, it doesn’t matter.

This case assumes that the connection string to the xDB with the historic data is in place at the moment of the first system use.

The Tree Construction process will kick off the historic data rebuild process for Path Analyzer’s needs in approximately 5 minutes after the first system start-up.

After this process finishes, the state of the Properties table in RDB should change to the following:

Key

Value

PathAnalyzer_newmaps

<empty>

PathAnalyzer_busy

False

PathAnalyzer_lastrunnewmaps

<timestamp in UTC>

This should also result in new records added to the Trees table. At this point, the Path Analyzer is capable of displaying maps in the UI, and it is possible to start the map analysis on the available data set.

Scenario 3: You installed Sitecore 8.0 XP that references to an empty xDB(*), but afterwards changed the connection string to another xDB with some historic data.

* An empty xDB means no data in the Interactions collection.

First of all, inspect the state of the Properties table in RDB. If it shows that the following key values are there, rebuild the data for all maps. Follow Scenario 4 below for instructions.

 

Key

Value

PathAnalyzer_newmaps

<empty>

PathAnalyzer_busy

False

PathAnalyzer_lastrunnewmaps

<timestamp in UTC>


Scenario 4: You need to rebuild all Path Analyzer’s maps for any reason some time after Sitecore 8.0 XP installation.

Prerequisites:

Your next actions depend on the following:

Agents

This section describes the Tree Construction process in detail. There are three main scenarios involved around Tree Construction process.

  1. Keeping all currently deployed maps updated on recurring interval (daily).
  2. A new map is deployed via the Marketing Control Panel, you need to propagate it with data from XDB.
  3. All maps need to be rebuilt from the historic data.
  4. Proactive map pre-cook.

Scenario 1: Updating all maps daily.

A special dailyMapAgent is in place (configured in the Sitecore.PathAnalyzer.Processing.config file).
It is set up to be triggered on the interval basis (each 10 minutes), however it will be executed only once in 24 hours and only after 1AM in the local server time zone.
Its last executed time stamp is persisted within the Properties table (Key=’PathAnalyzer_lastrundaily’).

The agent will read all maps currently deployed to RDB, and for each of them it will schedule a process that will re-stream the data from xDB for the previous day.

Scenario 2: New map deployment.

If a new map is created in the Marketing Control Panel and deployed to RDB, a special newMapAgent is in place (configured in the Sitecore.PathAnalyzer.Processing.config file) to handle historic data propagation for this new map.

It is set up to be triggered on the interval basis (each 5 minutes). The last executed time stamp is persisted within the Properties table (Key=’PathAnalyzer_lastrunnewmaps’).

Similar to dailyMapAgent, it will be executed only once in 24 hours and only after 1AM in the local server time zone. The difference is that it will only be executed if new maps were deployed to RDB since its last execution.
The list of the newly deployed maps is maintained by the record in the Properties table with key ‘PathAnalyzer_newmaps’. It is expected to contain a value with GUIDs of the newly deployed maps separated with a pipe (‘|’). If that value is there, the newMapAgent will parse it, and will perform the following actions:

At the end of the process, in-memory Tree data structure is persisted in RDB as a blob in a record of the Trees table.

Scenario 3: Historic map rebuild.

This process is facilitated by the same newMapAgent, the difference is that this process has to be performed for all currently deployed maps in RDB, not only the new ones.

See "Map Rebuild Mechanism" section below for more info.

Scenario 4: Proactive map pre-cook.

Since Trees are built on a daily basis, to optimize the performance of requests for the ad-hoc date ranges that could span multiple days and months, a special agent is in place to trigger the process of pre-cooking Trees for predefined intervals (weekly, monthly, quarterly, yearly). This process involves in-memory merging of multiple Trees into one, and persisting it to the Trees table.

The responsible  agent is called smartMergeAgent.
Similar to other agents, it is set up to be triggered on the interval basis (each 15 minutes), however it will be executed only once in 24 hours and only after 1AM in local server time zone.
Its last executed time stamp is persisted within the Properties table (Key=’PathAnalyzer_lastrunmerge’).

FAQ

How do I find out that all of my maps are rebuilt successfully?

Unfortunately there is not much feedback from the system currently. You can monitor the number of records in the "Trees" table and inspect the logs for the current progress. Consult the "Troubleshooting" section on how to enable DEBUG mode for logging. If no records are added to the "Trees" table and no Path Analyzer’s activity related to Tree Construction process is observed in the log files, this means that the process is finished.

How long does it take to rebuild all maps?

Depending on the processing power of your server, your XDB configuration, this could take from minutes to hours. Most of the time is spent on bringing over and de-serializing XDB interaction data on the Processing Server.

For the reference, it takes approximately 2 hours on XDB with 7M interactions, running on a single server.

Troubleshooting

If you experience any issues with the Trees not being built on your system, use this checklist to verify the problem:

  1. In order to gain more visibility to the process, change the logging priority in web.config to DEBUG from default INFO:

     <root>
       <priority value="DEBUG" />
       <appender-ref ref="LogFileAppender" />
     </root>

    This will produce a lot of additional entries in the log file. To facilitate the discovery, all Path Analyzer specific entries are prefixed with "[Path Analyzer]".

    Make sure to disable it after you have solved the issue, as this logging mode is not expected to be running in production.

  2. All agents responsible for triggering the process of Tree Construction, are set up to be triggered on the interval basis (depending on the agent, either 5, 10 or 15 minutes) and be executed only once in 24 hours.

    This means that you may have to either tweak the default intervals in the Sitecore.PathAnalyzer.Processing.config file or wait a few minutes before the process starts.

    If the agent was already executed once within past 24 hours, you must clear the record from the Properties table corresponding to a particular agent for it to execute again, for example "PathAnalyzer_lastrunnewmaps".

    If you use single server setup with Sitecore XP 8.0 Update-2 or later, you can trigger the agent from /Sitecore/admin/pathanalyzer.aspx.

  3. Check that the agent is enabled and the interval is not disabled using /sitecore/admin/showconfig.aspx.

    All agents are defined in this file by default: Sitecore.PathAnalyzer.Processing.config.

  4. In a setup where more than 1 Processing Server is enabled, the agents may be executed on any of those servers.

  5. The fact that the "PathAnalyzer_lastrunnewmaps" time stamp is updated in the Properties table indicates that the agent actually gets executed. The agent may also be executed without doing the actual work. If the agent finds that there is no data in the xDB interaction collection to build the maps for, it will skip processing.

  6. Check for the value of the following records in the Properties table with key=’PathAnalyzer_busy’ or ‘sc_isrebuilding’

    If the value for any of these properties is True, all agents will be halted, since it indicates that either historic processing is taking place or that other agents are active at the moment.

    If these flags are not reset to False for a long time, this might indicate that one of the agents is stuck and needs to be reset by changing the value of the "PathAnalyzer_busy" record manually. We highly recommend that you do not interfere with the "sc_isrebuilding" record.