Polish language may cause indexing errors


Description

Sitecore XP automatically creates dynamic fields for certain languages. A message similar to the following might be found in the logs when using the Solr search provider and indexing items in the Polish language:

9524 2015:06:25 14:52:03 
WARN Crawler : AddRecursive DoItemAdd failed - {6601373B-C31A-43D7-8DD8-6429ACA38298} Exception: SolrNet.Exceptions.SolrConnectionException Message: <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"><int name="status">400</int><int name="QTime">98</int></lst>
<lst name="error"><str name="msg">

ERROR:
[doc=sitecore://master/{110d559f-dea5-42ea-9c1c-8a5df7e70ef9}?lang=pl-pl&amp;ver=1&amp;ndx=sitecore_master_index] unknown field 'title_t_pl'</str>
<int name="code">400</int></lst>
</response>

Solution

Option 1

  1. Decide whether the language-specific analysis is required for content in Polish. The Solr schema does not contain a field type definition for analyzing content in Polish. If the language-specific analysis is not required, then the "text_general" type can be used. Otherwise, create a new field type using the existing types like "text_da" as an example. To make language-specific adjustments, check the Solr documentation. A language-specific field type should be similar to the following:
    <fieldType name="text_pl" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StempelPolishStemFilterFactory"/>
      </analyzer>
    </fieldType>
  2. Create a new dynamic field:
    <dynamicField name="*_t_pl" type="text_pl" indexed="true" stored="true" />

    or (when not creating a language-specific field type):

    <dynamicField name="*_t_pl" type="text_general" indexed="true" stored="true" />
  3. Decide how to introduce changes to the Solr schema. The changes can be applied directly to the Solr schema file. In this case, reload every Solr core with a modified Solr schema, or restart the Solr server.
  4. Restart Sitecore instances that reference the Solr server to pull information about the updated Solr schema.

    Note: If you populate the Solr schema using Sitecore, the changes will be overwritten. Normally, this step is only run during deployment of a new environment or after upgrading to a later version of the product. Check Option 2 if you want to make sure that the changes cannot be accidentally overwritten.

Option 2

Alternatively, changes to the Solr schema can be introduced by code.

  1. Customize the Solr schema population logic using the example provided in the article:
  2. Repopulate the schema.

Note

The steps given in the Solution can be used to add support for other languages.