The article provides a troubleshooting scenario for the following issue: Apache Solr extraction module vulnerable to XXE attacks via XFA content in PDFs. This issue pertains to Apache Solr extraction module (Solr Cell) in versions 6.2 – 9.x and only arises if the Solr Cell extraction handler (/update/extract) is enabled and actively used for PDF content extraction.
Important: The issue affects Sitecore XP, XM, and XC (on-prem or Managed Cloud) only when Apache Solr is configured with the extraction module (Solr Cell) and processes untrusted PDFs. Standard Solr text-only indexing is not impacted.
To confirm that the solution is affected by this particular issue, you must verify that Solr Cell PDF extraction is enabled in your Solr configuration by taking these steps:
Note: Solr "collections" are also affected under the same conditions if the Solr server is configured running in the SolrCloud mode.
If confirmed, to mitigate the issue, follow the instructions provided here.