Replace string process plugin - Migrate API

Recently I have work on a migration - moving Drupal 7 content into Drupal 9. During the testing phase, we found out that some fields in Drupal 7 (HTML text fields using CKEditor) had inserted absolute URI into the source content. The result was migrated content still pointing to the old domain (mainly links and files).

This issue can be addressed in multiple ways, but we decided we will fix this issue during migration using the Migrate API process plugin. Of course, if the site would already be in production then this would not be so straightforward. I decided to share this plugin since I did not found any solution which would allow replacing the content during the migration. The plugin is simple as it gets: specify what are we searching for using regular expression and what should replace string be. That's it.

Usage example

field_teaser_text:
   -
      plugin: mymodule_migrate_replace
      source: field_teaser
      search: 'https?\:\/\/some-hardcoded-domain\.net/'
      replace: /

As you can see from the example, plugin will replace hardcoded domain found in the HTML and replace it with relative (slash) path. By default, the plugin will do replacement using a case-insensitive expression. You can change that including other regex modifiers by using the modifiers option.

Full plugin code

<?php

namespace Drupal\mymodule_migrate\Plugin\migrate\process;

use Drupal\Core\Entity\EntityTypeManagerInterface;
use Drupal\Core\Plugin\ContainerFactoryPluginInterface;
use Drupal\migrate\MigrateExecutableInterface;
use Drupal\migrate\Plugin\Exception\BadPluginDefinitionException;
use Drupal\migrate\ProcessPluginBase;
use Drupal\migrate\Row;
use Symfony\Component\DependencyInjection\ContainerInterface;

/**
 * Replace string matched by pattern.
 *
 * @MigrateProcessPlugin(
 *   id = "mymodule_migrate_replace"
 * )
 *
 * If multiple patterns needs to be replaces, define them as list:
 * search:
 *  - 'pattern1'
 *  - 'pattern2' ...
 *
 * @code
 * field_teaser_text:
 *  -
 *    plugin: mymodule_migrate_replace
 *    source: field_base_teaser
 *    search: 'https?\:\/\/some-hardcoded-domain\.net/'
 *    replace: /
 * @endcode
 */
class Replace extends ProcessPluginBase implements ContainerFactoryPluginInterface {

  /**
   * The entity type manager.
   *
   * @var \Drupal\Core\Entity\EntityTypeManagerInterface
   */
  protected $entityTypeManager;

  /**
   * Constructor.
   *
   * @param array $configuration
   *   A configuration array containing information about the plugin instance.
   * @param string $plugin_id
   *   The plugin ID.
   * @param mixed $plugin_definition
   *   The migration.
   * @param \Drupal\Core\Entity\EntityTypeManagerInterface $entityTypeManager
   *   The entity manager.
   */
  public function __construct(
    array $configuration,
    $plugin_id,
    $plugin_definition,
    EntityTypeManagerInterface $entityTypeManager
  ) {
    parent::__construct($configuration, $plugin_id, $plugin_definition);
    $this->entityTypeManager = $entityTypeManager;
  }

  /**
   * {@inheritdoc}
   */
  public static function create(ContainerInterface $container, array $configuration, $plugin_id, $plugin_definition) {
    return new static(
      $configuration,
      $plugin_id,
      $plugin_definition,
      $container->get('entity_type.manager')
    );
  }

  /**
   * {@inheritdoc}
   */
  public function transform($value, MigrateExecutableInterface $migrateExecutable, Row $row, $destinationProperty) {
    if (!isset($this->configuration['search'])) {
      throw new BadPluginDefinitionException($this->pluginId, 'search');
    }
    if (!isset($this->configuration['replace'])) {
      throw new BadPluginDefinitionException($this->pluginId, 'replace');
    }

    if (isset($value['value'])) {
      $value['value'] = $this->processValue($value['value']);
      return $value['value'];
    }
    if (is_string($value) && $value) {
      $value = $this->processValue($value);
    }

    return $value;
  }

  /**
   * Search and replace string.
   *
   * @param string $value
   *   Value to process.
   *
   * @return string
   *   Replaced string.
   */
  protected function processValue(string $value) : string {
    $modifiers = $this->configuration['modifiers'] ?? 'i';
    $pattern = $this->configuration['search'];
    $pattern = is_array($pattern) ? $pattern : [$pattern];
    foreach ($pattern as $regexPattern) {
      $value = preg_replace('/' . $regexPattern . '/' . $modifiers, $this->configuration['replace'], $value);
    }

    return $value;
  }

}

Add new comment

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.