When dealing with a site migration that has hundreds of thousands of nodes with larger than usual field values, you might notice some performance issues.
In one instance recently I had to write a migration for nodes that had multiple fields of huge JSON strings and parse them. The migration itself was solid, but I kept running into memory usage warnings that would stop the migration its tracks.
Sometime during the migration, I would see these messages:
- Memory usage is 2.57 GB (85% of limit 3.02 GB), reclaiming memory.
[warning] - Memory usage is now 2.57 GB (85% of limit 3.02 GB), not enough reclaimed, starting new batch
[warning] - Processed 1007 items (1007 created, 0 updated, 0 failed, 0 ignored) - done with 'nodes_articles'
The migration would then cease to continue importing items as if it had finished, while there were still several hundred thousand nodes left to import. Running the import again would produce the same result.
I found a few issues on drupal.org that show others have been having similar issues:
https://www.drupal.org/node/2701335
https://www.drupal.org/node/2701121
The Drupal site was up to date and the patches provided in those issues weren't working. The ideal solution would be to solve the problem so that the migrations would start back up after memory was freed, but because there wasn't enough time to dig into the cause of the issue, I opted for another solution.
Often times it can be useful to create a bash script to run your migrations for you. That way you don't have to chain drush migrate-import commands together. So writing a bash script like this:
#!/usr/bin/env bashecho"Importing users"; drush mi users; echo"Importing terms"; drush mi terms; echo"Importing articles"; drush mi nodes_articles; echo"Importing others"; drush mi other_nodes;
...can help save keystrokes.
When I ran into the memory issues with these larger migration items I thought it might be easier to apply a solution to the bash script since there was nothing inherently wrong with the migrations themselves.
I came up with this bash method:
migration_loop(){# Get the output of the drush status.drush_output=$(drush ms | grep $1); # Split output string into an array.output=($drush_output); # Output the status items.for index in "${!output[@]}"do if[$index=="0"]thenecho"Migration: ${output[index]}"; fi if[$index=="1"]thenecho"Status: ${output[index]}"; fi if[$index=="2"]thenecho"Total: ${output[index]}"; fi if[$index=="3"]thenecho"Imported: ${output[index]}"; fi if[$index=="4"]thenecho"Remaining: ${output[index]}"; fi done# Check if all items were imported.if["${output[4]}"=="0"]thenecho"No items left to import."; elseecho"There are ${output[4]} remaining ${output[0]} items to be imported."; echo"Running command: drush mi $1"; echo"..."; # Run the migration until it stops. drush mi $1; # Run the check on this migration again. migration_loop $1; fi}
The loop is pretty simple. It simply reads the output of drush migrate-status for a given migration using grep as a filter. It then prints out some information about the migration and determines.
Based on the drush output of how many items remain to be imported, it will either run the migration again...
Migration: nodes_articles
Status: Idle
Total: 62294
Imported: 50672
Remaining: 11622
There are 11622 remaining nodes_articles items to be imported.
Running command: drush mi thr_node_venue
or end the loop...
Migration: terms Status: Idle Total: 8536 Imported: 8536 Remaining: 0 No items left to import.
Here is a full example of the script:
#!/usr/bin/env bash migration_loop(){# Better readability with separation.echo"========================"; # Get the output of the drush status.drush_output=$(drush ms | grep $1); # Split output string into an array.output=($drush_output); # Output the status items.for index in "${!output[@]}"do if[$index=="0"]thenecho"Migration: ${output[index]}"; fi if[$index=="1"]thenecho"Status: ${output[index]}"; fi if[$index=="2"]thenecho"Total: ${output[index]}"; fi if[$index=="3"]thenecho"Imported: ${output[index]}"; fi if[$index=="4"]thenecho"Remaining: ${output[index]}"; fi done# Check if all items were imported.if["${output[4]}"=="0"]thenecho"No items left to import."; elseecho"There are ${output[4]} remaining ${output[0]} items to be imported."; echo"Running command: drush mi $1"; echo"..."; # Run the migration until it stops. drush mi $1; # Run the check on this migration again. migration_loop $1; fi}
migration_loop users;
migration_loop terms
migration_loop article_nodes;
migration_loop other_nodes;
With this, you can circumvent any memory issues you may encounter with large migrations if time is limited.
Additional Resources:
Migration with Custom Values in Drupal 8 | Blog
Drupal 8: How to Reference a Views' Block Display From a Field | Blog
Rethinking Theme Structure in Drupal 8 | Blog