1 of 30

AMA: Migrating 1m entities from D7 to D9 using Drush and Migrate API

JD Leonard�Freelance Senior Drupal Architect, Backend Developer, and Consultant

jdleonard.net/drupal

DrupalNYC Meetup

July 7, 2021

2 of 30

Overview

3 of 30

About JD

  • Lives in Jersey City, NJ
  • Long-time Drupal developer (over 15 years!)
  • Professional freelancer for over 9 years
  • Focuses on complex web application development using Drupal
  • Volunteers in the DrupalNYC and Drupal Event Organizers communities

4 of 30

Goals for Today

  • (20%) Explain the project requirements, my approach, and lessons learned rebuilding the Drupal 7 GilderLehrman.org on Drupal 9 and migrating the data
  • (80%) Answer your questions (“Ask me anything”)
    • We can dive into:
      • Contrib modules used
      • Custom code
      • Custom migration configuration files
      • Drush command line usage
      • Staying organized
      • Whatever you can think of

5 of 30

The Gilder Lehrman Institute of American History

  • GLI
  • Founded 1994
  • Leading nonprofit dedicated to K–12 history education while also serving the general public
  • Mission
    • To promote the knowledge and understanding of American history through educational programs and resources
  • Provides access to primary source documents in the Gilder Lehrman Collection
    • Located online and on the lower level of the New-York Historical Society on the Upper West Side
  • Programs have been recognized by awards from
    • White House
    • National Endowment for the Humanities
    • Organization of American Historians
    • Council of Independent Colleges

6 of 30

Project Requirements

7 of 30

Overall Goals

  • Replatform from Drupal 7 + CiviCRM + Ubercart + Solr (all on Digital Ocean)
    • To Drupal 9 (on Pantheon) + Salesforce + Soapbox Engage + TalentLMS + OpenSolr
  • Minimize costs
    • Minimize changes to existing functionality to reduce scope
    • Move key functionality and technology to platforms that are easier for GLI staff to maintain
  • Allow GLI staff to focus on existing website and other operations during the upgrade
  • Minimize disruption to GLI programs and their participants
  • Embrace best practices

8 of 30

Needed Migration to Drupal 9

  • Blocks (manual)
  • Content types (14/33)
  • Entity view modes
  • Field collections
  • Fields (118/227)
  • Files (~100k, ~71 GB)
  • Filter formats
  • Image styles
  • Menus
  • Menu links
  • Nodes (~75k)
  • Path aliases
  • Pathauto patterns
  • Redirects (~130k)
  • Rules (manual)
  • Taxonomy terms (~475k)
  • Taxonomy vocabularies (19)
  • Text formats (3/4)

9 of 30

Needed Migration to Drupal 9

  • Users (~425k)
  • User roles (31/42)
  • View modes
  • Views (manual)
  • Google Analytics (manual)
  • Publishing workflow (manual)
  • Site search (manual)
  • Sendgrid (manual)

10 of 30

New Work

  • Improve user registration and user profile editing
  • Salesforce integration
    • New taxonomy vocabulary for schools (sync from Salesforce)
    • Users/Contacts (two-way sync)
  • Single sign on (SSO) with TalentLMS using SimpleSAMLphp
  • New Paragraphs-based content type for landing pages
  • New content type and custom logic to pre-fill Soapbox Engage forms
  • OpenSolr for site search

11 of 30

New Work

  • Pantheon set up
  • Miscellaneous module configuration to replace functionality provided by D7 modules not available in D9
  • Best practices
    • Backups - Pantheon nightly
    • Caching (Pantheon Advanced Page Cache, Redis object cache)
    • CI/CD - GitHub Actions from GitHub to Pantheon
    • Config Split for per-environment configuration
    • Dependency and patch management using Composer
    • Spam prevention - Honeypot
    • Stage File Proxy

12 of 30

Out of Scope for Me

  • Self-paced courses
    • Moved to TalentLMS (a Learning Management System)
  • Shop
    • Purchase self-paced courses and subscriptions/memberships
    • Moved to Soapbox Engage Shop (integrates with Salesforce)
  • Webforms
    • Moved to Soapbox Engage Forms (integrates with Salesforce)
  • CiviCRM
    • Members, program participants, donors, event attendance, etc.
    • Migrated to Salesforce
  • Front-end theme

13 of 30

Our Approach

14 of 30

Sequencing

  • Continuous discovery
    • GLI team had a full plate of work
    • Concurrent discovery and implementation of CiviCRM to Salesforce migration
  • Identify contrib module compatibility
    • Input into whether to build on D8 or D9
    • Identify or write needed patches
    • 227 modules enabled in D7 -> 151 in D9 (144 in production)
  • Inventory site
    • Especially looking for things that don’t need to be migrated to reduce scope
    • Evaluate custom module logic that needs to be reimplemented
    • Which content types, fields, taxonomies, etc. need migration

15 of 30

Sequencing

  • Set up GitHub for source control and issue tracking
  • Set up Lando for D9 site
  • Begin D9 site build and CI/CD pipeline
  • Train GLI staff on D9 development workflows (e.g. configuration management)
  • Set up Lando for D7 site (migration source during development)
  • Set up a Pantheon MultiDev environment as a migration source for the real migration
  • Implement and test logic to migrate site structure/configuration (e.g. content types, field configurations)
  • Execute structure migration

16 of 30

Sequencing

  • Begin third-party integrations (Salesforce, TalentLMS SSO, Soapbox pre-fill redirects)
  • Onboard front-end contractors (e.g. Lando, Pantheon, GitHub, CI/CD)
  • Implement and test logic to migrate site content (e.g. nodes, taxonomy terms, users, files)
  • Execute temporary partial migration on a Pantheon MultiDev environment to keep front-end team unblocked
  • Execute initial content migration in Live environment (slash locally due to memory constraints)
  • Execute incremental content migrations
  • Implement and test logic to import CiviCRM contact data from Salesforce export to D9 user fields
  • Execute import of contact data from Salesforce to D9

17 of 30

Sequencing

  • Execute final migration from D7
  • Go live! (June 7)
  • Post-launch support

18 of 30

Final Migration Steps (timings from an earlier incremental migration)

  • Started around 10am
  • [7m] Copy files from D7 production to local using rsync (incremental)
  • [1m] Copy files from local to D9 live (incremental) using rsync
  • [3m] Export database on D7 production using drush sql:dump
  • [1m] Copy database export from D7 production to local using rsync
  • Unzip database export
  • Drop database on Pantheon MultiDev environment
  • [39m] Import D7 database export to Pantheon MultiDev environment
  • Enable Maintenance Mode in Pantheon Live (D9) environment (D7 remains available throughout)
  • [22m] Execute users migration (incremental)

19 of 30

Final Migration Steps (timings from an earlier incremental migration)

  • [1m] Execute file migration (incremental)
  • [13m] Execute taxonomy term migration for vocabs with few terms (full)
  • [8m] Execute node migration (incremental)
  • [1m] Execute path redirect, URL alias, and menu link migration (incremental)
  • Enable Salesforce integration
  • Disable Maintenance Mode
  • Index unindexed Solr items
  • Index unindexed XmlSitemap entries
  • Apply manual content updates

20 of 30

Final Migration Steps (timings from an earlier incremental migration)

  • Smoke testing
  • Cutover DNS
  • Final testing
  • Signed off for the night at 10:30pm

21 of 30

Timeline

  • Early November - Initial discovery and estimation
  • Mid November - Project kick off
  • Early December - First git commits
  • Late January - Migrated structure/configuration
  • Late March - Partial content migration for front-end team
  • Late April - Initial content migration
  • Early June - Final migration and go-live

22 of 30

Hours per Week (~380 hours total inc. Salesforce)

23 of 30

Tools

  • Lando
  • Pantheon’s Terminus CLI
  • Drush
  • Migrate Tools module for Drush support
  • Migrate Plus module for additional migrate plugins
  • Migrate Upgrade module for its drush command to generate migration configuration YAML files

24 of 30

Lessons Learned

25 of 30

Salesforce Integrations are Difficult

  • The Salesforce Drupal module does a lot well
  • It doesn’t properly detect the deletion of Salesforce objects
  • It doesn’t support syncing multi-value fields
  • We decided on an ugly workaround to sync user roles being added/removed from a user in Drupal or a contact in Salesforce
  • Syncing structured validated address data (e.g. ISO state and country abbreviations) requires the Address module and enabling Salesforce’s state/country picklists, which complicates importing address data into Salesforce

26 of 30

Migrations on Pantheon are Difficult

  • Pantheon is optimized for serving web pages, not executing migration tasks
  • Executing the migrations locally took about one-third of the time
  • Encountered out of memory errors on Pantheon
    • Some I was able to work around
    • For others, I ended up performing the migration locally and pushing my local database up to Pantheon

27 of 30

Sometimes the Wrong Way is the Right Way

  • Importing Salesforce contact data from CSV
    • Originally implemented using Feeds module
    • Took too long
    • Ended up crafting SQL INSERT statements using Excel formulas, which worked great

28 of 30

Test Cookie Behaviors

  • After go-live, users had trouble logging in due to cookies set by D7 on the primary domain
  • Solution was for users to clear their browser’s cookies
  • User (and technical support) stress could have been reduced had this been caught and worked around in advance

29 of 30

Question Time

30 of 30

Thank you!

JD Leonard�Freelance Senior Drupal Architect, Backend Developer, and Consultant

jdleonard.net/drupal

DrupalNYC Meetup

July 7, 2021