Skip to content

8.0.x Upgrade Guide (Azure)

Pre-Upgrade Process

DataForge

DataForge will no longer allow certain type casts in output column mappings directly to prevent data loss when data types are mismatched incorrectly. Reach out to DataForge support for a query to run to identify data type changes needed. These changes can be made ahead of or after the upgrade. Output processes will fail until these data type changes are made.

Databricks

The previous SDK is no longer supported. All custom notebooks must now use the DataForge SDK. Custom processes that do not use the DataForge SDK will fail.

Follow the DataForge SDK migration guide for more information on switching notebook and cluster references.

Terraform

  • Add a Terraform variable for "postgres_update" and set the value to "yes"
  • Add a Terraform variable for "instanceType" and set the value to "Standard_DS3_v2"

This Terraform variable will ensure the Postgres update to version 16.1 will occur. This variable should not be removed after the upgrade or else future Terraform runs will attempt to delete the database and recreate it.

Terraform and Databricks

If you previously ran Terraform using Databricks Username and Password variables, you'll need to migrate to a Service Principal and secret. Databricks stopped supporting Databricks-managed passwords on July 10th. Follow the Updating Terraform for Databricks Authentication guide to complete this transition. If this is not completed, Terraform runs will fail to apply due to authentication issues.

Docker

The DataForge team will need to invite your original Docker user to a new Docker hub. You will need to accept the Docker invitation from the original email address used in Terraform or sign up for a new Docker account so the DataForge team can invite the new email address.

Azure Resource Group

DataForge recommends turning off Azure resources for API, Core, and Agent prior to starting the upgrade. The Postgres metadata database will upgrade to version 16.1. Turning off Azure resources ensures no processes will attempt to run during the database upgrade.

Upgrade Process

After completing the Pre-Upgrade Process steps, follow the standard upgrade guide below to complete the upgrade. Once the upgrade is complete, proceed to the post-upgrade steps below.

Azure Upgrade

Post-Upgrade Process

Open Cluster Configurations (System Configurations -> Cluster Configurations) and check for any clusters renamed as "CHECK DATABRICKS VERSION ...". Open each cluster that has this in the name and in the Parameters -> Cluster Configuration, change the Spark Version to "14.3.x-.scala2.12" and save the changes.

Confirm the environment is alive and working as usual. Submit a support request if something is not working as intended.