Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow in-place column drop for bigquery table #10170

Merged
merged 10 commits into from
Apr 4, 2024
Original file line number Diff line number Diff line change
Expand Up @@ -256,19 +256,16 @@ func bigQueryTableNormalizePolicyTags(val interface{}) interface{} {

// Compares two existing schema implementations and decides if
// it is changeable.. pairs with a force new on not changeable
func resourceBigQueryTableSchemaIsChangeable(old, new interface{}) (bool, error) {
func resourceBigQueryTableSchemaIsChangeable(old, new interface{}, topLevel bool) (bool, error) {
switch old.(type) {
case []interface{}:
arrayOld := old.([]interface{})
arrayNew, ok := new.([]interface{})
droppedColumns := 0
if !ok {
// if not both arrays not changeable
return false, nil
}
if len(arrayOld) > len(arrayNew) {
// if not growing not changeable
return false, nil
}
if err := bigQueryTablecheckNameExists(arrayOld); err != nil {
return false, err
}
Expand All @@ -289,16 +286,21 @@ func resourceBigQueryTableSchemaIsChangeable(old, new interface{}) (bool, error)
}
}
for key := range mapOld {
// all old keys should be represented in the new config
// dropping top level columns can happen in-place
if _, ok := mapNew[key]; !ok {
return false, nil
if !topLevel {
return false, nil
}
droppedColumns += 1
continue
}
if isChangable, err :=
resourceBigQueryTableSchemaIsChangeable(mapOld[key], mapNew[key]); err != nil || !isChangable {
resourceBigQueryTableSchemaIsChangeable(mapOld[key], mapNew[key], false); err != nil || !isChangable {
return false, err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we'd be updating this as we look through mapOld and comparing key to mapNew - could you explain how this is working as intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I understand the question correctly, we are going through the keys in mapOld and comparing to mapNew, but we're doing it recursively because of the nature of the schema (nested interfaces). We immediately return false (not changeable) if we encounter an unchangeable case, for example a column that had it's type changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to this block:

if topLevel {
  sameNameColumns += 1
}

At a glance, in a single given level, top or nested, sameNameColumns will be 0 or 1. Additional comments on the variables and the logic here would help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sameNameColumns is always 0 for any nested level, and could be 0 or more at the top level. I refactored this part of the code, it should still be the same logic but I think It's more readable now and less prone to accidental logic-breaking changes.

}
return true, nil
// only pure column drops allowed
return (droppedColumns == 0) || (len(arrayOld) == len(arrayNew)+droppedColumns), nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this logic would be more self-descriptive if we also tracked
e.g. "sameColumns" as we loop through the keys and just check old map size against the sum here instead?

Copy link
Contributor Author

@obada-ab obada-ab Mar 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "changeableColumns" would be more precise since changing a column's mode would be allowed alongside column drops, let me know if you think we should change this to only column drops without other in-place changes (besides adding columns)

and just check old map size against the sum here instead?

hmm not sure if I get it, if the condition is something like

(droppedColumns == 0) || (len(arrayOld) == changeableColumns+droppedColumns)

then it wouldn't work correctly, for example when changing {col1, col2, col3} to {col1, col4, col5} the condition would result in true (changeableColumns=1, droppedColumns=2)

I think we can do something like:

newColumns := len(arrayNew) - changeableColumns
return (droppedColumns == 0) || (newColumns == 0), nil

what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's still include other in-place changes. Naming is hard though - the columns would have the same name but maybe different other attributes. "SameNameColumns"?

Your updated logic looks good.

case map[string]interface{}:
objectOld := old.(map[string]interface{})
objectNew, ok := new.(map[string]interface{})
Expand Down Expand Up @@ -337,7 +339,7 @@ func resourceBigQueryTableSchemaIsChangeable(old, new interface{}) (bool, error)
return false, nil
}
case "fields":
return resourceBigQueryTableSchemaIsChangeable(valOld, valNew)
return resourceBigQueryTableSchemaIsChangeable(valOld, valNew, false)

// other parameters: description, policyTags and
// policyTags.names[] are changeable
Expand Down Expand Up @@ -376,7 +378,7 @@ func resourceBigQueryTableSchemaCustomizeDiffFunc(d tpgresource.TerraformResourc
// same as above
log.Printf("[DEBUG] unable to unmarshal json customized diff - %v", err)
}
isChangeable, err := resourceBigQueryTableSchemaIsChangeable(old, new)
isChangeable, err := resourceBigQueryTableSchemaIsChangeable(old, new, true)
if err != nil {
return err
}
Expand Down Expand Up @@ -1703,6 +1705,12 @@ func resourceBigQueryTableRead(d *schema.ResourceData, meta interface{}) error {
return nil
}

type TableInfo struct {
project string
datasetID string
tableID string
}

func resourceBigQueryTableUpdate(d *schema.ResourceData, meta interface{}) error {
config := meta.(*transport_tpg.Config)
userAgent, err := tpgresource.GenerateUserAgentString(d, config.UserAgent)
Expand All @@ -1725,13 +1733,62 @@ func resourceBigQueryTableUpdate(d *schema.ResourceData, meta interface{}) error
datasetID := d.Get("dataset_id").(string)
tableID := d.Get("table_id").(string)

tableInfo := &TableInfo{
project: project,
datasetID: datasetID,
tableID: tableID,
}

if err = resourceBigQueryTableColumnDrop(config, userAgent, table, tableInfo); err != nil {
return err
}

if _, err = config.NewBigQueryClient(userAgent).Tables.Update(project, datasetID, tableID, table).Do(); err != nil {
return err
}

return resourceBigQueryTableRead(d, meta)
}

func resourceBigQueryTableColumnDrop(config *transport_tpg.Config, userAgent string, table *bigquery.Table, tableInfo *TableInfo) error {
oldTable, err := config.NewBigQueryClient(userAgent).Tables.Get(tableInfo.project, tableInfo.datasetID, tableInfo.tableID).Do()
if err != nil {
return err
}

newTableFields := map[string]bool{}
for _, field := range table.Schema.Fields {
newTableFields[field.Name] = true
}

droppedColumns := []string{}
for _, field := range oldTable.Schema.Fields {
if !newTableFields[field.Name] {
droppedColumns = append(droppedColumns, field.Name)
}
}

if len(droppedColumns) > 0 {
droppedColumnsString := strings.Join(droppedColumns, ", DROP COLUMN ")

dropColumnsDDL := fmt.Sprintf("ALTER TABLE `%s.%s.%s` DROP COLUMN %s", tableInfo.project, tableInfo.datasetID, tableInfo.tableID, droppedColumnsString)
log.Printf("[INFO] Dropping columns in-place: %s", dropColumnsDDL)

useLegacySQL := false
req := &bigquery.QueryRequest{
Query: dropColumnsDDL,
UseLegacySql: &useLegacySQL,
}

_, err = config.NewBigQueryClient(userAgent).Jobs.Query(tableInfo.project, req).Do()
if err != nil {
return err
}
}

return nil
}

func resourceBigQueryTableDelete(d *schema.ResourceData, meta interface{}) error {
if d.Get("deletion_protection").(bool) {
return fmt.Errorf("cannot destroy instance without setting deletion_protection=false and running `terraform apply`")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -403,7 +403,7 @@ func (testcase *testUnitBigQueryDataTableJSONChangeableTestCase) check(t *testin
if err := json.Unmarshal([]byte(testcase.jsonNew), &new); err != nil {
t.Fatalf("unable to unmarshal json - %v", err)
}
changeable, err := resourceBigQueryTableSchemaIsChangeable(old, new)
changeable, err := resourceBigQueryTableSchemaIsChangeable(old, new, true)
if err != nil {
t.Errorf("%s failed unexpectedly: %s", testcase.name, err)
}
Expand All @@ -428,7 +428,7 @@ func (testcase *testUnitBigQueryDataTableJSONChangeableTestCase) check(t *testin
}
}

var testUnitBigQueryDataTableIsChangableTestCases = []testUnitBigQueryDataTableJSONChangeableTestCase{
var testUnitBigQueryDataTableIsChangeableTestCases = []testUnitBigQueryDataTableJSONChangeableTestCase{
{
name: "defaultEquality",
jsonOld: "[{\"name\": \"someValue\", \"type\" : \"INTEGER\", \"mode\" : \"NULLABLE\", \"description\" : \"someVal\" }]",
Expand All @@ -445,7 +445,7 @@ var testUnitBigQueryDataTableIsChangableTestCases = []testUnitBigQueryDataTableJ
name: "arraySizeDecreases",
jsonOld: "[{\"name\": \"someValue\", \"type\" : \"INTEGER\", \"mode\" : \"NULLABLE\", \"description\" : \"someVal\" }, {\"name\": \"asomeValue\", \"type\" : \"INTEGER\", \"mode\" : \"NULLABLE\", \"description\" : \"someVal\" }]",
jsonNew: "[{\"name\": \"someValue\", \"type\" : \"INTEGER\", \"mode\" : \"NULLABLE\", \"description\" : \"someVal\" }]",
changeable: false,
changeable: true,
},
{
name: "descriptionChanges",
Expand Down Expand Up @@ -523,6 +523,24 @@ var testUnitBigQueryDataTableIsChangableTestCases = []testUnitBigQueryDataTableJ
jsonNew: "[{\"name\": \"value3\", \"type\" : \"BOOLEAN\", \"mode\" : \"NULLABLE\", \"description\" : \"newVal\" }, {\"name\": \"value1\", \"type\" : \"INTEGER\", \"mode\" : \"NULLABLE\", \"description\" : \"someVal\" }]",
changeable: false,
},
{
name: "renameRequiredColumn",
jsonOld: "[{\"name\": \"value1\", \"type\" : \"INTEGER\", \"mode\" : \"REQUIRED\", \"description\" : \"someVal\" }]",
jsonNew: "[{\"name\": \"value3\", \"type\" : \"INTEGER\", \"mode\" : \"REQUIRED\", \"description\" : \"someVal\" }]",
changeable: false,
},
{
name: "renameNullableColumn",
jsonOld: "[{\"name\": \"value1\", \"type\" : \"INTEGER\", \"mode\" : \"NULLABLE\", \"description\" : \"someVal\" }]",
jsonNew: "[{\"name\": \"value3\", \"type\" : \"INTEGER\", \"mode\" : \"NULLABLE\", \"description\" : \"someVal\" }]",
changeable: false,
},
{
name: "typeModeReqToNullAndColumnDropped",
jsonOld: "[{\"name\": \"someValue\", \"type\" : \"BOOLEAN\", \"mode\" : \"REQUIRED\", \"description\" : \"someVal\" }, {\"name\": \"someValue2\", \"type\" : \"BOOLEAN\", \"mode\" : \"NULLABLE\", \"description\" : \"someVal\" }]",
jsonNew: "[{\"name\": \"someValue\", \"type\" : \"BOOLEAN\", \"mode\" : \"NULLABLE\", \"description\" : \"some new value\" }]",
changeable: true,
},
{
name: "policyTags",
jsonOld: `[
Expand All @@ -548,16 +566,26 @@ var testUnitBigQueryDataTableIsChangableTestCases = []testUnitBigQueryDataTableJ
},
}

func TestUnitBigQueryDataTable_schemaIsChangable(t *testing.T) {
func TestUnitBigQueryDataTable_schemaIsChangeable(t *testing.T) {
t.Parallel()
for _, testcase := range testUnitBigQueryDataTableIsChangableTestCases {
for _, testcase := range testUnitBigQueryDataTableIsChangeableTestCases {
testcase.check(t)
testcaseNested := &testUnitBigQueryDataTableJSONChangeableTestCase{
testcase.name + "Nested",
fmt.Sprintf("[{\"name\": \"someValue\", \"type\" : \"INTEGER\", \"fields\" : %s }]", testcase.jsonOld),
fmt.Sprintf("[{\"name\": \"someValue\", \"type\" : \"INT64\", \"fields\" : %s }]", testcase.jsonNew),
testcase.changeable,
}
}

func TestUnitBigQueryDataTable_schemaIsChangeableNested(t *testing.T) {
t.Parallel()
// Only top level column drops are changeable
skipNested := map[string]bool{"arraySizeDecreases": true, "typeModeReqToNullAndColumnDropped": true}
for _, testcase := range testUnitBigQueryDataTableIsChangeableTestCases {
if !skipNested[testcase.name] {
testcaseNested := &testUnitBigQueryDataTableJSONChangeableTestCase{
testcase.name + "Nested",
fmt.Sprintf("[{\"name\": \"someValue\", \"type\" : \"INTEGER\", \"fields\" : %s }]", testcase.jsonOld),
fmt.Sprintf("[{\"name\": \"someValue\", \"type\" : \"INT64\", \"fields\" : %s }]", testcase.jsonNew),
testcase.changeable,
}
testcaseNested.check(t)
}
testcaseNested.check(t)
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,39 @@ func TestAccBigQueryTable_Basic(t *testing.T) {
})
}

func TestAccBigQueryTable_DropColumns(t *testing.T) {
t.Parallel()

datasetID := fmt.Sprintf("tf_test_%s", acctest.RandString(t, 10))
tableID := fmt.Sprintf("tf_test_%s", acctest.RandString(t, 10))

acctest.VcrTest(t, resource.TestCase{
PreCheck: func() { acctest.AccTestPreCheck(t) },
ProtoV5ProviderFactories: acctest.ProtoV5ProviderFactories(t),
CheckDestroy: testAccCheckBigQueryTableDestroyProducer(t),
Steps: []resource.TestStep{
{
Config: testAccBigQueryTableTimePartitioningDropColumns(datasetID, tableID),
},
{
ResourceName: "google_bigquery_table.test",
ImportState: true,
ImportStateVerify: true,
ImportStateVerifyIgnore: []string{"deletion_protection"},
},
{
Config: testAccBigQueryTableTimePartitioningDropColumnsUpdate(datasetID, tableID),
},
{
ResourceName: "google_bigquery_table.test",
ImportState: true,
ImportStateVerify: true,
ImportStateVerifyIgnore: []string{"deletion_protection"},
},
},
})
}

func TestAccBigQueryTable_Kms(t *testing.T) {
t.Parallel()
resourceName := "google_bigquery_table.test"
Expand Down Expand Up @@ -1759,6 +1792,62 @@ EOH
`, datasetID, tableID, partitioningType)
}

func testAccBigQueryTableTimePartitioningDropColumns(datasetID, tableID string) string {
return fmt.Sprintf(`
resource "google_bigquery_dataset" "test" {
dataset_id = "%s"
}

resource "google_bigquery_table" "test" {
deletion_protection = false
table_id = "%s"
dataset_id = google_bigquery_dataset.test.dataset_id

schema = <<EOH
[
{
"name": "ts",
"type": "TIMESTAMP"
},
{
"name": "some_string",
"type": "STRING"
},
{
"name": "some_int",
"type": "INTEGER"
}
]
EOH

}
`, datasetID, tableID)
}

func testAccBigQueryTableTimePartitioningDropColumnsUpdate(datasetID, tableID string) string {
return fmt.Sprintf(`
resource "google_bigquery_dataset" "test" {
dataset_id = "%s"
}

resource "google_bigquery_table" "test" {
deletion_protection = false
table_id = "%s"
dataset_id = google_bigquery_dataset.test.dataset_id

schema = <<EOH
[
{
"name": "ts",
"type": "TIMESTAMP"
}
]
EOH

}
`, datasetID, tableID)
}

func testAccBigQueryTableKms(cryptoKeyName, datasetID, tableID string) string {
return fmt.Sprintf(`
resource "google_bigquery_dataset" "test" {
Expand Down