Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make backup and restore more universal and secure #1276

Merged

Conversation

onlymellb
Copy link
Contributor

What problem does this PR solve?

resolve #1217 and #1253

What is changed and how does it work?

Check List

Tests

  • Manual test (add detailed scripts or steps below)

Code changes

  • Has Go code change
  • Has documents change

Side effects

  • Breaking backward compatibility

Related changes

  • Need to update the documentation

Does this PR introduce a user-facing change?:

make backup and restore more universal and secure

name: demo2-restore
namespace: test2
spec:
from:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For restore, this field should be to.

gcs:
projectId: <your-project-id>
secretName: gcs-secret
storageType: gcs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting storageType, storageClassName and storageSize together is confusing.

Is the PV a temporary storage? Maybe we can rename either the storage backend or the temporary PV.

Also, I think the backup path is specific to each storage backend, it's better to put under the corresponding storage backend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

spec:
  from: # storage backend info for backup file
    type: gcs
    gcs:
      projectId: <your-project-id>
      secretName: gcs-secret
      path: gcs://test1-demo1/backup-2019-11-11T16:06:05Z.tgz
  to: # downstream tidb cluster info
    host: 10.0.0.2
    port: 4000
    user: root
    secretName: backup-demo2-tidb-secret
  tempStorage: # use temporarily to download the backup file
    storageClassName: local-storage
    size: 1Gi

/cc @DanielZhangQD @shuijing198799

Copy link
Contributor Author

@onlymellb onlymellb Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree putting storageType, storageClassName and storageSize together is confusing. These three fields all start with the word storage, which makes people think they have any relationship, but there is no relationship between them. Therefore I intend to remove storageType field, here is the related issue #768. I prefer the revised definition below.

spec:
  to:
    host: 10.0.0.2
    port: 4000
    user: root
    secretName: backup-demo2-tidb-secret
  gcs:
    projectId: <your-project-id>
    secretName: gcs-secret
    # "<bucket-name>/<path-to-backup-file>"
    path: test1-demo1/backup-2019-11-11T16:06:05Z.tgz
  storageClassName: local-storage
  storageSize: 1Gi

What do you think? @DanielZhangQD @shuijing198799 @weekface

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use from and to is more clear for users.
I agree to remove the storageType attribute, gcs: indicate the type already.

spec:
  storageClassName: local-storage
  storageSize: 1Gi
  from:
    gcs:
      projectId: <your-project-id>
      secretName: gcs-secret
      path: gcs://test1-demo1/backup-2019-11-11T16:06:05Z.tgz
  to:
    host: 10.0.0.2
    port: 4000
    user: root
    secretName: backup-demo2-tidb-secret

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that defining a remote storage configuration(here is gcs) in restore has already indicated that the backup data must come from here, and there is no need to use from to indicate this relationship. In addition, we should ensure that the spec is as simple as possible to reduce the cost of use for users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are storage class and size defined, users may think the backup is in the storage defined with storageClassName and storageSize. Adding to makes it less confuse and more straight forward and also simple enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think no need to add from/to, users know they are doing backup or restore, so even if no from/to indication, they know that the tidb cluster are used for what.
For the storageClassName: local-storage storageSize: 1Gi, I prefer to make it clear in the doc that they are used for temporary storage whether to put them under tempStorage or not.

spec:
  cluster or tidb:
    host: 10.0.0.2
    port: 4000
    user: root
    secretName: backup-demo2-tidb-secret
  gcs:
    projectId: <your-project-id>
    secretName: gcs-secret
    # "<bucket-name>/<path-to-backup-file>"
    path: test1-demo1/backup-2019-11-11T16:06:05Z.tgz
  storageClassName: local-storage
  storageSize: 1Gi

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if users provide multiple cloud storage configurations if we omit the storageType field?

host: 10.0.0.1
port: 4000
user: root
secretName: backup-demo1-tidb-secret
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use passwordSecretName?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense to include the name of the key in the secretName. What if the secret stores more stuff?

@onlymellb onlymellb changed the title make backup and restore more universal and secure (WIP)make backup and restore more universal and secure Dec 3, 2019
pkg/backup/util/util.go Outdated Show resolved Hide resolved
pkg/backup/util/util.go Outdated Show resolved Hide resolved
// GetTidbUserAndPassword get the tidb user and password from specific secret
func GetTidbUserAndPassword(ns, name, tidbSecretName string, secretLister corelisters.SecretLister) (user, password, reason string, err error) {
// GenerateTidbPasswordEnv generate the password EnvVar
func GenerateTidbPasswordEnv(ns, name, tidbSecretName string, secretLister corelisters.SecretLister) ([]corev1.EnvVar, string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can follow k8s builtin controller's behavior to not checking the referenced resources existence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the check is to make it fail quickly when there is a problem. At the same time, the failure information and reason will be written into the backup CR status, which is helpful for upper-layer application integrated with us to diagnose the problem quickly. k8s only discovers problems when it encounters problems in the execution layer, and then records the problem to the underlying pod object, which is not very straightforward for diagnosing the problem of the upper-layer application

secret, err := secretLister.Secrets(ns).Get(gcsSecretName)
if err != nil {
err := fmt.Errorf("backup %s/%s get gcs secret %s failed, err: %v", ns, name, gcsSecretName, err)
err := fmt.Errorf("get gcs secret %s/%s failed, err: %v", ns, gcsSecretName, err)
return certEnv, "GetGcsSecretFailed", err
}

keyStr, exist := CheckAllKeysExistInSecret(secret, constants.GcsCredentialsKey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already checked in GenerateGcsCertEnvVar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateGcsCertEnvVar does not check whether the secret exists and contains the specified key.

secret, err := secretLister.Secrets(ns).Get(s3SecretName)
if err != nil {
err := fmt.Errorf("backup %s/%s get s3 secret %s failed, err: %v", ns, name, s3SecretName, err)
err := fmt.Errorf("get s3 secret %s/%s failed, err: %v", ns, s3SecretName, err)
return certEnv, "GetS3SecretFailed", err
}

keyStr, exist := CheckAllKeysExistInSecret(secret, constants.S3AccessKey, constants.S3SecretKey)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already checked in GenerateS3CertEnvVar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -506,7 +506,7 @@ type S3StorageProvider struct {
Provider S3StorageProviderType `json:"provider"`
// Region in which the S3 compatible bucket is located.
Region string `json:"region,omitempty"`
// Bucket in which to store the Backup.
// Bucket in which to store the backup data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not support specifying a prefix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by specifying a bucket prefix and adding a random string to construct a complete bucket name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users may want to store the backup file under a specific directory (prefix), though we have generated a prefix composed by cluster name and namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this pr, users must specify the bucket name, since backup is not the same as before. There is no cluster and namespace, so the bucket name will not be generated automatically.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bucket name is different from prefix. I'm wondering if we should provide another parameter prefix, so users can store backup file in a bucket under a specific directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added one on #1280

// BackupPath is the location of the backup.
BackupPath string `json:"backupPath"`
// Type is the backup type for tidb cluster.
Type BackupType `json:"backupType,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can allow users to omit this field if there is only one storage provider to improve UX.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BackupType field is unused currently. Do you mean StorageType?

onlymellb and others added 2 commits December 3, 2019 21:50
Comment on lines 46 to 54
cmd.Flags().StringVarP(&bo.Namespace, "namespace", "n", "", "Backup CR's namespace")
cmd.Flags().StringVarP(&bo.Host, "host", "h", "", "Tidb cluster access address")
cmd.Flags().Int32VarP(&bo.Port, "port", "P", bkconstants.DefaultTidbPort, "Port number to use for connecting tidb cluster")
cmd.Flags().StringVarP(&bo.Bucket, "bucket", "k", "", "Bucket in which to store the backup data")
cmd.Flags().StringVarP(&bo.Password, bkconstants.TidbPasswordKey, "p", "", "Password to use when connecting to tidb cluster")
cmd.Flags().StringVarP(&bo.User, "user", "u", "", "User for login tidb cluster")
cmd.Flags().StringVarP(&bo.StorageType, "storageType", "S", "", "Backend storage type")
cmd.Flags().StringVarP(&bo.StorageType, "storageType", "s", "", "Backend storage type")
cmd.Flags().StringVarP(&bo.BackupName, "backupName", "b", "", "Backup CRD object name")
util.SetFlagsFromEnv(cmd.Flags(), bkconstants.BackupManagerEnvVarPrefix)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, backup-manager will get the backup from kube-apiserver and you only need to pass the name and namespace of the backup CR, then all the other info can be retrieved from that CR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, backup-manager can assemble its own parameters through backup & restore CR. But here I hope that the logic of these parameter checks can be done at the operator level, instead of waiting until the backup & restore job is executed to find that some parameters are problematic. We think that the backup-manager startup parameters obtained from the operator must be legal. The backup-manager only needs to focus on processing backup and restore tasks and updating the status of backup CR.

type: string
cluster:
description: Cluster represents the tidb cluster to be restored.
backupType:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a backupType in backupSpec, why add one more here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is generated automatically

@onlymellb onlymellb changed the title (WIP)make backup and restore more universal and secure make backup and restore more universal and secure Dec 10, 2019
tennix
tennix previously approved these changes Dec 10, 2019
Copy link
Member

@tennix tennix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@onlymellb
Copy link
Contributor Author

I have tested backup and restore locally and it works as expected. @tennix @weekface @DanielZhangQD @shuijing198799 PTAL again

@onlymellb
Copy link
Contributor Author

/run-e2e-tests

@onlymellb
Copy link
Contributor Author

/run-e2e-tests

@onlymellb onlymellb added the status/PTAL PR needs to be reviewed label Dec 16, 2019
Copy link
Contributor

@weekface weekface left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@onlymellb
Copy link
Contributor Author

/run-e2e-tests

Copy link
Contributor

@shuijing198799 shuijing198799 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@onlymellb onlymellb merged commit 63e81ca into pingcap:master Dec 16, 2019
@onlymellb onlymellb deleted the make-backup-and-restore-more-universal branch December 16, 2019 06:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/backup status/PTAL PR needs to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use env for rclone
5 participants