Skip to content

egose/database-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Extra MongoDB Tools

This repository provides additional MongoDB Tools with the following functionalities:

  • mongo-archive - dump MongoDB backups to disk and upload them to cloud storage.
  • mongo-unarchive - download MongoDB dump files from cloud storage and restore them to a live database.

Building Tools

To build the MongoDB Tools, follow these steps:

  1. Clone the repository:

    git clone https://github.com/egose/database-tools
    cd database-tools
  2. Install dependencies and build the Go binaries:

    go mod tidy
    make build

This will ensure that all the necessary dependencies are installed and then build the Go binaries in dist directory.

Binary Arguments and Environment Variables

The binaries provided in this repository utilize MongoDB Tools directly, ensuring a familiar interface for users with minimal modifications to the command arguments. The design closely resembles the behavior and command structure of MongoDB's native tools such as mongodump and mongorestore.

mongo-archive

flags environments type description
uri MONGOARCHIVE__URI string MongoDB uri connection string
db MONGOARCHIVE__DB string database to use
collection MONGOARCHIVE__COLLECTION string collection to use
host MONGOARCHIVE__HOST string MongoDB host to connect to
port MONGOARCHIVE__PORT string MongoDB port
ssl MONGOARCHIVE__VERBOSE bool connect to a mongod or mongos that has ssl enabled
ssl-ca-file MONGOARCHIVE__SSL_CA_FILE string the .pem file containing the root certificate chain
ssl-pem-key-file MONGOARCHIVE__SSL_PEM_KEY_FILE string the .pem file containing the certificate and key
ssl-pem-key-password MONGOARCHIVE__SSL_PEM_KEY_PASSWORD string the password to decrypt the sslPEMKeyFile, if necessary
ssl-crl-file MONGOARCHIVE__SSL_CRL_File string the .pem file containing the certificate revocation list
ssl-allow-invalid-certificates MONGOARCHIVE__SSL_ALLOW_INVALID_CERTIFICATES bool bypass the validation for server certificates
ssl-allow-invalid-hostnames MONGOARCHIVE__SSL_ALLOW_INVALID_HOSTNAMES bool bypass the validation for server name
ssl-fips-mode MONGOARCHIVE__SSL_FIPS_MODE bool use FIPS mode of the installed openssl library
username MONGOARCHIVE__USERNAME string username for authentication
password MONGOARCHIVE__PASSWORD string password for authentication
authentication-database MONGOARCHIVE__AUTHENTICATION_DATABASE string database that holds the user's credentials
authentication-mechanism MONGOARCHIVE__AUTHENTICATION_MECHANISM string authentication mechanism to use
gssapi-service-name MONGOARCHIVE__GSSAPI_SERVICE_NAME string service name to use when authenticating using GSSAPI/Kerberos
gssapi-host-name MONGOARCHIVE__GSSAPI_HOST_NAME string hostname to use when authenticating using GSSAPI/Kerberos
query MONGOARCHIVE__QUERY string query filter, as a v2 Extended JSON string
query-file MONGOARCHIVE__QUERY_FILE string path to a file containing a query filter (v2 Extended JSON)
read-preference MONGOARCHIVE__READ_PREFERENCE string specify either a preference mode or a preference json objectoutput
force-table-scan MONGOARCHIVE__FORCE_TABLE_SCAN bool force a table scanoutput
verbose MONGOARCHIVE__VERBOSE string more detailed log output
quiet MONGOARCHIVE__QUIET bool hide all log output
az-account-name MONGOARCHIVE__AZ_ACCOUNT_NAME string Azure Blob Storage Account Name
az-account-key MONGOARCHIVE__AZ_ACCOUNT_KEY string Azure Blob Storage Account Key
az-container-name MONGOARCHIVE__AZ_CONTAINER_NAME string Azure Blob Storage Container Name
aws-access-key-id MONGOARCHIVE__AWS_ACCESS_KEY_ID string AWS access key associated with an IAM account
aws-secret-access-key MONGOARCHIVE__AWS_SECRET_ACCESS_KEY string AWS secret key associated with the access keyName
aws-region MONGOARCHIVE__AWS_REGION string AWS Region whose servers you want to send your requests to
aws-bucket MONGOARCHIVE__AWS_BUCKET string AWS S3 bucket name
gcp-bucket MONGOARCHIVE__GCP_BUCKET string GCP storage bucket name
gcp-creds-file MONGOARCHIVE__GCP_CREDS_FILE string GCP service account's credentials file
gcp-project-id MONGOARCHIVE__GCP_PROJECT_ID string GCP service account's project id
gcp-private-key-id MONGOARCHIVE__GCP_PRIVATE_KEY_ID string GCP service account's private key id
gcp-private-key MONGOARCHIVE__GCP_PRIVATE_KEY string GCP service account's private key
gcp-client-email MONGOARCHIVE__GCP_CLIENT_EMAIL string GCP service account's client email
gcp-client-id MONGOARCHIVE__GCP_CLIENT_ID string GCP service account's client id
cron MONGOARCHIVE__CRON bool run a cron schedular and block current execution path
cron-expression MONGOARCHIVE__CRON_EXPRESSION string a string describes individual details of the cron schedule
tz MONGOARCHIVE__TZ string user-specified time zone
keep MONGOARCHIVE__KEEP bool keep data dump
uri-prune MONGOARCHIVE__URI_PRUNE bool prune MongoDB uri connection string

mongo-unarchive

flags environments type description
uri MONGOUNARCHIVE__URI string MongoDB uri connection string
db MONGOUNARCHIVE__DB string database to use
collection MONGOUNARCHIVE__COLLECTION string collection to use
ns-exclude MONGOUNARCHIVE__NS_EXCLUDE string exclude matching namespaces
ns-include MONGOUNARCHIVE__NS_INCLUDE string include matching namespaces
ns-from MONGOUNARCHIVE__NS_FROM string rename matching namespaces, must have matching nsTo
ns-to MONGOUNARCHIVE__NS_TO string rename matched namespaces, must have matching nsFrom
host MONGOUNARCHIVE__HOST string MongoDB host to connect to
port MONGOUNARCHIVE__PORT string MongoDB port
ssl MONGOUNARCHIVE__VERBOSE bool connect to a mongod or mongos that has ssl enabled
ssl-ca-file MONGOUNARCHIVE__SSL_CA_FILE string the .pem file containing the root certificate chain
ssl-pem-key-file MONGOUNARCHIVE__SSL_PEM_KEY_FILE string the .pem file containing the certificate and key
ssl-pem-key-password MONGOUNARCHIVE__SSL_PEM_KEY_PASSWORD string the password to decrypt the sslPEMKeyFile, if necessary
ssl-crl-file MONGOUNARCHIVE__SSL_CRL_File string the .pem file containing the certificate revocation list
ssl-allow-invalid-certificates MONGOUNARCHIVE__SSL_ALLOW_INVALID_CERTIFICATES bool bypass the validation for server certificates
ssl-allow-invalid-hostnames MONGOUNARCHIVE__SSL_ALLOW_INVALID_HOSTNAMES bool bypass the validation for server name
ssl-fips-mode MONGOUNARCHIVE__SSL_FIPS_MODE bool use FIPS mode of the installed openssl library
username MONGOUNARCHIVE__USERNAME string username for authentication
password MONGOUNARCHIVE__PASSWORD string password for authentication
authentication-database MONGOUNARCHIVE__AUTHENTICATION_DATABASE string database that holds the user's credentials
authentication-mechanism MONGOUNARCHIVE__AUTHENTICATION_MECHANISM string authentication mechanism to use
gssapi-service-name MONGOUNARCHIVE__GSSAPI_SERVICE_NAME string service name to use when authenticating using GSSAPI/Kerberos
gssapi-host-name MONGOUNARCHIVE__GSSAPI_HOST_NAME string hostname to use when authenticating using GSSAPI/Kerberos
drop MONGOUNARCHIVE__DROP bool drop each collection before import
dry-run MONGOUNARCHIVE__DRY_RUN bool view summary without importing anything. recommended with verbosity
write-concern MONGOUNARCHIVE__WRITE_CONCERN string write concern options
no-index-restore MONGOUNARCHIVE__NO_INDEX_RESTORE bool don't restore indexes
no-options-restore MONGOUNARCHIVE__NO_OPTIONS_RESTORE bool don't restore collection options
keep-index-version MONGOUNARCHIVE__KEEP_INDEX_VERSION bool don't update index version
maintain-insertion-order MONGOUNARCHIVE__MAINTAIN_INSERTION_ORDER bool restore the documents in the order of the input source
num-parallel-collections MONGOUNARCHIVE__NUM_PARALLEL_COLLECTIONS string number of collections to restore in parallel
num-insertion-workers-per-collection MONGOUNARCHIVE__NUM_INSERTION_WORKERS_PER_COLLECTION string number of insert operations to run concurrently per collection
stop-on-error MONGOUNARCHIVE__STOP_ON_ERROR string halt after encountering any error during insertion
bypass-document-validation MONGOUNARCHIVE__BYPASS_DOCUMENT_VALIDATION string bypass document validation
preserve-uuid MONGOUNARCHIVE__PRESERVE_UUID string preserve original collection UUIDs
verbose MONGOUNARCHIVE__VERBOSE string more detailed log output
quiet MONGOUNARCHIVE__QUIET bool hide all log output
az-account-name MONGOUNARCHIVE__AZ_ACCOUNT_NAME string Azure Blob Storage Account Name
az-account-key MONGOUNARCHIVE__AZ_ACCOUNT_KEY string Azure Blob Storage Account Key
az-container-name MONGOUNARCHIVE__AZ_CONTAINER_NAME string Azure Blob Storage Container Name
aws-access-key-id MONGOUNARCHIVE__AWS_ACCESS_KEY_ID string AWS access key associated with an IAM account
aws-secret-access-key MONGOUNARCHIVE__AWS_SECRET_ACCESS_KEY string AWS secret key associated with the access keyName
aws-region MONGOUNARCHIVE__AWS_REGION string AWS Region whose servers you want to send your requests to
aws-bucket MONGOUNARCHIVE__AWS_BUCKET string AWS S3 bucket name
gcp-bucket MONGOUNARCHIVE__GCP_BUCKET string GCP storage bucket name
gcp-creds-file MONGOUNARCHIVE__GCP_CREDS_FILE string GCP service account's credentials file
gcp-project-id MONGOUNARCHIVE__GCP_PROJECT_ID string GCP service account's project id
gcp-private-key-id MONGOUNARCHIVE__GCP_PRIVATE_KEY_ID string GCP service account's private key id
gcp-private-key MONGOUNARCHIVE__GCP_PRIVATE_KEY string GCP service account's private key
gcp-client-email MONGOUNARCHIVE__GCP_CLIENT_EMAIL string GCP service account's client email
gcp-client-id MONGOUNARCHIVE__GCP_CLIENT_ID string GCP service account's client id
object-name MONGOUNARCHIVE__OBJECT_NAME bool Object name of the archived file in the storage
dir MONGOUNARCHIVE__DIR bool directory name that contains the dumped files
updates MONGOUNARCHIVE__UPDATES bool array of update specifications in JSON string
updates-file MONGOUNARCHIVE__UPDATES_FILE bool path to a file containing an array of update specifications
keep MONGOUNARCHIVE__KEEP bool keep data dump
uri-prune MONGOUNARCHIVE__URI_PRUNE bool prune MongoDB uri connection string

Examples

Dump Database and Upload to Azure Storage

mongo-archive \
--uri="mongodb://<username>:<password>@cluster0.mongodb.net/" \
--db=<dbname> \
--az-account-name=<az_account_name> \
--az-account-key=<az_account_key> \
--az-container-name=<az_container_name>

This example demonstrates how to dump the data from a specified database and upload it to Azure storage. Replace , , , <az_account_name>, <az_account_key>, and <az_container_name> with the appropriate values for your setup.

Run Persistent Server for Regular Database Archival

mongo-archive \
--uri="mongodb://<username>:<password>@cluster0.mongodb.net/" \
--db=<dbname> \
--az-account-name=<az_account_name> \
--az-account-key=<az_account_key> \
--az-container-name=<az_container_name> \
--cron \
--cronExpression="* * * * *"

This example demonstrates how to run a persistent server that regularly archives a database. The server will execute the archival process based on the specified cron expression. Replace , , , <az_account_name>, <az_account_key>, <az_container_name>, and <cron_expression> with your own values.

Restore the Target Database from Azure Storage

mongo-unarchive \
--uri="mongodb://localhost:27017" \
--db=<dbname> \
--az-account-name=<az_account_name> \
--az-account-key=<az_account_key> \
--az-container-name=<az_container_name>

This example shows how to restore the target database from Azure storage. Replace , <az_account_name>, <az_account_key>, and <az_container_name> with your own values. The database will be restored to the MongoDB instance running on localhost:27017.

Restore the Target Database from Azure Storage and Apply Changes

mongo-unarchive \
--uri="mongodb://localhost:27017" \
--db=<dbname> \
--az-account-name=<az_account_name> \
--az-account-key=<az_account_key> \
--az-container-name=<az_container_name> \
--updates-file=/home/nonroot/updates.json

This example demonstrates how to restore the target database from Azure storage and apply changes contained in an updates file. Replace , <az_account_name>, <az_account_key>, <az_container_name>, and /home/nonroot/updates.json with your own values. The updates file should contain the necessary instructions to modify the restored database.

An example of updates.json:

[
  {
    "collection": "users",
    "filter": {
      "email": {
        "$exists": true
      }
    },
    "update": [
      {
        "$set": {
          "email": {
            "$replaceOne": {
              "input": "$email",
              "find": "@",
              "replacement": "_"
            }
          }
        }
      }
    ]
  }
]

This JSON file provides an example of updating the users collection in the restored database.

Execute Binary Using Docker Container Image

To execute a binary using a Docker container image, you can use the following command:

docker run --rm \
    -v "$(pwd)/tmp:/tmp" \
    -e MONGOARCHIVE__DUMP_PATH=/tmp/datadump \
    ghcr.io/egose/database-tools:latest \
    mongo-archive \
    --uri="mongodb://<username>:<password>@cluster0.mongodb.net/" \
    --db=<dbname> \
    --az-account-name=<az_account_name> \
    --az-account-key=<az_account_key> \
    --az-container-name=<az_container_name> \
    --keep

Run Kubernetes CronJob with Mounted Volume

apiVersion: batch/v1
kind: CronJob
metadata:
  name: mongo-archive
spec:
  schedule: "0 12 * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      backoffLimit: 3
      template:
        spec:
          restartPolicy: Never
          initContainers:
            - name: backup-permission
              image: alpine:3.18
              imagePullPolicy: IfNotPresent
              command: ["/bin/sh", "-c"]
              args:
                - |
                  rm -rf /tmp/*;
                  adduser -D -u 1000 nonroot;
                  chown nonroot:nonroot /tmp;
              volumeMounts:
                - mountPath: /tmp
                  name: backup-volume
          containers:
            - name: backup-job
              image: ghcr.io/egose/database-tools:0.2.6
              imagePullPolicy: IfNotPresent
              command: ["/bin/sh", "-c"]
              args:
                - |
                  mongo-archive --db=mydb --read-preference=primary --force-table-scan
              env:
                - name: MONGOARCHIVE__URI
                  value: "mongodb+srv://user:[email protected]"
                - name: MONGOARCHIVE__AZ_ACCOUNT_NAME
                  value: mystorage
                - name: MONGOARCHIVE__AZ_ACCOUNT_KEY
                  value: myaccountkey
                - name: MONGOARCHIVE__AZ_CONTAINER_NAME
                  value: mybackup
              volumeMounts:
                - mountPath: /tmp
                  name: backup-volume
          volumes:
            - name: backup-volume
              persistentVolumeClaim:
                claimName: backup-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: backup-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Backlog

...