Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dedicated-3.7] Cherry-picks from OCP 3.7 GA #6656

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,685 changes: 1,079 additions & 606 deletions _topic_map.yml

Large diffs are not rendered by default.

315 changes: 308 additions & 7 deletions admin_guide/backup_restore.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ nodes they get rescheduled to.
[[backup-restore-prerequisites]]
== Prerequisites

. Because the restore procedure involves a xref:cluster-restore[complete
reinstallation], save all the files used in the initial installation. This may
. Because the restore procedure involves a complete
reinstallation, save all the files used in the initial installation. This may
include:
+
- *_~/.config/openshift/installer.cfg.yml_* (from the
Expand Down Expand Up @@ -73,13 +73,17 @@ following sections), which depends on how *etcd* is deployed.
|all-in-one cluster
|*_/var/lib/openshift/openshift.local.etcd_*

|external etcd (not on master)
|external etcd (located either on a master or another host)
|*_/var/lib/etcd_*

|embedded etcd (on master)
|*_/var/lib/origin/etcd_*
|===

[WARNING]
====
Embedded etcd is no longer supported starting with {product-title} 3.7. See
xref:../install_config/upgrading/migrating_embedded_etcd.adoc#install-config-upgrading-etcd-data-migration[Migrating Embedded etcd to External etcd] for details.
====


[[cluster-backup]]
== Cluster Backup
Expand Down Expand Up @@ -132,8 +136,14 @@ For a container-based installation, you must use `docker exec` to run *etcdctl*
inside the container.
====

[[cluster-restore]]
== Cluster Restore
. Copy the *_db_* file over to the backup you created:
+
----
# cp "$ETCD_DATA_DIR"/member/snap/db "$ETCD_DATA_DIR.bak"/member/snap/db
----

[[registry-certificates-backup]]
=== Registry Certificates Backup

[NOTE]
====
Expand Down Expand Up @@ -166,6 +176,297 @@ that {product-title} was previously installed.
# chown -R etcd:etcd $ETCD_DATA_DIR
----

. Create the new single node cluster using etcd's `--force-new-cluster` option.
You can do this using the values from *_/etc/etcd/etcd.conf_*, or you can
temporarily modify the *systemd* unit file and start the service normally.
+
To do so, edit the *_/usr/lib/systemd/system/etcd.service_* file, and add
`--force-new-cluster`:
+
----
# sed -i '/ExecStart/s/"$/ --force-new-cluster"/' /usr/lib/systemd/system/etcd.service
# systemctl show etcd.service --property ExecStart --no-pager

ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd --force-new-cluster"
----
+
Then, restart the *etcd* service:
+
----
# systemctl daemon-reload
# systemctl start etcd
----

. Verify the *etcd* service started correctly, then re-edit the
*_/usr/lib/systemd/system/etcd.service_* file and remove the
`--force-new-cluster` option:
+
----
# sed -i '/ExecStart/s/ --force-new-cluster//' /usr/lib/systemd/system/etcd.service
# systemctl show etcd.service --property ExecStart --no-pager

ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd"
----

. Restart the *etcd* service, then verify the etcd cluster is running correctly
and displays {product-title}'s configuration:
+
----
# systemctl daemon-reload
# systemctl restart etcd
----

[[cluster-restore-multiple-member-etcd-clusters]]
== Cluster Restore for Multiple-member etcd Clusters

When using an external etcd host, you must first restore the etcd backup
by creating a new, single node etcd cluster. If using external etcd with
multiple members, you must then also add any additional etcd members to the
cluster one by one.

Choose a system to be the initial etcd member, and restore its etcd backup and
configuration:

. Run the following on the etcd host:
+
----
# ETCD_DIR=/var/lib/etcd/
# mv $ETCD_DIR /var/lib/etcd.orig
# cp -Rp /var/lib/origin/etcd-backup-<timestamp>/ $ETCD_DIR
# chcon -R --reference /var/lib/etcd.orig/ $ETCD_DIR
# chown -R etcd:etcd $ETCD_DIR
----

. Restore your *_/etc/etcd/etcd.conf_* file from backup or *_.rpmsave_*.

. Depending on your environment, follow the instructions for
xref:backup-containerized-etcd-deployments[Containerized etcd Deployments] or
xref:backup-non-containerized-etcd-deployments[Non-Containerized etcd
Deployments].

[[backup-containerized-etcd-deployments]]
=== Containerized etcd Deployments

. Create the new single node cluster using etcd's `--force-new-cluster`
option. You can do this with a long, complex command using the values from
*_/etc/etcd/etcd.conf_*, or you can temporarily modify the *systemd* unit file
and start the service normally.
+
To do so, edit the *_/etc/systemd/system/etcd_container.service_* file, and add
`--force-new-cluster`:
+
----
# sed -i '/ExecStart=/s/$/ --force-new-cluster/' /etc/systemd/system/etcd_container.service

ExecStart=/usr/bin/docker run --name etcd --rm -v \
/var/lib/etcd:/var/lib/etcd:z -v /etc/etcd:/etc/etcd:ro --env-file=/etc/etcd/etcd.conf \
--net=host --entrypoint=/usr/bin/etcd rhel7/etcd:3.1.9 --force-new-cluster
----
+
Then, restart the *etcd* service:
+
----
# systemctl daemon-reload
# systemctl start etcd_container
----

. Verify the *etcd* service started correctly, then re-edit the
*_/etc/systemd/system/etcd_container.service_* file and remove the
`--force-new-cluster` option:
+
----
# sed -i '/ExecStart=/s/ --force-new-cluster//' /etc/systemd/system/etcd_container.service

ExecStart=/usr/bin/docker run --name etcd --rm -v /var/lib/etcd:/var/lib/etcd:z -v \
/etc/etcd:/etc/etcd:ro --env-file=/etc/etcd/etcd.conf --net=host \
--entrypoint=/usr/bin/etcd rhel7/etcd:3.1.9
----

. Restart the *etcd* service, then verify the etcd cluster is running correctly
and displays {product-title}'s configuration:
+
----
# systemctl daemon-reload
# systemctl restart etcd_container
# etcdctl --cert-file=/etc/etcd/peer.crt \
--key-file=/etc/etcd/peer.key \
--ca-file=/etc/etcd/ca.crt \
--peers="https://172.16.4.18:2379,https://172.16.4.27:2379" \
ls /
----

. If you have additional etcd members to add to your cluster, continue to
xref:adding-addtl-etcd-members[Adding Additional etcd Members].
Otherwise, if you only want a single node external etcd, continue to
xref:bringing-openshift-services-back-online[Bringing {product-title}
Services Back Online].

[[backup-non-containerized-etcd-deployments]]
=== Non-Containerized etcd Deployments

. Create the new single node cluster using etcd's `--force-new-cluster`
option. You can do this with a long, complex command using the values from
*_/etc/etcd/etcd.conf_*, or you can temporarily modify the *systemd* unit file
and start the service normally.
+
To do so, edit the *_/usr/lib/systemd/system/etcd.service_* file, and add
`--force-new-cluster`:
+
----
# sed -i '/ExecStart/s/"$/ --force-new-cluster"/' /usr/lib/systemd/system/etcd.service
# systemctl show etcd.service --property ExecStart --no-pager

ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd --force-new-cluster"
----
+
Then restart the *etcd* service:
+
----
# systemctl daemon-reload
# systemctl start etcd
----

. Verify the *etcd* service started correctly, then re-edit the
*_/usr/lib/systemd/system/etcd.service_* file and remove the
`--force-new-cluster` option:
+
----
# sed -i '/ExecStart/s/ --force-new-cluster//' /usr/lib/systemd/system/etcd.service
# systemctl show etcd.service --property ExecStart --no-pager

ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd"
----

. Restart the *etcd* service, then verify the etcd cluster is running correctly
and displays {product-title}'s configuration:
+
----
# systemctl daemon-reload
# systemctl restart etcd
# etcdctl --cert-file=/etc/etcd/peer.crt \
--key-file=/etc/etcd/peer.key \
--ca-file=/etc/etcd/ca.crt \
--peers="https://172.16.4.18:2379,https://172.16.4.27:2379" \
ls /
----

. If you have additional etcd members to add to your cluster, continue to
xref:adding-addtl-etcd-members[Adding Additional etcd Members].
Otherwise, if you only want a single node external etcd, continue to
xref:bringing-openshift-services-back-online[Bringing {product-title}
Services Back Online].

[[adding-addtl-etcd-members]]
=== Adding Additional etcd Members

To add additional etcd members to the cluster, you must first adjust the default
*localhost* peer in the `*peerURLs*` value for the first member:

. Get the member ID for the first member using the `member list` command:
+
----
# etcdctl --cert-file=/etc/etcd/peer.crt \
--key-file=/etc/etcd/peer.key \
--ca-file=/etc/etcd/ca.crt \
--peers="https://172.18.1.18:2379,https://172.18.9.202:2379,https://172.18.0.75:2379" \
member list
----

. Update the value of `*peerURLs*` using the `etcdctl member update` command by
passing the member ID obtained from the previous step:
+
----
# etcdctl --cert-file=/etc/etcd/peer.crt \
--key-file=/etc/etcd/peer.key \
--ca-file=/etc/etcd/ca.crt \
--peers="https://172.18.1.18:2379,https://172.18.9.202:2379,https://172.18.0.75:2379" \
member update 511b7fb6cc0001 https://172.18.1.18:2380
----
+
Alternatively, you can use `curl`:
+
----
# curl --cacert /etc/etcd/ca.crt \
--cert /etc/etcd/peer.crt \
--key /etc/etcd/peer.key \
https://172.18.1.18:2379/v2/members/511b7fb6cc0001 \
-XPUT -H "Content-Type: application/json" \
-d '{"peerURLs":["https://172.18.1.18:2380"]}'
----

. Re-run the `member list` command and ensure the peer URLs no longer include
*localhost*.

. Now, add each additional member to the cluster one at a time.
+
[WARNING]
====
Each member must be fully added and brought online one at a time. When adding
each additional member to the cluster, the `*peerURLs*` list must be correct for
that point in time, so it will grow by one for each member added. The `etcdctl
member add` command will output the values that need to be set in the
*_etcd.conf_* file as you add each member, as described in the following
instructions.
====

.. For each member, add it to the cluster using the values that can be found in
that system's *_etcd.conf_* file:
+
----
# etcdctl --cert-file=/etc/etcd/peer.crt \
--key-file=/etc/etcd/peer.key \
--ca-file=/etc/etcd/ca.crt \
--peers="https://172.16.4.18:2379,https://172.16.4.27:2379" \
member add 10.3.9.222 https://172.16.4.27:2380

Added member named 10.3.9.222 with ID 4e1db163a21d7651 to cluster

ETCD_NAME="10.3.9.222"
ETCD_INITIAL_CLUSTER="10.3.9.221=https://172.16.4.18:2380,10.3.9.222=https://172.16.4.27:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
----

.. Using the environment variables provided in the output of the above `etcdctl
member add` command, edit the *_/etc/etcd/etcd.conf_* file on the member system
itself and ensure these settings match.

.. Now start etcd on the new member:
+
----
# rm -rf /var/lib/etcd/member
# systemctl enable etcd
# systemctl start etcd
----

.. Ensure the service starts correctly and the etcd cluster is now healthy:
+
----
# etcdctl --cert-file=/etc/etcd/peer.crt \
--key-file=/etc/etcd/peer.key \
--ca-file=/etc/etcd/ca.crt \
--peers="https://172.16.4.18:2379,https://172.16.4.27:2379" \
member list

51251b34b80001: name=10.3.9.221 peerURLs=https://172.16.4.18:2380 clientURLs=https://172.16.4.18:2379
d266df286a41a8a4: name=10.3.9.222 peerURLs=https://172.16.4.27:2380 clientURLs=https://172.16.4.27:2379

# etcdctl --cert-file=/etc/etcd/peer.crt \
--key-file=/etc/etcd/peer.key \
--ca-file=/etc/etcd/ca.crt \
--peers="https://172.16.4.18:2379,https://172.16.4.27:2379" \
cluster-health

cluster is healthy
member 51251b34b80001 is healthy
member d266df286a41a8a4 is healthy
----

.. Now repeat this process for the next member to add to the cluster.

. After all additional etcd members have been added, continue to
xref:bringing-openshift-services-back-online[Bringing {product-title}
Services Back Online].

[[backup-restore-adding-etcd-hosts]]
== Adding New etcd Hosts

Expand Down
2 changes: 1 addition & 1 deletion admin_guide/manage_nodes.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ installation] method for instructions on running the playbook directly.

ifdef::openshift-enterprise[]
Alternatively, if you used the quick installation method, you can
xref:../install_config/install/quick_install.adoc#adding-nodes-or-reinstalling-quick[re-run
xref:../install_config/adding_hosts_to_existing_cluster.adoc#adding-nodes-or-reinstalling-quick[re-run
the installer to add nodes], which performs the same steps.
endif::[]

Expand Down
12 changes: 11 additions & 1 deletion admin_guide/managing_networking.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,16 @@ different, and the egress network policy may not be enforced as expected. In the
above example, suppose `www.foo.com` resolved to `10.11.12.13` and has a DNS TTL
of one minute, but was later changed to `20.21.22.23`. {product-title} will then
take up to one minute to adapt to these changes.
+
[NOTE]
====
The egress firewall always allows pods access to the external interface of the
node the pod is on for DNS resolution. If your DNS resolution is not handled by
something on the local node, then you will need to add egress firewall rules
allowing access to the DNS server's IP addresses if you are using domain names
in your pods. The xref:../install_config/install/quick_install.adoc#install-config-install-quick-install[default installer]
sets up a local dnsmasq, so if you are using that setup you will not need to add extra rules.
====

. Use the JSON file to create an EgressNetworkPolicy object:
+
Expand Down Expand Up @@ -1232,4 +1242,4 @@ services to include this site in their HSTS preload lists. For example, sites
such as Google can construct a list of sites that have `preload` set. Browsers
can then use these lists to determine which sites to only talk to over HTTPS,
even before they have interacted with the site. Without `preload` set, they need
to have talked to the site over HTTPS to get the header.
to have talked to the site over HTTPS to get the header.
5 changes: 3 additions & 2 deletions admin_solutions/master_node_config.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@ files define a wide range of options that can be configured on the {product-titl
xref:../architecture/infrastructure_components/kubernetes_infrastructure.adoc#master[master] and
xref:../architecture/infrastructure_components/kubernetes_infrastructure.adoc#node[nodes]. These options include overriding the default plug-ins, connecting to etcd, automatically creating service accounts, building image names, customizing project requests, configuring volume plug-ins, and much more.

== How Many Masters Do I Need?
[[master-node-config-prereq]]
== Prerequisites
For testing environments deployed via the
xref:../install_config/install/quick_install.adoc#install-config-install-quick-install[quick install], one master should be sufficient. The quick installation method should not be used for production environments.

Expand All @@ -56,7 +57,7 @@ method using Ansible, then make your configuration changes
xref:../admin_solutions/master_node_config.adoc#master-node-config-ansible[in the Ansible playbook].
- xref:../install_config/install/quick_install.adoc#install-config-install-quick-install[Quick installation]
ifdef::openshift-origin[]
or https://docs.openshift.org/latest/getting_started/administrators.html[Manual installation]
or link:https://docs.openshift.org/latest/getting_started/administrators.html[Manual installation]
endif::openshift-origin[]
method, then make your changes
xref:../admin_solutions/master_node_config.adoc#master-node-config-manual[manually in the configuration files] themselves.
Expand Down
Loading