Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/types: Add cross-platform Networking.MachineCIDR #983

Merged

Conversation

wking
Copy link
Member

@wking wking commented Jan 3, 2019

The concept was not platform-specific, and it's simpler to track the cross-platform value in a single, generic location.

Spun off from #792; see the thread starting here.

CC @abhinavdahiya.

@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 3, 2019
@wking wking force-pushed the platform-agnostic-machine-cidr branch from 544cf18 to 7156171 Compare January 3, 2019 09:27
@wking
Copy link
Member Author

wking commented Jan 3, 2019

tf-fmt:

$ git fetch https://github.com/openshift/installer.git pull/983/head
fatal: unable to access 'https://github.com/openshift/installer.git/': Could not resolve host: github.com

/retest

@wking wking force-pushed the platform-agnostic-machine-cidr branch from 7156171 to f27c67b Compare January 3, 2019 20:03
@@ -96,7 +96,10 @@ type Networking struct {
// Type is the network type to install
Type netopv1.NetworkType `json:"type"`

// ServiceCIDR is the ip block from which to assign service IPs
// MachineCIDR is the IP address space from which to assign machine IPs.
MachineCIDR ipnet.IPNet `json:"machineCIDR"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about MachineNetwork to match the convention that we are moving towards https://github.com/openshift/api/blob/master/config/v1/types_network.go#L26-L40

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about MachineNetwork to match the convention...

Are you ok with me breaking convention with the other properties we have here in Networking now (which use CIDR)? If we're going to make changes to the install-config, now would be the time to do that sort of thing. Personally, I think the fact that these are properties of a Networking structure is enough namespacing, so if we don't feel a need to point out CIDR-ness, I'd be fine with just:

type Networking struct {
  Clusters []netopv1.ClusterNetwork `json:"clusters,omitempty"`
  Machine ipnet.IPNet `json:"machine"`
  Service ipnet.IPNet `json:"service"`
  Type netopv1.NetworkType `json:"type"`
}

@wking wking force-pushed the platform-agnostic-machine-cidr branch from f27c67b to 7aee83e Compare January 3, 2019 23:40
The concept was not platform-specific, and it's simpler to track the
cross-platform value in a single, generic location.
@wking wking force-pushed the platform-agnostic-machine-cidr branch from 7aee83e to 31952f1 Compare January 4, 2019 19:41
@wking
Copy link
Member Author

wking commented Jan 4, 2019

Rebased around #982 with 7aee83e -> 31952f1.

@crawford
Copy link
Contributor

crawford commented Jan 4, 2019

This approach looks good to me.

/approve

@wking
Copy link
Member Author

wking commented Jan 4, 2019

images:

info: Manifests will be extracted to /tmp/release-image-0.0.1-2019-01-04-195325372869831
error: unable to connect to image repository registry.svc.ci.openshift.org/ci-op-hlnzwq9h/stable@sha256:7e7e9074ae67ef2da6d80bb27880af271e0ca2043b563c9773933738a525adfd: Get https://registry.svc.ci.openshift.org/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2019/01/04 19:53:49 Container release in pod release-latest failed, exit code 1, reason Error
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:110: decoding init error from pipe caused \"read parent: connection reset by peer\""

/retest

@wking
Copy link
Member Author

wking commented Jan 4, 2019

e2e-aws:

2019/01/04 21:09:09 Running pod e2e-aws
level=fatal msg="failed to fetch Terraform Variables: failed to load asset \"Install Config\": invalid \"install-config.yaml\" file: networking.machineCIDR: Invalid value: ipnet.IPNet{IPNet:net.IPNet{IP:net.IP(nil), Mask:net.IPMask(nil)}}: must use IPv4"

Looks like I need to wait for #902 or cludge something in...

wking added a commit to wking/openshift-release that referenced this pull request Jan 4, 2019
Preparing for openshift/installer@31952f15 (pkg/types: Add
cross-platform Networking.MachineCIDR, 2019-01-03,
openshift/installer#983).  Once that lands we can drop the
vpcCIDRBlock and NetworkCIDRBlock properties.
@sdodson
Copy link
Member

sdodson commented Jan 4, 2019

Does this filter down to anything that would need to support multiple CIDR blocks?

@wking
Copy link
Member Author

wking commented Jan 4, 2019

Does this filter down to anything that would need to support multiple CIDR blocks?

The closest would probably using this for the libvirt domain's network here, and choose IPs from it here. But on all of our platforms, there's one network (e.g. the AWS VPC) that holds all the machines, so I expect we'll be fine with this until we get to stretch clusters or federation.

@wking
Copy link
Member Author

wking commented Jan 4, 2019

openshift/release#2506 landed.

/retest

@wking
Copy link
Member Author

wking commented Jan 4, 2019

Checking via https://api.ci.openshift.org/console/project/ci-op-bpcyl265/browse/pods/e2e-aws?tab=logs (because I'm the PR author), this is going to fail with at least:

fail [github.com/openshift/origin/test/extended/deployments/deployments.go:612]: Expected
    <string>: --> pre: Running hook pod ...
    test pre hook executed
    --> pre: Success
    --> Scaling custom-deployment-1 to 2
    --> Reached 50%
    Halfway
    --> pre: Hook pod already succeeded
to contain substring
  <string>: Finished
...

failed: (1m46s) 2019-01-04T23:24:47 "[Feature:DeploymentConfig] deploymentconfigs with custom deployments [Conformance] should run the custom deployment steps [Suite:openshift/conformance/parallel/minimal]"

From here, that test has been flaky for update payload promotion as well. I'll kick it again once it returns.

@crawford
Copy link
Contributor

crawford commented Jan 7, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 7, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: crawford, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wking
Copy link
Member Author

wking commented Jan 7, 2019

e2e-aws:

Flaky tests:

[sig-storage] Dynamic Provisioning DynamicProvisioner should test that deleting a claim before the volume is provisioned deletes the volume. [Suite:openshift/conformance/parallel] [Suite:k8s] [Suite:openshift/smoke-4]
[sig-storage] Dynamic Provisioning Invalid AWS KMS key should report an error and create no PV [Suite:openshift/conformance/parallel] [Suite:k8s]

Writing JUnit report to /tmp/artifacts/junit/junit_e2e_20190107-200258.xml

Error: 2 fail, 522 pass, 97 skip (27m37s)

/retest

@wking
Copy link
Member Author

wking commented Jan 7, 2019

e2e-aws:

Flaky tests:

[Feature:DeploymentConfig] deploymentconfigs with test deployments [Conformance] should run a deployment to completion and then scale to zero [Suite:openshift/conformance/parallel/minimal]
[sig-auth] ServiceAccounts should allow opting out of API token automount  [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]

Writing JUnit report to /tmp/artifacts/junit/junit_e2e_20190107-212827.xml

Error: 2 fail, 522 pass, 97 skip (24m39s)

/retest

@wking
Copy link
Member Author

wking commented Jan 8, 2019

e2e-aws:

Flaky tests:

[Feature:DeploymentConfig] deploymentconfigs with test deployments [Conformance] should run a deployment to completion and then scale to zero [Suite:openshift/conformance/parallel/minimal]
[sig-auth] ServiceAccounts should allow opting out of API token automount  [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]

Writing JUnit report to /tmp/artifacts/junit/junit_e2e_20190107-231944.xml

Error: 2 fail, 522 pass, 97 skip (25m39s)

Same failures as last time? Hmm.

/retest

@openshift-merge-robot openshift-merge-robot merged commit 87a44e7 into openshift:master Jan 8, 2019
@wking
Copy link
Member Author

wking commented Jan 8, 2019

Hooray, finally got through CI 🎉

@wking wking deleted the platform-agnostic-machine-cidr branch January 8, 2019 02:02
@dgoodwin
Copy link
Contributor

dgoodwin commented Jan 8, 2019

This change broke Hive, whenever possible please let us know if a non-backward compat change is made to install config. Will update to fix.

@wking
Copy link
Member Author

wking commented Jan 8, 2019

This change broke Hive, whenever possible please let us know if a non-backward compat change is made to install config.

I pinged @vrutkovs about the change here (although I could have done a better job emphasizing breaking-ness). Which other users/teams want to be CCed for future breaks? (although we're almost through the time when we're allowed to break this)

@dgoodwin
Copy link
Contributor

dgoodwin commented Jan 8, 2019

Things are crazy and stuff gets missed, I can understand, but yeah hopefully all of this can stabilize soon.

tomassedovic added a commit to imain/ocp-doit that referenced this pull request Jan 8, 2019
This pull request removed the platform-specific
`openstack.NetworkCIDRBlock` and friends in favour of a cross-platform
`networking.machineCIDR`:

openshift/installer#983

Deployments using the old NetworkCIDRBlock are failing now.
wking added a commit to wking/openshift-release that referenced this pull request Jan 10, 2019
Now that openshift/installer@31952f15 (pkg/types: Add cross-platform
Networking.MachineCIDR, 2019-01-03, openshift/installer#983) has
landed, we can drop the vpcCIDRBlock and NetworkCIDRBlock properties.
This completes the transition started by 2a9a6a3
(ci-operator/templates/openshift: Set machineCIDR, 2019-01-04, openshift#2506).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants