Skip to content

Latest commit

 

History

History
448 lines (325 loc) · 22.7 KB

troubleshooting.md

File metadata and controls

448 lines (325 loc) · 22.7 KB

Troubleshooting the Operator Cert Pipeline

Table of Contents

Manual Pull Request (Red Hat Tested)

set-github-status-pending

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

set-env

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

checkout

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

validate-pr-title

Pull Request Title

When creating a pull request manually the title of your pull request must follow a predefined format.

Prefix Package Name Version
The word operator Operator package name Version in parenthesis. DO NOT use a 'v' prefix.

Note: The version string in your PR Title must match the version directory in your Operator Bundle.

Examples

operator simple-demo-operator (0.0.0)

operator hello-world-certified (1.2.3)

operator my-operator (3.2.1)

get-bundle-path

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

bundle-path-validation

Failures at this step are uncommon. Please reference this step to ensure your package name is consistent with your PR title. Also, please ensure your package name matches your Operator folder name. If the package name is consistent and you still experience a failure or error at this step, contact Red Hat Support.

certification-project-check

This step confirms that your ci.yaml contains a cert_project_id value.

content-hash

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

create-support-link-for-pr

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

get-cert-project-related-data

Make sure the cert_project_id in your ci.yaml file is formatted correctly and references the right project in your Partner Connect account.

cert_project_id: 6804256accf2367227abc887612ffc5567

Note: Do not include the ospid- prefix and remove any dashes/hyphens

Make sure you have included the Authorized GitHub Usernames in Connect See more detail here.

submission-validation

Please check the pipeline.log file to check for either of these two possible causes:

  1. You may only have one open Pull Request at a time. Attempting to open a second Pull request while one is still open will result in a failure
  2. The PR request was opened by a user who is not listed in the Project Settings page's Authorized GitHub user accounts field. Please verify that all required usernames are specified in that field, with each name separated by a comma. Read the Authorized GitHub Usernames section below for more details.

Authorized GitHub Usernames

Any GitHub username or GitHub organization used to submit a pull request must be entered in the GitHub Authorized Users field on the Project settings page in connect.redhat.com.

Fork URL User or Org
https://github.com/my-github-user/certified-operators.git my-github-user
https://github.com/my-github-organization/certified-operators.git my-github-organization
https://github.com/my-github-user/redhat-marketplace-operators.git my-github-user
https://github.com/my-github-organization/redhat-marketplace-operators.git my-github-organization

Once you have your GitHub username or organization identified follow the instructions below to add it to Connect

  1. Navigate to connect.redhat.com
  2. Click the Login button
  3. Click the Log in for technology partners button
  4. Click Product Certification > Manage certification projects
  5. Click on the Project link for your Operator Bundle Image
  6. Click on the settings tab
  7. Add your GitHub users/organizations to the Authorized GitHub user accounts field.

Auth GH Users

merge-registry-credentials

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

update-cert-project-status

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

reserve-operator-name

This step will make sure that the package name of our Operator belongs to your cert_project_id. This step will fail if another cert_project_id has laid claim to the package name used in this Pull Request.

get-supported-versions

Failures at this step may be caused by several issues:

If you also have a failure at the annotations-validation step, you should resolve the annotation-validation issues first and then retry. It's possible that this will resolve failures at this step as well.

  • Your annotations.yaml file needs to include an com.redhat.openshift.versions annotation indicating the OpenShift versions supported by your Operator.
  • The filename for your clusterserviceversion.yaml must be prefixed with your Operator's package name.
  • The value of your operators.operatorframework.io.bundle.package.v1 annotation in the annotations.yaml must equal your Operator's package name.

annotations-validation

The version used in your Pull Request title will be the version directory used to look up your annotations.yaml file. A common mistake is to use a v in the version of your PR title without also using a v prefix for the version folder in your Operator Bundle.

Pull Request Title Version directory name
operator simple-demo-operator (v1.2.3) v1.2.3
operator simple-demo-operator (1.2.3) 1.2.3

Note: If you are certifying an operator bundle for the Red Hat Marketplace (Powered by IBM), you must add two more annotations to your annotations.yaml file.

Annotation Keyword String Required Value
marketplace.openshift.io/remote-workflow: https://marketplace.redhat.com/en-us/operators/<package-name>/pricing?utm_source=openshift_console
marketplace.openshift.io/support-workflow: https://marketplace.redhat.com/en-us/operators/<package-name>/support?utm_source=openshift_console

Omitting these annotations is a common cause of failures where an annotations.yaml file worked fine in a Certified Operators Bundle but now fails Red Hat Marketplace bundle certification.

Also please ensure to double check that your package naming remains consistent throughout the metadata code, particularly in the annotations.yaml file.

digest-pinning

If you also have a failure at the annotations-validation step, you should resolve the annotation-validation issues first and then retry. It's possible that this will resolve failures at this step as well.

All images referenced in your Operator Bundle must reference SHA digest and not tags. The existence of tags in your bundle will cause a certification failure Replace all image tags with image digests.

Unpinned Example Pinned Example
quay.io/my_repo/my_image:v1.0.0 quay.io/my_repo/my_image@sha256:fd8d827d4d345ec327cb92d30086a17a2e08ba9c3163db4a25bfe2512123fd6a

In addition your clusterserviceversion.yaml must include a relatedImages section. This section should implement a format similar to the one below.

Note: This section can be created at bundle generation time by adding the --use-image-digests flag to the operator-sdk generate bundle command. If the project was scaffolded with operator-sdk, running USE_IMAGE_DIGESTS=true make bundle will also generate this section.

...
spec:
  relatedImages: 
    - name: etcd-operator 
      image: quay.io/etcd-operator/operator@sha256:d134a9865524c29fcf75bbc4469013bc38d8a15cb5f41acfddb6b9e492f556e4 
    - name: etcd-image
      image: quay.io/etcd-operator/etcd@sha256:13348c15263bd8838ec1d5fc4550ede9860fcbb0f843e48cbccec07810eebb68
...

verify-changed-directories

Your Pull Request should only add files and not modify any files that have already been merged. Make sure files changed reside in a single version directory that matches the version used in the title of your Pull Request

yaml-lint

If you also have a failure at the annotations-validation step, you should resolve the annotation-validation issues first and then retry. It's possible that this will resolve failures at this step as well.

Warnings at this step should be addressed if possible but won't result in a failure.
Errors at this step will need to be addressed. Often errors center around unexpected whitespace at the end of lines or missing newlines at the end of your yaml files.

verify-pinned-digest

See Digest Pinning

This step checks to ensure that all your container images are using SHA digests instead of tags.
This step also checks for the existence of a spec.relatedImages section in your Cluster Service Version (CSV).

More information on formatting clusterserviceversion.yaml files can be found here.

dockerfile-creation

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

build-bundle

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

generate-index

Failures at this step are uncommon.

If you are updating your Operator to be compatible with OpenShift v4.9, and your Operator was previously removed from the 4.9 Operator index, be sure you are following the guidance outlined here. A common cause of error is to use 'replaces' to replace a version of your Operator that was removed from the index.

If you still experience a failure or error at this step, contact Red Hat Support.

make-bundle-repo-public

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

build-index

Failures at this step are uncommon and if they do occur they are often transient.

Please click the Close pull request button in GitHub then click the Reopen pull request button. Closing and re-opening your Pull request will restart the Pipeline. If your PR fails at this step twice in a row please contact Red Hat Support

make-index-repo-public

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

get-ci-results-attempt

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

preflight-trigger

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

Note: There is a known issue if you Operator only supports OpenShift 4.7 or below. In this case we recommend using the CI Pipeline.

upload-artifacts

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

get-ci-results

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

link-pull-request

Failures at this step are uncommon. If you do experience a failure or error at this step, contact Red Hat Support.

verify-ci-results

Issues at this step typically point to failures in the Preflight checks which confirm your Operators adherence to the certification policy. There may be additional logs available at connect.redhat.com listed on the Test Results page for your Project.

Failures here may be caused by multiple issues

  • In your CSV (clusterserviceversion.yaml) ensure that your alm-examples annotation is correct. Look out for incorrect spacing or tabs which may inadvertently include other annotations under alm-examples
  • In alm-examples make sure all your CRs have a spec block.
  • Failure here may indicate that we were unable to deploy your Operator using the Operator Lifecycle Manager (OLM)
  • Ensure all of your related images are certified and published

query-publishing-checklist

Failures here usually point to an incomplete Checklist item. Please login to connect.redhat.com and complete all the items listed under the Project publishing checklist.

merge-pr

Failure at this step may happen if the pull request is a draft, convert the draft to a pull request and then retry. If this problem persists at this step, contact Red Hat Support.

verify-project-distribution

Failures in this step suggest a mismatch between the destination catalog (e.g "Red Hat Certified", "Red Hat Marketplace") selected during project setup on connect.redhat.com and your pull request's GitHub repository. Below lists the appropriate repositories:


Package Name

Your Operator's package name must be used consistently in three areas

Package
Name of your folder under the operators directory in your fork
Value of the operators.operatorframework.io.bundle.package.v1 annotation in annotations.yaml
Prefix of the filename for your clusterserviceversion.yaml file

Automated Pull Request (Tested on Partner Premise)

Make sure you are using the latest version of the Pipeline

As the Pipeline is updated with fixes and enhancements you want to make sure you are using the latest version.

Soft reload of the Pipeline using oc apply

git clone https://github.com/redhat-openshift-ecosystem/operator-pipelines
cd operator-pipelines
oc apply -R -f ansible/roles/operator-pipeline/templates/openshift/pipelines
oc apply -R -f ansible/roles/operator-pipeline/templates/openshift/tasks

Hard reload of the Pipeline using oc delete and oc create

git clone https://github.com/redhat-openshift-ecosystem/operator-pipelines
cd operator-pipelines
oc delete -R -f ansible/roles/operator-pipeline/templates/openshift/pipelines
oc delete -R -f ansible/roles/operator-pipeline/templates/openshift/tasks
oc create -R -f ansible/roles/operator-pipeline/templates/openshift/pipelines
oc create -R -f ansible/roles/operator-pipeline/templates/openshift/tasks

Make sure we are using Production

Partners who were a part of the alpha testing and beta testing may still be using GitHub forks of the preprod repos. Make sure you are now using the production repo for your GitHub fork.

Prod Repo is https://github.com/redhat-openshift-ecosystem/certified-operators

In addition your tkn command should reference the production environments

... any other params
--param git_repo_url=<your fork of the prod repo>
--param git_branch=main
--param env=prod
--param upstream_repo_name=redhat-openshift-ecosystem/certified-operators
... any other params

Get a clean kubeconfig

The Kubeconfig you are using may contain multiple contexts with certs for multiple clusters. If you try to use that kubeconfig as is it may cause issues on your OpenShift cluster. To make sure you are using a clean kubeconfig targeting your Openshift CI Cluster use the script below.

oc config view --flatten > new-kubeconfig
export KUBECONFIG=new-kubeconfig
oc delete secret kubeconfig
oc create secret generic kubeconfig --from-file=kubeconfig=$KUBECONFIG

Digest pinning fatal: could not read Username for 'https://github.com': No such device or address

Use GitHub SSH URL or set .gitconfig to enforce SSH

Add access to multiple container registries.

If you are leveraging multiple container registries you will likely need to provide credentials for each. The sample script below can be modified to accommodate X number of registries

This issue is most often seen in the Digest Pinning task and the Build Index task. If you are seeing errors indicating registry access issues, the following might resolve it.

#! /bin/bash

CRED1=$(echo -n "myuser1:mypasswd1" | base64)
CRED2=$(echo -n "myuser2:mypasswd2" | base64)
CRED3=$(echo -n "myuser3:mypasswd3" | base64)


cat << EOF > auth.yml
{
        "auths": {
                "registry1": {
                        "auth": "$CRED1"
                },
                "registry2": {
                        "auth": "$CRED2"
                },
                "registry3": {
                        "auth": "$CRED3"
                }
        }

}
EOF

CONFIGJSON=$(base64 auth.yml)

cat << EOF > regcred.yml
apiVersion: v1
kind: Secret
metadata:
  name: registry-dockerconfig-secret
data:
  .dockerconfigjson: $CONFIGJSON
type: kubernetes.io/dockerconfigjson
EOF

#########################################
## If adding to a local OpenShift Cluster
# oc create -f regcred.yml
##########################################
# If adding to connect.redhat.com 
# copy the contents of the regcred.yml to the 
# OpenShift Object YAML field on the Project settings page

Cannot find CSV file

Make sure the package annotation in your annotations.yaml file matches the prefix of the clusterserviceversion.yaml filename.

For example if my Operator is called simple-demo-operator then the following should be set to:

  1. metadata/annotations.yaml
...
operators.operatorframework.io.bundle.package.v1: simple-demo-operator**
...
  1. manifests/simple-demo-operator.clusterserviceversion.yaml

Get a Red Hat registry service account token

Instructions

https://access.redhat.com/RegistryAuthentication#creating-registry-service-accounts-6

Create a Registry Service Account

https://access.redhat.com/terms-based-registry/

Upon creating a Registry Service Account you will be given a username similar to the one below

username: 123456789:my-sa-acctname

And a token.

eyJhbGc.................

With the username and the token as your password you can follow the instructions for supporting multiple registries.

Verify Pinned Digest Step Fails

Digest pinning will create a pinned version of your CSV that uses SHA Digest for all images. Your manually pinned CSV must match exactly, as tested by git diff --stat, what is created by the Digest Pinning tool. Sometimes the Digest Pinning Tool may add duplicate entries using different names in the relatedImages section.

If you are using the CI Pipeline adding --param pin_digest=true will avoid this issue. If you are submitting a PR manually you may hit this issue.

404 Error when attempting to open a PR with the CI Pipeline

  • Make sure you have a secret named github-api-secret that contains a GitHub personal access token
  • Make sure the GitHub personal access token has the Repo scope selected, which should also select all scopes under Repo
repo Full control of private repositories
        repo:status Access commit status
        repo_deployment Access deployment status
        public_repo Access public repositories
        repo:invite Access repository invitations
        security_events Read and write security events