Skip to content

Commit

Permalink
Fix keyword to watch SM cluster provisioning logs and increase timeout (
Browse files Browse the repository at this point in the history
red-hat-data-services#1754)

* draft fix hive watcher

* do not print dots when process is not producing new outputs

* wait for ns to be created before looking for logs

* wait until logs are available to be fetched

* fix wait for pod

* fix wait kw call

* clean new code

* increase provisioning timeout

* extend label selector for cd deployments

* use job type label for cluster pool deployments

* fix catenation separator

* remove commented line

* fix robocop alerts

* minor change

* increase timeout further

* apply suggestions from reviews

* remove wrong return variables

* fix robocop alert

* remove debug lines
  • Loading branch information
bdattoma authored and tonyxrmdavidson committed Sep 4, 2024
1 parent 670ec5e commit 1fe06e5
Showing 1 changed file with 21 additions and 37 deletions.
58 changes: 21 additions & 37 deletions ods_ci/tasks/Resources/Provisioning/Hive/provision.robot
Original file line number Diff line number Diff line change
Expand Up @@ -150,43 +150,27 @@ Create Floating IPs
Export Variables From File ${fips_file_to_export}

Watch Hive Install Log
[Arguments] ${namespace} ${install_log_file} ${hive_timeout}=50m
WHILE True limit=${hive_timeout} on_limit_message=Hive Install ${hive_timeout} Timeout Exceeded # robotcode: ignore
${old_log_data} = Get File ${install_log_file}
${last_line_index} = Get Line Count ${old_log_data}
IF ${use_cluster_pool}
${pod} = Oc Get kind=Pod namespace=${namespace}
ELSE
${pod} = Oc Get kind=Pod namespace=${namespace}
... label_selector=hive.openshift.io/cluster-deployment-name=${cluster_name}
END
TRY
${new_log_data} = Oc Get Pod Logs name=${pod[0]['metadata']['name']} container=installer namespace=${namespace}
EXCEPT
# Hive container (OCP installer log) is not ready yet
Log To Console . no_newline=true
Sleep 10s
CONTINUE
END
# Print the new lines that were added to the installer log
@{new_lines} = Split To Lines ${new_log_data} ${last_line_index}
${lines_count} = Get length ${new_lines}
IF ${lines_count} > 0
Create File ${install_log_file} ${new_log_data}
FOR ${line} IN @{new_lines}
Log To Console ${line}
END
ELSE
${hive_pods_status} = Run And Return Rc oc get pod -n ${namespace} --no-headers | awk '{print $3}' | grep -v 'Completed'
IF ${hive_pods_status} != 0
Log All Hive pods in ${namespace} have completed console=True
BREAK
END
Log To Console . no_newline=true
Sleep 10s
END
[Arguments] ${pool_name} ${namespace} ${hive_timeout}=70m
${label_selector} = Set Variable hive.openshift.io/cluster-deployment-name=${cluster_name}
IF ${use_cluster_pool}
${label_selector} = Set Variable hive.openshift.io/clusterpool-name=${pool_name}
END
${label_selector} = Catenate SEPARATOR= ${label_selector} ,hive.openshift.io/job-type=provision
${logs_cmd} = Set Variable oc logs -f -l ${label_selector} -n ${namespace}
Wait For Pods To Be Ready label_selector=${label_selector} namespace=${namespace} timeout=5m
TRY
${return_code} = Run And Watch Command ${logs_cmd} timeout=${hive_timeout}
... output_should_contain=install completed successfully
EXCEPT
Log To Console ERROR: Check Hive Logs if present or you may have hit timeout ${hive_timeout}.
END
Should Be Equal As Integers ${return_code} ${0}
${hive_pods_status} = Run And Return Rc
... oc get pod -n ${namespace} --no-headers | awk '{print $3}' | grep -v 'Completed'
IF ${hive_pods_status} != 0
Log All Hive pods in ${namespace} have completed console=True
END
Should Contain ${new_log_data} install completed successfully
Sleep 10s reason=Let's wait some seconds before proceeding with next checks

Wait For Cluster To Be Ready
IF ${use_cluster_pool}
Expand All @@ -201,7 +185,7 @@ Wait For Cluster To Be Ready
END
${install_log_file} = Set Variable ${artifacts_dir}/${cluster_name}_install.log
Create File ${install_log_file}
Run Keyword And Ignore Error Watch Hive Install Log ${pool_namespace} ${install_log_file}
Run Keyword And Continue On Failure Watch Hive Install Log ${pool_name} ${pool_namespace}
Log Verifying that Cluster '${cluster_name}' has been provisioned and is running according to Hive Pool namespace '${pool_namespace}' console=True # robocop: disable:line-too-long
${provision_status} = Run Process
... oc -n ${pool_namespace} wait --for\=condition\=ProvisionFailed\=False cd ${clusterdeployment_name} --timeout\=15m # robocop: disable:line-too-long
Expand Down

0 comments on commit 1fe06e5

Please sign in to comment.