You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
We are using ceph disks only.
Sometimes all disks from vm (+ disk with os) are detach from vm.
Problem is in our region cluster (user api cluster -> region cluster -> compute cluster). Where machinepoollet delete all disks on region cluster and it is propagated into compute cluster.
This empty desired volumes are later used in function getExistingIRIVolumesForMachine which will call detach volumes because desired volumes are empty.
Expected behavior
Volumes aren't deleted, if volumes on user api level doesn't have deletetion timestamp and they weren't detached.
Additional context
In our lab was bug trigger little randomly and we weren't able to simulate it always with standard attaching/detaching volumes into vm.
The text was updated successfully, but these errors were encountered:
Observation from side:
This issue is happening in case of update scenario, when all IRI machine along with IRI volume was available, then some problem with one single/some volumes(volumes without access or secret) happens, this IRI volume list is recalculate but this logic has a problem and it is returning empty list. Because of which all the disks are getting detached in later flow. Returning correct desired list of IRI volumes should fix this problem.
Describe the bug
We are using ceph disks only.
Sometimes all disks from vm (+ disk with os) are detach from vm.
Problem is in our region cluster (user api cluster -> region cluster -> compute cluster). Where machinepoollet delete all disks on region cluster and it is propagated into compute cluster.
As you can see in logs machinepoollet load volumes from user api cluster properly but inside of reconcile logic for volume it returned empty desired volumes slice in line: https://github.com/ironcore-dev/ironcore/blob/main/poollet/machinepoollet/controllers/machine_controller_volume.go#L315
This empty desired volumes are later used in function getExistingIRIVolumesForMachine which will call detach volumes because desired volumes are empty.
Problem code is in function prepareIRIVolumes on line https://github.com/ironcore-dev/ironcore/blob/main/poollet/machinepoollet/controllers/machine_controller_volume.go#L233
If count of iriVolumes is different as count of volumes in spec, function won't return any desired volumes.
To Reproduce
Ceph volumes without status or secret are skipped in function https://github.com/ironcore-dev/ironcore/blob/main/poollet/machinepoollet/controllers/machine_controller_volume.go#L209
Expected behavior
Volumes aren't deleted, if volumes on user api level doesn't have deletetion timestamp and they weren't detached.
Additional context
In our lab was bug trigger little randomly and we weren't able to simulate it always with standard attaching/detaching volumes into vm.
The text was updated successfully, but these errors were encountered: