Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host local storage support #1076

Merged
merged 25 commits into from
Oct 1, 2024
Merged

Conversation

torchiaf
Copy link
Collaborator

@torchiaf torchiaf commented Jul 19, 2024

Summary

PR Checklist

  • Is this a multi-tenancy feature/bug?
    • Yes, the relevant RBAC changes are at:
  • Do we need to backport changes to the old Rancher UI, such as RKE1?
    • Yes, the relevant PR is at:
  • Are backend engineers aware of UI changes?
    • Yes, the backend owner is:

Related Issue harvester/harvester#6057
Related Issue harvester/harvester#5953

Occurred changes and/or fixed issues

Technical notes summary

Areas or cases that should be tested

Areas which could experience regressions

Screenshot/Video

  • Add a LVM disk

image

  • Add LVM storage class

image

@torchiaf torchiaf force-pushed the 6057-local-path-storage branch from a42a608 to 0c9ef1d Compare July 19, 2024 15:27
@a110605 a110605 added Draft PR The PR still no ready New Feature New feature for up coming release labels Aug 26, 2024
@torchiaf torchiaf force-pushed the 6057-local-path-storage branch 3 times, most recently from bf1340d to dedcf22 Compare September 11, 2024 17:15
@torchiaf torchiaf force-pushed the 6057-local-path-storage branch from 11818f9 to 5e5c898 Compare September 18, 2024 13:13
@torchiaf torchiaf marked this pull request as ready for review September 18, 2024 13:13
@torchiaf torchiaf force-pushed the 6057-local-path-storage branch from 5e5c898 to 0592d66 Compare September 18, 2024 13:28
@torchiaf torchiaf removed the Draft PR The PR still no ready label Sep 18, 2024
Copy link

mergify bot commented Sep 20, 2024

This pull request is now in conflict. Could you fix it @torchiaf? 🙏

@torchiaf torchiaf force-pushed the 6057-local-path-storage branch from c3f64da to c53b2ff Compare September 22, 2024 14:11
@torchiaf torchiaf requested a review from tserong September 22, 2024 14:14
@torchiaf torchiaf force-pushed the 6057-local-path-storage branch 2 times, most recently from 9273c79 to 28d3fab Compare September 22, 2024 16:47
@tserong
Copy link

tserong commented Sep 23, 2024

I've done some initial testing of this, and provisioning and unprovisioning LHv1 and LHv2 disks seems to work, but I've found two things so far that need tweaking:

  1. This PR sets a storageclass parameter named engineVersion with value LonghornV1 or LonghornV2, but it should instead be dataEngine with value v1 or v2 (see https://raw.githubusercontent.com/longhorn/longhorn/v1.7.1/examples/v2/storageclass.yaml for an example of a V2 storage class).

  2. spec.fileSystem.provisioned is deprecated in favour of spec.provision, but it seems we're still setting spec.fileSystem.provisioned here. If I use this PR to provision a Longhorn V2 disk for example, then check the BD CR, I get this:

> kubectl -n longhorn-system get bds/989754e4e66edadfd3390974a1aba3f8 -o yaml 
[...]
spec:
  devPath: /dev/sdb
  fileSystem:
    mountPoint: ""
    provisioned: true
  nodeName: harvester-node-0
  provision: false
  provisioner:
    longhorn:
      engineVersion: LonghornV2
[...]

...but it should be:

spec:
  devPath: /dev/sdb
  fileSystem:
    mountPoint: ""
  nodeName: harvester-node-0
  provision: true
  provisioner:
    longhorn:
      engineVersion: LonghornV2

@torchiaf torchiaf force-pushed the 6057-local-path-storage branch from 28d3fab to bcef064 Compare September 23, 2024 11:48
@torchiaf
Copy link
Collaborator Author

Thanks @tserong , the code is now updated to handle the v1/v2 engineVersion in Storage Class page and use the new spec.provision field in blockDevice crd in Disks assignement.

Copy link

mergify bot commented Sep 24, 2024

This pull request is now in conflict. Could you fix it @torchiaf? 🙏

@tserong
Copy link

tserong commented Sep 24, 2024

Thanks @torchiaf, the spec.provision field is working correctly now.

The StorageClass is not quite right - the v1/v2 value is correct, but the parameter name engineVersion needs to be changed to dataEngine (sorry if this wasn't clear earlier, and I know it's probably a pain that this is a different name than what's in the block device CR).

I also have a question: previously there was a "force formatted" checkbox for Longhorn V1:

image

...but this isn't present anymore, although there's still a tip about it:

image

Is this correct?


I also have a problem provisioning Longhorn V1 disks. They seem to add successfully, but shortly thereafter, they aren't mounted and are unschedulable. I suspect this is actually a problem in node-disk-manager. I'll report back once I've figured out what's going on with this.

image

@tserong
Copy link

tserong commented Sep 24, 2024

...oh also, one more thing: After a disk is provisioned, do we need to show what provisioner is used when viewing the disks on that node? Because there's no indication of that currently - you just see the disk like this:

image

@Vicente-Cheng, WDTY? Also, I haven't tested what's shown here in the LVM case.

@torchiaf torchiaf force-pushed the 6057-local-path-storage branch from bcef064 to 0cecf90 Compare September 24, 2024 08:35
@torchiaf
Copy link
Collaborator Author

@tserong I've fixed the dataEngine parameter, thanks.

For the other points:

  • We don't show the 'Force Formatted' checkbox if the disk has already been formatted. This mechanism is unchanged from master:
    • We show up the Force Formatted warning message if any of the new disks is formatted -> blockDevice.spec.fileSystem.forceFormatted. Could you verify this value? I can't replicate it in my environment.

image

  • Not sure what was the root cause of provisioning errors, let me know if any API parameter from the UI is wrong.

@Vicente-Cheng
Copy link

Vicente-Cheng commented Sep 24, 2024

...oh also, one more thing: After a disk is provisioned, do we need to show what provisioner is used when viewing the disks on that node? Because there's no indication of that currently - you just see the disk like this:

image

@Vicente-Cheng, WDTY? Also, I haven't tested what's shown here in the LVM case.

I checked with the latest changes. It looks like @torchiaf already added the provisioner fields.
截圖 2024-09-25 上午1 22 41

I thought the current display looked good to me. WDYT? @tserong

We show up the Force Formatted warning message if any of the new disks is formatted -> blockDevice.spec.fileSystem.forceFormatted. Could you verify this value? I can't replicate it in my environment.

I thought we could keep that, but only for the v1 engine.
For v2 engine, we should not let users select any formatted options.

I also have a problem provisioning Longhorn V1 disks. They seem to add successfully, but shortly thereafter, they aren't mounted and are unschedulable. I suspect this is actually a problem in node-disk-manager. I'll report back once I've figured out what's going on with this.

I will check this... It is not related to UI issue. The CRD fields looks good.
Fixed, PR here: harvester/node-disk-manager#139

BTW, I checked the whole LVM flow, and everything looks fine. Thanks!

@tserong
Copy link

tserong commented Sep 25, 2024

Thanks @Vicente-Cheng, @torchiaf! Will re-test this morning.

@torchiaf torchiaf force-pushed the 6057-local-path-storage branch from f2b29a9 to ba16fd2 Compare September 29, 2024 17:20
@torchiaf torchiaf force-pushed the 6057-local-path-storage branch from ba16fd2 to 21cbfa6 Compare September 29, 2024 17:25
@torchiaf
Copy link
Collaborator Author

torchiaf commented Sep 29, 2024

@tserong @Vicente-Cheng I 've fixed the accessMode property in Virtual Machine and Volumes create pages, following this rule:

- LonghronV1: RWX
- LonghornV2, Lvm: RWO

Copy link

@tserong tserong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LHv2 changes are all behaving correctly in my testing. Thanks for all your work on this @torchiaf!

Copy link

@Vicente-Cheng Vicente-Cheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with LVM part (with addon enable/vg creation/volume creation)
Thanks!

@a110605
Copy link
Collaborator

a110605 commented Sep 30, 2024

Are Volume Group Name and Volume Group Type required when creating Storage Class ? cc @Vicente-Cheng
I can successfully create one even NOT choose any volume group name / type

Screenshot 2024-09-30 at 8 59 01 PM

@a110605
Copy link
Collaborator

a110605 commented Sep 30, 2024

Seems we don't clean the previous disk setting when deleting disk in host edit page. I need to press save first and edit again to modify the disk provisoner. Is that the limitation from backend ? @Vicente-Cheng

remove_disk.webm

Delete unsaved disk and then add disk back gets previous settings. Do we want to fix this behavior ?

@a110605
Copy link
Collaborator

a110605 commented Sep 30, 2024

When adding new disk, if I didn't choose any volume group and got error. I'm unable to modify anything.

Screenshot 2024-09-30 at 9 31 16 PM

@a110605
Copy link
Collaborator

a110605 commented Sep 30, 2024

Does Longhorn V2 volume support resize ? cc @tserong @Vicente-Cheng
I got the LHv2 volume stuck in resizing status while LHv1 volume is good on resizing.
If not, we should disable from frontend or block from backend webhook.

Screenshot 2024-09-30 at 9 34 50 PM Screenshot 2024-09-30 at 9 35 02 PM

@a110605
Copy link
Collaborator

a110605 commented Sep 30, 2024

Does LHv2 volume support Export to Image ? @tserong @Vicente-Cheng
Seems not, we should hide the expot to image action for LHv2 volume.

Screenshot 2024-09-30 at 9 41 59 PM

@a110605
Copy link
Collaborator

a110605 commented Sep 30, 2024

Maybe we should disable LHv2 volume to Take a snapshot. cc @tserong , @Vicente-Cheng
There is no restore action for LHv2 volume snapshot to do restore.

Screenshot 2024-09-30 at 9 52 19 PM

@Vicente-Cheng
Copy link

Vicente-Cheng commented Oct 1, 2024

Hi @a110605,
Thanks for these questions, here are some update:

Are Volume Group Name and Volume Group Type required when creating Storage Class ? cc @Vicente-Cheng
I can successfully create one even NOT choose any volume group name / type

We already have webhook on v0.1.3. I thought you were using the v0.1.2 on the lab environment.
The latest behavior would block the empty vgName/type/nodeName by webhook like below:
截圖 2024-10-01 上午10 17 56

Delete unsaved disk and then add disk back gets previous settings. Do we want to fix this behavior ?

As I mentioned before, this behavior is related to UI behavior.
Delete the disk w/o saving would not change any CR, so adding this disk back will get the previous settings.
So, If we would like to delete the disk w/o saving and adding back with an empty provisioner, we need to handle it on the UI side.

When adding new disk, if I didn't choose any volume group and got error. I'm unable to modify anything.

We can remove and re-add it to handle this user error, but I will also create a webhook to check it.

Does Longhorn V2 volume support resize ? cc @tserong @Vicente-Cheng
I got the LHv2 volume stuck in resizing status while LHv1 volume is good on resizing.
If not, we should disable from frontend or block from backend webhook.

I think v2 volume does not support resize. (REF: longhorn/longhorn#8022)
So, as you mentioned, we should block this operation. cc @tserong

Does LHv2 volume support Export to Image ? @tserong @Vicente-Cheng
Seems not, we should hide the expot to image action for LHv2 volume.

+1, cc @tserong

Maybe we should disable LHv2 volume to Take a snapshot. cc @tserong , @Vicente-Cheng
There is no restore action for LHv2 volume snapshot to do restore.

Looks reasonable to me if we cannot restore the v2 volume. cc @tserong

@tserong
Copy link

tserong commented Oct 1, 2024

Agree with @Vicente-Cheng to disable expand and export to image. As for snapshots, I have created a LHv2 volume snapshot and (seem to have) successfully restored it to a new volume, so IMO we should leave this enabled.

My one question is: can we do this extra little bit of work separately later, or should we do it as part of this PR, bearing in mind we really want to get this PR merged as soon as possible for RC1? (My preference is to merge as-is if there's no other issues)

@Vicente-Cheng
Copy link

Agree with @Vicente-Cheng to disable expand and export to image. As for snapshots, I have created a LHv2 volume snapshot and (seem to have) successfully restored it to a new volume, so IMO we should leave this enabled.

My one question is: can we do this extra little bit of work separately later, or should we do it as part of this PR, bearing in mind we really want to get this PR merged as soon as possible for RC1? (My preference is to merge as-is if there's no other issues)

Yeah, I agree!
I thought the most work looked great. We could fix others after the rc1.
We could get more feedback if we can ship this UI change with RC1.
WDYT? cc @a110605, @torchiaf, @bk201

@torchiaf
Copy link
Collaborator Author

torchiaf commented Oct 1, 2024

Let's open a new Issue/PR to change the remaining functionalities.

@torchiaf torchiaf merged commit 9cf422b into harvester:master Oct 1, 2024
7 checks passed
@torchiaf
Copy link
Collaborator Author

torchiaf commented Oct 1, 2024

@mergify backport release-harvester-v1.4

Copy link

mergify bot commented Oct 1, 2024

backport release-harvester-v1.4

✅ Backports have been created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New Feature New feature for up coming release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants