Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Host local storage support - Experimental #5724

Closed
bk201 opened this issue May 2, 2024 · 14 comments
Closed

[FEATURE] Host local storage support - Experimental #5724

bk201 opened this issue May 2, 2024 · 14 comments
Assignees
Labels
area/storage experimental Experimental feature (alpha) highlight Highlight issues/features kind/feature Issues that represent larger new pieces of functionality, not enhancements to existing functionality priority/0 Must be fixed in this release require/doc Improvements or additions to documentation require/HEP Require Harvester Enhancement Proposal PR require-ui/small Estimate 1-2 working days
Milestone

Comments

@bk201
Copy link
Member

bk201 commented May 2, 2024

Is your feature request related to a problem? Please describe.

Applications require large volume or high-performance local disks. Live-migration is not a concern.

Describe the solution you'd like

As a HCI solution, we can have a build-in local storage support like leveraging LVM or files.
We have experimented lvm-csi-driver in a POC environment.

Describe alternatives you've considered

Additional context

@bk201 bk201 added the kind/feature Issues that represent larger new pieces of functionality, not enhancements to existing functionality label May 2, 2024
@bk201 bk201 added this to the v1.5.0 milestone May 2, 2024
@bk201 bk201 modified the milestones: v1.5.0, v1.4.0 May 14, 2024
@innobead innobead added area/storage highlight Highlight issues/features labels May 14, 2024
@innobead innobead added priority/0 Must be fixed in this release require/HEP Require Harvester Enhancement Proposal PR area/ui Harvester UI require/doc Improvements or additions to documentation labels May 28, 2024
@SlavikCA
Copy link

I'm curious: how that feature will be different from simply using hostPath?

Is correct to say, that hostPath is file-level access and this feature will provide block-level access?

Or is it going to be file-level access, but the disc (volume group?) will be managed by the LVM CSI driver, so it will provide features such mirroring, striping, etc...?

@Vicente-Cheng
Copy link
Contributor

I'm curious: how that feature will be different from simply using hostPath?

Is correct to say, that hostPath is file-level access and this feature will provide block-level access?

Or is it going to be file-level access, but the disc (volume group?) will be managed by the LVM CSI driver, so it will provide features such mirroring, striping, etc...?

Hi @SlavikCA,

This feature will provide the block device.
With LVM, we can easily use various LVM types like mirror and stripe. Also, we would benefit from various LVM features like expansion/snapshot.

@abonillabeeche
Copy link

FYI - I had the LVM CSI driver installed on 1.3.0 and upgraded to 1.3.1 without an issue. Perhaps worth investigating is the ability to have snapshots and backup, as it is not available out of the box.
Screenshot 2024-06-19 at 9 51 06 AM

@bk201 bk201 added require-ui/small Estimate 1-2 working days and removed area/ui Harvester UI require-ui/small Estimate 1-2 working days labels Jun 24, 2024
@harvesterhci-io-github-bot
Copy link
Collaborator

GUI issue created #6057.

@abonillabeeche
Copy link

Or is it going to be file-level access, but the disc (volume group?) will be managed by the LVM CSI driver, so it will provide features such mirroring, striping, etc...?

Correct. The LVM CSI provides stripping which could be used for large volumes.

@albinsun
Copy link

Hi Guys,
Per discussed in HEP PR, Longhorn plans to support LVM-based Local Volume in v1.8 (Nov. 2024) or v1.9 (Mar. 2025) and we will use LH solution for local storage that time.

If so, why we need to make this feature by ourself now as LH will support it?
Or they are not completely overlapped?

Ref. https://github.com/longhorn/longhorn/wiki/Roadmap#longhorn-v19-march-2025

@innobead
Copy link
Contributor

innobead commented Jul 18, 2024

Hi Guys, Per discussed in HEP PR, Longhorn plans to support LVM-based Local Volume in v1.8 (Nov. 2024) or v1.9 (Mar. 2025) and we will use LH solution for local storage that time.

If so, why we need to make this feature by ourself now as LH will support it? Or they are not completely overlapped?

Ref. Wiki: Roadmap (longhorn v19 march 2025) (longhorn/longhorn)

@albinsun it's all about the priority and good question.

For Longhorn, local volume via LVM is a relatively low priority compared to ongoing v2 development. However, for Harvester, local volume support is an emerging requirement for data-intensive workloads in some cases. This is why we decided to develop an addon first to support an LVM storage driver with limited functions, allowing users to experience this and maximize the capacity of local disks before we implement complete functions such as snapshot, backup, trim, expansion, and encryption, which Longhorn supports. (I know @Vicente-Cheng is also experimenting with snapshot and expansion in the current development, which is good and okay.)

Whether we migrate to the version supported by Longhorn in the future will depend on how it will be used and adopted in Harvester first, and what features are exclusively supported by Longhorn.

cc @Vicente-Cheng @derekbit

@harvesterhci-io-github-bot
Copy link
Collaborator

harvesterhci-io-github-bot commented Sep 10, 2024

Pre Ready-For-Testing Checklist

  • If labeled: require/HEP Has the Harvester Enhancement Proposal PR submitted?
    The HEP PR is at: HEP: introduce the LVM CSI driver #5956

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at:

Test Plan:

  • Enable CSI driver
  1. create new addon with this PR: addons: add harvester-csi-driver-lvm experimental-addons#18
  2. enable addon
  • Create Storage provisioner
  1. Create Storage for lvm volumegroup (entry point: host//storage)
  2. Select the disk -> select provisioner LVM -> input the vgName -> save
  • Create StorageClass
  1. select provisioner LVM (we need to convert this to LVM on UI side)

  2. select node -> select vgName -> save

  3. select volume and use it as data volume for the VM.
    NOTE:

  4. If the VM is already running, it can only attach the LVM Volume within the same node.

  5. If the VM is stopped, it will select the corresponding node of the LVM volume when VM starting

  • Is there a workaround for the issue? If so, where is it documented?
    The workaround is at:

None

* [ ] If NOT labeled: not-require/test-plan Has the e2e test plan been merged? Have QAs agreed on the automation test case? If only test case skeleton w/o implementation, have you created an implementation issue?
- The automation skeleton PR is at:
- The automation test case PR is at:

* [ ] If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?
The compatibility issue is filed at:

@harvesterhci-io-github-bot
Copy link
Collaborator

Automation e2e test issue: harvester/tests#1515

@TachunLin
Copy link

TachunLin commented Oct 22, 2024

Verifying on v1.4.0-rc3

Result

$\color{green}{\textsf{PASS}}$ Single node: Use lvm-csi-driver storage class volume on VM $~~$
  • After created the harvester-csi-driver-lvm, can enable the harvester-csi-driver-lvm on addons page

image

  • Can create Storage for LVM volume group on Host
    image

  • Can create Storage Class for LVM provisioner
    image

  • Check the storage class yaml details contains the allowedTopologies on specific node

    # Please edit the object below. Lines beginning with a '#' will be ignored,
    # and an empty file will abort the edit. If an error occurs while saving this file will be
    # reopened with the relevant failures.
    #
    allowVolumeExpansion: false
    allowedTopologies:
    - matchLabelExpressions:
      - key: topology.lvm.csi/node
        values:
        - hp-164-seeder
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      creationTimestamp: "2024-10-22T13:38:55Z"
      name: lvm-sc
      resourceVersion: "3969422"
      uid: 12cd92ba-33bb-4e3e-bbb3-44115140a9d7
    parameters:
      type: striped
      vgName: lvm-vg
    provisioner: lvm.driver.harvesterhci.io
    reclaimPolicy: Delete
    volumeBindingMode: WaitForFirstConsumer
    
  • Can select lvm storage volume and use it as data volume for the VM.

    image

  • Check the VM is running and the lvm-sc volume is attached on it

    image

    image

    image

$\color{red}{\textsf{Failed}}$ Multiple nodes: Use lvm-csi-driver storage class volume on VM $~~$
  • Given a disk on node1 have created by lvm provisioner

image

  • When we try to create the storage class with the lvm provisioner type, but can't find the node1 on the available list

image

vokoscreenNG-2024-10-23_18-38-54.mp4
  • After we create a storage class on node2

    image

  • Then use this storage class to create a data volume in vm

  • The VM would failed to start with Unschedulable

    0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.
    

    image

    vokoscreenNG-2024-10-23_18-40-43.mp4

Test Information

  • Test Environment: Single node and three nodes baremetal machines
  • Harvester version: v1.4.0-rc3

Verify Steps

Single node: Use lvm-csi-driver storage class volume on VM
  1. Create the harvester-csi-driver-lvm from yaml in Harvester
apiVersion: harvesterhci.io/v1beta1
kind: Addon
metadata:
  name: harvester-csi-driver-lvm
  namespace: harvester-system
  labels:
    addon.harvesterhci.io/experimental: "true"
spec:
  enabled: false
  repo: https://charts.harvesterhci.io
  version: 0.1.4
  chart: harvester-csi-driver-lvm
  valuesContent: ""
  1. Enabled on Add-ons page
    image

  2. Create Storage for LVM volume group
    image

  3. First time create LVM volume group
    image

image

  1. Create StorageClass, select provisioner LVM

  2. Select node -> Select volume group name -> select volume group type and save
    image

  3. Created the storage class of LVM provisioner
    image

  4. Check the storage class yaml details contains the allowedTopologies on specific node

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
allowVolumeExpansion: false
allowedTopologies:
- matchLabelExpressions:
  - key: topology.lvm.csi/node
    values:
    - hp-164-seeder
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2024-10-22T13:38:55Z"
  name: lvm-sc
  resourceVersion: "3969422"
  uid: 12cd92ba-33bb-4e3e-bbb3-44115140a9d7
parameters:
  type: striped
  vgName: lvm-vg
provisioner: lvm.driver.harvesterhci.io
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
  1. Select volume and use it as data volume for the VM.

image

  1. Check the VM is running and the lvm-sc volume is attached on it
    image

image

image

Multiple nodes: Use lvm-csi-driver storage class volume on VM
  1. Create the harvester-csi-driver-lvm from yaml in Harvester
apiVersion: harvesterhci.io/v1beta1
kind: Addon
metadata:
  name: harvester-csi-driver-lvm
  namespace: harvester-system
  labels:
    addon.harvesterhci.io/experimental: "true"
spec:
  enabled: false
  repo: https://charts.harvesterhci.io
  version: 0.1.4
  chart: harvester-csi-driver-lvm
  valuesContent: ""
  1. Enabled on Add-ons page
    image

  2. Create Storage for LVM volume group on node0

image

  1. Create StorageClass, select provisioner LVM

  2. Select node -> Select volume group name -> select volume group type and save

image

  1. Created the storage class of LVM provisioner

image

  1. Check the storage class yaml details contains the allowedTopologies on specific node
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
allowVolumeExpansion: false
allowedTopologies:
- matchLabelExpressions:
  - key: topology.lvm.csi/node
    values:
    - node2
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2024-10-23T08:09:31Z"
  name: lvm-sc2
  resourceVersion: "257300"
  uid: 6d3a3f00-004a-4f1b-be7c-27aaf7c2591a
parameters:
  type: dm-thin
  vgName: lvm-vg2
provisioner: lvm.driver.harvesterhci.io
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
  1. Create vm, select volume and use it as data volume for the VM.

  2. Check the VM is running and the lvm-sc volume is attached on it

@Vicente-Cheng
Copy link
Contributor

root cause:

WARNING: ext4 signature detected on /dev/sdb at offset 1080. Wipe it?

The vgcreate failure is due to the target device not being clean.

It should be fixed with that PR: harvester/node-disk-manager#159
You can use v1.4.0-rc4 to test again or wipefs the corresponding device manually.

@TachunLin
Copy link

TachunLin commented Oct 24, 2024

Thanks for checking the root cause.
After using wipefs to cleanup the target disk device signature on the node machine.

$ sudo wipefs --all --force /dev/sdc

node2:~ # sudo wipefs --all --force /dev/sdc
/dev/sdc: 2 bytes were erased at offset 0x00000438 (ext4): 53 ef

And retest in the following steps:

  1. Delete the previously created storage class
  2. Remove the /dev/sdc on node2
  3. Add the /dev/sdc back on node2 given the lvm-vg2 group already created before
  4. Create a new storage class on the disk
  5. Create the new vm, add the second volume with the lvm storage class
  6. Check the vm is running and can attach the volume well

The vm can be started in running and attach the lvm volume well
image
image

@innobead innobead changed the title [FEATURE] Host local storage support [FEATURE] Host local storage support - Experimental Oct 24, 2024
@TachunLin
Copy link

TachunLin commented Oct 30, 2024

Update the retesting state of resting on three nodes v1.4.0-rc4

  • On a newly provisioned Harvester on kvm machines with extra disks.

  • After cleanup the signature with wipefs.

    rancher@harvester-node-0:~> sudo wipefs /dev/sda
    rancher@harvester-node-0:~> lsblk
    ...
    rancher@harvester-node-0:~> sudo wipefs --all --force /dev/sda
    rancher@harvester-node-0:~>
    
  • Then adding the lvm provisioner disk on host page. It prompt the error message
    image

    failed to activate volume group lvm-vg1, err: execute command 'vgchange' with args '[--activate y lvm-vg1]' failed: failed to execute: nsenter [--mount=/host/proc/4716/ns/mnt --net=/host/proc/4716/ns/net --ipc=/host/proc/4716/ns/ipc vgchange --activate y lvm-vg1], output , stderr File descriptor 7 (socket:[45102]) leaked on vgchange invocation. Parent PID 1: /usr/lib/systemd/systemd
    Volume group "lvm-vg1" not found
    Cannot process volume group lvm-vg1
    : exit status 5
    
  • As per discussed with Vicente, it a side-effect we introduce in rc4 with provisioner: ensure the device is ready before adding into volume group node-disk-manager#159.

  • We also have the fix PR Add webhook to ensure the VGStatus is active node-disk-manager#161.

  • The next step is we will verify the add disk again on the v1.4.0-rc5.

@TachunLin
Copy link

TachunLin commented Nov 7, 2024

Retest fixed on v1.4.0-rc5. Close this issue.

Test Result on single node

$\color{green}{\textsf{PASS}}$ NVME disk can be added as lvm-csi-driver disk to Host and be created as volume $~~$
  1. After enabled harvester-csi-controller-lvm addons

  2. Can correctly add the NVME disk as LVM provisioner to the host

image

  1. Can correctly create the storage class with the selected node and volume group

image

  1. VM created with LVM provisioner storage class running well

image

  1. Volume create with the LVM provisioner

image

  1. Can take a snapshot of the create vm

image

  1. Disk state
v140rc5-test-lvm:~ # lsblk -o NAME,TYPE,TRAN
NAME                                                                   TYPE TRAN
loop0                                                                  loop 
loop1                                                                  loop 
loop2                                                                  loop 
...
nvme1n1                                                                disk nvme
├─lvm--vg1-pvc--812c99b3--a774--4877--8bf5--4bda9fe1edca-real          lvm  
│ ├─lvm--vg1-pvc--812c99b3--a774--4877--8bf5--4bda9fe1edca             lvm  
│ └─lvm--vg1-lvm--snapshot--d8215eca--79dd--4650--a4be--7426d643e7d5   lvm  
└─lvm--vg1-lvm--snapshot--d8215eca--79dd--4650--a4be--7426d643e7d5-cow lvm  
  └─lvm--vg1-lvm--snapshot--d8215eca--79dd--4650--a4be--7426d643e7d5   lvm  

$\color{green}{\textsf{PASS}}$ SATA disk can be added as lvm-csi-driver disk to Host and be created as volume $~~$
  1. After enabled harvester-csi-controller-lvm addons

  2. Can correctly add the SATA disk as LVM provisioner to the host

v140rc5-test-lvm:~ # lsblk -o NAME,TYPE,TRAN
NAME                                                                   TYPE TRAN
...
sdc                                                                    disk   sata

image

  1. Can correctly create the storage class with the selected node and volume group

image

  1. VM created with LVM provisioner storage class running well

image

  1. Volume create with the LVM provisioner

image

  1. Can take a snapshot of the create vm

image

  1. Disk state
v140rc5-test-lvm:~ # lsblk -o NAME,TYPE,TRAN
NAME                                                                   TYPE TRAN
...
sdc                                                                    disk sata
├─lvm--vg2-pvc--3ea4506c--260c--48e1--8528--af4235b765a1-real          lvm  
│ ├─lvm--vg2-pvc--3ea4506c--260c--48e1--8528--af4235b765a1             lvm  
│ └─lvm--vg2-lvm--snapshot--2a8acc9d--c8fb--4d55--83bc--bc54552fd97b   lvm  
└─lvm--vg2-lvm--snapshot--2a8acc9d--c8fb--4d55--83bc--bc54552fd97b-cow lvm  
  └─lvm--vg2-lvm--snapshot--2a8acc9d--c8fb--4d55--83bc--bc54552fd97b   lvm

Test Result on multiple nodes

$\color{green}{\textsf{PASS}}$ LVM volume can be created on specific node according to the LVM storage class $~~$
  1. Can display all nodes contains LVM volume group while creating LVM storage class

image

  1. Only the volume group on n2-140rc5 would be displayed
    image

  2. VM can be created and running ion the node specified in the lvm storage class

image

image

Test Information

  • Test Environment:
    • Single node harvester on Equinix bare-metal machine (s3.xlarge.x86)
    • Two nodes kvm machines
  • Harvester version: v1.4.0-rc5

Verify Steps

Test lvm-csi-driver on single node Harvester

Refer to document

  1. Create the harvester-csi-driver-lvm from yaml in Harvester
apiVersion: harvesterhci.io/v1beta1
kind: Addon
metadata:
  name: harvester-csi-driver-lvm
  namespace: harvester-system
  labels:
    addon.harvesterhci.io/experimental: "true"
spec:
  enabled: false
  repo: https://charts.harvesterhci.io
  version: 0.1.4
  chart: harvester-csi-driver-lvm
  valuesContent: ""
  1. Enabled harvester-csi-driver-lvm on Add-ons page

image

  1. Edit Host page to add a disk in Storage

  2. Select NVEM/SATA/SCSI type of disk

  3. Select the LVM provisioner, and create a volume group lvm-vg1

image

  1. Check the disk can be correct added to the host

image

  1. Create StorageClass, select provisioner LVM

  2. Select node -> Select volume group name -> select volume group type and save

image

  1. Create a new storage class lvm-sc-nvme, select LVM provisioner

  2. Select the Node, Volume Group Name and Volume Group Type

image

  1. Check can correctly create the storage class

image

  1. Create a vm and use the storage class lvm-sc-nvme

image

  1. Check the vm can running well

image

  1. Check the volume is created with the LVM provisioner

image

  1. In the csi-driver-config section, select ⋮ > Edit Setting

  2. Add an entry with the following settings and save

  • Provisioner: Select lvm.driver.harvesterhci.io.
  • Volume Snapshot Class Name: Select lvm-snapshot.
  • Backup Volume Snapshot Class Name: Select lvm-snapshot.

image

  1. Take a snapshot of the created vm

  2. Verify that snapshot is ready to us

image

Test lvm-csi-driver on multiple nodes Harvester
    1. Create the harvester-csi-driver-lvm from yaml in Harvester
  1. Enabled harvester-csi-driver-lvm on Add-ons page

  2. Open the node1 host page -> Storage

  3. Add a disk and select the LVM provisioner, create a new volume group: lvm-vg1

image

image

  1. Open the node2 host page -> Storage

  2. Add a disk and select the LVM provisioner, create a new volume group: lvm-vg2

image

image

  1. Create a new storage class lvm-sc2 with LVM provisioner`

  2. Select node n2-140rc5

  3. Only the volume group on n2-140rc5 would be displayed

image

  1. Create a vm and add a volume, select storage class with lvm-sc2

image

  1. Check the vm can start in running state on node n2-140rc5

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage experimental Experimental feature (alpha) highlight Highlight issues/features kind/feature Issues that represent larger new pieces of functionality, not enhancements to existing functionality priority/0 Must be fixed in this release require/doc Improvements or additions to documentation require/HEP Require Harvester Enhancement Proposal PR require-ui/small Estimate 1-2 working days
Projects
None yet
Development

No branches or pull requests

8 participants