Skip to content
This repository has been archived by the owner on Oct 22, 2024. It is now read-only.

Direct nvdimm mode #133

Merged
merged 8 commits into from
Feb 1, 2019
Merged

Direct nvdimm mode #133

merged 8 commits into from
Feb 1, 2019

Conversation

okartau
Copy link
Contributor

@okartau okartau commented Jan 15, 2019

combining some changes here, related to getting direct-nvdimm mode working

@okartau
Copy link
Contributor Author

okartau commented Jan 17, 2019

The code here is now in such shape that tests pass:

  • in Unified mode (both lifecycle and sanity) in VM using qemu-emulated NVDIMM device
  • in Unified mode (both lifecycle and sanity) on host with real NVDIMM memory

Next I tried cluster-mode using system in test/ with 4xVM with qemu-device.
Here I see failure in NodeStageVolume while trying mkfs.ext4:

mkfs failed: mke2fs 1.44.2 (14-May-2018)
/dev/pmem0.4: Not enough space to build proposed filesystem while setting up superblock

Namespace of 8GB had been created correctly on that VM.
When I try mkfs.ext4 on a VM, it works OK.

When I attach to running pmem-csi container, I can see same failure from mkfs.ext4.
Look to devices shows the difference which is likely behind the issue:
/dev/pmem0.4 appears as plain file inside container instead of block device:
In VM:

root@host-1 ~ # ls -l /dev/pmem0.4 
brw-rw---- 1 root disk 259, 0 Jan 17 12:34 /dev/pmem0.4

In container:

/go # ls -l /dev/pmem0.4 
-rw-r--r--    1 root     root          4096 Jan 17 12:33 /dev/pmem0.4

What causes /dev/pmem0.4 becoming a file instead of device,
so that this happens in devicemode=direct of plugin?
Same code has worked when we use devicemode=LVM.

@okartau
Copy link
Contributor Author

okartau commented Jan 24, 2019

tested to work:

  • in Unified mode in single host, util/lifecycle and util/sanity
  • in VM-cluster using make start and deployment examples, my-csi-app with v1.13.2, Clear=27420

@okartau okartau requested review from avalluri and pohly January 24, 2019 11:44
@okartau okartau force-pushed the direct-nvdimm-mode branch from 25cba82 to 947a19c Compare January 25, 2019 09:50
@okartau
Copy link
Contributor Author

okartau commented Jan 28, 2019

renamed both flushDevice and clearDevice to start with capital, to make those public

@okartau okartau force-pushed the direct-nvdimm-mode branch 5 times, most recently from 4ec79dd to 846ed2a Compare January 31, 2019 11:41
@okartau
Copy link
Contributor Author

okartau commented Jan 31, 2019

I had to put back double dealing with size alignment as we need to do it in different style in LVM2 and direct mode.
I also re-arranged and combined scattered-around commits to smaller number of logical change units.

Use error.Wrap to expose error msg from inner call. Note that
there may be multiple errors in the loop, but we can capture
only the last one using Wrap. Earlier ones are logged.
runtime-deps.csv: Adding pkg/errors
dev.manager: moved FlushDevice to separate file pmd-util.go
FlushDevice: added argument how many blocks to clear/shred:
so that we can optionally use this function in the role
where nullify() has been previously.
Both device managers:ndctl,lvm use deviceFlush same way
via common wrapper function.
Device manager: clear start of block device after volume creation:
Even though we clear data area after device deletion,
let's reduce chances even further that old data remains there
and becomes recognized as file system.
nullify() has been unreliable, so we implemented different
approach to clear device in Device Manager side.
We keep code and call point for reference as commented out, just in case
Caused by how libndctl creates namespaces, we have to adjust
our creation request.
libndctl v63 creates 1GB smaller size than we request,
if alignment is set to 1GB, and fails to create if request
is smaller than 2 GB. We have to pass sanity testing where small
volumes are created as well.
In devicemode:LVM2, we want to allocate as much as we can, but
we don't care about sizes.
In devicemode:direct, we want the size to be not less what we want.
In both cases, we want 1GB alignment.
After recent cleanup in driver, driver does not create directory,
means we need to paths tester script.
lifecycle script: make size larger 1G, change default erase=true.
Unified mode test: move run_driver into util, create direct-nvdimm variant
renamed scripts to express the devicemode in script name.
There is also source of diagrams (program 'dia' was used).
New yaml file specifies use of driver in direct mode.
Two pre-stages are skipped, devicemode arg is added.
We change existing pmem-csi.yaml to be symlink to
original deployment manifest that we rename to be lvm specific,
to maintain existing use cases pointing to it
and reduce parallel documenting needs.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants