Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gluster-block: adopt to targetclid usage #203

Merged
merged 1 commit into from
Dec 16, 2019

Conversation

pkalever
Copy link
Contributor

What does this PR achieve? Why do we need it?

Improve over all creation time at scale and give a consistent create time for users.

Does this PR fix issues?

Fixes:
http://bit.ly/targetcli-create-delay

Notes for the reviewer

TODO: add targetcli version check and fall-back mechanism in absence of
required targetcli version supporting daemonize targetclid

Requires: open-iscsi/targetcli-fb#132

Signed-off-by: Prasanna Kumar Kalever [email protected]

@ghost ghost assigned pkalever Apr 10, 2019
@ghost ghost added the in progress label Apr 10, 2019
@pkalever pkalever requested review from lxbsz and amarts April 10, 2019 13:02
@pkalever
Copy link
Contributor Author

Sending this PR, just to enable others to test targetcli-fb#132 PR with gluster-block.
See description for TODO list and please add more things if you think anything else is missing.

@pkalever pkalever changed the title [WIP] gluster-block: adopt to targetclid usage gluster-block: adopt to targetclid usage Sep 17, 2019
@pkalever
Copy link
Contributor Author

@lxbsz Please take a look. Thanks!

@lxbsz
Copy link
Collaborator

lxbsz commented Sep 18, 2019

@pkalever
We may need to update the spec file too about the targetcli package version depending from which will the targetclid be supported. And also the check in the code ?

Thanks.
BRs

@pkalever
Copy link
Contributor Author

@pkalever
We may need to update the spec file too about the targetcli package version depending from which will the targetclid be supported. And also the check in the code ?

@lxbsz IMHO, having targetclid is not mandatory, hence lets just check if its exist. In case its available start it and use the daemon approach, else fall back to old approach only.

Does that explain you, why I didn't wrote up code in the capabilities[.c/.h]/versioning part aswell ?

@lxbsz
Copy link
Collaborator

lxbsz commented Oct 17, 2019

@pkalever

There is one problem, if the targetclid service died for some reason, then the create will fail:

The logs are:

[2019-10-17 09:11:45.911931] INFO: create cli request, volume=repvol blockname=block5 mpath=1 blockhosts=192.168.195.164 authmode=0 size=104857600, rbsize=0, blksize=0 [at block_create.c+940 :<block_create_cli_1_svc_st>]
[2019-10-17 09:11:45.990914] INFO: create request, volume=repvol volserver=localhost blockname=block5 blockhosts=192.168.195.164 filename=ea15ee4e-1a5b-4a13-8d06-afbc67d5b06c authmode=0 passwd= size=104857600 [at block_create.c+457 :<block_create_common>]
[2019-10-17 09:11:46.316702] ERROR: backend creation failed for: repvol/block5 [at block_svc_routines.c+547 :<blockValidateCommandOutput>]
[2019-10-17 09:11:46.316866] ERROR: Error from targetcli:

 [at block_svc_routines.c+547 :<blockValidateCommandOutput>]
[2019-10-17 09:11:46.316931] INFO: command exit code, -1 [at block_create.c+680 :<block_create_common>]
[2019-10-17 09:11:46.338513] ERROR: failed in remote create for block block5 on host 192.168.195.164 volume repvol [at block_create.c+342 :<glusterBlockCreateRemote>]
[2019-10-17 09:11:46.340327] WARNING: glusterBlockCreateRemoteAsync: return -1 failed in remote async create for block block5 on volume repvol with hosts 192.168.195.164 [at block_create.c+1089 :<block_create_cli_1_svc_st>]
[2019-10-17 09:11:46.385193] INFO: delete request, blockname=block5 filename=ea15ee4e-1a5b-4a13-8d06-afbc67d5b06c [at block_delete.c+464 :<block_delete_1_svc_st>]
[2019-10-17 09:11:46.723298] ERROR: Block 'block5' may be not loaded. [at block_svc_routines.c+111 :<blockCheckBlockLoadedStatus>]
[2019-10-17 09:11:46.729136] ERROR: Block 'block5' already deleted. [at block_svc_routines.c+154 :<blockCheckBlockLoadedStatus>]
[2019-10-17 09:11:46.744100] ERROR: failed in remote delete for block block5 on host 192.168.195.164 volume repvol [at block_delete.c+49 :<glusterBlockDeleteRemote>]
[2019-10-17 09:11:46.867015] INFO: create cli return success, volume=repvol blockname=block5 [at block_create.c+1115 :<block_create_cli_1_svc_st>]

And currently we only checking the targetclid liveness when gluster-blockd daemon's starting. After the targetclid dead, the targetcli command will stuck even after manually start the targetclid service.

Thanks.

@lxbsz
Copy link
Collaborator

lxbsz commented Oct 18, 2019

@pkalever

There is one problem, if the targetclid service died for some reason, then the create will fail:

The logs are:

[2019-10-17 09:11:45.911931] INFO: create cli request, volume=repvol blockname=block5 mpath=1 blockhosts=192.168.195.164 authmode=0 size=104857600, rbsize=0, blksize=0 [at block_create.c+940 :<block_create_cli_1_svc_st>]
[2019-10-17 09:11:45.990914] INFO: create request, volume=repvol volserver=localhost blockname=block5 blockhosts=192.168.195.164 filename=ea15ee4e-1a5b-4a13-8d06-afbc67d5b06c authmode=0 passwd= size=104857600 [at block_create.c+457 :<block_create_common>]
[2019-10-17 09:11:46.316702] ERROR: backend creation failed for: repvol/block5 [at block_svc_routines.c+547 :<blockValidateCommandOutput>]
[2019-10-17 09:11:46.316866] ERROR: Error from targetcli:

 [at block_svc_routines.c+547 :<blockValidateCommandOutput>]
[2019-10-17 09:11:46.316931] INFO: command exit code, -1 [at block_create.c+680 :<block_create_common>]
[2019-10-17 09:11:46.338513] ERROR: failed in remote create for block block5 on host 192.168.195.164 volume repvol [at block_create.c+342 :<glusterBlockCreateRemote>]
[2019-10-17 09:11:46.340327] WARNING: glusterBlockCreateRemoteAsync: return -1 failed in remote async create for block block5 on volume repvol with hosts 192.168.195.164 [at block_create.c+1089 :<block_create_cli_1_svc_st>]
[2019-10-17 09:11:46.385193] INFO: delete request, blockname=block5 filename=ea15ee4e-1a5b-4a13-8d06-afbc67d5b06c [at block_delete.c+464 :<block_delete_1_svc_st>]
[2019-10-17 09:11:46.723298] ERROR: Block 'block5' may be not loaded. [at block_svc_routines.c+111 :<blockCheckBlockLoadedStatus>]
[2019-10-17 09:11:46.729136] ERROR: Block 'block5' already deleted. [at block_svc_routines.c+154 :<blockCheckBlockLoadedStatus>]
[2019-10-17 09:11:46.744100] ERROR: failed in remote delete for block block5 on host 192.168.195.164 volume repvol [at block_delete.c+49 :<glusterBlockDeleteRemote>]
[2019-10-17 09:11:46.867015] INFO: create cli return success, volume=repvol blockname=block5 [at block_create.c+1115 :<block_create_cli_1_svc_st>]

And currently we only checking the targetclid liveness when gluster-blockd daemon's starting. After the targetclid dead, the targetcli command will stuck even after manually start the targetclid service.

Thanks.

For this I was using the old setups which is not clean:

[root@rhel3 gluster-block]# ls /usr/lib/python2.7/site-packages/rtslib
rtslib/ rtslib_fb/ rtslib_fb-2.1.63-py2.7.egg-info/ rtslib_fb-2.1.70-py2.7.egg

By cleaning the old packages in the old setup and also test it again in a new setup, all works as expected, no stuck hit any more.

Thanks.

@pkalever
Copy link
Contributor Author

@lxbsz updated this PR based on latest improvements to targetcli PR#153, please help review. Thanks!

ret = gbRunner("ps aux ww | grep -w '[t]argetclid' > /dev/null");
if (ret) {
LOG("mgmt", GB_LOG_WARNING, "targetclid not running, using targetcli");
if (GB_ASPRINTF(&global_opts, "targetcli --skip-daemon; %s", tmp) == -1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, should we also set the auto_use_daemon=false at the same time here since the targetclid is not running ?

Copy link
Contributor Author

@pkalever pkalever Oct 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lxbsz '# targetcli --skip-daemon' will internally set auto_use_daemon=false, as there is no daemon running, we cannot send any commands via daemon, hence ''# targetcli --skip-daemon' will internally bypass call to daemon although 'auto_use_daemon=true' is in action and set 'auto_use_daemon=false' via the cli route.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pkalever
Ah, yes, you are right. I misread that part of code.
Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lxbsz could you please test this and approve ? Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pkalever
Before I had test it and it works. I thought we need to wait the targetcli_fb patches, which were not totally finished yet and the '--skip-daemon' option may change ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lxbsz
That is right, we need to wait for targetcli pathes for sure, just wanted to make sure we keep this patch ready till then.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pkalever Sure, once that PR is done I will test it again and approve it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lxbsz the targetcli improvements PR is merged now, could you please test and review this patch. Thanks!

Copy link
Collaborator

@lxbsz lxbsz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested it and this looks good to me.
Thanks.

@pkalever
Copy link
Contributor Author

Tested this and it works for me as well. Merging now.

@pkalever pkalever merged commit 62473a1 into gluster:master Dec 16, 2019
@pkalever pkalever deleted the targetclid branch May 13, 2020 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants