Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gluster-blockd crashes when we give invalid block-host #15

Closed
pranithk opened this issue Apr 28, 2017 · 1 comment
Closed

gluster-blockd crashes when we give invalid block-host #15

pranithk opened this issue Apr 28, 2017 · 1 comment
Assignees
Labels
Milestone

Comments

@pranithk
Copy link
Member

pranithk commented Apr 28, 2017

These are the steps I did to recreate the bug consistently:

[root@localhost block-meta]# gluster-block create v1/block3 ha 1 google.com 1GB --json-pretty
[root@localhost block-meta]# echo $?
255 <<--- Please note that there was no output.
[root@localhost block-meta]# ls # I was searching why this issue is happening for these two block-files
block1  block3  meta.lock
[root@localhost block-meta]# cat block3 # they have CONFIGINPROGRESS for google.com which will eventually fail and then gluster-blockd crashes
VOLUME: v1
GBID: 3798836d-1e3e-4b70-8e05-e8b187baaa1c
SIZE: 1073741824
HA: 1
ENTRYCREATE: INPROGRESS
ENTRYCREATE: SUCCESS
google.com: CONFIGINPROGRESS
[root@localhost block-meta]# cat block1
VOLUME: v1
GBID: ed260a6f-1550-4d36-b0c6-b35983eb76d0
SIZE: 1073741824
HA: 1
ENTRYCREATE: INPROGRESS
ENTRYCREATE: SUCCESS
google.com: CONFIGINPROGRESS
[root@localhost block-meta]# diff block1 block3
2c2
< GBID: ed260a6f-1550-4d36-b0c6-b35983eb76d0
---
> GBID: 3798836d-1e3e-4b70-8e05-e8b187baaa1c
[root@localhost block-meta]# gluster-block delete v1/block3
[root@localhost block-meta]# ls
block1  block3  meta.lock
[root@localhost block-meta]# systemctl status gluster-blockd
● gluster-blockd.service - Gluster block storage utility
   Loaded: loaded (/usr/lib/systemd/system/gluster-blockd.service; disabled; vendor preset: di
   Active: active (running) since Wed 2017-04-26 16:09:21 IST; 2min 13s ago
 Main PID: 7902 (gluster-blockd)
    Tasks: 13 (limit: 4915)
   CGroup: /system.slice/gluster-blockd.service
           └─7902 /usr/sbin/gluster-blockd

Apr 26 16:09:21 localhost.localdomain systemd[1]: Started Gluster block storage utility.
[root@localhost block-meta]# gluster-block delete v1/block3
[root@localhost block-meta]# systemctl status gluster-blockd
● gluster-blockd.service - Gluster block storage utility
   Loaded: loaded (/usr/lib/systemd/system/gluster-blockd.service; disabled; vendor preset: di
   Active: failed (Result: signal) since Wed 2017-04-26 16:11:37 IST; 10s ago
  Process: 7902 ExecStart=/usr/sbin/gluster-blockd (code=killed, signal=SEGV)
 Main PID: 7902 (code=killed, signal=SEGV)

Apr 26 16:09:21 localhost.localdomain systemd[1]: Started Gluster block storage utility.
Apr 26 16:11:37 localhost.localdomain systemd[1]: gluster-blockd.service: Main process exited,
Apr 26 16:11:37 localhost.localdomain systemd[1]: gluster-blockd.service: Unit entered failed 
Apr 26 16:11:37 localhost.localdomain systemd[1]: gluster-blockd.service: Failed with result '

Crash shows the following bt:
Thread 1 (Thread 0x7f320d44e700 (LWP 7723)):
#0  0x00007f320e46c046 in strlen () from /lib64/libc.so.6
#1  0x00007f320e46bd7e in strdup () from /lib64/libc.so.6
#2  0x00007f320e9adcab in json_object_new_string () from /lib64/libjson-c.so.2
#3  0x00005654cb1c25b9 in block_create_cli_format_response (blk=blk@entry=0x7f320d44d550, 
    errCode=255, errMsg=<optimized out>, savereply=0x7f32000ca4a0, reply=0x7f3200004f10)
    at block_svc_routines.c:975
#4  0x00005654cb1c2a38 in block_create_cli_1_svc (blk=blk@entry=0x7f320d44d550, 
    rqstp=rqstp@entry=0x7f320d44d7b0) at block_svc_routines.c:1185
#5  0x00005654cb1be1bb in gluster_block_cli_1 (rqstp=0x7f320d44d7b0, transp=0x7f3200002bd0)
    at block_svc.c:70
#6  0x00007f320e5192a1 in svc_getreq_common () from /lib64/libc.so.6
#7  0x00007f320e5193e7 in svc_getreq_poll () from /lib64/libc.so.6
#8  0x00007f320e51cd01 in svc_run () from /lib64/libc.so.6
#9  0x00005654cb1be055 in glusterBlockCliThreadProc (vargp=<optimized out>)
    at gluster-blockd.c:101
#10 0x00007f320f7176ca in start_thread () from /lib64/libpthread.so.0
#11 0x00007f320e4e6f6f in clone () from /lib64/libc.so.6
(gdb) fr 3
#3  0x00005654cb1c25b9 in block_create_cli_format_response (blk=blk@entry=0x7f320d44d550, 
    errCode=255, errMsg=<optimized out>, savereply=0x7f32000ca4a0, reply=0x7f3200004f10)
    at block_svc_routines.c:975
975	    json_object_object_add(json_obj, "IQN",
(gdb) l
970	    return;
971	  }
972	
973	  if (blk->json_resp) {
974	    json_obj = json_object_new_object();
975	    json_object_object_add(json_obj, "IQN",
976	                           json_object_new_string(savereply->iqn));
977	
978	    json_array = json_object_new_array();
979	
(gdb) p savereply->iqn
$1 = 0x0
(gdb) p errCode
$2 = 255
(gdb) p errMsg
$3 = <optimized out>

This test was done with an older patchset so the line numbers may not match.

@pranithk pranithk self-assigned this Apr 28, 2017
@pranithk pranithk added the bug label Apr 28, 2017
@pranithk pranithk added this to the 0.2 milestone Apr 28, 2017
@pkalever
Copy link
Contributor

pkalever commented May 5, 2017

The SIGPIPE is issued due to TimeOut.
The connect takes too long for returning for invalid IP addresses, the delay is much higher that the CLI RPC Timeout, hence the CLI returns before the daemon returns.

The issue can be fixed by increasing the CLI RPC clnt_call() TIMEOUT.

This should actually be handled like:

struct timeval tv;
CLIENT *cl;

cl=clnt_create("somehost", SOMEPROG, SOMEVERS, "tcp");
if (cl=NULL) {
  exit(1);
}
tv.tv_sec=60; /* change timeout to 1 minute */
tv.tv_usec=0;
clnt_control(cl, CLSET_TIMEOUT, &tv);

But there is a bug in sun RPC which ignores TIMEOUT set using cln_control.
See https://lists.gnu.org/archive/html/bug-glibc/2000-10/msg00095.html

Hence using regex to override default TIMEOUT generated in rpc code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants