Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV in TopicImpl.cpp partitioner_cb_trampoline #162

Closed
ihersht opened this issue Nov 4, 2014 · 25 comments
Closed

SIGSEGV in TopicImpl.cpp partitioner_cb_trampoline #162

ihersht opened this issue Nov 4, 2014 · 25 comments

Comments

@ihersht
Copy link

ihersht commented Nov 4, 2014

Wrong line (SIGSEGV when keylen == 0 , on random)

std::string key(static_cast<const char *>(keydata), keylen);

I understand that C++ interface is not official but have never seen that bad C++.

@edenhill
Copy link
Contributor

edenhill commented Nov 4, 2014

Hi,

can you provide a backtrace from gdb and 'p keydata'?

Thanks

@edenhill
Copy link
Contributor

edenhill commented Nov 4, 2014

Some more questions:

  • is this using rdkafka_example_cpp or your own application?
  • what platform are you on?
  • what compiler?

@ihersht
Copy link
Author

ihersht commented Nov 4, 2014

is this using rdkafka_example_cpp or your own application YES
-bash-3.2$ uname -r
2.6.18-371.1.2.el5xen
-bash-3.2$ gcc --version
gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54)

At the same time the questions are irrelevant because it is basic C++ bug.

From: Magnus Edenhill [mailto:[email protected]]
Sent: Tuesday, November 04, 2014 4:12 PM
To: edenhill/librdkafka
Cc: Igor Hersht
Subject: Re: [librdkafka] SIGSEGV in TopicImpl.cpp partitioner_cb_trampoline (#162)

Some more questions:

  • is this using rdkafka_example_cpp or your own application?
  • what platform are you on?
  • what compiler?


Reply to this email directly or view it on GitHubhttps://github.com//issues/162#issuecomment-61715251.

@edenhill
Copy link
Contributor

edenhill commented Nov 4, 2014

I can't reproduce this on gcc 4.9.1.
Can you elaborate on what the bug is exactly?

@ihersht
Copy link
Author

ihersht commented Nov 4, 2014

SIGSEGV is not always reproducible ( that is why it is very bad). Ask somebody who knows basic C++.
File TopicImpl.cpp function partitioner_cb_trampoline , line
std::string key(static_cast<const char *>(keydata), keylen);
Creating std::string with length 0 is this way is illegal.

From: Magnus Edenhill [mailto:[email protected]]
Sent: Tuesday, November 04, 2014 4:25 PM
To: edenhill/librdkafka
Cc: Igor Hersht
Subject: Re: [librdkafka] SIGSEGV in TopicImpl.cpp partitioner_cb_trampoline (#162)

I can't reproduce this on gcc 4.9.1.
Can you elaborate on what the bug is exactly?


Reply to this email directly or view it on GitHubhttps://github.com//issues/162#issuecomment-61717105.

@edenhill
Copy link
Contributor

edenhill commented Nov 4, 2014

I think it is okay to create a std::string with length 0,
I can't find any proof for the opposite, can you?

Try this program and see if it crashes too, otherwise I think there is another problem and I would need the stacktrace from GDB.

#include <string>
#include <iostream>

int main (void) {
  std::string s((const char *)"hi", 0);
  std::cout << "str " << s << " len " << s.length() << std::endl;
  return 0;
}

@ihersht
Copy link
Author

ihersht commented Nov 4, 2014

Yes it is okay to create a std::string with length 0.

Take a look at C++ standars (2003)
Table 40
data() points at the first element of an
allocated copy of the array whose
first element is pointed at by s
size() n

You cannot have array of size 0

From: Magnus Edenhill [mailto:[email protected]]
Sent: Tuesday, November 04, 2014 4:42 PM
To: edenhill/librdkafka
Cc: Igor Hersht
Subject: Re: [librdkafka] SIGSEGV in TopicImpl.cpp partitioner_cb_trampoline (#162)

I think it is okay to create a std::string with length 0,
I can't find any proof for the opposite, can you?

Try this program and see if it crashes too, otherwise I think there is another problem and I would need the stacktrace from GDB.

#include

#include

int main (void) {

std::string s((const char *)"hi", 0);

std::cout << "str " << s << " len " << s.length() << std::endl;

return 0;

}


Reply to this email directly or view it on GitHubhttps://github.com//issues/162#issuecomment-61719866.

@ihersht
Copy link
Author

ihersht commented Nov 4, 2014

From: Igor Hersht
Sent: Tuesday, November 04, 2014 4:56 PM
To: 'edenhill/librdkafka'
Subject: RE: [librdkafka] SIGSEGV in TopicImpl.cpp partitioner_cb_trampoline (#162)

Yes it is okay to create a std::string with length 0.

Take a look at C++ standars (2003)
Table 40
data() points at the first element of an
allocated copy of the array whose
first element is pointed at by s
size() n

You cannot have array of size 0

From: Magnus Edenhill [mailto:[email protected]]
Sent: Tuesday, November 04, 2014 4:42 PM
To: edenhill/librdkafka
Cc: Igor Hersht
Subject: Re: [librdkafka] SIGSEGV in TopicImpl.cpp partitioner_cb_trampoline (#162)

I think it is okay to create a std::string with length 0,
I can't find any proof for the opposite, can you?

Try this program and see if it crashes too, otherwise I think there is another problem and I would need the stacktrace from GDB.

#include

#include

int main (void) {

std::string s((const char *)"hi", 0);

std::cout << "str " << s << " len " << s.length() << std::endl;

return 0;

}


Reply to this email directly or view it on GitHubhttps://github.com//issues/162#issuecomment-61719866.

@edenhill
Copy link
Contributor

edenhill commented Nov 4, 2014

Can you please provide the output from gdb command 'bt' from one of these crashes?

@ihersht
Copy link
Author

ihersht commented Nov 4, 2014

I cannot now (I didn’t set ulimit). I fixed the bug in my version of the library.
I am done.

From: Magnus Edenhill [mailto:[email protected]]
Sent: Tuesday, November 04, 2014 5:05 PM
To: edenhill/librdkafka
Cc: Igor Hersht
Subject: Re: [librdkafka] SIGSEGV in TopicImpl.cpp partitioner_cb_trampoline (#162)

Can you please provide the output from gdb command 'bt' from one of these crashes?


Reply to this email directly or view it on GitHubhttps://github.com//issues/162#issuecomment-61723250.

@edenhill
Copy link
Contributor

edenhill commented Nov 4, 2014

Can you share your patch?

@edenhill edenhill added invalid and removed invalid labels Nov 4, 2014
@ihersht
Copy link
Author

ihersht commented Nov 5, 2014

I remove my local fix in the library (sorry but I just don’t have time to work on edenhill library) and rewrite producer::send with key in my application to call partitioner in my application. This is just way around the edenhill bug, but it is OK with me.
I am done

@edenhill
Copy link
Contributor

edenhill commented Nov 5, 2014

Glad to hear that! :)

@edenhill edenhill closed this as completed Nov 5, 2014
@ihersht
Copy link
Author

ihersht commented Nov 5, 2014

I don’t get it. Why you closed the valid serious bug.

@edenhill
Copy link
Contributor

edenhill commented Nov 5, 2014

Because I have nothing to go on:

  • no crash backtrace
  • unable to reproduce
  • vague error description (I cant find any statement that a zero length std::string is illegal)

@ihersht
Copy link
Author

ihersht commented Nov 5, 2014

You have C++ standard. You have your code which is wrong according to C++ standard. Take a look at C++ standard 21.3.1 table 40.
“data() points at the first element of an
allocated copy of the array whose
first element is pointed at by s
size() n
If n == 0 (when one try to use producer::send with key) then you trying to copy array of size 0. This is illegal to have array of size 0.
The fix is very simple just to do special case If (n == 0)

Yes. SIGSEGV is not always reproducible. That is way this is one of the worse bug possible.

From: Magnus Edenhill [mailto:[email protected]]
Sent: Wednesday, November 05, 2014 10:46 AM
To: edenhill/librdkafka
Cc: Igor Hersht
Subject: Re: [librdkafka] SIGSEGV in TopicImpl.cpp partitioner_cb_trampoline (#162)

Because I have nothing to go on:

  • no crash backtrace
  • unable to reproduce
  • vague error description (I cant find any statement that a zero length std::string is illegal)


Reply to this email directly or view it on GitHubhttps://github.com//issues/162#issuecomment-61827314.

@edenhill
Copy link
Contributor

edenhill commented Nov 5, 2014

I'm sorry but I dont think that is the problem, I'm quite sure C++ supports empty std::strings.

So I ask you again, could you please provide a gdb backtrace?

@ihersht
Copy link
Author

ihersht commented Nov 5, 2014

Yes. C++ supports empty std::strings.
C++ does not support arrays of size 0 This is the bug.

I gave up rapiding myself and sending pointers to C++ standard. I am talking about one thing, you are talking about something else.
Useless. Done.

From: Magnus Edenhill [mailto:[email protected]]
Sent: Wednesday, November 05, 2014 12:03 PM
To: edenhill/librdkafka
Cc: Igor Hersht
Subject: Re: [librdkafka] SIGSEGV in TopicImpl.cpp partitioner_cb_trampoline (#162)

I'm sorry but I dont think that is the problem, I'm quite sure C++ supports empty std::strings.

So I ask you again, could you please provide a gdb backtrace?


Reply to this email directly or view it on GitHubhttps://github.com//issues/162#issuecomment-61843061.

@hulaxiaodai
Copy link

is this bug fixed? I can reproduce it every。 below is my backtrace:

#0 0x00000000004aff97 in partitioner_cb_trampoline (rkt=0x7fffb8005760, keydata=0x0, keylen=0, partition_cnt=1, rkt_opaque=0x7fffb81ff050, msg_opaque=0x0) at src/TopicImpl.cpp:59
#1 0x000000000058dbca in rd_kafka_msg_partitioner (rkt=rkt@entry=0x7fffb8005760, rkm=rkm@entry=0x7fffb81fd610, do_lock=do_lock@entry=1) at rdkafka_msg.c:745
#2 0x000000000058dcfe in rd_kafka_msg_new (rkt=0x7fffb8005760, force_partition=, msgflags=2, payload=0x7fffb81e54d8 "2018-05-16 19:57:42.937 mingjiadeMacBook-Pro.local com.sankuai.log.sdk - - - ", 'a' <repeats 123 times>..., len=386, key=0x0, keylen=0,
msg_opaque=0x0) at rdkafka_msg.c:260
#3 0x000000000049216f in RdKafka::ProducerImpl::produce (this=0x7fffb8001730, topic=0x7fffb81ff050, partition=-1, msgflags=2, payload=0x7fffb81e54d8, len=386, key=0x0, msg_opaque=0x0) at src/ProducerImpl.cpp:102
#4 0x00000000004708db in KafkaSink::run (this=0xb63b80) at src/kafka_sink.cpp:208
#5 0x00000000004735d1 in boost::_mfi::mf0<void, KafkaSink>::operator() (this=0xb677e0, p=0xb63b80) at ./include/boost/bind/mem_fn_template.hpp:49
#6 0x00000000004733c6 in boost::_bi::list1<boost::_bi::value<KafkaSink*> >::operator()<boost::_mfi::mf0<void, KafkaSink>, boost::_bi::list0> (this=0xb677f0, f=..., a=...) at ./include/boost/bind/bind.hpp:253
#7 0x00000000004732d7 in boost::_bi::bind_t<void, boost::_mfi::mf0<void, KafkaSink>, boost::_bi::list1<boost::_bi::value<KafkaSink*> > >::operator() (this=0xb677e0) at ./include/boost/bind/bind_template.hpp:20
#8 0x0000000000473229 in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf0<void, KafkaSink>, boost::_bi::list1<boost::_bi::value<KafkaSink*> > >, void>::invoke (function_obj_ptr=...)
at ./include/boost/function/function_template.hpp:153
#9 0x0000000000466f4a in boost::function0::operator() (this=0xb677d8) at ./include/boost/function/function_template.hpp:767
#10 0x000000000046c22e in boost::detail::thread_data<boost::function<void ()> >::run() (this=0xb67620) at ./include/boost/thread/detail/thread.hpp:117
#11 0x00000000004fb7cb in thread_proxy ()
#12 0x00007ffff6ed0aa1 in start_thread () from /lib64/libpthread.so.0
#13 0x00007ffff6c1d93d in clone () from /lib64/libc.so.6

@edenhill
Copy link
Contributor

In gdb, can you do p key and p keydata and p keylen ?

@hulaxiaodai
Copy link

this is gdb print:
(gdb) p keydata
$1 = (const void *) 0x0
(gdb) p keylen
$2 = 0

and I produce msg use the code in rdkafka_example.cpp,i set key "null",
this is my code:

for (it = mess->begin(); it != mess->end(); it++) {
logentry_ptr_t msg = *it;
string topic_str = "org.";
topic_str += msg->category;

                RdKafka::Topic *topic = RdKafka::Topic::create(producer, topic_str,
                                                               topic_conf, errstr);
                if (!topic) {
                    std::cerr << "Failed to create topic: " << errstr << std::endl;
                    exit(1);
                }

                RdKafka::ErrorCode resp =
                        producer->produce(topic,partition,
                                          RdKafka::Producer::RK_MSG_COPY /* Copy payload */,
                                          const_cast<char *>(msg->message.c_str()), msg->message.size(),
                                          NULL, NULL);
                if (resp != RdKafka::ERR_NO_ERROR)
                    std::cerr << "% Produce failed: " <<
                              RdKafka::err2str(resp) << std::endl;
                else
                    std::cerr << "% Produced message (" << msg->message.size() << " bytes)" <<
                              std::endl;
            }
            producer->poll(0);

@edenhill
Copy link
Contributor

What if you print p topicimpl->partitioner_cb_->partitioner_cb ?
It could be that your partitioner instance has gone out of scope (e.g., stack allocated)

@hulaxiaodai
Copy link

(gdb) p topicimpl->partitioner_cb_->partitioner_cb
Cannot take address of method partitioner_cb.

maybe I should initialize partitioner in the same thread?

@hulaxiaodai
Copy link

it works for me, initialize partitioner in the same thread. but is every produce should own a partitioner instance?

@edenhill
Copy link
Contributor

You can use the same partitioner instance for multiple producers, just make sure it doesn't go out of scope / gets deleted / freed prior to all producer instances being destroyed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants