Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Having trouble with the KeppAlive setting, After network disconnects it takes so long for the server to detect disconnection #73

Closed
bo-er opened this issue Mar 9, 2021 · 1 comment

Comments

@bo-er
Copy link

bo-er commented Mar 9, 2021

What version of Gmqtt are you using?

I am using the latest version which is v0.3.0

What did you do?

Here is the client file I am using to produce this error:

package main

import (
	"fmt"
	"log"
	"os"
	"os/signal"
	"syscall"

	mqtt "github.com/eclipse/paho.mqtt.golang"
)

func main() {
	sigs := make(chan os.Signal, 1)
	signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM)
	topic1 := "test/topic"

	go testPublishEvery10Seconds("sws1", topic1)

	// go testPublishEvery10Seconds("sws1", topic2)
	<-sigs
}

func testPublishEvery10Seconds(clientID, topic string) {
	clientOptions := mqtt.NewClientOptions()
	clientOptions.AddBroker("tcp://localhost:18888")
	clientOptions.SetClientID(clientID)
	clientOptions.SetUsername("admin")
	clientOptions.SetPassword("ilovesws")
	clientOptions.SetProtocolVersion(4)
	clientOptions.SetKeepAlive(30)
	clientOptions.SetCleanSession(false)
	var f mqtt.MessageHandler = func(client mqtt.Client, msg mqtt.Message) {
		fmt.Printf("TOPIC: %s\n", msg.Topic())
		fmt.Printf("MSG: %s\n", msg.Payload())
	}

	clientOptions.SetDefaultPublishHandler(f)

	c := mqtt.NewClient(clientOptions)

	if token := c.Connect(); token.Wait() && token.Error() != nil {
		log.Fatalf("Error on Client.Connect(): %v", token.Error())
	}

}

I am using the above code to connect to a gmqtt server, at the moment I get a connection I disconnect the Internet(I am using mac and disconnect means I close the WIFI connection. And it would take so long (sometime it might take half and two minutes) for the server to detect a disconnection!

You can see that I have set KeepAlive to 30 seconds so this should not happen right?

And by the way, after searching google I found that actually KeepAlive is for a client to send packets to the server and makes sure the connection doesn't break up. So right now I am kind of confused, actually there is a max_keepalive setting in config.yml.

And here is my config.yml:

listeners:
    - address: ":18888"
      websocket:
        path: "/"
    # - address: "localhost:18889"
    #   websocket:
    #     path: "/"


mqtt:
  session_expiry: 1m
  message_expiry: 1m
  max_packet_size: 200
  server_receive_maximum: 65535
  max_keepalive: 60 # unlimited
  topic_alias_maximum: 0 # 0 means not Supported
  subscription_identifier_available: true
  wildcard_subscription_available: false
  shared_subscription_available: true
  maximum_qos: 2
  retain_available: false
  max_queued_messages: 1000
  max_inflight: 32
  max_awaiting_rel: 100
  queue_qos0_messages: true
  delivery_mode: overlap # overlap or onlyonce
  allow_zero_length_clientid: true

log:
  level: debug # debug | info | warning | error

And here is my onBasicAuth hook:

var OnBasicAuth server.OnBasicAuth = func(ctx context.Context, client server.Client, req *server.ConnectRequest) error {
	username := string(req.Connect.Username)
	password := string(req.Connect.Password)
	isvalid := validation(username, password)
	if !isvalid {
		// check the client version, return a compatible reason code.
		switch client.Version() {
		case packets.Version5:
			return codes.NewError(codes.BadUserNameOrPassword)
		case packets.Version311:
			return codes.NewError(codes.V3BadUsernameorPassword)
		}
	}
	// return nil if pass authentication.
	return nil
}

What did you expect to see?

I expect the server to detect a disconnection in 45 seconds or 60 seconds, not minutes.

What did you see instead?

I saw an error but this error occurred so late!

ERROR	server/client.go:240	connection lost	{"client_id": "sws1", "error": "read tcp x.x.x.x:18888->x.x.x.x:16752: read: connection timed out"}

Many thanks

If you go through my issue, I would like to give you a big thank you!🙇🏻‍♂️

@bo-er
Copy link
Author

bo-er commented Mar 9, 2021

After struggling for days, with the help of gmqtt ownerDrmagicE I finally noticed what went wrong!
It is because of an error lives in a comment from this mqtt repository => paho.mqtt.golang, the bugged comments goes like this and hasn't been fixed for years🤷🏻‍♂️

// SetKeepAlive will set the amount of time (in seconds) that the client
// should wait before sending a PING request to the broker. This will
// allow the client to know that a connection has not been lost with the
// server.
func (o *ClientOptions) SetKeepAlive(k time.Duration) *ClientOptions {
	o.KeepAlive = int64(k / time.Second)
	return o
}

It says SetKeepAlive will set the amount of time (in seconds) ! No that's wrong! it's not seconds but nanoseconds! There is a big difference!

that is the paho.mqtt.golang client's keepalive setting should be like this:

clientOptions.SetKeepAlive(time.Duration(30)*time.Second)

not

clientOptions.SetKeepAlive(30)

And it took me days to figure it out.(With the help of gmqtt's main contributor, who is very very generous and kind and smart too!)
Actually there has been an issue raised by another victim of this typo: Docs for SetKeepAlive on connection options are incorrect

OMG this is so poisonous and without DrmagicE's guide I would be mentally tortured for who knows how many more days or weeks!

@bo-er bo-er closed this as completed Mar 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant