-
Notifications
You must be signed in to change notification settings - Fork 865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Not freeing system sockets when calling srt_close() #1822
Comments
I encountered this problem early in the development of SRT (circa 2013) and then implemented a workaround in my wrapper lib that was so effective that I forgot about it... until today. Basically, when |
The problem we got is that the user might select other protocols or features we do not control on that previously used SRT port so we cant fix this issue in any wrapper. We got a user interface where the user can select many different types of protocols where SRT is one of them. The user selects another protocol on the same port usually.. Then our system fails due to SRT still binds the port we did free. Anyhow there are ugly workarounds to fix stuff like that but we boil down to how should it work and my view is that when closing a socket it should block the call until the system socket is released. |
@jeandube I fixed this as one of the first thing I did in SRT :) See That was, however, only the first problem, which was blocking the listener and made it unable to close one listener and immediately create another binding to the same port. Now it works, but it simply borrows the multiplexer from the closed-but-not-yet-recycled previous listener socket before it would lose all users and would be deleted. What is likely required now is that the socket is disconnected from its multiplexer immediately when requested deletion. This is actually a bigger problem because in order to disconnect the socket from the multiplexer - and associated sender and receiver queues - first all activities concerning this socket must be stopped. The biggest problem actually is with these weird secondary lists that collect sockets to be operated with from another thread, which are not exactly well synchronized with socket deletion. |
@andersc Just note that what is really required here (and possible to be done) is to early disconnect the SRT socket from the multiplexer, not forcefully destroy the multiplexer. Multiplexer will be of course destroyed when disconnected from the last socket, but you cannot be ever sure it will happen. The multiplexer is shared between sockets in two cases:
To make sure that the multiplexer is destroyed, the application must track all sockets and know what to do to make it happen: for a caller socket do not bind explicitly (or at least close all sharing sockets) and for a listener, if you close the listener, you need to also close all sockets that were accepted off it. Only then will the listener's multiplexer be destroyed. |
So my workaround was maybe not so effective after all. "C" workaround vs. C++ fix. |
This has actually nothing to do with C++. This is the UDT's design, which made the socket garbage-collected after being closed, while some activities still need to continue in other threads. But I think it should be possible to move the multiplexer-disconnection code to happen immediately when closing and actually in every place where the socket is moved from |
@ethouris Don't you extend the definition of the Multiplexer too much here: I thought that all accepted sockets will share the local port but share the Multiplexer only if the are with the same peer. |
There's no other way in SRT to share the local port as sharing the Multiplexer. Multiplexer owns the Channel, Channel owns the UDP socket, UDP socket owns the port when binding with default SO_REUSEADDR=false. Multiplexer also owns the sender and receiver queues. When you send a packet over a socket it lands in its sender queue - it doesn't matter that the same queue contain packets to be send to different peers, nor it matters that the receiver queue gets packets incoming through this same UDP socket from different peers. |
I also encountered the same problem. Our scenario is that the listener lives in a service process. Every time the socket returned by srt_accept() is closed, we observe that the internal resources are not released, so each time the following loop will cause the memory to continue to increase(about +100K ~ 200K)
This is observed with some tools(https://github.com/bytedance/memory-leak-detector.git) that after each sequence of srt_accept()->srt_close() calls, the tool complains about no closed objects
Could you please tell me if there is any effective way to avoid this problem? The focus of our attention is that the memory should not continue to increase, because it is a service process, I can't call srt_cleanup directly. In addition, I noticed the following PR, is this related to this problem, is it stable enough to merge into the master state? |
In addition, after merging #1829 (some minor errors need to be corrected due to version changes), the memory still continues to grow. |
Status UpdateAfter PR #2643 (SRT v1.5.2) the closure time has been reduced from 4.5 s to 1.5s.
|
Removing the artificial delay of 1s to close a socket makes it closed almost immediately, but still in the background thread. The Changediff --git a/srtcore/api.cpp b/srtcore/api.cpp
index 7ec8ff57..e3015a97 100644
--- a/srtcore/api.cpp
+++ b/srtcore/api.cpp
@@ -2678,7 +2678,7 @@ void srt::CUDTUnited::checkBrokenSockets()
// RcvUList
const steady_clock::time_point now = steady_clock::now();
const steady_clock::duration closed_ago = now - j->second->m_tsClosureTimeStamp;
- if (closed_ago > seconds_from(1))
+ //if (closed_ago > seconds_from(1))
{
CRNode* rnode = j->second->core().m_pRNode;
if (!rnode || !rnode->m_bOnList) Unit Test(Or use the #include <gtest/gtest.h>
#include <thread>
#include <chrono>
#include <string>
#include <map>
#ifdef _WIN32
#define INC_SRT_WIN_WINTIME // exclude gettimeofday from srt headers
#endif
#include "srt.h"
TEST(Listener, Restart)
{
srt_startup();
srt_setloglevel(LOG_NOTICE);
auto s = srt_create_socket();
sockaddr_in bind_sa;
memset(&bind_sa, 0, sizeof bind_sa);
bind_sa.sin_family = AF_INET;
ASSERT_EQ(inet_pton(AF_INET, "127.0.0.1", &bind_sa.sin_addr), 1);
bind_sa.sin_port = htons(5555);
EXPECT_NE(srt_bind(s, (sockaddr*)&bind_sa, sizeof bind_sa), -1);
EXPECT_NE(srt_listen(s, 5), -1);
std::this_thread::sleep_for(std::chrono::milliseconds(500));
srt_close(s);
s = srt_create_socket();
int optval = 100;
int optlen = sizeof optval;
EXPECT_NE(srt_setsockflag(s, SRTO_IPTTL, (void*)&optval, optlen), SRT_ERROR) << srt_getlasterror_str();
const auto time_start = std::chrono::steady_clock::now();
while (srt_bind(s, (sockaddr*)&bind_sa, sizeof bind_sa) == -1)
{
std::this_thread::sleep_for(std::chrono::milliseconds(500));
}
std::cerr << "Binding took " << std::chrono::duration_cast<std::chrono::milliseconds>((std::chrono::steady_clock::now() - time_start)).count() << '\n';
//EXPECT_NE(srt_bind(s, (sockaddr*)&bind_sa, sizeof bind_sa), -1);
EXPECT_NE(srt_listen(s, 5), -1);
srt_close(s);
srt_cleanup();
} |
Describe the bug
SRT got a problem freeing system-sockets the way it’s implemented.
If I start one server and ->
The system-socket is released immediately. after
srt_cleanup();
However we dynamically start and stop multiple instances of SRT servers all the time. Meaning we can’t call
srt_cleanup();
Since we in the system got other active servers while we close a server.
The problem then is that the
srt_close(socket);
seams just to mark the system-socket for deletion and your garbage collector that periodically runs is after a while actually closing the system-socket.The problem is then that programatically the user of SRT thinks the socket is free to be used but ‘bind’ will fail.
srt_close()
should close the system-socket.To Reproduce
(This is verified on Ubuntu 20.04 and MacOS 11.2.1)
There is a repo verifying this claim ->
bugtest
bugtestmain.cpp
run the bugtest executable
and in the system also a terminal
where in linux you check the process binding the port like this->
netstat -tulpen | grep 8000
Expected behavior
The bugtest application and when the application prints ->
Stopped and destroyed the server, now the port should be free.
run the setstat command.. The socket is still bound to the application for a while (3-5 seconds).
Desktop (please provide the following information):
The text was updated successfully, but these errors were encountered: