Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Three subtests fail in socket-api-tests #7

Open
musicinmybrain opened this issue Oct 19, 2023 · 5 comments
Open

Three subtests fail in socket-api-tests #7

musicinmybrain opened this issue Oct 19, 2023 · 5 comments

Comments

@musicinmybrain
Copy link

We noticed this when the mmlib package in Fedora Linux started to fail to build from source on Fedora 38, 39 (which is about to be released), and 40 (Rawhide/development). However, I can reproduce it in a fresh git checkout of the current master, f416f7a.

gh repo clone mmlabs-mindmaze/mmlib
cd mmlib
meson setup _build
meson compile -C _build
meson test -C _build --print-errorlogs
[…]
▶ 5/5 - ../tests/socket-api-tests.c:socket:getaddrinfo_valid: Assertion 'rp->ai_socktype == exp_socktype' failed FAIL          
▶ 5/5 - ../tests/socket-api-tests.c:socket:getaddrinfo_error: Assertion 'mm_getaddrinfo("localhost", "ssh", &hints, &res) == -1' failed FAIL          
▶ 5/5 - ../tests/socket-api-tests.c:socket:create_invalid_sockclient: Assertion 'mm_create_sockclient("ssh://localhost:10") == -1' failed FAIL          
5/5 unit api tests        FAIL            11.00s   418/421 subtests passed
>>> LD_LIBRARY_PATH=/home/ben/src/forks/mmlib/_build/src MALLOC_PERTURB_=190 CK_VERBOSITY=silent /home/ben/src/forks/mmlib/_build/tests/testapi
[…]

testlog.txt

I haven’t attempted to dig more deeply into why these might be failing.

@musicinmybrain
Copy link
Author

Some fprintf debugging showed that in the getaddrinfo_valid test for ssh://localhost, the two getaddrinfo entries both have ai_family = AF_INET6 and have ai_socktype = SOCK_STREAM and SOCK_DGRAM, respectively. There could be more, but the test fails when it hits SOCK_DGRAM.

We can reproduce this in Python:

>>> from pprint import pprint
>>> from socket import getaddrinfo, AF_UNSPEC
>>> pprint(getaddrinfo('localhost', 'ssh', family=AF_UNSPEC))
[(<AddressFamily.AF_INET6: 10>,
  <SocketKind.SOCK_STREAM: 1>,
  6,
  '',
  ('::1', 22, 0, 0)),
 (<AddressFamily.AF_INET6: 10>,
  <SocketKind.SOCK_DGRAM: 2>,
  17,
  '',
  ('::1', 22, 0, 0)),
 (<AddressFamily.AF_INET6: 10>,
  <SocketKind.SOCK_STREAM: 1>,
  132,
  '',
  ('::1', 22, 0, 0)),
 (<AddressFamily.AF_INET6: 10>,
  <SocketKind.SOCK_SEQPACKET: 5>,
  132,
  '',
  ('::1', 22, 0, 0)),
 (<AddressFamily.AF_INET: 2>,
  <SocketKind.SOCK_STREAM: 1>,
  6,
  '',
  ('127.0.0.1', 22)),
 (<AddressFamily.AF_INET: 2>,
  <SocketKind.SOCK_DGRAM: 2>,
  17,
  '',
  ('127.0.0.1', 22)),
 (<AddressFamily.AF_INET: 2>,
  <SocketKind.SOCK_STREAM: 1>,
  132,
  '',
  ('127.0.0.1', 22)),
 (<AddressFamily.AF_INET: 2>,
  <SocketKind.SOCK_SEQPACKET: 5>,
  132,
  '',
  ('127.0.0.1', 22))]

Those definitely do not all have ai_socktype = SOCK_STREAM as the test code expects.

I don’t have a full explanation for why this wasn’t failing in the past, but it seems like the test code should be setting ai_socktype = SOCK_STREAM in the hints struct if it’s going to test that all results are of that type, and indeed, this fixes the failing getaddrinfo_valid test:

diff --git a/tests/socket-api-tests.c b/tests/socket-api-tests.c
index 722e037..8b08672 100644
--- a/tests/socket-api-tests.c
+++ b/tests/socket-api-tests.c
@@ -985,7 +985,10 @@ void assert_addrinfo(struct addrinfo* rp, int exp_socktype, short exp_port)
 START_TEST(getaddrinfo_valid)
 {
        struct addrinfo *rp, *res = NULL;
-       struct addrinfo hints = {.ai_family = AF_UNSPEC};
+       struct addrinfo hints = {
+               .ai_family = AF_UNSPEC,
+               .ai_socktype = SOCK_STREAM
+       };
 
        ck_assert(mm_getaddrinfo("localhost", "ssh", &hints, &res) == 0);
        for (rp = res; rp != NULL; rp = rp->ai_next)

@musicinmybrain
Copy link
Author

Given the above, the failure in getaddrinfo_error is easily explained: we expect ck_assert(mm_getaddrinfo("localhost", "ssh", &hints, &res) == -1); to fail, but in fact getaddrinfo is happy to return an entry with SOCK_DGRAM.

I’m not sure if this is a very good test, but I can get it to work again by asking for SOCK_RAW.

diff --git a/tests/socket-api-tests.c b/tests/socket-api-tests.c
index 8b08672..5e133f6 100644
--- a/tests/socket-api-tests.c
+++ b/tests/socket-api-tests.c
@@ -1016,7 +1016,7 @@ START_TEST(getaddrinfo_error)
        ck_assert(mm_getaddrinfo("localhost", "joke", &hints, &res) == -1);
        ck_assert_int_eq(mm_get_lasterror_number(), MM_ENOTFOUND);
 
-       hints.ai_socktype = SOCK_DGRAM;
+       hints.ai_socktype = SOCK_RAW;
        ck_assert(mm_getaddrinfo("localhost", "ssh", &hints, &res) == -1);
        ck_assert_int_eq(mm_get_lasterror_number(), MM_ENOTFOUND);
        hints.ai_socktype = 0;

@musicinmybrain
Copy link
Author

For the create_invalid_sockclient failure, the expectation is that mm_create_sockclient("ssh://localhost:10") will fail. Consider that mm_create_sockclient (or more specifically, create_connected_socket) loops through the candidate addresses until it finds one that succeeds. I think what’s happening here is that when getaddrinfo offered only SOCK_STREAM addresses, this always failed because nobody was listening on port 10. But now, once it tries a SOCK_DGRAM address, creating and “connecting” the socket succeeds, because all it needs is a valid peer address that is potentially deliverable based on the local network configuration; there is no verification that anything is actually listening there.

I’m not sure what to do about this one. The outcome is clearly not what was intended, but seems to be “correct” in terms of what the API guarantees.

I think I’ll go ahead and open a PR to fix getaddrinfo_valid and getaddrinfo_error, and disable the create_invalid_sockclient test in the Fedora package until an mmlib developer decides what should happen.

@musicinmybrain
Copy link
Author

I think I’ll go ahead and open a PR to fix getaddrinfo_valid and getaddrinfo_error, and disable the create_invalid_sockclient test in the Fedora package until an mmlib developer decides what should happen.

This works, in lieu of entirely disabling the test:

diff --git a/tests/socket-api-tests.c b/tests/socket-api-tests.c
index 5e133f6..cc7916b 100644
--- a/tests/socket-api-tests.c
+++ b/tests/socket-api-tests.c
@@ -1162,7 +1162,7 @@ START_TEST(create_invalid_sockclient)
        ck_assert(mm_create_sockclient("dummy://localhost") == -1);
        ck_assert_int_eq(mm_get_lasterror_number(), MM_ENOTFOUND);
 
-       ck_assert(mm_create_sockclient("ssh://localhost:10") == -1);
+       ck_assert(mm_create_sockclient("tcp://localhost:10") == -1);
        ck_assert_int_eq(mm_get_lasterror_number(), ECONNREFUSED);
 
        ck_assert(mm_create_sockclient("tcp://localhost") == -1);

…so I’ll do that downstream for now, but it does not test exactly the same thing, so I’m not adding it to #8 without maintainer feedback.

@musicinmybrain
Copy link
Author

musicinmybrain commented Oct 19, 2023

This works, in lieu of entirely disabling the test:

[…]

It suffices when building directly in a git checkout as described above, but not when building the Fedora package in a chroot:

▶ 5/5 - ../tests/socket-api-tests.c:socket:getaddrinfo_error: Assertion 'mm_getaddrinfo("notanhost.localdomain", "ssh", &hints, &res) == -1' failed FAIL

In Python session in a normal terminal in Fedora 38:

>>> from pprint import pprint
>>> from socket import getaddrinfo, AF_INET
>>> pprint(getaddrinfo('notanhost.localdomain', 'ssh', family=AF_INET))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.11/socket.py", line 962, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -2] Name or service not known

Same thing in a mock chroot:

>>> from pprint import pprint
>>> from socket import getaddrinfo, AF_INET
>>> pprint(getaddrinfo('notanhost.localdomain', 'ssh', family=AF_INET))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.11/socket.py", line 962, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -3] Temporary failure in name resolution

The error is different than the one the test expects when the chroot has networking disabled, but what happens if we enable netwroking?

>>> pprint(getaddrinfo('notanhost.localdomain', 'ssh', family=AF_INET))
[(<AddressFamily.AF_INET: 2>,
  <SocketKind.SOCK_STREAM: 1>,
  6,
  '',
  ('AAA.BBB.CCC.DDD', 22)),
 (<AddressFamily.AF_INET: 2>,
  <SocketKind.SOCK_DGRAM: 2>,
  17,
  '',
  ('AAA.BBB.CCC.DDD', 22)),
 (<AddressFamily.AF_INET: 2>,
  <SocketKind.SOCK_STREAM: 1>,
  132,
  '',
  ('AAA.BBB.CCC.DDD', 22)),
 (<AddressFamily.AF_INET: 2>,
  <SocketKind.SOCK_SEQPACKET: 5>,
  132,
  '',
  ('AAA.BBB.CCC.DDD', 22)),
 (<AddressFamily.AF_INET: 2>,
  <SocketKind.SOCK_STREAM: 1>,
  6,
  '',
  ('AAA.BBB.CCC.EEE', 22)),
 (<AddressFamily.AF_INET: 2>,
  <SocketKind.SOCK_DGRAM: 2>,
  17,
  '',
  ('AAA.BBB.CCC.EEE', 22)),
 (<AddressFamily.AF_INET: 2>,
  <SocketKind.SOCK_STREAM: 1>,
  132,
  '',
  ('AAA.BBB.CCC.EEE', 22)),
 (<AddressFamily.AF_INET: 2>,
  <SocketKind.SOCK_SEQPACKET: 5>,
  132,
  '',
  ('AAA.BBB.CCC.EEE', 22))]

Oh, it succeeds. Here, instead of AAA.BBB.CCC.DDD and AAA.BBB.CCC.EEE, I see the addresses that come up in DNS if you take the FQDN of my build host and replace the initial hostname with something bogus. Does that make sense?

Anyway, I am not sure the assumptions baked into getaddrinfo_error tests are really portable across different environments, and I think I have not considered all the ways they could go wrong. For now, I am just going to omit the entire getaddrinfo_error test in the Fedora RPM build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant