Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DFS server]: Infinite recursion due to lstat for a non-existent file #228

Closed
karan-lb opened this issue Jul 11, 2023 · 3 comments · Fixed by #239
Closed

[DFS server]: Infinite recursion due to lstat for a non-existent file #228

karan-lb opened this issue Jul 11, 2023 · 3 comments · Fixed by #239

Comments

@karan-lb
Copy link

karan-lb commented Jul 11, 2023

Hi, we have a 3 node DFS SMB server that we are trying to traverse. The DFS server exposes 4 namespaces that are fully replicated over all nodes. This means that every node holds all the files that are present in the namespace.
When we try to lstat a file which exists, the code is working well and we don't see any issue. The DFS referrals are not involved there as all files do exist on each node.

Issue:
When we try to lstat a file path that does not exist on the DFS server the smbclient gets into an infinite recursion. The trace for which looks as follows:
Notice the frame repeats till we interrupt the process.
File "/usr/local/lib/python3.7/site-packages/smbclient/_io.py", line 335, in commit self.commit() [Previous line repeated 38 more times]

  File "download_file.py", line 35, in stat_dfs_file
    smbclient.lstat(path)
  File "/usr/local/lib/python3.7/site-packages/smbclient/_os.py", line 259, in lstat
    return stat(path, follow_symlinks=False, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/smbclient/_os.py", line 592, in stat
    query_info(transaction, FileAttributeTagInformation)
  File "/usr/local/lib/python3.7/site-packages/smbclient/_io.py", line 258, in __exit__
    self.commit()
  File "/usr/local/lib/python3.7/site-packages/smbclient/_io.py", line 335, in commit
    self.commit()
  File "/usr/local/lib/python3.7/site-packages/smbclient/_io.py", line 335, in commit
    self.commit()
  File "/usr/local/lib/python3.7/site-packages/smbclient/_io.py", line 335, in commit
    self.commit()
  [Previous line repeated 38 more times]
  File "/usr/local/lib/python3.7/site-packages/smbclient/_io.py", line 294, in commit
    res = func(requests[idx])
  File "/usr/local/lib/python3.7/site-packages/smbprotocol/open.py", line 1166, in _create_response
    response = self.connection.receive(request)
  File "/usr/local/lib/python3.7/site-packages/smbprotocol/connection.py", line 1008, in receive
    if not request.response_event.wait(timeout=iter_timeout):
  File "/usr/local/lib/python3.7/threading.py", line 552, in wait
    signaled = self._cond.wait(timeout)
  File "/usr/local/lib/python3.7/threading.py", line 296, in wait
    waiter.acquire()
KeyboardInterrupt

We notice that the referrals being returned by the server keep rotating between a couple of nodes and its gets into this infinite loop. Server A refers to Server B and then it refers back and so forth. Since the namespace does indeed exist on all three nodes that does make sense, but for deleted files there is no condition to make the recursion stop. (Ideally it should stop when we get to a node that we have already been referred before?)

Possible solution:
We did notice that the ClientConfig does contain an option skip_dfs to avoid processing any referrals, but that option doesn't work for our case. We still see infinite recursion happening.

If the skip_dfs option worked, it would solve the crash for our case as any single node of the DFS server is fully capable to serve all files in the server.

So, we notice in the method, we can return early if the tree connect is not a DFS share. So we plan to toggle the is_dfs_share capability returned by the server to False when ClientConfig has skip_dfs set to True by the client.

def _resolve_dfs(raw_io):
    if not raw_io.fd.tree_connect.is_dfs_share:
        return

We do the above by modifying the following code in _pool.py

def get_smb_tree(
    path, username=None, password=None, port=445, encrypt=None, connection_timeout=60, connection_cache=None
):
    client_config = ClientConfig()
...

    if not tree:
        tree = TreeConnect(session, share_path)
        use_dfs = not client_config.skip_dfs
        try:
            tree.connect(require_secure_negotiate=client_config.require_secure_negotiate, use_dfs=use_dfs)
        except BadNetworkName as err:
            # If the server doesn't mention it supports DFS then don't try to
            # resolve the DFS path.
            if not session.connection.server_capabilities.has_flag(Capabilities.SMB2_GLOBAL_CAP_DFS) or not use_dfs:
                raise

and in tree.py :

class TreeConnect(object):
...
    def connect(self, require_secure_negotiate=True, use_dfs=True):
...
        capabilities = tree_response["capabilities"]
        self.is_dfs_share = capabilities.has_flag(ShareCapabilities.SMB2_SHARE_CAP_DFS) and use_dfs

This does help us in honoring the skip_dfs field, but we wanted to know

  1. Whether this problem is known?
  2. What could a potential solution look like?
  3. Would it help to send a small patch for honoring skip_dfs field?
@karan-lb karan-lb changed the title [DFS server]: Infinite recursion due to lstat for a deleted file [DFS server]: Infinite recursion due to lstat for a non-existent file Jul 11, 2023
@jborean93
Copy link
Owner

Sorry I'm not ignoring this, just haven't had time to look into it. Unfortunately DFS is a complex topic so might take some time to try and replicate and figure out where the problem is.

@karan-lb
Copy link
Author

@jborean93 please let me know if you need any details about the setup. It was easy to repro this in-house using a 2 node setup with a fully replicated DFS namespace across both nodes.

jborean93 added a commit that referenced this issue Sep 5, 2023
The code that handles a missing path does not currently record existing
DFS target servers that have been tried resulting in a recursion error.
By ensuring a DFS target is only retried if it hasn't already it should
avoid this recursion problem.

Fixes: #228
@jborean93
Copy link
Owner

Sorry for the delay getting to this issue, the PR #239 fixes the recursion problem by ensuring it won't retry a target that has already been tried. I was able to replicate the problem in my environment and I can confirm the changes in that PR ensure the correct exception is raised.

jborean93 added a commit that referenced this issue Sep 5, 2023
The code that handles a missing path does not currently record existing
DFS target servers that have been tried resulting in a recursion error.
By ensuring a DFS target is only retried if it hasn't already it should
avoid this recursion problem.

Fixes: #228
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants