Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scanning fails with error on directory structures with recursive symlinks #25

Closed
mscotthouston opened this issue Aug 25, 2023 · 13 comments · Fixed by #36 or #49
Closed

Scanning fails with error on directory structures with recursive symlinks #25

mscotthouston opened this issue Aug 25, 2023 · 13 comments · Fixed by #36 or #49
Assignees
Labels
bug Something isn't working dev-complete Development work to resolve this issue is complete qa-passed QA has tested and confirmed the fix for this issue subcommand:scan Related to the scan subcommand
Milestone

Comments

@mscotthouston
Copy link

Hi all,

Just started testing the wordfence-cli on a server.

Server OS: debian 11.7
Wordfence version: 1.0.1

When I run wordfence scan /var/www, I see:

Scanning path: /var/www
Error: Directory search failed

If I enable debugging with wordfence scan -d /var/www, I see several thousand lines printed to the screen as wordfence-cli scans through the subdirectories. Eventually it exits with this message:

Traceback (most recent call last):
  File "main.py", line 4, in <module>
  File "wordfence/cli/cli.py", line 17, in main
  File "wordfence/cli/scan/scan.py", line 295, in main
  File "wordfence/cli/scan/scan.py", line 288, in main
  File "wordfence/cli/scan/scan.py", line 229, in execute
  File "wordfence/scanning/scanner.py", line 654, in scan
  File "wordfence/scanning/scanner.py", line 584, in await_results
wordfence.scanning.exceptions.ScanningException: Directory search failed
[3876313] Failed to execute script 'main' due to unhandled exception!

Everytime it fails, it fails while processing a sub-directory within /var/www, let's call it /var/www/websites/. If I run wordfence scan -d /var/www/websites, I see the exact same Traceback as above.

However, if we create a for loop to run wordfence-cli in each subdirectory under /var/www/websites/, such as with for d in /var/www/webpages/*; do wordfence scan $d; done, every single scan completes without error.

Any thoughts? Anything else I can provide?

@akenion
Copy link
Contributor

akenion commented Aug 25, 2023

That exception is raised from another exception that should be an OSError and appear earlier in the output. Can you share that part of the message?

@mscotthouston
Copy link
Author

Thanks for replying.

I'm not seeing any other error in the output before the exception I posted. Sent all debug output to a file, and grepped the file for OSError, OSE, Error, and I'm seeing nothing.

@akenion
Copy link
Contributor

akenion commented Aug 25, 2023

The error would be written to stderr rather than stdout. Just to confirm, did you use something like 2>&1 to capture the stderr output as well?

It initially looks like a permission issue, but that should impact your find usage as well. What are the permissions on /var/www/webpages? I noticed in your find command, you're using /var/www/webpages whereas when running Wordfence CLI you're using /var/www/websites. Are you using the same path in both tests?

@mscotthouston
Copy link
Author

My apologies--the /var/www/webpages and /var/www/websites are standing in for a specific directory name which I redacted. To confirm, I used the same path in every test. This path contains about 100 subdirectories, each of which is a personal homepage for a unique user.

I did use 2>&1 to redirect all output to a text file.

I agree that it felt like a permissions issue, but the fact that I can iterate through every single subdirectory without issue with a for loop seems to suggest otherwise. Each subdirectory within /var/www/websites has the same owner and group--there are no exceptions.

/var/www
|-- websites
|   |-- user1_homepage
|   |-- user2_homepage
|   |-- user3_homepage
...
|   |-- user99_homepage
|   `-- user100_homepage

@akenion
Copy link
Contributor

akenion commented Aug 28, 2023

What are the modes on each directory tier? /var, /var/www, /var/www/websites, /var/www/websites/userX_homepage

I suspect there's a directory in that hierarchy that has read without execute. By creating a similar structure and setting the permissions for /var/www/websites to 444, I am able to recreate the behavior you've encountered where the scanner is unable to find files, but shell globbing (as in your for d in /var/www/websites/* example) does still work.

Directories should have the execute permission. Adding that should fix the scanning issue, assuming it's missing.

@mscotthouston
Copy link
Author

Thanks for that idea.

So normally everything within /var/www and /var/www/websites is 750, with a few 770 and 755.

I tried setting everything to 777 just to test, and I got the exact same errors as reported above happening in the exact same place.

I also want to reconfirm that every one of the homepage directories (/var/www/websites/userX_homepage) is owned by the same user and group. So it is odd that scanning should fail for some and not for others.

@akenion
Copy link
Contributor

akenion commented Aug 29, 2023

It sounds like the permissions should be OK, then. There aren't any additional permission controls like ACLs in place, are there? Is the user who is running Wordfence CLI the same as the owner of the directories?

Can you confirm that find /var/www/websites -type f yields the expected results without permission issues?

@mscotthouston
Copy link
Author

No ACLs in place. I've been running the testing as the root user. find /var/www/websites -type f does return everything without complaint.

We do use the apache module mpm_itk (http://mpm-itk.sesse.net/), and so /var/www/websites has a uid and gid that is not just unique to it and its subdirectories, but is also not in use anywhere else on the system. Its home is /var/www/websites, and when I attempted to run the wordfence scan as this user, I first was prompted to initialize the ~/.config and ~/.cache directories. The debug output is identical to my previous tests, the only difference being that this test was only scanning /var/www/websites, and not /var/www/, as this user has no permissions for /var/www/. I then also ran find /var/www/websites -type f as this user, and as with the root user it returned everything without complaint.

Not being familiar with the wordfence code, it almost feels as if a buffer were getting filled up. Does that seem possible?

@akenion akenion self-assigned this Aug 29, 2023
@akenion
Copy link
Contributor

akenion commented Aug 29, 2023

If this issues occurs when running as root, it's almost certainly not a permission issue. The queue used for sending located files to the actual scan workers does have a fixed size (currently 1,000), but that should just block until the workers process the queue; it shouldn't trigger an OSError (which is the underlying cause of the error you're seeing).

Do you have the ability to run Python on this system? I've put together a test script with a simplified version of the file locator so hopefully we can actually capture the underlying error.

import os
import sys


def search_directory(path: str):
    contents = os.scandir(path)
    for item in contents:
        if item.is_dir():
            yield from search_directory(item.path)
        elif item.is_file():
            yield item.path


def locate(path: str):
    real_path = os.path.realpath(path)
    if os.path.isdir(real_path):
        for path in search_directory(real_path):
            print(path)
    else:
        print(real_path)


path = sys.argv[1]
print(f'Base path: {path}')
locate(path=path)

You can save this script to a file and then run the following:
python3 /path/to/script.py /var/www/websites

That should yield an OSError that will help us identify the underlying cause.

@akenion akenion added the subcommand:scan Related to the scan subcommand label Aug 29, 2023
@akenion
Copy link
Contributor

akenion commented Aug 29, 2023

To prevent the need for debugging such issues outside of the Wordfence CLI tool in the future, I've added #31 to the next milestone.

@mscotthouston
Copy link
Author

Perfect! That produced the full text of the OSError, and the problem was then very easy to identify. A directory contained a symbolic link to itself:

/var/www/websites/userN_homepage/subdirectory# la
total 4
lrwxrwxrwx 1 websites websites 1 Aug 30 10:24 Bad_Link -> .

This is the full error output:

Traceback (most recent call last):
  File "/root/wordfence_test.py", line 25, in <module>
    locate(path=path)
  File "/root/wordfence_test.py", line 17, in locate
    for path in search_directory(real_path):
  File "/root/wordfence_test.py", line 9, in search_directory
    yield from search_directory(item.path)
  File "/root/wordfence_test.py", line 9, in search_directory
    yield from search_directory(item.path)
  File "/root/wordfence_test.py", line 9, in search_directory
    yield from search_directory(item.path)
  [Previous line repeated 40 more times]
  File "/root/wordfence_test.py", line 8, in search_directory
    if item.is_dir():
OSError: [Errno 40] Too many levels of symbolic links: '/var/www/websites/userN_homepage/subdirectory/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/'

Removing the Bad_Link now allows the scan to complete without issue.

Thank you for that! Your patience and support is very much appreciated :)

@akenion akenion removed the triaged label Aug 30, 2023
@akenion
Copy link
Contributor

akenion commented Aug 30, 2023

Thanks for working with me to diagnose that! We need to handle this better within Wordfence CLI. I'm going to leave this issue open for now and update it to reflect that we need additional handling for recursive symlinks.

@akenion akenion added the bug Something isn't working label Aug 30, 2023
@akenion akenion changed the title "Error: Directory search failed" on /var/www Scanning fails with error on directory structures with recursive symlinks Aug 30, 2023
@akenion akenion added this to the exploring-elk milestone Aug 30, 2023
@akenion akenion linked a pull request Aug 31, 2023 that will close this issue
akenion added a commit that referenced this issue Aug 31, 2023
Added detection for recursive symlinks
@akenion akenion added the dev-complete Development work to resolve this issue is complete label Aug 31, 2023
@ewodrich ewodrich self-assigned this Sep 15, 2023
@ewodrich
Copy link

Created 5 symlinks to test a variety of scenarios, and verified error no longer occurs, and instead a notification appears in the output as "Recursive symlink detected at /path/being/scanned" for each occurrence:

  • Parent directory just outside of root of scan path
  • Different path than scan path
  • Recursive - same as root of scan path
  • Recursive - from subdirectory to parent directory of scan path
  • Recursive - from subdirectory to self of scan path

Symlinks that are not recursive do not present a warning and scan without error.

Additional testing included a variety of file paths with both stdout and progress output, adding in options that include number of workers, file types to include, large and small scale malware findings, debug, verbose, routing stderr to file, defining output-path, as root, as user, with and without allow-io-errors option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working dev-complete Development work to resolve this issue is complete qa-passed QA has tested and confirmed the fix for this issue subcommand:scan Related to the scan subcommand
Projects
None yet
3 participants