Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: 'utf-8' codec can't decode byte 0xfc in position 62: invalid start byte #266

Closed
AleksCee opened this issue Jul 17, 2024 · 7 comments · Fixed by #286
Closed

Error: 'utf-8' codec can't decode byte 0xfc in position 62: invalid start byte #266

AleksCee opened this issue Jul 17, 2024 · 7 comments · Fixed by #286
Assignees
Labels
bug Something isn't working subcommand:vuln-scan Related to the vuln-scan subcommand
Milestone

Comments

@AleksCee
Copy link

AleksCee commented Jul 17, 2024

Hi,

some sites 13 of 15 get this error by using vuln-scan, only 2 sites on the same server runs without this error. Any ideas?

sorry forgotten, version: v4.0.2 as binary installation on Ubuntu 22.04 with LANG=de_DE.UTF-8 - also tried with en utf8

thanks, Alex

@akenion akenion self-assigned this Jul 17, 2024
@akenion akenion added bug Something isn't working subcommand:vuln-scan Related to the vuln-scan subcommand labels Jul 17, 2024
@akenion
Copy link
Contributor

akenion commented Jul 17, 2024

@AleksCee Could you try running with --debug and capturing the stack trace where this error occurs? It's likely related to a file name or content on the sites where it's not working, but after reviewing I'm not seeing a clear place where such an error would occur, so the stack trace will help significantly.

@AleksCee
Copy link
Author

@akenion here is the debug output:

WordPress Core Version: 6.6
Traceback (most recent call last):
  File "main.py", line 4, in <module>
  File "wordfence/cli/cli.py", line 193, in main
  File "wordfence/cli/cli.py", line 187, in invoke_cli
  File "wordfence/cli/cli.py", line 43, in process_exception
  File "wordfence/cli/cli.py", line 185, in invoke_cli
  File "wordfence/cli/cli.py", line 178, in invoke
  File "wordfence/cli/vulnscan/vulnscan.py", line 277, in invoke
  File "wordfence/cli/vulnscan/vulnscan.py", line 209, in _scan_sites
  File "wordfence/cli/vulnscan/vulnscan.py", line 120, in _scan
  File "wordfence/wordpress/site.py", line 420, in get_all_plugins
  File "wordfence/wordpress/site.py", line 390, in get_plugins
  File "wordfence/wordpress/site.py", line 369, in _generate_possible_plugins_paths
  File "wordfence/wordpress/site.py", line 360, in get_configured_plugins_directory
  File "wordfence/wordpress/site.py", line 314, in _extract_string_from_config
  File "wordfence/wordpress/site.py", line 305, in _get_parsed_config_state
  File "wordfence/wordpress/site.py", line 290, in _parse_config_file
  File "wordfence/php/parsing.py", line 1639, in parse_php_file
  File "wordfence/php/parsing.py", line 1623, in parse
  File "wordfence/php/parsing.py", line 1614, in parse_any
  File "wordfence/php/parsing.py", line 1567, in parse_output
  File "wordfence/php/parsing.py", line 936, in accept_base_token
  File "wordfence/php/lexing.py", line 544, in get_next_token
  File "wordfence/php/lexing.py", line 520, in extract_inline_html_or_open_tag
  File "wordfence/php/lexing.py", line 449, in step
  File "wordfence/php/lexing.py", line 438, in _read_chunk
  File "codecs.py", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 311: invalid continuation byte
[3822364] Failed to execute script 'main' due to unhandled exception!

when I read it correctly it’s happened during the config-file…
Ok there are comment with broken "umlauts" I will try to fix it.
Wenn du verschiedene Pr�fixe benutzt kannst du innerhalb einer Datenbank

@AleksCee
Copy link
Author

Ok, this Dochte trick, thanks for the quick help! The wp-config.php has broken encodings in the comments with German umlauts.

@akenion
Copy link
Contributor

akenion commented Jul 17, 2024

Thanks @AleksCee. You're correct that this is due to those characters not being valid UTF-8, but PHP isn't actually required to be UTF-8 so the parser in CLI should be able to handle this gracefully without error. I'm going to re-open this issue and include a fix in the upcoming CLI release as it should be fairly straightforward to correct.

Thanks for reporting this and your help troubleshooting; I'm glad to hear you were able to find a workaround for the time being.

@akenion akenion reopened this Jul 17, 2024
@akenion akenion added this to the v4.0.3 milestone Jul 17, 2024
@AleksCee
Copy link
Author

@akenion thanks too.
But sorry one more Probleme. After fixing the configs I get a new error with only one site (multisiteconfig) and I don’t figure out what’s happened:

Traceback (most recent call last):
  File "main.py", line 4, in <module>
  File "wordfence/cli/cli.py", line 193, in main
  File "wordfence/cli/cli.py", line 187, in invoke_cli
  File "wordfence/cli/cli.py", line 43, in process_exception
  File "wordfence/cli/cli.py", line 185, in invoke_cli
  File "wordfence/cli/cli.py", line 178, in invoke
  File "wordfence/cli/vulnscan/vulnscan.py", line 277, in invoke
  File "wordfence/cli/vulnscan/vulnscan.py", line 209, in _scan_sites
  File "wordfence/cli/vulnscan/vulnscan.py", line 120, in _scan
  File "wordfence/wordpress/site.py", line 420, in get_all_plugins
  File "wordfence/wordpress/site.py", line 390, in get_plugins
  File "wordfence/wordpress/site.py", line 369, in _generate_possible_plugins_paths
  File "wordfence/wordpress/site.py", line 360, in get_configured_plugins_directory
  File "wordfence/wordpress/site.py", line 314, in _extract_string_from_config
  File "wordfence/wordpress/site.py", line 305, in _get_parsed_config_state
  File "wordfence/wordpress/site.py", line 290, in _parse_config_file
  File "wordfence/php/parsing.py", line 1639, in parse_php_file
  File "wordfence/php/parsing.py", line 1623, in parse
  File "wordfence/php/parsing.py", line 1612, in parse_any
  File "wordfence/php/parsing.py", line 1599, in parse_statement
  File "wordfence/php/parsing.py", line 1519, in parse_conditional
  File "wordfence/php/parsing.py", line 1507, in parse_condition
  File "wordfence/php/parsing.py", line 1229, in parse_expression
  File "wordfence/php/parsing.py", line 1171, in parse_expression_component
  File "wordfence/php/parsing.py", line 1274, in parse_invocation
  File "wordfence/php/parsing.py", line 1260, in parse_argument_list
AttributeError: 'NoneType' object has no attribute 'is_character'
[3885201] Failed to execute script 'main' due to unhandled exception!

have you a tip where I can find the issue? The config file is the same a all this others. Only diff are the hashes and dB config. And additional this block:

/* Multisite */
define('WP_ALLOW_MULTISITE', true);
define('MULTISITE', true);
define('SUBDOMAIN_INSTALL', true);
define('DOMAIN_CURRENT_SITE', '***delete.net');
define('PATH_CURRENT_SITE', '/');
define('SITE_ID_CURRENT_SITE', 1);
define('BLOG_ID_CURRENT_SITE', 1);

@AleksCee
Copy link
Author

AleksCee commented Jul 18, 2024

Ok after some try and error I found out that the parse comes in trouble with this line:

define('PATH_CURRENT_SITE', '/');

When I temporarily remove this line for testing, it’s work. But this line is needed for multisites.

OK, seams to be a accident - did not work. :-(

@davidnuzik
Copy link

v4.0.3rc4 8/1/24

SUMMARY:
QA validation PASSED. I was successfully able to reproduce the vuln-scan php parsing issue and validate the fix.

REPRODUCTION STEPS:

  1. Write a python script which will append a php file that gets read in and parsed during a vuln-scan (wp-config.php for example). This script should write (in binary) invalid utf-8. NOTE: This obviously will break a wp-config.php file and cause WordPress to no longer function it was merely done in a test environment to reproduce the UnicodeDecodeError error and validate the fix.
  2. Using the current official release, v4.0.2, attempt to vuln-scan this WordPress install with the altered wp-config.php on disk. I successfully reproduced the issue. Note my byte shown in the error is slightly different but this is not relevant.
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 3649: invalid start byte

VALIDATION STEPS:
Execute step 2 above again, but this time using v4.0.3rc4. The issue no longer occurs and the output from the vuln-scan is nominal.

NOTES:
Internal test automation was updated to include tests for this issue going forward.
I also executed additional tests (including besides the vuln-scan subcommand) to ensure the Wordfence CLI still behaves normally in all areas. No significant or related issues observed.

Other PRs also validated - for example CLI output for vuln-scans was including a 'b' character when outputting versions - this is fixed as well. All internal test automation passes across the board.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working subcommand:vuln-scan Related to the vuln-scan subcommand
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants