Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Python code instead of sed for obfuscation #65

Open
Glutexo opened this issue Jul 18, 2018 · 0 comments
Open

Use Python code instead of sed for obfuscation #65

Glutexo opened this issue Jul 18, 2018 · 0 comments

Comments

@Glutexo
Copy link
Contributor

Glutexo commented Jul 18, 2018

Regardless of which obfuscation approach will take place (see #63), it might be worth considering using Python instead of sed for password obfuscation. Even if diversion from the “one pattern to rule them all” approach would take longer time, it is possible to replace the sed call with a Python routine.

Python comes with a strong and well-documented re library. As long as good precautions on locale and encoding are made, the Python code should be more robust and portable.

sed as an external tool comes with some burdens. Its not just a regex replacement tool, but a text-editor and actually a whole programming language with its own rules, syntax etc. It uses PERL RE engine, but not everything works as expected and the documentation is scattered all over manpages. That significantly reduces maintainability. Moreover binding to a specific implementation (GNU sed) affects portability.

One more issue with sed is that it is heavily locale dependent. Even simple things like matching only ASCII part of [A-Za-z0-9_] is virtually impossible without overloading the LC_COLLATION environment setting. Coercing to C locale can likely lead to different problems once locale-specific or at least Unicode concerning matching is needed.

There may be some questions on performance and memory use. If its not feasible to always load a whole file into memory to run Python regex replacement on it, it may be necessary to process the payloads in a stream-like way. It may also make sense to spawn a new thread or process for this, which is what running sed actually does. An inspiration can be found in existing obfuscation solutions like soscleaner.

(Recently, maintaining sed script file has been a hassle in #63.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant