You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Regardless of which obfuscation approach will take place (see #63), it might be worth considering using Python instead of sed for password obfuscation. Even if diversion from the “one pattern to rule them all” approach would take longer time, it is possible to replace the sed call with a Python routine.
Python comes with a strong and well-documented re library. As long as good precautions on locale and encoding are made, the Python code should be more robust and portable.
sed as an external tool comes with some burdens. Its not just a regex replacement tool, but a text-editor and actually a whole programming language with its own rules, syntax etc. It uses PERL RE engine, but not everything works as expected and the documentation is scattered all over manpages. That significantly reduces maintainability. Moreover binding to a specific implementation (GNU sed) affects portability.
One more issue with sed is that it is heavily locale dependent. Even simple things like matching only ASCII part of [A-Za-z0-9_] is virtually impossible without overloading the LC_COLLATION environment setting. Coercing to C locale can likely lead to different problems once locale-specific or at least Unicode concerning matching is needed.
There may be some questions on performance and memory use. If its not feasible to always load a whole file into memory to run Python regex replacement on it, it may be necessary to process the payloads in a stream-like way. It may also make sense to spawn a new thread or process for this, which is what running sed actually does. An inspiration can be found in existing obfuscation solutions like soscleaner.
(Recently, maintaining sed script file has been a hassle in #63.)
The text was updated successfully, but these errors were encountered:
Regardless of which obfuscation approach will take place (see #63), it might be worth considering using Python instead of sed for password obfuscation. Even if diversion from the “one pattern to rule them all” approach would take longer time, it is possible to replace the
sed
call with a Python routine.Python comes with a strong and well-documented
re
library. As long as good precautions on locale and encoding are made, the Python code should be more robust and portable.sed as an external tool comes with some burdens. Its not just a regex replacement tool, but a text-editor and actually a whole programming language with its own rules, syntax etc. It uses PERL RE engine, but not everything works as expected and the documentation is scattered all over manpages. That significantly reduces maintainability. Moreover binding to a specific implementation (GNU sed) affects portability.
One more issue with sed is that it is heavily locale dependent. Even simple things like matching only ASCII part of
[A-Za-z0-9_]
is virtually impossible without overloading theLC_COLLATION
environment setting. Coercing toC
locale can likely lead to different problems once locale-specific or at least Unicode concerning matching is needed.There may be some questions on performance and memory use. If its not feasible to always load a whole file into memory to run Python regex replacement on it, it may be necessary to process the payloads in a stream-like way. It may also make sense to spawn a new thread or process for this, which is what running sed actually does. An inspiration can be found in existing obfuscation solutions like soscleaner.
(Recently, maintaining sed script file has been a hassle in #63.)
The text was updated successfully, but these errors were encountered: