Fix XML comparison, compare time within 1e-6 seconds #63

CervEdin · 2024-09-25T18:37:28Z

There appears to be a difference in how 'time' is generated in the XML output, possibly different in different systems/configurations.

#62

Possible culprit is how the 'time' attribute is for 'testsuite' elements. https://github.com/kyrus/python-junit-xml/blob/4bd08a272f059998cedf9b7779f944d49eba13a6/junit_xml/__init__.py#L135

Test case attributes appear to be formatted using '%f' % time, which is 6 decimal places by default.

For the testsuite, it looks like it's just a str call and maybe this is causing different systems to generate time of different precision.

Due to this small differences in how time is encoded in the XML, the test suite may fail due to differences in the amount of precision is encoded on the system.

Add a new function to the test suite that compares the XML document recursively and assert that tags, text and attributes match.

For 'time' attributes, we compare using math.isclose() with a tolerance of 1e-6.

There appears to be a difference in how 'time' is generated in the XML output, possibly different in different systems/configurations. Possible culprit is how the 'time' attribute is for 'testsuite' elements. https://github.com/kyrus/python-junit-xml/blob/4bd08a272f059998cedf9b7779f944d49eba13a6/junit_xml/__init__.py#L135 Test case attributes appear to be formatted using `'%f' % time`, which is 6 decimal places by default. For the testsuite, it looks like it's just a `str` call and maybe this is causing different systems to generate time of different precision. Due to this small differences in how time is encoded in the XML, the test suite may fail due to differences in the amount of precision is encoded on the system. Add a new function to the test suite that compares the XML document recursively and assert that tags, text and attributes match. For 'time' attributes, we compare using `math.isclose()` with a tolerance of 1e-6.

nicola-lunghi · 2024-10-16T01:44:01Z

Hi @CervEdin can you fix the ruff issue with the xml module?

Run ruff check --output-format=github .
Error: test/test_outputs.py:63:24: S314 Using `xml` to parse untrusted data is known to be vulnerable to XML attacks; use `defusedxml` equivalents
Error: test/test_outputs.py:64:22: S314 Using `xml` to parse untrusted data is known to be vulnerable to XML attacks; use `defusedxml` equivalents
Error: Process completed with exit code 1.

from defusedxml documentation:

Instead of:

>>> from xml.etree.ElementTree import parse
>>> et = parse(xmlfile)

alter code to:

>>> from defusedxml.ElementTree import parse
>>> et = parse(xmlfile)

nicola-lunghi · 2024-10-16T01:49:53Z

also I don't think this works:

the problem is the testsuites line:

tap2junit/test/output/test.xml

Lines 2 to 3 in 0601e2d

    
           <testsuites disabled="0" errors="1" failures="0" tests="2370" time="11.862548"> 
        
           	<testsuite disabled="0" errors="1" failures="0" name="test/fixtures/test" skipped="45" tests="2370" time="11.862548" hostname="{HOSTNAME}">

 <?xml version="1.0" encoding="utf-8"?>
-<testsuites disabled="0" errors="1" failures="0" tests="2367" time="1.1381459999999999">
-       <testsuite disabled="0" errors="1" failures="0" name="test/fixtures/test3" skipped="25" tests="2367" time="1.1381459999999999" hostname="{HOSTNAME}">
+<testsuites disabled="0" errors="1" failures="0" tests="2367" time="1.138146">
+       <testsuite disabled="0" errors="1" failures="0" name="test3" skipped="25" tests="2367" time="1.138146" hostname="{HOSTNAME}">
                <testcase name="test-async-await" time="0.000601" classname="async-hooks"/>

cclauss · 2024-10-16T13:03:57Z

% ruff rule S314

suspicious-xml-element-tree-usage (S314)

Derived from the flake8-bandit linter.

What it does

Checks for uses of insecure XML parsers.

Why is this bad?

Many XML parsers are vulnerable to XML attacks (such as entity expansion),
which cause excessive memory and CPU usage by exploiting recursion. An
attacker could use such methods to access unauthorized resources.

Consider using the defusedxml package when parsing untrusted XML data,
to protect against XML attacks.

Example

from xml.etree.ElementTree import parse

tree = parse("untrusted.xml")  # Vulnerable to XML attacks.

Use instead:

from defusedxml.ElementTree import parse

tree = parse("untrusted.xml")

References

CervEdin · 2024-10-16T13:21:59Z

Hi @CervEdin can you fix the ruff issue with the xml module?

@nicola-lunghi I guess there are two options here

Use another dependency which is not the xml parser in the stdlib
Add comments to ignore this particular check in ruff

I personally don't see the security issue here since the XML is trusted but option 1 is not bad per se

also I don't think this works:
* the problem is the testsuites line:

Sorry but I didn't follow what you meant here. Is something failing? Could be that epsilon needs tweaking.
The choice I made for this was pretty arbitrary

CervEdin · 2024-10-16T13:26:36Z

% ruff rule S314

suspicious-xml-element-tree-usage (S314)

Derived from the flake8-bandit linter.

What it does

Checks for uses of insecure XML parsers.

Why is this bad?

Many XML parsers are vulnerable to XML attacks (such as entity expansion), which cause excessive memory and CPU usage by exploiting recursion. An attacker could use such methods to access unauthorized resources.

Consider using the defusedxml package when parsing untrusted XML data, to protect against XML attacks.

Example
from xml.etree.ElementTree import parse

tree = parse("untrusted.xml")  # Vulnerable to XML attacks.
Use instead:
from defusedxml.ElementTree import parse

tree = parse("untrusted.xml")
References
* [Python documentation: `xml` — XML processing modules](https://docs.python.org/3/library/xml.html)

* [PyPI: `defusedxml`](https://pypi.org/project/defusedxml/)

* [Common Weakness Enumeration: CWE-400](https://cwe.mitre.org/data/definitions/400.html)

* [Common Weakness Enumeration: CWE-776](https://cwe.mitre.org/data/definitions/776.html)

@cclauss thanks for the references

I wonder if it applies in this case. The XML being parsed is 1) the xml files checked into the repo 2) the xml generated from tap files checked in to the repo and processed by tap2junit.
Spontaneously, neither seem to necessarily qualify as untrusted.

cclauss requested a review from MoLow September 25, 2024 19:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix XML comparison, compare time within 1e-6 seconds #63

Fix XML comparison, compare time within 1e-6 seconds #63

CervEdin commented Sep 25, 2024

nicola-lunghi commented Oct 16, 2024 •

edited

Loading

nicola-lunghi commented Oct 16, 2024

cclauss commented Oct 16, 2024

CervEdin commented Oct 16, 2024

CervEdin commented Oct 16, 2024

suspicious-xml-element-tree-usage (S314)

What it does

Why is this bad?

Example

References

Fix XML comparison, compare time within 1e-6 seconds #63

Are you sure you want to change the base?

Fix XML comparison, compare time within 1e-6 seconds #63

Conversation

CervEdin commented Sep 25, 2024

nicola-lunghi commented Oct 16, 2024 • edited Loading

nicola-lunghi commented Oct 16, 2024

cclauss commented Oct 16, 2024

suspicious-xml-element-tree-usage (S314)

What it does

Why is this bad?

Example

References

CervEdin commented Oct 16, 2024

CervEdin commented Oct 16, 2024

suspicious-xml-element-tree-usage (S314)

What it does

Why is this bad?

Example

References

nicola-lunghi commented Oct 16, 2024 •

edited

Loading