Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/issue765 SPDX is_compound_expression does not strictly check for compound expression #773

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Joerki
Copy link

@Joerki Joerki commented Feb 7, 2025

This pull request solves the issue that the function is_compound_expression is returning true also for other expressions that are not compound expressions.

To be consistent, functions are introduced that form a clear distinction between SPDX license identifiers (SPDX IDs, present in the official SPDX list), simple expressions (either SPDX IDs or custom LicenseRef IDs) and compound expressions.

The implementation now also accesses exclusively the SPDX data coming with the license-expression module that is frequently updated with new SPDX IDs and other licenses - SPDX custom licenses - coming from aboutcode-org.

… expression

Introduces strict checks for SPDX compound expressions, plus SPDX identifiers and SPDX simple expressions as well to be consistent.
Changed SPDX data dependencies to license-expression module to have a single, consistent source.
Changed license factory to use new/updated functions to replace try/except/pass statements.

fixes CycloneDX#765

Signed-off-by: Joerg Arndt <[email protected]>
@Joerki Joerki requested a review from a team as a code owner February 7, 2025 13:54
Copy link
Member

@jkowalleck jkowalleck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a mix of scopes here

  • adding documentation
  • fixing typos
  • modifying existing behavior
  • changing public API

Please split this PR into individual pullrequest, one per scope.
This allowed the autonomous PRs to move on their own pace, be reviewed by the dedicated audience, and simply make reviews easier.
Thank you in advance.

__SPDX_EXPRESSION_LICENSING: 'Licensing' = get_spdx_licensing()
__KNOWN_IDS = ([entry['spdx_license_key'] for entry in get_license_index()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do i see this right? Instead of using CycloneDX own well-known list of allowed values for a SPDX license ID,
the new code uses an arbitrary list maintained by an unauthoritative 3rd party?
If this is what this is about, then a clear "no". the CycloneDX explicitly states a defined list of well-known values here - which is maintained in https://github.com/CycloneDX/specification/blob/master/schema/spdx.{xsd,schema.json)}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are using the license-expression module for license validation in the is_compound _expression. And this module internally takes already the data I use for the SPDX IDs to do this verification for official and specified LicenseRef-scancode-*. So I did not introduce something new.

To use different data from different sources is inconsistent for me an leds to problems if one of the modules, cyclonedx-python-lib and license-expression are not most up to date and correct at the same time. Depending on usage on a single or compound the use of same licenses may result in different IDs. This is what we currently already have.

It could be fine to have a text (for the target application) to check license-expressions SPDX license list IDs data.
During my development (having the dedicated SPDX list as reference) a test was failing when my implementation used license-expression data. Errors may happen, but the latest license-expression maintainers fixed the issue already and converted a formerly existing LicenseRef-scancode- ID into the now official one.

Furthermore, the BSI explicitly recommends the usage of the AboutCode/ScanCode database in their SBOM TR (for the LicenseRef-scancode-* custom IDs that also is incorporated in the license-expression module. And the module also receives updates for the SPDX license list.

If your point to use both at the same time, this is your decision as a core maintainer.

But for me I say that I do not want to rely on two different lists burned manually into each module version.

Copy link
Member

@jkowalleck jkowalleck Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are using the license-expression module for license validation in the is_compound _expression. And this module internally takes already the data I use for the SPDX IDs to do this verification for official and specified LicenseRef-scancode-*. So I did not introduce something new.

This is a complete behavioral change.
The own shipped list of well-known SPDX ID's is a different one than the one any 3rd party library uses.

To use different data from different sources is inconsistent for me an leds to problems [...]

there were no different data. there was just one data: the official list of well-known SPDX license IDs from the official CycloneDX spec. please see below.

If your point to use both at the same time, this is your decision as a core maintainer.

it is not my decision. it is adherence to the spec.
this is the official spec:

Furthermore, the BSI explicitly recommends the usage of the AboutCode/ScanCode database in their SBOM TR (for the LicenseRef-scancode-* custom IDs that also is incorporated in the license-expression module. And the module also receives updates for the SPDX license list.

This is the BSI's very own narrow profile/view/flavor of SBOM. and it is a recommendation that nobody is forced to follow.
This library is about the international standard CycloneDX - which is much more than SBOM - and much more than a recommendation.

@jkowalleck jkowalleck marked this pull request as draft February 10, 2025 10:36
@jkowalleck
Copy link
Member

current scope-mix PR is not intended to be reviewed as is.
set to "draft" until clarified.

@Joerki
Copy link
Author

Joerki commented Feb 10, 2025

I'm sorry to say that at this time I cannot spend more of my (private) time on this PR.
I have neglected other responsibilities that need my focus now.

@jkowalleck
Copy link
Member

I'm sorry to say that at this time I cannot spend more of my (private) time on this PR. I have neglected other responsibilities that need my focus now.

feel free to come back any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants