Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script to report rules #2685

Merged
merged 2 commits into from
Sep 6, 2021
Merged

Conversation

AyanSinhaMahapatra
Copy link
Member

@AyanSinhaMahapatra AyanSinhaMahapatra commented Sep 1, 2021

Partially adresses #2566

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
    Run tests locally to check for errors.
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
@AyanSinhaMahapatra
Copy link
Member Author

@mjherzog this is ready for you to look at.

Here's the help text for this script:

Usage: report_license_rules.py [OPTIONS]

  Write Licenses/Rules from scancode into a CSV file with all details. Output
  can be optionally filtered by category/license-key.

Options:
  -l, --licenses FILE       Write all Licenses data to the csv FILE.
  -r, --rules FILE          Write all Rules data to the csv FILE.
  -c, --category STRING     An optional filter to only output licenses/rules
                            of this category. Example STRING: `permissive`.
  -k, --license-key STRING  An optional filter to only output licenses/rules
                            which has this license key.Example STRING: `mit`.
  -t, --with-text           Also include the license/rules texts.Note that
                            this increases the file size significantly.
  -h, --help                Show this message and exit.

You have to checkout this branch in a configured virtualenv and run the script if you want to check this out now.

Attaching sample ouput files:

licence_data.csv
rule_data.csv

@mjherzog
Copy link
Member

mjherzog commented Sep 3, 2021

@AyanSinhaMahapatra Thank you for running the sample with:
python ./etc/scripts/licenses/report_license_rules.py -l comm_licenses_data.csv -r comm_rule_data.csv -t -c "Commercial"
It looks good overall, but there is strange data in the identifier column - it looks like some kind of "bleed" from license text. Easy to see the problem if you sort by identifier because there is a very big chunk at the beginning and the other examples interspersed among the "true" identifiers.

@mjherzog
Copy link
Member

mjherzog commented Sep 3, 2021

I see clues for Accellera and Hacktivismo in the errant text.
If we have text too large for a single csv/Excel cell we could just truncate it. This tool is needed most for licenses with relatively sparse text for most rules like commercial-license.

Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See a few nits for your consideration!
Feel free to merge.

etc/scripts/licenses/report_license_rules.py Show resolved Hide resolved
etc/scripts/licenses/report_license_rules.py Show resolved Hide resolved
etc/scripts/licenses/report_license_rules.py Outdated Show resolved Hide resolved
etc/scripts/licenses/report_license_rules.py Outdated Show resolved Hide resolved
etc/scripts/licenses/report_license_rules.py Outdated Show resolved Hide resolved
etc/scripts/licenses/report_license_rules.py Outdated Show resolved Hide resolved
etc/scripts/licenses/report_license_rules.py Outdated Show resolved Hide resolved
Add licensedb url, modify csv output to correct bugs, only add first 200 characters
in case of text, modify functions and add docstrings.

Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
@AyanSinhaMahapatra
Copy link
Member Author

@pombredanne added requested changes.

@mjherzog added some changes that might fix your problems earlier. Made texts have only first 200 characters and removed some probably problematic csv file handling. Adding new csv file reports, do check if the bugs are resolved.

The commands used to generate reports:

python ./etc/scripts/licenses/report_license_rules.py -l licenses_data.csv -r rule_data.csv -t
python ./etc/scripts/licenses/report_license_rules.py -l comm_licenses_data.csv -r comm_rule_data.csv -t -c "Commercial"

Reports:
licenses_data.csv
rule_data.csv
comm_licenses_data.csv
comm_rule_data.csv

@AyanSinhaMahapatra AyanSinhaMahapatra merged commit 9b6186a into develop Sep 6, 2021
@pombredanne pombredanne deleted the 2566-report-license-rules branch September 21, 2021 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants