-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add argument to make_codebase_resource
to make the created CodebaseResource path relative to the project work path rather than the project codebase path
#378
Conversation
a4d4a8b
to
4faafad
Compare
@JonoYang I'm not confortable with the idea of storing the
What about a special "root" CodebaseResource instance instead? Note that we used to always store the |
@tdruez Thanks for the review!
This would be acceptable too. The thing I am interested in is the ability to use the For example:
I'll modify my pipeline to create a |
@JonoYang my point is that we do not want to store any prefix in the database. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tdruez I've made some changes where we now create the "." CodebaseResource when running the scan_codebase
pipeline. I've done this by modifying Project.walk_codebase_path()
to return the project codebase/ directory and by modifying make_codebase_resource
to handle the cases where it is making a CodebaseResource for the project codebase/ directory.
scanpipe/models.py
Outdated
""" | ||
return self.codebase_path.rglob("*") | ||
return itertools.chain( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tdruez I am trying to get walk_codebase_path()
to return the codebase/
directory in addition to the subdirectories and files of codebase/
. Is there a better way to do this?
8c690ce
to
cf8a8bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not implemented a way to create "." when creating CodebaseResources for the docker pipeline. An immediate solution would be to check to see if a "." exists before creating the CodebaseResources for a layer (https://github.com/nexB/scancode.io/blob/main/scanpipe/pipes/docker.py#L134)
Maybe a better idea would be to create "." when the project is created?
No, let's not create resources at project creation time, resources should be created during pipelines execution.
@@ -668,9 +669,10 @@ def get_latest_output(self, filename): | |||
|
|||
def walk_codebase_path(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a unit test for this method.
scanpipe/models.py
Outdated
@@ -1580,7 +1588,12 @@ def descendants(self): | |||
database query on the current CodebaseResource `path`. | |||
The current CodebaseResource is not included. | |||
""" | |||
return self.project.codebaseresources.filter(path__startswith=f"{self.path}/") | |||
if self.path == ".": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a is_root
model property for this code and replace here with "self.is_root"
This will make the code more readable and we'll be able to change the Root logic in the future by simply updating the property.
Also, the "." should be stored in a global variable, something like ROOT_SYMBOL
, for the same reasons as above.
scanpipe/models.py
Outdated
@@ -1580,7 +1588,12 @@ def descendants(self): | |||
database query on the current CodebaseResource `path`. | |||
The current CodebaseResource is not included. | |||
""" | |||
return self.project.codebaseresources.filter(path__startswith=f"{self.path}/") | |||
if self.path == ".": | |||
return self.project.codebaseresources.exclude(path=".") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition does not seem to be tested. Also, could you document in the dostring why the root need to be treated differently?
scanpipe/models.py
Outdated
""" | ||
exactly_one_sub_directory = "[^/]+$" | ||
children_regex = rf"^{self.path}/{exactly_one_sub_directory}" | ||
if self.path == ".": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, it's not clear nor documented why we have to treat the root differently.
Signed-off-by: Thomas Druez <[email protected]>
Signed-off-by: Thomas Druez <[email protected]>
Signed-off-by: Thomas Druez <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
* Remove create_discovered_packages2 and create_codebase_resources2 Signed-off-by: Jono Yang <[email protected]>
* Normalize package_uids before comparing results in tests * Update expected test results Signed-off-by: Jono Yang <[email protected]>
* Mark ProjectCodebase tests with expectedFailure * We will revisit ProjectCodebase and update it to fit our current models Signed-off-by: Jono Yang <[email protected]>
* We are using a scancode scan results for tests since asgiref-3.3.0_scan.json is not exactly the same format as scancode's json output Signed-off-by: Jono Yang <[email protected]>
* Update regen_test_data.py to generate asgiref-3.3.0_walk_test_fixtures.json Signed-off-by: Jono Yang <[email protected]>
* No need to explicity get license_clarity_score in make_results_summary() * Update expected test results Signed-off-by: Jono Yang <[email protected]>
* Add .vscode to .gitignore Signed-off-by: Jono Yang <[email protected]>
* Add argument to set the codebase path of a CodebaseResource created by make_codebase_resource relative to the project work path Signed-off-by: Jono Yang <[email protected]>
* `codebase` has been added as an unused argument to CodebaseResource.save() to provide compatibility with commoncode.resource.Resource Signed-off-by: Jono Yang <[email protected]>
* Have test to ensure that the codebase prefix is present when relative_to_work_path is True Signed-off-by: Jono Yang <[email protected]>
* Update expected test results * Revert previous changes to make_codebase_resource Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
* Use different regex to determine the children of the "." CodebaseResource * Update test expectations Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
* replace_root_path_and_name is a helper function that replaces the root of a Codebase with "." and strips the previous root prefix from the remaining Resources * Update tests Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
Signed-off-by: Jono Yang <[email protected]>
* Update docstrings * Update expected test results Signed-off-by: Jono Yang <[email protected]>
48e489c
to
a9410f8
Compare
* Format code Signed-off-by: Jono Yang <[email protected]>
95ad5af
to
24902bf
Compare
@JonoYang now that we merged the toolkit upgrade, would it be a good time to rebase et complete this PR? |
* Update expected test results Signed-off-by: Jono Yang <[email protected]>
I'm merging in the changes from main into this branch. I'm having trouble with I think what I am trying to do in this function is best implemented somewhere in |
Closing this in favor of #624. Instead of creating a single root CodebaseResource, ProjectCodebase has been updated to walk, not from a single root, but from every CodebaseResource in the root of a Project. This effectively will get the same tree traversal done without large changes to the Project/CodebaseResource models. |
I have a pipeline that performs something similar to a Merkle tree calculation on a codebase. In order for this to work, I need a root directory to roll my results up to. scanpipe currently does not have the concept of a root, so my current workaround is to use the project's codebase directory as the root directory.
I create a new pipeline that creates a
CodebaseResource
for the project codebase directory, but when I callscanpipe.pipes.scancode.make_codebase_resource
, the createdCodebaseResources
do not havecodebase
as the prefix.I've modified
scanpipe.pipes.scancode.make_codebase_resource
to have a new argumentrelative_to_work_path
such that the path created for the CodebaseResource is related to the project work path, rather than the project codebase path. This means that the created CodebaseResource paths will be prefixed withcodebase
e.g.codebase/notice.NOTICE
.