-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search by key word #60
Conversation
WalkthroughThis pull request integrates a new command, Changes
Sequence Diagram(s)sequenceDiagram
participant U as User
participant C as CLI Interface
participant P as CLI Parser
participant S as search_projects_by_keywords-and-filters
participant E as Search Engine
U->>C: Enter search-projects-by-keywords-and-filters command with options
C->>P: Parse command-line arguments (-k, -f, -ps, -p, -sd, -sf)
P->>S: Invoke search_projects_by_keywords-and-filters with parameters
S->>E: Build query & call search method with filters and sorting
E-->>S: Return search results
S-->>C: Return formatted results
C-->>U: Display search results
Possibly related PRs
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
pridepy/pridepy.py (1)
361-378
: Add validation for sort_direction value.The function should validate that the sort_direction value is either "ASC" or "DESC" before passing it to the search method.
Apply this diff to add validation:
def search_projects_by_keywords_and_filters( keyword, filter, page_size, page, sort_direction, sort_fields ): project = Project() + if sort_direction not in ["ASC", "DESC"]: + raise ValueError("sort_direction must be either 'ASC' or 'DESC'") sf = ', '.join(sort_fields) logging.info( project.search_by_keywords_and_filters( keyword, filter, page_size, page, sort_direction, sf ) )README.md (1)
160-166
: Improve the command example documentation.
- The command example should show its output to comply with markdown best practices.
- Consider adding expected output to help users understand the command's behavior.
Apply this diff to improve the documentation:
## Search projects by keywords and filters Get the Project metadata by keywords and filters ```bash $ python -m pridepy.pridepy search-projects-by-keywords-and-filters -f projectTags==Proteometools,organismsPart==Pancreas -k human -sd DESC -sf accession -sf submissionDate +# Example output: +# { +# "accession": "PXD000001", +# "title": "Human Proteome Project", +# "projectTags": ["Proteometools"], +# "submissionDate": "2023-01-01", +# ... +# }<details> <summary>🧰 Tools</summary> <details> <summary>🪛 markdownlint-cli2 (0.17.2)</summary> 165-165: Dollar signs used before commands without showing output null (MD014, commands-show-output) </details> </details> </blockquote></details> </blockquote></details> <details> <summary>📜 Review details</summary> **Configuration used: CodeRabbit UI** **Review profile: CHILL** **Plan: Pro** <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between e57c89ab9dd2a084ea9f55a9d1f5b820ae74e8a1 and 0c2301178a8f825fe0d6defe8550604d5c5d5dd9. </details> <details> <summary>📒 Files selected for processing (2)</summary> * `README.md` (3 hunks) * `pridepy/pridepy.py` (2 hunks) </details> <details> <summary>🧰 Additional context used</summary> <details> <summary>🪛 markdownlint-cli2 (0.17.2)</summary> <details> <summary>README.md</summary> 165-165: Dollar signs used before commands without showing output null (MD014, commands-show-output) </details> </details> <details> <summary>🪛 GitHub Actions: Python package</summary> <details> <summary>pridepy/pridepy.py</summary> [warning] 13-13: expected 2 blank lines, found 1 --- [warning] 231-231: line too long (143 > 127 characters) --- [warning] 274-274: expected 2 blank lines, found 1 --- [warning] 311-311: expected 2 blank lines, found 1 --- [warning] 383-383: too many blank lines (3) </details> </details> </details> <details> <summary>⏰ Context from checks skipped due to timeout of 90000ms (1)</summary> * GitHub Check: build </details> <details> <summary>🔇 Additional comments (4)</summary><blockquote> <details> <summary>pridepy/pridepy.py (2)</summary> `129-129`: **LGTM!** The formatting of category choices has been improved by removing unnecessary spaces. --- `315-318`: **LGTM!** Making the keyword parameter required is a good choice as it ensures more focused search results. </details> <details> <summary>README.md (2)</summary> `61-61`: **LGTM!** The new command has been correctly added to the list of available commands. --- `139-139`: **LGTM!** The typo in the section header has been correctly fixed. </details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
pridepy/pridepy.py
Outdated
@click.option( | ||
"-f", | ||
"--filter", | ||
required=False, | ||
help="Parameters to filter the search results. The structure of the " | ||
"filter is: field1==value1, field2==value2. Example " | ||
"accession==PRD000001", | ||
) | ||
@click.option( | ||
"-ps", | ||
"--page_size", | ||
required=False, | ||
default=100, | ||
help="Number of results to fetch in a page", | ||
) | ||
@click.option( | ||
"-p", | ||
"--page", | ||
required=False, | ||
default=0, | ||
help="Identifies which page of results to fetch", | ||
) | ||
@click.option( | ||
"-sd", | ||
"--sort_direction", | ||
required=False, | ||
default="DESC", | ||
help="Sorting direction: ASC or DESC", | ||
) | ||
@click.option( | ||
"-sf", | ||
"--sort_fields", | ||
required=False, | ||
default="submission_date", | ||
multiple=True, | ||
help="Field(s) for sorting the results on. Default for this " | ||
"request is submission_date. More fields can be separated by " | ||
"comma and passed. Example: submissionDate,accession", | ||
type=click.Choice("accession,submissionDate,diseases,organismsPart,organisms,instruments,softwares," | ||
"avgDownloadsPerFile,downloadCount,publicationDate".split(",")), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add type validation for sort_direction and fix inconsistency in sort_fields.
- The
sort_direction
option should validate input usingtype=click.Choice(["ASC", "DESC"])
. - There's an inconsistency in the sort fields: the help text mentions "submissionDate" but the choice uses "submission_date".
Apply this diff to improve the options:
@click.option(
"-sd",
"--sort_direction",
required=False,
default="DESC",
- help="Sorting direction: ASC or DESC",
+ help="Sorting direction",
+ type=click.Choice(["ASC", "DESC"]),
)
@click.option(
"-sf",
"--sort_fields",
required=False,
- default="submission_date",
+ default="submissionDate",
multiple=True,
help="Field(s) for sorting the results on. Default for this "
"request is submission_date. More fields can be separated by "
"comma and passed. Example: submissionDate,accession",
type=click.Choice("accession,submissionDate,diseases,organismsPart,organisms,instruments,softwares,"
"avgDownloadsPerFile,downloadCount,publicationDate".split(",")),
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
pridepy/pridepy.py (1)
363-372
: Enhance docstring with more detailed parameter descriptions.The docstring should provide more information about:
- Valid filter formats and examples
- Valid range for page_size
- Valid values for sort_direction
- List of available sort_fields
""" Search all projects by keywords and filters Parameters: - keyword (str): keyword to search in entire project. - filter (str): filter the search results. field1==value1 - page_size (int): no of records or projects per page - page (int): Page number - sort_direction (str): sort direction of the results based on sortfield - sort_fields (str): field to sort the results by. + keyword (str): Keyword to search across all project fields. + filter (str): Filter string in format "field1==value1, field2==value2". + Example: "accession==PRD000001, organisms==Human" + page_size (int): Number of results per page (1-1000, default: 100) + page (int): Zero-based page number (default: 0) + sort_direction (str): Sort direction, either "ASC" or "DESC" (default: "DESC") + sort_fields (str): Fields to sort by, comma-separated. + Available fields: accession, submissionDate, diseases, + organismsPart, organisms, instruments, softwares, + avgDownloadsPerFile, downloadCount, publicationDate """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pridepy/pridepy.py
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build
🔇 Additional comments (2)
pridepy/pridepy.py (2)
129-129
: LGTM!The formatting fix ensures consistent category choices without extra spaces.
341-359
: Add type validation for sort_direction and fix inconsistency in sort_fields.
- The
sort_direction
option should validate input usingtype=click.Choice(["ASC", "DESC"])
.- There's an inconsistency in the sort fields: the help text mentions "submissionDate" but the choice uses "submission_date".
@click.option( | ||
"-f", | ||
"--filter", | ||
required=False, | ||
help="Parameters to filter the search results. The structure of the " | ||
"filter is: field1==value1, field2==value2. Example " | ||
"accession==PRD000001", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Clarify filter parsing logic.
The help text suggests using comma-separated field-value pairs (e.g., "field1==value1, field2==value2"), but the implementation doesn't show how these are parsed. Please verify:
- Is the filter string parsed correctly?
- Are multiple filters supported?
🏁 Script executed:
#!/bin/bash
# Description: Check if filter parsing logic exists in the Project class
# Search for filter parsing logic in the Project class
ast-grep --pattern $'class Project {
$$$
search_by_keywords_and_filters($$$) {
$$$
}
$$$
}'
# Search for any filter-related code
rg -A 5 "filter.*==.*"
Length of output: 1141
Clarify Filter Parsing Implementation
The current implementation only shows a help text suggesting the use of comma‑separated field–value pairs, but no explicit code was found that splits or otherwise processes the filter string. In particular, a search for filter‐related parsing (e.g. splitting on commas or "==") within the Project class (and specifically in the search_by_keywords_and_filters function) didn’t reveal any parsing logic. This raises two concerns:
• The filter string does not appear to be explicitly parsed to convert something like "field1==value1, field2==value2" into structured filters.
• There is no clear evidence that multiple filters are handled, even though the documentation suggests that they should be.
It would be advisable to verify whether filter processing is handled elsewhere or to add explicit parsing (with corresponding inline comments and tests) so that the behavior matches the help text.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
pridepy/pridepy.py (2)
313-318
: Improve keyword parameter help text.The current help text could be clearer about what fields are searched and how the keyword is used.
help="The entered word will be searched among the fields to fetch " - "matching pride." + "matching PRIDE projects. The keyword will be searched across project metadata " + "including titles, descriptions, and other relevant fields."
376-376
: Consider handling empty sort_fields.The current implementation joins sort fields without checking if the tuple is empty. While Click's default value ensures it won't be empty, it's good practice to handle edge cases.
- sf = ', '.join(sort_fields) + sf = ', '.join(sort_fields) if sort_fields else 'submissionDate'
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pridepy/pridepy.py
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: build
🔇 Additional comments (4)
pridepy/pridepy.py (4)
129-129
: LGTM! Improved formatting of category choices.The formatting of category choices has been improved by removing unnecessary spaces.
343-349
: Add type validation for sort_direction.As mentioned in the previous review, the
sort_direction
option should validate input usingtype=click.Choice(["ASC", "DESC"])
.required=False, default="DESC", - help="Sorting direction: ASC or DESC", + help="Sorting direction", + type=click.Choice(["ASC", "DESC"]),
351-361
: Fix inconsistency in sort_fields default value.As mentioned in the previous review, there's an inconsistency between the default value "submission_date" and the choice "submissionDate".
required=False, - default=["submission_date"], + default=["submissionDate"], multiple=True,
319-326
: Clarify filter parsing implementation.As mentioned in the previous review, the filter parsing logic needs clarification. The help text suggests using comma-separated field-value pairs, but the implementation details are missing. Please verify:
- Is the filter string parsed correctly?
- Are multiple filters supported?
#!/bin/bash # Description: Check if filter parsing logic exists in the Project class # Search for filter parsing logic in the Project class ast-grep --pattern $'class Project { $$$ search_by_keywords_and_filters($$$) { $$$ } $$$ }' # Search for any filter-related code rg -A 5 "filter.*==.*"
Summary by CodeRabbit
New Features
Documentation