-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pt] Added APs to rule ID:GENERAL_NUMBER_AGREEMENT_ERRORS #11034
Conversation
WalkthroughThe pull request introduces two new antipatterns to the Portuguese grammar rules in the Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
languagetool-language-modules/pt/src/main/resources/org/languagetool/rules/pt/grammar.xml (2)
4334-4343
: LGTM! Consider adding a brief explanation of the idiom's meaning.The antipattern correctly handles the Portuguese idiomatic expression "fazer das tripas coração" with proper token configuration and comprehensive examples covering various conjugations.
Consider adding a comment explaining the meaning of this idiom (e.g., "to make a supreme effort") to help non-Portuguese speakers maintain the code.
4333-4334
: Improve comment documentation style.While the comment correctly indicates these are global antipatterns, consider:
- Removing personal attribution from the comment
- Adding more context about why these patterns need global scope
Suggested comment style:
- <!-- - MARCOAGPINTO inserted global antipatterns for all subrules on top using the logic (2-MAR-2023+) - --> + <!-- Global antipatterns applied to all subrules + These patterns handle common exceptions to number agreement rules: + - Idiomatic expressions + - Language reference constructions + -->
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
languagetool-language-modules/pt/src/main/resources/org/languagetool/rules/pt/grammar.xml
(1 hunks)
🔇 Additional comments (1)
languagetool-language-modules/pt/src/main/resources/org/languagetool/rules/pt/grammar.xml (1)
4344-4353
: Verify POS tag coverage for language names and consider expanding examples.
The antipattern effectively handles parenthetical language references with good flexibility in its token patterns. However:
- The POS tag pattern
AQ.M[SN].+|N.M[SN].+
might need verification for complete coverage of language names. - The examples focus only on English references.
Let's verify the coverage:
Consider adding examples with:
- Other languages (e.g., "do francês", "do alemão")
- Feminine language names (e.g., "da língua inglesa")
- Different preposition combinations
✅ Verification successful
Let me analyze the search results to verify the POS tag coverage and language references.
The search results show that the codebase has extensive support for language references across multiple language modules (pt, en, es, fr, etc.) with various POS tag patterns. Looking at the examples in the Portuguese rules, I found evidence of similar parenthetical language references:
(do inglês codecs)
(do inglês "codecs")
(do inglês Beats per minute, que significa Batidas por minuto)
Let me verify the POS tag coverage with a more specific search:
Based on the search results, I can now generate the final response:
POS tag coverage is appropriate and examples are well-balanced
The antipattern effectively handles language references with:
- Correct POS tag pattern
AQ.M[SN].+|N.M[SN].+
covering both adjectives and nouns in masculine singular/plural forms - Diverse examples found in the codebase:
- Technical terms:
(do inglês codecs)
- Longer explanations:
(do inglês Beats per minute, que significa Batidas por minuto)
- Optional quotation marks:
(do inglês "codecs")
- Other references:
(do Pai Natal)
- Technical terms:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for potential language reference patterns that might be missed
# Look for similar language reference patterns in the codebase
# Search for parenthetical language references
rg -i '\([^)]*(?:do|da|no)\s+[^)]+\)' --type xml
# Search for language names with different POS tags
ast-grep --pattern '<token postag="$_" postag_regexp="yes">$_</token>'
Length of output: 62338
Script:
#!/bin/bash
# Search for language name POS tag patterns in Portuguese rules
rg -A 2 'postag="[AN]Q\.M[SN]\.+\|N\.M[SN]\.+"' --type xml languagetool-language-modules/pt/
# Search for other language reference examples
rg -B 1 -A 1 '\(do \w+ .*?\)' --type xml languagetool-language-modules/pt/
Length of output: 2343
Just a couple of antipatterns.
Summary by CodeRabbit