Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPENNLP-1539 - Introduce parameter for POSTaggerME to configure output POS tag format #601

Merged
merged 1 commit into from
May 29, 2024

Conversation

rzo1
Copy link
Contributor

@rzo1 rzo1 commented May 23, 2024

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with OPENNLP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically main)?

  • Is your initial contribution a single, squashed commit?

For code changes:

  • Have you ensured that the full suite of tests is executed via mvn clean install at the root opennlp folder?
  • Have you written or updated unit tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file, including the main LICENSE file in opennlp folder?
  • If applicable, have you updated the NOTICE file, including the main NOTICE file found in opennlp folder?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

This is a Draft open for feedback on implementing compatibility with older PENN-based POS models from Sourceforge.

However, it is ready for review / ideas and comments / thoughts.

@rzo1 rzo1 requested review from kinow, jzonthemtn and mawiesne May 23, 2024 15:02
@rzo1 rzo1 marked this pull request as draft May 23, 2024 15:03
@rzo1 rzo1 changed the title Draft: OPENNLP-1539 - Introduce parameter for POSTaggerME to configure output POS tag format OPENNLP-1539 - Introduce parameter for POSTaggerME to configure output POS tag format May 23, 2024
@rzo1
Copy link
Contributor Author

rzo1 commented May 23, 2024

Note: The currently failing tests are unrelated to the actual change:

Error:    ChunkerModelLoaderTest.initResources:43->lambda$initResources$0:47 Runtime java.io.IOException: Server returned HTTP response code: 503 for URL: https://opennlp.sourceforge.net/models-1.5/en-chunker.bin
Error:    TokenNameFinderModelLoaderTest.initResources:43->lambda$initResources$0:47 Runtime java.io.IOException: Server returned HTTP response code: 503 for URL: https://opennlp.sourceforge.net/models-1.5/en-ner-location.bin
Error:    TokenNameFinderModelTest.testNERWithPOSModelV15:122->AbstractModelLoaderTest.downloadVersion15Model:41->AbstractModelLoaderTest.downloadModel:57 » IO Server returned HTTP response code: 503 for URL: https://opennlp.sourceforge.net/models-1.5/pt-pos-perceptron.bin

@jzonthemtn
Copy link
Contributor

@rzo1 Did you run into any weirdness with the older Sourceforge models?

@rzo1
Copy link
Contributor Author

rzo1 commented May 23, 2024

Need to add an test for it ;-)

@mawiesne
Copy link
Contributor

Thx @rzo1 - I left some comments to improve clarity of the doc for the changes in the API.

@rzo1 rzo1 requested a review from mawiesne May 24, 2024 08:50
@rzo1 rzo1 marked this pull request as ready for review May 24, 2024 08:50
@rzo1
Copy link
Contributor Author

rzo1 commented May 24, 2024

@rzo1 Did you run into any weirdness with the older Sourceforge models?

Added tests. They look good.

@mawiesne mawiesne merged commit 24e17f1 into main May 29, 2024
10 checks passed
@mawiesne mawiesne deleted the OPENNLP-1539 branch May 29, 2024 06:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants