-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: class-based design #15
Conversation
modified: yuisub/__main__.py new file: yuisub/sub_translator.py
Reviewer's Guide by SourceryThis PR refactors the codebase to implement a class-based design by introducing the SubtitleTranslator class. The class encapsulates all subtitle translation functionality, including audio transcription and subtitle file handling. The changes simplify the API surface and improve code organization by moving the core functionality into a single cohesive class. Class diagram for WhisperModelclassDiagram
class WhisperModel {
- name: str
- device: Optional[Union[str, torch.device]]
- download_root: Optional[str]
- in_memory: bool
+ WhisperModel(name, device, download_root, in_memory)
+ transcribe(audio)
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @NULL204 - I've reviewed your changes and they look great!
Here's what I looked at during the review
- 🟡 General issues: 3 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
modified: yuisub/__main__.py renamed: yuisub/sub_translator.py -> yuisub/translator.py
@sourcery-ai review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @NULL204 - I've reviewed your changes - here's some feedback:
Overall Comments:
- There appears to be a bug in get_subtitles() where it uses self.sub_zh instead of the local sub_zh variable in the bilingual() call. This will cause issues since self.sub_zh is never set.
Here's what I looked at during the review
- 🟡 General issues: 2 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
modified: yuisub/__main__.py modified: yuisub/translator.py
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
modified: tests/test_bangumi.py modified: tests/test_llm.py modified: tests/test_sub.py new file: tests/test_translator.py modified: tests/util.py modified: yuisub/translator.py
@sourcery-ai review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @NULL204 - I've reviewed your changes and they look great!
Here's what I looked at during the review
- 🟡 General issues: 2 issues found
- 🟢 Security: all looks good
- 🟡 Testing: 3 issues found
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
sub: Optional[Union[str, Path, pysubs2.SSAFile]] = None, | ||
audio: Optional[Union[str, Any]] = None, | ||
styles: Optional[Dict[str, pysubs2.SSAStyle]] = None, | ||
ad: Optional[pysubs2.SSAEvent] = advertisement(), # noqa: B008 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Avoid mutable default arguments in method parameters
Mutable default arguments can cause unexpected behavior. Consider using None as the default and creating the advertisement in the method body if needed.
@@ -112,8 +112,8 @@ async def translate( | |||
base_url=base_url, | |||
bangumi_info=bangumi_info, | |||
) | |||
print(summarizer.system_prompt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Remove or replace debug print statement
Consider using a proper logging system instead of print statements if this information is important for debugging.
import logging
logging.debug(summarizer.system_prompt)
async def test_translator_sub() -> None: | ||
translator = SubtitleTranslator( | ||
model=util.OPENAI_MODEL, | ||
api_key=util.OPENAI_API_KEY, | ||
base_url=util.OPENAI_BASE_URL, | ||
bangumi_url=util.BANGUMI_URL, | ||
bangumi_access_token=util.BANGUMI_ACCESS_TOKEN, | ||
) | ||
|
||
sub_zh, sub_bilingual = await translator.get_subtitles(sub=str(util.TEST_ENG_SRT)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Test should verify the content of the translated subtitles
The test only checks if the files are saved but doesn't verify the actual content of the translations. Consider adding assertions to check the translated text, timing, and format of both sub_zh and sub_bilingual.
async def test_translator_sub() -> None:
translator = SubtitleTranslator(
model=util.OPENAI_MODEL,
api_key=util.OPENAI_API_KEY,
base_url=util.OPENAI_BASE_URL,
bangumi_url=util.BANGUMI_URL,
bangumi_access_token=util.BANGUMI_ACCESS_TOKEN,
)
sub_zh, sub_bilingual = await translator.get_subtitles(sub=str(util.TEST_ENG_SRT))
assert "你好" in str(sub_zh)
assert "Hello" in str(sub_bilingual) and "你好" in str(sub_bilingual)
@pytest.mark.skipif(os.environ.get("GITHUB_ACTIONS") == "true", reason="Skipping test when running on CI") | ||
async def test_translator_sub() -> None: | ||
translator = SubtitleTranslator( | ||
model=util.OPENAI_MODEL, | ||
api_key=util.OPENAI_API_KEY, | ||
base_url=util.OPENAI_BASE_URL, | ||
bangumi_url=util.BANGUMI_URL, | ||
bangumi_access_token=util.BANGUMI_ACCESS_TOKEN, | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Consider using mocks for CI environment instead of skipping tests
Rather than skipping these tests in CI, consider mocking the external dependencies (Whisper model, OpenAI API) to allow these tests to run in all environments. This would provide better test coverage and catch potential issues earlier.
@pytest.mark.asyncio
@mock.patch('your_module.SubtitleTranslator.get_subtitles')
async def test_translator_sub(mock_get_subtitles) -> None:
mock_get_subtitles.return_value = (Mock(), Mock())
translator = SubtitleTranslator(
model=util.OPENAI_MODEL,
api_key="mock_key",
base_url="mock_url",
bangumi_url="mock_url",
bangumi_access_token="mock_token"
)
await translator.get_subtitles(sub=str(util.TEST_ENG_SRT))
async def test_translator_audio() -> None: | ||
translator = SubtitleTranslator( | ||
torch_device=util.DEVICE, | ||
whisper_model=util.MODEL_NAME, | ||
model=util.OPENAI_MODEL, | ||
api_key=util.OPENAI_API_KEY, | ||
base_url=util.OPENAI_BASE_URL, | ||
bangumi_url=util.BANGUMI_URL, | ||
bangumi_access_token=util.BANGUMI_ACCESS_TOKEN, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Add error case tests for the SubtitleTranslator
The tests only cover the happy path. Consider adding tests for error cases such as invalid audio files, network errors, invalid API keys, and other edge cases that could occur during translation.
async def test_translator_audio() -> None:
translator = SubtitleTranslator(
torch_device=util.DEVICE,
whisper_model=util.MODEL_NAME,
model=util.OPENAI_MODEL,
api_key=util.OPENAI_API_KEY,
base_url=util.OPENAI_BASE_URL,
bangumi_url=util.BANGUMI_URL,
bangumi_access_token=util.BANGUMI_ACCESS_TOKEN,
)
sub_zh, sub_bilingual = await translator.get_subtitles(audio=str(util.TEST_AUDIO))
sub_zh.save(util.projectPATH / "assets" / "test.zh.translator.audio.ass")
sub_bilingual.save(util.projectPATH / "assets" / "test.bilingual.translator.audio.ass")
with pytest.raises(FileNotFoundError):
await translator.get_subtitles(audio="nonexistent_file.mp3")
with pytest.raises(Exception):
invalid_translator = SubtitleTranslator(
torch_device=util.DEVICE,
whisper_model=util.MODEL_NAME,
model=util.OPENAI_MODEL,
api_key="invalid_key",
base_url=util.OPENAI_BASE_URL,
bangumi_url=util.BANGUMI_URL,
bangumi_access_token=util.BANGUMI_ACCESS_TOKEN,
)
await invalid_translator.get_subtitles(audio=str(util.TEST_AUDIO))
Summary by Sourcery
Refactor the subtitle translation functionality into a class-based design using the new SubtitleTranslator class. Update the README to reflect these changes and add tests for the new class. Adjust the CI workflow to reorder OS versions in the testing matrix.
Enhancements:
CI:
Documentation:
Tests: