-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
9eec455
commit 8f0a82f
Showing
3 changed files
with
113 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
# `TranslatorOpenSource` Class Documentation | ||
|
||
The `TranslatorOpenSource` class is designed for language detection and translation using open-source large language models (LLMs) deployed on a custom endpoint. This class enables interaction with these models to detect the language of text and translate between languages. | ||
|
||
## Initialization | ||
|
||
```python | ||
from llmtranslate import TranslatorOpenSource | ||
|
||
# Create the translator object | ||
translator = TranslatorOpenSource(api_key="YOUR_API_KEY", llm_endpoint="YOUR_LLM_ENDPOINT", model="mistralai/Mistral-Nemo-Instruct-2407") | ||
``` | ||
|
||
|
||
The constructor initializes the `TranslatorOpenSource` class by setting up the API key, endpoint, and large language model (LLM). It also configures the maximum text length limits depending on the model. | ||
|
||
### Parameters: | ||
|
||
- **`api_key`**: | ||
A string representing the API key that authenticates access to the LLM service. This key ensures that the service can securely connect to your model deployment. | ||
|
||
- **`llm_endpoint`**: | ||
A string representing the endpoint URL where the LLM model is deployed. This is usually a REST API endpoint provided by your deployment platform (e.g., Hugging Face or other open-source model deployment platforms). | ||
|
||
- **`model`**: | ||
A string representing the name of the model used for translation. In this case, the `mistralai/Mistral-Nemo-Instruct-2407` model is used. This model determines the capabilities and performance of the translation system. | ||
|
||
- For models in `MINI_MODELS`: | ||
- `max_length = 30`: Limits the maximum length of text chunks that can be translated in a single request. | ||
- `max_length_mini_text_chunk = 20`: Limits the maximum length of mini text chunks for smaller translations. | ||
- For larger models (those not in `MINI_MODELS`): | ||
- `max_length = 100`: Allows larger text chunks for translation. | ||
- `max_length_mini_text_chunk = 50`: Allows larger mini text chunks for more efficient translation handling. | ||
|
||
## Example Usage | ||
|
||
### Synchronous Language Detection | ||
|
||
```python | ||
from llmtranslate import TranslatorOpenSource | ||
|
||
# Create the translator object | ||
translator = TranslatorOpenSource(api_key="YOUR_API_KEY", llm_endpoint="YOUR_LLM_ENDPOINT", model="mistralai/Mistral-Nemo-Instruct-2407") | ||
|
||
# Detect the language of a given text | ||
detected_language = translator.get_text_language("Bonjour tout le monde") | ||
|
||
if detected_language is not None: | ||
print(detected_language.ISO_639_1_code) # Output: 'fr' | ||
print(detected_language.ISO_639_2_code) # Output: 'fra' | ||
print(detected_language.ISO_639_3_code) # Output: 'fra' | ||
print(detected_language.language_name) # Output: 'French' | ||
``` | ||
|
||
### Asynchronous Language Detection | ||
|
||
```python | ||
import asyncio | ||
from llmtranslate import TranslatorOpenSource | ||
|
||
# Create the translator object | ||
translator = TranslatorOpenSource(api_key="YOUR_API_KEY", llm_endpoint="YOUR_LLM_ENDPOINT", model="mistralai/Mistral-Nemo-Instruct-2407") | ||
|
||
# Async function to detect language | ||
async def detect_language_async(): | ||
detected_language = await translator.async_get_text_language("Hola, ¿cómo estás?") | ||
if detected_language is not None: | ||
print(detected_language.ISO_639_1_code) # Output: 'es' | ||
print(detected_language.language_name) # Output: 'Spanish' | ||
|
||
# Run the async function | ||
asyncio.run(detect_language_async()) | ||
``` | ||
|
||
### Synchronous Translation | ||
|
||
```python | ||
from llmtranslate import TranslatorOpenSource | ||
|
||
# Create the translator object | ||
translator = TranslatorOpenSource(api_key="YOUR_API_KEY", llm_endpoint="YOUR_LLM_ENDPOINT", model="mistralai/Mistral-Nemo-Instruct-2407") | ||
|
||
# Translate text from one language to another | ||
translated_text = translator.translate( | ||
text="Hallo, wie geht's?", | ||
to_language="en" # Target language in ISO 639-1 format | ||
) | ||
print(translated_text) # Output: "Hello, how are you?" | ||
``` | ||
|
||
### Asynchronous Translation | ||
|
||
```python | ||
import asyncio | ||
from llmtranslate import TranslatorOpenSource | ||
|
||
# Create the translator object | ||
translator = TranslatorOpenSource(api_key="YOUR_API_KEY", llm_endpoint="YOUR_LLM_ENDPOINT", model="mistralai/Mistral-Nemo-Instruct-2407") | ||
|
||
# Async function to translate text | ||
async def translate_text_async(): | ||
translated_text = await translator.async_translate_text( | ||
text="Cześć, jak się masz?", | ||
to_language="en" # Target language in ISO 639-1 format | ||
) | ||
print(translated_text) # Output: "Hello, how are you?" | ||
|
||
# Run the async function | ||
asyncio.run(translate_text_async()) | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
[tool.poetry] | ||
name = "llmtranslate" | ||
version = "0.4.4" | ||
version = "0.4.5" | ||
description = "A Python library for language detection and translation using OpenAI's GPT-4o." | ||
authors = ["Adam Pawelek <[email protected]>"] | ||
readme = "README.md" | ||
|