Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs for new release #97

Merged
merged 3 commits into from
Nov 27, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions docs/embeddings_support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Support for embeddings for RAG (Retrieval augmented generation)

Support for getting embeddings for RAG use-cases have been implemented. The Open-AI ada-002 model is currently supported to get embeddings for input data. For embedding output the output typehint needs to be set as `Embedding[np.ndarray]`. Currently adding align statements to steer embedding model behaviour is not implemented, but is on the roadmap.


## Example
```python
@tanuki.patch
def score_sentiment(input: str) -> Embedding[np.ndarray]:
"""
Scores the input between 0-10
"""
```
41 changes: 41 additions & 0 deletions docs/function_configurability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Function configurability

The following optional arguments are currently supported for funcion configurability:
* environment_id (int, default = 0): The environment id. Used for fetching correct finetuned models.
* ignore_finetune_fetching (boolean, default = False): Whether to ignore fetching finetuned models. If set to True, during the first call Open-Ai will not be queried for finetuned models, which reduces initial startup latency.
* ignore_finetuning (boolean, default = False): Whether to ignore finetuning the models altogether. If set to True the teacher model will always be used. The data is still saved however if in future the data would need to be used for finetuning.
* ignore_data_storage (boolean, default = False): Whether to ignore storing the data. If set to True, the data will not be stored in the finetune dataset and the align statements will not be saved (align statements are still used for aligning outputs so model performance is not affected). This improves latency as communications with data storage is minimised.

**NB** - Configurations can be sent in only to `@tanuki.patch` decorator using keyword arguments. If you have any additional configurability needs, feel free to open an issue or implement it yourself and open a PR

## Examples

### Default function
```python
@tanuki.patch
def some_function(input: TypedInput) -> TypedOutput:
"""(Optional) Include the description of how your function will be used."""

@tanuki.align
def test_some_function(example_typed_input: TypedInput,
example_typed_output: TypedOutput):

assert some_function(example_typed_input) == example_typed_output

```
### Function with configurations (fastest inferece latency)
```python
@tanuki.patch(environment_id = 1,
ignore_finetune_fetching = True,
ignore_finetuning = True,
ignore_data_storage = True)
def some_function(input: TypedInput) -> TypedOutput:
"""(Optional) Include the description of how your function will be used."""

@tanuki.align
def test_some_function(example_typed_input: TypedInput,
example_typed_output: TypedOutput):

assert some_function(example_typed_input) == example_typed_output

```
10 changes: 10 additions & 0 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

The easiest way to build scalable, LLM-powered functions and applications that get cheaper and faster the more you use them.

## Release
[27/11] Support for [embeddings](https://github.com/monkeypatch/tanuki.py/blob/update_docs/docs/embeddings_support.md) and [function configurability](https://github.com/monkeypatch/tanuki.py/blob/update_docs/docs/function_configurability.md) is released!
* Use embeddings to integrate Tanuki with downstream RAG implementations using OpenAI Ada-2 model.
* Function configurability allows to configure Tanuki function executions to ignore certain implemented aspects (finetuning, data-storage communications) for improved latency and serverless integrations.

## Contents

<!-- TOC start (generated with https://github.com/derlin/bitdowntoc) -->
Expand Down Expand Up @@ -43,6 +48,7 @@ def test_some_function(example_typed_input: TypedInput,

- **Easy and seamless integration** - Add LLM augmented functions to any workflow within seconds. Decorate a function stub with `@tanuki.patch` and optionally add type hints and docstrings to guide the execution. That’s it.
- **Type aware** - Ensure that the outputs of the LLM adhere to the type constraints of the function (Python Base types, Pydantic classes, Literals, Generics etc) to guard against bugs or unexpected side-effects of using LLMs.
- **RAG support** - Seamlessly get embedding outputs for downstream RAG (Retrieval Augmented Generation) implementations. Output embeddings can then be easily stored and used for relevant document retrieval to reduce cost & latency and improve performance on long-form content.
- **Aligned outputs** - LLMs are unreliable, which makes them difficult to use in place of classically programmed functions. Using simple assert statements in a function decorated with `@tanuki.align`, you can align the behaviour of your patched function to what you expect.
- **Lower cost and latency** - Achieve up to 90% lower cost and 80% lower latency with increased usage. The package will take care of model training, MLOps and DataOps efforts to improve LLM capabilities through distillation.
- **Batteries included** - No remote dependencies other than OpenAI.
Expand Down Expand Up @@ -101,6 +107,9 @@ if __name__ == "__main__":
```

<!-- TOC --><a name="how-it-works"></a>

See [here](https://github.com/monkeypatch/tanuki.py/blob/update_docs/docs/function_configurability.md) for configuration options for patched Tanuki functions

## How It Works

When you call a tanuki-patched function during development, an LLM in a n-shot configuration is invoked to generate the typed response.
Expand Down Expand Up @@ -175,6 +184,7 @@ if __name__ == "__main__":

To see more examples using Tanuki for different use cases (including how to integrate with FastAPI), have a look at [examples](https://github.com/monkeypatch/tanuki.py/tree/master/examples).

For embedding outputs for RAG support, see [here](https://github.com/monkeypatch/tanuki.py/blob/update_docs/docs/embeddings_support.md)

<!-- TOC --><a name="test-driven-alignment"></a>
## Test-Driven Alignment
Expand Down
2 changes: 1 addition & 1 deletion src/tanuki/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,7 @@ def patch(patchable_func=None,
patchable_func: The function to be patched, should be always set to none. This is used here to allow for keyword arguments or no arguments to be passed to the decorator
environment_id (int): The environment id. Used for fetching correct finetuned models
ignore_finetune_fetching (bool): Whether to ignore fetching finetuned models.
If set to False, during the first call openai will not be queried for finetuned models, which reduces initial startup latency
If set to True, during the first call openai will not be queried for finetuned models, which reduces initial startup latency
ignore_finetuning (bool): Whether to ignore finetuning the models altogether. If set to True the teacher model will always be used.
The data is still saved however if in future would need to use finetuning
ignore_data_storage (bool): Whether to ignore storing the data.
Expand Down