Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: refactor handle_response_model #1032

Merged
merged 9 commits into from
Oct 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,12 @@ jobs:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}

- name: Run Gemini Tests
if: matrix.python-version != '3.9'
run: poetry run pytest tests/llm/test_gemini
env:
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}

- name: Generate coverage report
if: matrix.python-version == '3.11'
run: |
Expand Down
105 changes: 105 additions & 0 deletions docs/blog/posts/introducing-cerebras-support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
---
draft: False
date: 2024-10-15
slug: introducing-cerebras-support
categories:
- LLM
- Cerebras
authors:
- ivanleomk
---

# Introducing Cerebras Support

## What's Cerebras?

Cerebras provides a new AI chip that is purpose-built for large language models. It is a custom chip that is designed to be more efficient and powerful than existing chips. With Cerebras inference, you can get up to 550 tokens/seconds over their API.

We're happy to announce that we've added support for Cerebras inference in Instructor using the `from_cerebras` method.

### Basic Usage

To use Cerebras inference, you can use the `from_cerebras` method to create a new Instructor client, define a Pydantic model to pass into the `response_model` parameter and get back a validated response exactly as you would expect.

<!-- more -->

You'll also need to install the Cerebras SDK to use the client. You can install it with the command below.

```bash
pip install "instructor[cerebras_cloud_sdk]"
```

This ensures that you have the necessary dependencies to use the Gemini or VertexAI SDKs with instructor.

### Getting Started

Before running the following code, you'll need to make sure that you have your Cerebras API Key set in your shell under the alias `CEREBRAS_API_KEY`.

```python
import instructor
from cerebras.cloud.sdk import Cerebras
from pydantic import BaseModel

client = instructor.from_cerebras(Cerebras())


class Person(BaseModel):
name: str
age: int


resp = client.chat.completions.create(
model="llama3.1-70b",
messages=[
{
"role": "user",
"content": "Extract the name and age of the person in this sentence: John Smith is 29 years old.",
}
],
response_model=Person,
)

print(resp)
#> Person(name='John Smith', age=29)
```

We support both the `AsyncCerebras` and `Cerebras` clients.

### Streaming

We also support streaming with the Cerebras client with `CEREBRAS_JSON` mode so that you can take advantage of the speed of Cerebras and process the response as it comes in.

```python
import instructor
from cerebras.cloud.sdk import Cerebras, AsyncCerebras
from pydantic import BaseModel
from typing import Iterable

client = instructor.from_cerebras(Cerebras(), mode=instructor.Mode.CEREBRAS_JSON)


class Person(BaseModel):
name: str
age: int


resp = client.chat.completions.create(
model="llama3.1-70b",
messages=[
{
"role": "user",
"content": "Extract all users from this sentence : Chris is 27 and lives in San Francisco, John is 30 and lives in New York while their college roomate Jessica is 26 and lives in London",
}
],
response_model=Iterable[Person],
stream=True,
)

for person in resp:
print(person)
# > Person(name='Chris', age=27)
# > Person(name='John', age=30)
# > Person(name='Jessica', age=26)
```

We're excited to see what you build with Instructor and Cerebras!
6 changes: 3 additions & 3 deletions docs/concepts/prompt_caching.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ from instructor import Instructor, Mode, patch
from anthropic import Anthropic
from pydantic import BaseModel

client = Instructor( # (1)!
client = Instructor( # (1)!
client=Anthropic(),
create=patch(
create=Anthropic().beta.prompt_caching.messages.create,
Expand All @@ -196,7 +196,7 @@ class Character(BaseModel):
description: str


with open("./book.txt", "r") as f:
with open("./book.txt") as f:
book = f.read()

resp = client.chat.completions.create(
Expand All @@ -208,7 +208,7 @@ resp = client.chat.completions.create(
{
"type": "text",
"text": "<book>" + book + "</book>",
"cache_control": {"type": "ephemeral"}, # (2)!
"cache_control": {"type": "ephemeral"}, # (2)!
},
{
"type": "text",
Expand Down
44 changes: 27 additions & 17 deletions docs/concepts/templating.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,21 +27,23 @@ from pydantic import BaseModel

client = instructor.from_openai(openai.OpenAI())


class User(BaseModel):
name: str
age: int


resp = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": """Extract the information from the
following text: `{{ data }}`""" # (1)!
{
"role": "user",
"content": """Extract the information from the
following text: `{{ data }}`""", # (1)!
},
],
response_model=User,
context = { # (2)!
"data": "John Doe is thirty years old"
}
context={"data": "John Doe is thirty years old"}, # (2)!
)

print(resp)
Expand All @@ -63,6 +65,7 @@ import re

client = instructor.from_openai(openai.OpenAI())


class Response(BaseModel):
text: str

Expand All @@ -76,12 +79,13 @@ class Response(BaseModel):
v = re.sub(pattern, '****', v)
return v


response = client.create(
model="gpt-4o",
response_model=Response,
messages=[
{
"role": "user",
"role": "user",
"content": """
Write about a {{ topic }}

Expand All @@ -94,21 +98,21 @@ response = client.create(
{% endfor %}
</banned_words>
{% endif %}
"""
""",
},
],
context={
"topic": "jason and now his phone number is 123-456-7890",
"redact_patterns": [
r"\b\d{3}[-.]?\d{3}[-.]?\d{4}\b", # Phone number pattern
r"\b\d{3}-\d{2}-\d{4}\b", # SSN pattern
r"\b\d{3}-\d{2}-\d{4}\b", # SSN pattern
],
},
max_retries=3,
)

print(response.text)
# > While i can't say his name anymore, his phone number is ****
#> While i can't say his name anymore, his phone number is ****
```

1. Access the variables passed into the `context` variable inside your Pydantic validator
Expand All @@ -128,13 +132,16 @@ from pydantic import BaseModel

client = instructor.from_openai(openai.OpenAI())


class Citation(BaseModel):
source_ids: list[int]
text: str


class Response(BaseModel):
answer: list[Citation]


resp = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
Expand Down Expand Up @@ -165,19 +172,19 @@ resp = client.chat.completions.create(
* {{ rule }}
{% endfor %}
{% endif %}
"""
""",
},
],
response_model=Response,
context = {
context={
"role": "professional educator",
"question": "What is the capital of France?",
"context": [
{"id": 1, "text": "Paris is the capital of France."},
{"id": 2, "text": "France is a country in Europe."}
{"id": 2, "text": "France is a country in Europe."},
],
"rules": ["Use markdown."]
}
"rules": ["Use markdown."],
},
)

print(resp)
Expand All @@ -193,16 +200,19 @@ from pydantic import BaseModel, SecretStr
import instructor
import openai


class UserContext(BaseModel):
name: str
address: SecretStr


class Address(BaseModel):
street: SecretStr
city: str
state: str
zipcode: str


client = instructor.from_openai(openai.OpenAI())
context = UserContext(name="scolvin", address="secret address")

Expand All @@ -211,16 +221,16 @@ address = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "{{ user.name }} is `{{ user.address.get_secret_value() }}`, normalize it to an address object"
"content": "{{ user.name }} is `{{ user.address.get_secret_value() }}`, normalize it to an address object",
},
],
context={"user": context},
response_model=Address,
)
print(context)
# > UserContext(username='jliu', address="******")
#> UserContext(username='jliu', address="******")
print(address)
# > Address(street='******', city="Toronto", state="Ontario", zipcode="M5A 0J3")
#> Address(street='******', city="Toronto", state="Ontario", zipcode="M5A 0J3")
```

This allows you to preserve your sensitive information while still using it in your prompts.
6 changes: 3 additions & 3 deletions docs/examples/batch_job_oai.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,9 @@ The Reserve Bank of Australia (RBA) came into being on 14 January 1960 as Austra
print(generate_question(text_chunk).model_dump_json(indent=2))
"""
{
"chain_of_thought": "The text mentions that the Reserve Bank of Australia (RBA) came into being on 14 January 1960 as Australia’s central bank and banknote issuing authority.",
"question": "When was the Reserve Bank of Australia (RBA) established?",
"answer": "14 January 1960"
"chain_of_thought": "The text provides information about the Reserve Bank of Australia's establishment, its core functions, its net worth, and the location of its employee base. The net worth is provided as A$101 billion.",
"question": "What is the estimated net worth of the Reserve Bank of Australia?",
"answer": "A$101 billion."
}
"""
```
Expand Down
1 change: 1 addition & 0 deletions docs/examples/bulk_classification.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ This is very helpful because once we use something like FastAPI to create endpoi
from typing import List
from pydantic import BaseModel, ValidationInfo, model_validator


class Tag(BaseModel):
id: int
name: str
Expand Down
3 changes: 2 additions & 1 deletion docs/examples/classification.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,10 +105,11 @@ For multi-label classification, we'll update our approach to use Literals instea
from typing import List
from pydantic import BaseModel, Field


class MultiClassPrediction(BaseModel):
"""
Class for a multi-class label prediction.

Examples:
- "My account is locked": ["TECH_ISSUE"]
- "I can't access my billing info": ["TECH_ISSUE", "BILLING"]
Expand Down
22 changes: 13 additions & 9 deletions docs/examples/document_segmentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ Note that in order to avoid LLM regenerating the content of each section, we can

```python
from pydantic import BaseModel, Field
from typing import List, Dict, Any
from typing import List


class Section(BaseModel):
title: str = Field(description="main topic of this section of the document")
Expand All @@ -23,6 +24,7 @@ class Section(BaseModel):

class StructuredDocument(BaseModel):
"""obtains meaningful sections, each centered around a single concept/topic"""

sections: List[Section] = Field(description="a list of sections of the document")
```

Expand Down Expand Up @@ -87,13 +89,15 @@ def get_sections_text(structured_doc, line2text):
for s in structured_doc.sections:
contents = []
for line_id in range(s.start_index, s.end_index):
contents.append(line2text.get(line_id, ''))
segments.append({
"title": s.title,
"content": "\n".join(contents),
"start": s.start_index,
"end": s.end_index
})
contents.append(line2text.get(line_id, ''))
segments.append(
{
"title": s.title,
"content": "\n".join(contents),
"start": s.start_index,
"end": s.end_index,
}
)
return segments
```

Expand All @@ -106,7 +110,7 @@ Here's an example of using these classes and functions to segment a tutorial on
from trafilatura import fetch_url, extract


url='https://sebastianraschka.com/blog/2023/self-attention-from-scratch.html'
url = 'https://sebastianraschka.com/blog/2023/self-attention-from-scratch.html'
downloaded = fetch_url(url)
document = extract(downloaded)

Expand Down
Loading
Loading