community[major]: Added integration with new Gemini API (#3621)

* Added integration with new Gemini API * added to requiresOptionalDependency * reverted old models * fixed linting * chat * Cleanup * Run format * Update deps * Move to chat model, add tests * Add docs markdown skeleton * Move deprecation notices around * Docs update * Fix dependency issue * Fix docs path * moved conversion function * cleanup * Update lockfile * minor cleanup * docs indent * More updates to docs * removed enum imports * fixed enum imports/exports * Docs * import order * Fix lint --------- Co-authored-by: jacoblee93 <[email protected]>
langchain-ai · Dec 13, 2023 · 12ec4ca · 12ec4ca · vercel · Dec 14, 2023
1 parent f10ce5e
commit 12ec4ca
Show file tree

Hide file tree

Showing 36 changed files with 2,100 additions and 57 deletions.
diff --git a/docs/core_docs/docs/integrations/chat/google_generativeai.mdx b/docs/core_docs/docs/integrations/chat/google_generativeai.mdx
@@ -0,0 +1,44 @@
+---
+sidebar_label: Google GoogleGenerativeAI
+---
+
+import CodeBlock from "@theme/CodeBlock";
+
+# ChatGoogleGenerativeAI
+
+You can access Google's `gemini` and `gemini-vision` models, as well as other
+generative models in LangChain through `ChatGoogleGenerativeAI` class in the
+`@langchain/google-genai` integration package.
+
+Get an API key here: https://ai.google.dev/tutorials/setup
+
+You'll first need to install the `@langchain/google-genai` package:
+
+```bash npm2yarn
+npm install @langchain/google-genai
+```
+
+## Usage
+
+import GoogleGenerativeAI from "@examples/models/chat/googlegenerativeai.ts";
+
+<CodeBlock language="typescript">{GoogleGenerativeAI}</CodeBlock>
+
+## Multimodal support
+
+To provide an image, pass a human message with a `content` field set to an array of content objects. Each content object
+where each dict contains either an image value (type of image_url) or a text (type of text) value. The value of image_url must be a base64
+encoded image (e.g., data:image/png;base64,abcd124):
+
+import GoogleGenerativeAIMultimodal from "@examples/models/chat/googlegenerativeai_multimodal.ts";
+
+<CodeBlock language="typescript">{GoogleGenerativeAIMultimodal}</CodeBlock>
+
+## Gemini Prompting FAQs
+
+As of the time this doc was written (2023/12/12), Gemini has some restrictions on the types and structure of prompts it accepts. Specifically:
+
+1. When providing multimodal (image) inputs, you are restricted to at most 1 message of "human" (user) type. You cannot pass multiple messages (though the single human message may have multiple content entries)
+2. System messages are not natively supported, and will be merged with the first human message if present.
+3. For regular chat conversations, messages must follow the human/ai/human/ai alternating pattern. You may not provide 2 AI or human messages in sequence.
+4. Message may be blocked if they violate the safety checks of the LLM. In this case, the model will return an empty response.
diff --git a/docs/core_docs/docs/integrations/chat/google_palm.mdx b/docs/core_docs/docs/integrations/chat/google_palm.mdx
@@ -1,11 +1,16 @@
 ---
 sidebar_label: Google PaLM
+sidebar_class_name: hidden
 ---
 
 import CodeBlock from "@theme/CodeBlock";
 
 # ChatGooglePaLM
 
+:::note
+This integration is largely superseded by the newer [Google GenerativeAI Gemini](/docs/integrations/chat/google_generativeai) chat models.
+:::
+
 The [Google PaLM API](https://developers.generativeai.google/products/palm) can be integrated by first
 installing the required packages:
 

diff --git a/docs/core_docs/docs/integrations/llms/google_palm.mdx b/docs/core_docs/docs/integrations/llms/google_palm.mdx
@@ -1,7 +1,15 @@
+---
+sidebar_class_name: hidden
+---
+
 import CodeBlock from "@theme/CodeBlock";
 
 # Google PaLM
 
+:::note
+This integration is largely superseded by the newer [Google GenerativeAI Gemini](/docs/integrations/chat/google_generativeai) embeddings.
+:::
+
 The [Google PaLM API](https://developers.generativeai.google/products/palm) can be integrated by first
 installing the required packages:
 

diff --git a/docs/core_docs/docs/integrations/platforms/google.mdx b/docs/core_docs/docs/integrations/platforms/google.mdx
@@ -1,26 +1,65 @@
 # Google
 
-All functionality related to [Google Cloud Platform](https://cloud.google.com/)
+Functionality related to [Google Cloud Platform](https://cloud.google.com/)
 
-## LLMs
+## Chat models
 
-### Vertex AI
+### ChatGoogleGenerativeAI
 
-Access PaLM LLMs like `text-bison` and `code-bison` via Google Cloud.
+Access Gemini models such as `gemini-pro` and `gemini-pro-vision` through the `ChatGoogleGenerativeAI` class.
 
-```typescript
-import { GoogleVertexAI } from "langchain/llms/googlevertexai";
+```bash npm2yarn
+npm install @langchain/google-genai
 ```
 
-### Model Garden
+Configure your API key.
 
-Access PaLM and hundreds of OSS models via Vertex AI Model Garden.
+```
+export GOOGLE_API_KEY=your-api-key
+```
 
 ```typescript
-import { GoogleVertexAI } from "langchain/llms/googlevertexai";
+const model = new ChatGoogleGenerativeAI({
+  modelName: "gemini-pro",
+  maxOutputTokens: 2048,
+});
+
+// Batch and stream are also supported
+const res = await model.invoke([
+  [
+    "human",
+    "What would be a good company name for a company that makes colorful socks?",
+  ],
+]);
 ```
 
-## Chat models
+Gemini vision models support image inputs when providing a single human message. For example:
+
+```typescript
+const visionModel = new ChatGoogleGenerativeAI({
+  modelName: "gemini-pro-vision",
+  maxOutputTokens: 2048,
+});
+const image = fs.readFileSync("./hotdog.jpg").toString("base64");
+const input2 = [
+  new HumanMessage({
+    content: [
+      {
+        type: "text",
+        text: "Describe the following image.",
+      },
+      {
+        type: "image_url",
+        image_url: `data:image/png;base64,${image}`,
+      },
+    ],
+  }),
+];
+
+const res = await visionModel.invoke(input2);
+```
+
+The value of image_url must be a base64 encoded image (e.g., data:image/png;base64,abcd124).
 
 ### Vertex AI
 
@@ -30,6 +69,24 @@ Access PaLM chat models like `chat-bison` and `codechat-bison` via Google Cloud.
 import { ChatGoogleVertexAI } from "langchain/chat_models/googlevertexai";
 ```
 
+## LLMs
+
+### Vertex AI
+
+Access PaLM LLMs like `text-bison` and `code-bison` via Google Cloud.
+
+```typescript
+import { GoogleVertexAI } from "langchain/llms/googlevertexai";
+```
+
+### Model Garden
+
+Access PaLM and hundreds of OSS models via Vertex AI Model Garden.
+
+```typescript
+import { GoogleVertexAI } from "langchain/llms/googlevertexai";
+```
+
 ## Vector Store
 
 ### Vertex AI Vector Search

diff --git a/docs/core_docs/docs/integrations/text_embedding/google_generativeai.mdx b/docs/core_docs/docs/integrations/text_embedding/google_generativeai.mdx
@@ -0,0 +1,20 @@
+import CodeBlock from "@theme/CodeBlock";
+
+# Google Generative AI
+
+You can access Google's generative AI embeddings models through
+`@langchain/google-genai` integration package.
+
+Get an API key here: https://ai.google.dev/tutorials/setup
+
+You'll need to install the `@langchain/google-genai` package:
+
+```bash npm2yarn
+npm install @langchain/google-genai
+```
+
+## Usage
+
+import GoogleGenerativeAIExample from "@examples/models/embeddings/googlegenerativeai.ts";
+
+<CodeBlock language="typescript">{GoogleGenerativeAIExample}</CodeBlock>
diff --git a/docs/core_docs/docs/integrations/text_embedding/google_palm.mdx b/docs/core_docs/docs/integrations/text_embedding/google_palm.mdx
@@ -1,7 +1,15 @@
+---
+sidebar_class_name: hidden
+---
+
 import CodeBlock from "@theme/CodeBlock";
 
 # Google PaLM
 
+:::note
+This integration is largely superseded by the newer [Google GenerativeAI Gemini](/docs/integrations/text_embedding/google_generativeai) embeddings.
+:::
+
 The [Google PaLM API](https://developers.generativeai.google/products/palm) can be integrated by first
 installing the required packages:
 

diff --git a/examples/package.json b/examples/package.json
@@ -27,7 +27,9 @@
     "@getmetal/metal-sdk": "^4.0.0",
     "@getzep/zep-js": "^0.9.0",
     "@gomomento/sdk": "^1.51.1",
+    "@google/generative-ai": "^0.1.0",
     "@langchain/community": "workspace:*",
+    "@langchain/google-genai": "workspace:*",
     "@opensearch-project/opensearch": "^2.2.0",
     "@pinecone-database/pinecone": "^1.1.0",
     "@planetscale/database": "^1.8.0",

diff --git a/examples/src/models/chat/googlegenerativeai.ts b/examples/src/models/chat/googlegenerativeai.ts
@@ -0,0 +1,60 @@
+import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
+import { HarmBlockThreshold, HarmCategory } from "@google/generative-ai";
+
+/*
+ * Before running this, you should make sure you have created a
+ * Google Cloud Project that has `generativelanguage` API enabled.
+ *
+ * You will also need to generate an API key and set
+ * an environment variable GOOGLE_API_KEY
+ *
+ */
+
+// Text
+const model = new ChatGoogleGenerativeAI({
+  modelName: "gemini-pro",
+  maxOutputTokens: 2048,
+  safetySettings: [
+    {
+      category: HarmCategory.HARM_CATEGORY_HARASSMENT,
+      threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
+    },
+  ],
+});
+
+// Batch and stream are also supported
+const res = await model.invoke([
+  [
+    "human",
+    "What would be a good company name for a company that makes colorful socks?",
+  ],
+]);
+
+console.log(res);
+
+/*
+  AIMessage {
+    content: '1. Rainbow Soles\n' +
+      '2. Toe-tally Colorful\n' +
+      '3. Bright Sock Creations\n' +
+      '4. Hue Knew Socks\n' +
+      '5. The Happy Sock Factory\n' +
+      '6. Color Pop Hosiery\n' +
+      '7. Sock It to Me!\n' +
+      '8. Mismatched Masterpieces\n' +
+      '9. Threads of Joy\n' +
+      '10. Funky Feet Emporium\n' +
+      '11. Colorful Threads\n' +
+      '12. Sole Mates\n' +
+      '13. Colorful Soles\n' +
+      '14. Sock Appeal\n' +
+      '15. Happy Feet Unlimited\n' +
+      '16. The Sock Stop\n' +
+      '17. The Sock Drawer\n' +
+      '18. Sole-diers\n' +
+      '19. Footloose Footwear\n' +
+      '20. Step into Color',
+    name: 'model',
+    additional_kwargs: {}
+  }
+*/
diff --git a/examples/src/models/chat/googlegenerativeai_multimodal.ts b/examples/src/models/chat/googlegenerativeai_multimodal.ts
@@ -0,0 +1,56 @@
+import fs from "fs";
+import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
+import { HumanMessage } from "@langchain/core/messages";
+
+// Multi-modal
+const vision = new ChatGoogleGenerativeAI({
+  modelName: "gemini-pro-vision",
+  maxOutputTokens: 2048,
+});
+const image = fs.readFileSync("./hotdog.jpg").toString("base64");
+const input2 = [
+  new HumanMessage({
+    content: [
+      {
+        type: "text",
+        text: "Describe the following image.",
+      },
+      {
+        type: "image_url",
+        image_url: `data:image/png;base64,${image}`,
+      },
+    ],
+  }),
+];
+
+const res2 = await vision.invoke(input2);
+
+console.log(res2);
+
+/*
+  AIMessage {
+    content: ' The image shows a hot dog in a bun. The hot dog is grilled and has a dark brown color. The bun is toasted and has a light brown color. The hot dog is in the center of the bun.',
+    name: 'model',
+    additional_kwargs: {}
+  }
+*/
+
+// Multi-modal streaming
+const res3 = await vision.stream(input2);
+
+for await (const chunk of res3) {
+  console.log(chunk);
+}
+
+/*
+  AIMessageChunk {
+    content: ' The image shows a hot dog in a bun. The hot dog is grilled and has grill marks on it. The bun is toasted and has a light golden',
+    name: 'model',
+    additional_kwargs: {}
+  }
+  AIMessageChunk {
+    content: ' brown color. The hot dog is in the center of the bun.',
+    name: 'model',
+    additional_kwargs: {}
+  }
+*/
diff --git a/examples/src/models/embeddings/googlegenerativeai.ts b/examples/src/models/embeddings/googlegenerativeai.ts
@@ -0,0 +1,47 @@
+import { GoogleGenerativeAIEmbeddings } from "@langchain/google-genai";
+import { TaskType } from "@google/generative-ai";
+
+/*
+ * Before running this, you should make sure you have created a
+ * Google Cloud Project that has `generativelanguage` API enabled.
+ *
+ * You will also need to generate an API key and set
+ * an environment variable GOOGLE_API_KEY
+ *
+ */
+
+const embeddings = new GoogleGenerativeAIEmbeddings({
+  modelName: "embedding-001", // 768 dimensions
+  taskType: TaskType.RETRIEVAL_DOCUMENT,
+  title: "Document title",
+});
+
+const res = await embeddings.embedQuery("OK Google");
+
+console.log(res, res.length);
+
+/*
+  [
+      0.010467986,  -0.052334797,   -0.05164676,  -0.0092885755,   0.037551474,
+      0.007278041, -0.0014511136, -0.0002727135,    -0.01205141,  -0.028824795,
+      0.022447161,   0.032513272, -0.0075029004,    0.013371749,    0.03725578,
+      -0.0179886,  -0.032127254,  -0.019804858,   -0.035530213,  -0.057539217,
+      0.030938378,   0.022367297,  -0.024294581,    0.011045744,  0.0026335048,
+    -0.018090524,  0.0066266404,   -0.05072178,   -0.025432976,    0.04673682,
+    -0.044976745,   0.009511519,  -0.030653704,   0.0066106077,   -0.03870159,
+      -0.04239313,   0.016969211,     -0.015911,    0.020452755,   0.033449557,
+    -0.002724189,  -0.049285132,  -0.016055783,  -0.0016474632,   0.013622627,
+    -0.012853559,   -0.00383113,  0.0047683385,    0.029007262,  -0.082496256,
+      0.055966448,   0.011457588,    0.04426033,   -0.043971397,   0.029413547,
+      0.012740723,    0.03243298,  -0.005483601,    -0.01973574,  -0.027495336,
+    0.0031939305,    0.02392931,  -0.011409592,    0.053490978,   -0.03130516,
+    -0.037364446,  -0.028803863,   0.019082755, -0.00075289875,   0.015987953,
+      0.005136402,  -0.045040093,   0.051010687,    -0.06252348,   -0.09334517,
+      -0.11461444,  -0.007226655,   0.034570504,    0.017628446,    0.02613834,
+    -0.0043784343,  -0.022333296,  -0.053109482,   -0.018441308,   -0.10350664,
+      0.048912525,  -0.042917475, -0.0014399975,    0.023028672, 0.00041137074,
+      0.019345555,  -0.023254089,   0.060004912,    -0.07684076,   -0.04034909,
+      0.05221485,  -0.015773885,  -0.029030964,     0.02586164,    -0.0401004,
+    ... 668 more items
+  ]
+*/