Merge pull request #8 from Labelbox/gu_prompt_creation

Created prompt and response projects guide
Labelbox · Jul 29, 2024 · 8800f8f · 8800f8f
2 parents 3066818 + 3ee43e2
commit 8800f8f
Showing 1 changed file with 338 additions and 0 deletions.
diff --git a/project_configuration/prompt_response_projects.ipynb b/project_configuration/prompt_response_projects.ipynb
@@ -0,0 +1,338 @@
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "<td>\n",
+        "   <a target=\"_blank\" href=\"https://labelbox.com\" ><img src=\"https://labelbox.com/blog/content/images/2021/02/logo-v4.svg\" width=256/></a>\n",
+        "</td>\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "<td>\n",
+        "<a href=\"https://colab.research.google.com/github/Labelbox/labelbox-notebooks/blob/main/project_configuration/prompt_response_projects.ipynb\" target=\"_blank\"><img\n",
+        "src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"></a>\n",
+        "</td>\n",
+        "\n",
+        "<td>\n",
+        "<a href=\"https://github.com/Labelbox/labelbox-notebooks/tree/main/project_configuration/prompt_response_projects.ipynb\" target=\"_blank\"><img\n",
+        "src=\"https://img.shields.io/badge/GitHub-100000?logo=github&logoColor=white\" alt=\"GitHub\"></a>\n",
+        "</td>"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "# Prompt and response creation projects\n",
+        "\n",
+        "This notebook will provide an example workflow of setting up a prompt and response type project with the Labelbox-Python SDK."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Set up"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "%pip install -q --upgrade \"labelbox[data]\""
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "import labelbox as lb"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## API key and client\n",
+        "Replace the value of `API_KEY` with a valid [API key]([ref:create-api-key](https://docs.labelbox.com/reference/create-api-key)) to connect to the Labelbox client."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "API_KEY = None\n",
+        "client = lb.Client(api_key=API_KEY)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Example: Create prompt and response projects and ontologies\n",
+        "\n",
+        "The steps of creating prompt and response projects and corresponding ontologies using the Labelbox-Python SDK are similar to creating a regular project, and we will describe the slight differences for each scenario."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Create a prompt and response ontology\n",
+        "\n",
+        "You can create ontologies for prompt and response projects in the same way as other project ontologies using two methods: `client.create_ontology` and `client.create_ontology_from_feature_schemas`. You also need to include the respective `media_type`: `lb.MediaType.LLMPromptCreation` and `lb.MediaType.LLMPromptResponseCreation`. Additionally, you need to provide an `ontology_kind` parameter set to `lb.OntologyKind.ResponseCreation` that is only applicable for prompt and prompt response creation projects."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### Option A: `client.create_ontology`\n",
+        "\n",
+        "Typically, you create ontologies and generate the associated features simultaneously. Below is an example of creating an ontology for your prompt and response projects using supported classifications; for information on supported annotation types, see [prompt and response generation](https://docs.labelbox.com/docs/prompt-and-response-generation-editor).\n",
+        "\n",
+        "Depending if you were creating a prompt, response, or prompt and response creation projects, you don't need certain classifications inside your ontologies. For information on supported annotation types, see [prompt and response generation](doc:prompt-and-response-generation-editor#supported-prompt-formats). In this notebook, we will create a prompt and response creation ontology. "
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "##### Prompt and response creation ontology"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "ontology_builder = lb.OntologyBuilder(\n",
+        "    tools=[],\n",
+        "    classifications=[\n",
+        "        lb.PromptResponseClassification(\n",
+        "            class_type=lb.PromptResponseClassification.Type.PROMPT,\n",
+        "            name=\"prompt text\",\n",
+        "            character_min = 1, # Minimum character count of prompt field (optional)\n",
+        "            character_max = 20, # Maximum character count of prompt field (optional)\n",
+        "        ),\n",
+        "        lb.PromptResponseClassification(\n",
+        "            class_type=lb.PromptResponseClassification.Type.RESPONSE_CHECKLIST,\n",
+        "            name=\"response checklist feature\",\n",
+        "            options=[\n",
+        "                lb.Option(value=\"option 1\", label=\"option 1\"),\n",
+        "                lb.Option(value=\"option 2\", label=\"option 2\"),\n",
+        "            ],\n",
+        "        ),\n",
+        "        lb.PromptResponseClassification(\n",
+        "            class_type=lb.PromptResponseClassification.Type.RESPONSE_RADIO,\n",
+        "            name=\"response radio feature\",\n",
+        "            options=[\n",
+        "                lb.Option(value=\"first_radio_answer\"),\n",
+        "                lb.Option(value=\"second_radio_answer\"),\n",
+        "            ],\n",
+        "        ),\n",
+        "        lb.PromptResponseClassification(\n",
+        "            class_type=lb.PromptResponseClassification.Type.RESPONSE_TEXT,\n",
+        "            name=\"response text\",\n",
+        "            character_min = 1, # Minimum character count of response text field (optional)\n",
+        "            character_max = 20, # Maximum character count of response text field (optional)\n",
+        "        )\n",
+        "    ],\n",
+        ")\n",
+        "\n",
+        "# Create ontology\n",
+        "ontology = client.create_ontology(\n",
+        "    \"Prompt and response ontology\",\n",
+        "    ontology_builder.asdict(),\n",
+        "    media_type=lb.MediaType.LLMPromptResponseCreation,\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Option B: `client.create_ontology_from_feature_schemas`\n",
+        "You can also create ontologies using feature schema IDs, which make your ontologies with existing features instead of generating new features. You can get these features by going to the _Schema_ tab inside Labelbox."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "# Uncomment the follwing code block for this option\n",
+        "# ontology = client.create_ontology_from_feature_schemas(\n",
+        "#     \"LMC ontology\",\n",
+        "#     feature_schema_ids=[\"<list of feature schema ids\"],\n",
+        "#     media_type=lb.MediaType.Conversational,\n",
+        "#     ontology_kind=lb.OntologyKind.ModelEvaluation,\n",
+        "# )"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Create response creation projects\n",
+        "\n",
+        "You can create response creation projects using `client.create_response_creation_project`, which uses the same parameters as `client.create_project` but provides better validation to ensure the project is set up correctly. Additionally, you need to import text data rows to be used as prompts."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "response_creation_project = client.create_response_creation_project(\n",
+        "    name=\"Demo response creation\",\n",
+        "    description=\"<project_description>\",  # optional\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Create prompt response and prompt creation projects\n",
+        "\n",
+        "When creating a prompt response or prompt creation project using `client.create_prompt_response_generation_project()`, you do not need to create data rows because they are generated automatically. This method takes the same parameters as the traditional client.create_project but with a few specific additional parameters.\n",
+        "\n",
+        "Parameters\n",
+        "The `client.create_prompt_response_generation_project` method requires the following parameters:\n",
+        "\n",
+        "- `create_prompt_response_generation_project` parameters:\n",
+        "\n",
+        "    - `name` (required): The name of your new project.\n",
+        "\n",
+        "    - `description`: An optional description of your project.\n",
+        "\n",
+        "    - `media_type` (required): The type of assets this project accepts. Can be either `lb.MediaType.LLMPromptCreation` or `MediaType.LLMPromptResponseCreation`, depending on the project type you are setting up.\n",
+        "\n",
+        "    - `dataset_name`: The name of the dataset where the generated data rows will be located. Include this parameter only if you want to create a new dataset.\n",
+        "\n",
+        "    - `dataset_id`: An optional dataset ID of an existing Labelbox dataset. Include this parameter if you want to append it to an existing dataset.\n",
+        "\n",
+        "    - `data_row_count`: The number of data row assets that will be generated and used with your project."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "project = client.create_prompt_response_generation_project(\n",
+        "    name=\"Demo prompt response project\",\n",
+        "    media_type=lb.MediaType.LLMPromptResponseCreation,\n",
+        "    dataset_name=\"Demo prompt response dataset\",\n",
+        "    data_row_count=100,\n",
+        ")\n",
+        "\n",
+        "# Setup project with ontology created above\n",
+        "project.connect_ontology(ontology)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Exporting prompt response, prompt or response create project\n",
+        "Exporting from a prompt and response type project works the same as exporting from other projects. In this example, your export will be empty unless you create labels within the Labelbox platform. See prompt and response export for a sample export."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "# The return type of this method is an `ExportTask`, which is a wrapper of a`Task`\n",
+        "# Most of `Task` features are also present in `ExportTask`.\n",
+        "\n",
+        "export_params = {\n",
+        "    \"attachments\": True,\n",
+        "    \"metadata_fields\": True,\n",
+        "    \"data_row_details\": True,\n",
+        "    \"project_details\": True,\n",
+        "    \"label_details\": True,\n",
+        "    \"performance_details\": True,\n",
+        "    \"interpolated_frames\": True,\n",
+        "}\n",
+        "\n",
+        "# Note: Filters follow AND logic, so typically using one filter is sufficient.\n",
+        "filters = {\n",
+        "    \"last_activity_at\": [\"2000-01-01 00:00:00\", \"2050-01-01 00:00:00\"],\n",
+        "    \"label_created_at\": [\"2000-01-01 00:00:00\", \"2050-01-01 00:00:00\"],\n",
+        "    \"workflow_status\": \"InReview\",\n",
+        "    \"batch_ids\": [\"batch_id_1\", \"batch_id_2\"],\n",
+        "    \"data_row_ids\": [\"data_row_id_1\", \"data_row_id_2\"],\n",
+        "    \"global_keys\": [\"global_key_1\", \"global_key_2\"],\n",
+        "}\n",
+        "\n",
+        "export_task = project.export(params=export_params, filters=filters)\n",
+        "export_task.wait_till_done()\n",
+        "\n",
+        "# Return a JSON output string from the export task results/errors one by one:\n",
+        "def json_stream_handler(output: lb.BufferedJsonConverterOutput):\n",
+        "    print(output.json)\n",
+        "\n",
+        "\n",
+        "if export_task.has_errors():\n",
+        "    export_task.get_buffered_stream(stream_type=lb.StreamType.ERRORS).start(\n",
+        "        stream_handler=lambda error: print(error)\n",
+        "    )\n",
+        "\n",
+        "if export_task.has_result():\n",
+        "    export_json = export_task.get_buffered_stream(\n",
+        "        stream_type=lb.StreamType.RESULT\n",
+        "    ).start(stream_handler=json_stream_handler)\n",
+        "\n",
+        "print(\"file size: \", export_task.get_total_file_size(stream_type=lb.StreamType.RESULT))\n",
+        "print(\"line count: \", export_task.get_total_lines(stream_type=lb.StreamType.RESULT))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Clean up\n",
+        "\n",
+        "This section serves as an optional clean-up step to delete the Labelbox assets created within this guide. You will need to uncomment the delete methods shown."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "# project.delete()\n",
+        "# response_creation_project.delete()\n",
+        "# client.delete_unused_ontology(ontology.uid)\n",
+        "# dataset.delete()"
+      ]
+    }
+  ],
+  "metadata": {
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 2
+}