Skip to content

Commit 2e6495d

Browse files
Merge pull request #24 from determined-ai/mldm_284_mlde_0281
added LLM RAG, fixed deployment images
2 parents 56a75e6 + f8671f7 commit 2e6495d

File tree

181 files changed

+48429
-75
lines changed
  • bring-your-own-model
  • deploy
  • examples
    • brain-mri
    • dog-cat
    • llm-rag
      • pipelines
      • sample-data
        • HPE_press_releases
          • 2022
            • 01
              • hewlett-packard-enterprise-signs-agreement-with-uae-cyber-security-council-to-accelerate-youth-education
              • hewlett-packard-enterprise-unveils-supercomputing-research-that-raises-the-bar-for-achieving-quantum-advantage
              • hpe-greenlake-selected-to-expand-core-cloud-offering-and-enhance-desktop-as-a-service-for-cdw-serviceworks
              • hpe-statement-on-uk-high-court-decision-in-autonomy-proceedings
              • j-j-keller-chooses-hpe-greenlake-to-optimize-operations-and-accelerate-innovation-for-safety-and-compliance
              • leading-japanese-telecommunications-provider-optage-trialling-local-5g-services-powered-by-hpe-5g-core-stack
              • steel-authority-of-indias-central-marketing-organization-selects-hpe-greenlake-to-modernize-critical-sap-environment-and-data-management
              • werder-bremen-partners-with-hewlett-packard-enterprise-to-advance-fan-experience-and-data-intelligence
            • 02
              • hewlett-packard-enterprise-extends-leadership-in-enterprise-connectivity-with-private-5g-offering
              • hewlett-packard-enterprise-to-present-live-audio-webcast-of-first-quarter-2022-earnings-conference-call
            • 03
              • auckland-transport-adopts-hpe-greenlake-for-advanced-analytics-to-promote-public-safety
              • cheops-technology-selects-hpe-greenlake-to-expand-and-enhance-cloud-services-portfolio-for-its-customers
              • hewlett-packard-enterprise-expands-hpe-greenlake-cloud-services-portfolio-for-hybrid-cloud-and-data-with-availability-of-microsoft-azure-stack-hci-integrated-system
              • hewlett-packard-enterprise-reports-fiscal-2022-first-quarter-results
              • hpe-greenlake-edge-to-cloud-platform-delivers-greater-choice-and-simplicity-with-unified-experience-new-cloud-services-and-expanded-partner-ecosystem
              • hpe-greenlake-selected-by-bmw-group-to-unify-data-management-across-global-locations-and-the-cloud
              • hpe-greenlake-selected-by-worldline-to-modernize-mission-critical-payments
              • jcb-chooses-hpe-greenlake-to-enhance-customer-experience-and-modernize-its-next-generation-myjcb-platform
              • kddi-advances-o-ran-compliant-5g-standalone-virtualized-base-stations-in-japan-with-hpe-telco-infrastructure
              • napa-valleys-trinchero-family-estates-supports-online-business-growth-with-hpe-greenlake
            • 04
              • city-university-of-hong-kong-advances-scientific-research-with-up-to-10x-faster-high-performance-computing-cluster-from-hewlett-packard-enterprise
              • hewlett-packard-enterprise-accelerates-ai-journey-from-poc-to-production-with-new-solution-for-ai-development-and-training-at-scale
              • hewlett-packard-enterprise-accelerates-ran-deployments-with-automation-and-simplified-management-for-both-open-and-traditional-networks
              • hewlett-packard-enterprise-drives-innovation-at-the-extreme-edge-on-the-international-space-station-with-24-completed-experiments
              • hewlett-packard-enterprise-ushers-in-next-era-in-ai-innovation-with-swarm-learning-solution-built-for-the-edge-and-distributed-sites
              • hpe-opens-global-center-of-excellence-in-artificial-intelligence-and-data-in-spain-to-help-customers-harness-the-power-of-their-data
              • korea-customs-service-selects-hpe-ezmeral-to-advance-smuggler-crackdown-and-inform-policy-decisions
              • leading-telematics-company-ituran-selects-hpe-cloud-native-storage-to-improve-real-time-data-access-for-clients
            • 05
              • hewlett-packard-enterprise-and-sipearl-partner-to-develop-hpc-solutions-with-european-processors-and-accelerate-europes-adoption-of-exascale-supercomputers
              • hewlett-packard-enterprise-opens-new-head-office-in-canada-to-realize-a-new-workplace-experience-for-team-members
              • hewlett-packard-enterprise-strengthens-europes-supercomputer-supply-chain-with-new-factory-in-czech-republic
              • hewlett-packard-enterprise-to-present-live-audio-webcast-of-fiscal-2022-second-quarter-earnings-conference-call
              • hewlett-packard-enterprise-ushers-in-new-era-with-worlds-first-and-fastest-exascale-supercomputer-frontier-for-the-us-department-of-energys-oak-ridge-national-laboratory
              • hpe-statement-on-uk-high-court-decision-in-autonomy-proceedings-may-2022
              • leibniz-supercomputing-centre-accelerates-ai-innovation-in-bavaria-with-next-generation-ai-system-from-cerebras-systems-and-hewlett-packard-enterprise
              • lupin-goes-live-with-sap-s4-hana-on-hpe-greenlake-to-drive-digital-transformation
            • 06
              • channel-news-2022
              • eclit-chooses-hpe-greenlake-to-launch-a-new-cloud-offering-and-expand-its-managed-services-portfolio
              • hewlett-packard-enterprise-adds-even-more-genius-to-evil-geniuses-industry-leading-data-and-analytics-program
              • hewlett-packard-enterprise-expands-compute-portfolio-with-new-servers-based-on-cloud-native-silicon
              • hewlett-packard-enterprise-releases-2021-living-progress-report-accelerates-net-zero-climate-target-by-10-years
              • hewlett-packard-enterprise-reports-fiscal-2022-second-quarter-results
              • hewlett-packard-enterprise-to-present-live-webcast-of-investor-relations-summit-at-discover
              • hpe-greenlake-adds-open-source-leader-red-hat-to-expanding-ecosystem
              • hpe-greenlake-advances-hybrid-cloud-experience-with-modern-private-cloud-and-new-cloud-services
              • iliane-selects-hpe-greenlake-for-expansion-of-high-performing-cloud-offerings
              • swedish-cloud-service-provider-datarewind-boosts-hybrid-office-productivity-with-hpe-cloud-native-storage-and-enhanced-compute-capacities
              • taeknizon-selects-hpe-greenlake-to-expand-their-fully-managed-cloud-offering-in-the-uae
              • turk-nippon-insurance-embarks-on-technology-modernization-with-hpe-greenlake-to-boost-performance
            • 07
              • catharina-hospital-selects-hpe-ezmeral-to-power-data-first-modernization-and-improve-accuracy-and-speed-of-diagnosis
              • french-cloud-service-provider-antemeta-selects-hpe-greenlake-to-introduce-new-automated-disaster-recovery-service
              • mcmaster-university-cracks-genome-sequencing-to-fight-covid-19-and-other-infectious-diseases-with-hewlett-packard-enterprise
            • 08
              • hewlett-packard-enterprise-reports-fiscal-2022-third-quarter-results
              • hewlett-packard-enterprise-to-present-live-audio-webcast-of-third-quarter-earnings-conference-call
              • polish-manufacturer-stelmet-selects-hpe-cloud-native-storage-to-underpin-its-dynamic-growth
              • steel-authority-of-india-expands-adoption-of-hpe-greenlake-to-increase-productivity-enhance-agility-and-reduce-energy-consumption
            • 09
              • du-selects-hewlett-packard-enterprise-for-digital-transformation-journey-to-5g
              • french-healthcare-software-provider-maincare-selects-hpe-greenlake-to-accelerate-deployment-of-secure-health-cloud-services
              • hewlett-packard-enterprise-names-regina-e-dugan-technology-leader-and-former-darpa-director-to-board-of-directors
              • kaust-selects-hpe-to-build-the-middle-easts-most-powerful-supercomputer
            • 10
              • hewlett-packard-enterprise-demonstrates-strong-momentum-of-edge-to-cloud-strategy-and-presents-fy23-outlook-at-hpe-securities-analyst-meeting-2022
              • hewlett-packard-enterprise-to-webcast-hpe-securities-analyst-meeting-2022
              • hpe-accelerates-ag-digital-transformation-and-business-outcomes-with-cloud-ready-and-data-driven-solution
              • meteorological-service-singapore-advances-weather-forecasting-with-new-supercomputer-from-hewlett-packard-enterprise
              • mohamed-bin-zayed-university-of-artificial-intelligence-advances-the-uaes-national-strategy-for-ai-with-new-supercomputer-built-by-hewlett-packard-enterprise
            • 11
              • hewlett-packard-enterprise-extends-supercomputing-to-the-enterprise-with-new-hpe-cray-portfolio
              • hewlett-packard-enterprise-introduces-next-generation-compute-engineered-for-a-hybrid-world
              • hewlett-packard-enterprise-reports-fiscal-2022-results-with-record-q4-performance
              • hewlett-packard-enterprise-to-present-live-audio-webcast-of-fiscal-2022-fourth-quarter-earnings-conference-call
              • hpe-and-vmware-advance-partnership-to-drive-digital-transformation-with-integrated-hybrid-cloud-experience
              • hpe-asset-upcycling-services-chosen-by-yahoo-japan-for-sustainable-reuse-of-it-assets
            • 12
              • hewlett-packard-enterprise-partners-with-the-2023-ryder-cup-to-deliver-an-intelligent-and-immersive-experience-to-fans-onsite-and-around-the-world
              • hpe-greenlake-adds-application-analytics-and-developer-services-to-modernize-workloads-across-the-hybrid-cloud
              • lack-of-data-capabilities-impedes-organizations-success-global-survey-finds
          • 2023
            • 01
              • brazils-largest-medical-cooperative-from-santa-catarina-selects-hpe-greenlake-to-drive-innovation-transform-patient-outcomes-and-extend-reach-of-healthcare-services
              • city-electrical-factors-selects-hpe-greenlake-to-modernize-its-customer-and-data-experience
              • hewlett-packard-enterprise-acquires-pachyderm-to-expand-ai-at-scale-capabilities-with-reproducible-ai
              • hewlett-packard-enterprise-names-frank-damelio-former-cfo-of-pfizer-inc-to-board-of-directors
              • toppan-forms-chooses-hpe-greenlake-to-respond-quickly-to-demand-as-their-customers-transition-to-a-digital-first-world
            • 02
              • hewlett-packard-enterprise-and-alfanar-announce-intent-to-invest-in-high-tech-production-in-saudi-arabia
              • hewlett-packard-enterprise-and-nokia-to-collaborate-on-cloud-ran-solution-for-csps-and-enterprises
              • hewlett-packard-enterprise-doubles-down-on-private-5g-extends-leadership-in-wireless-connectivity-with-acquisition-of-athonet
              • hewlett-packard-enterprise-to-present-live-audio-webcast-of-fiscal-2023-first-quarter-earnings-conference-call
              • ol-groupe-selects-hpe-greenlake-to-support-its-energy-efficiency-plan-and-improve-the-visitor-and-fan-experience
            • 03
              • hewlett-packard-enterprise-fortifies-network-security-with-acquisition-of-security-service-edge-provider-axis-security
              • hewlett-packard-enterprise-reports-fiscal-2023-first-quarter-results
              • hewlett-packard-enterprise-to-acquire-opsramp-advancing-hybrid-cloud-leadership-and-expanding-hpe-greenlake-into-it-operations-management
              • irelands-largest-managed-cloud-services-provider-eir-evo-selects-hpe-greenlake-to-modernize-its-private-cloud
            • 04
              • hewlett-packard-enterprise-deploys-easy-access-and-affordable-healthcare-services-across-pilgrimage-sites-in-india
              • hpe-aruba-networking-simplifies-it-operations-with-aiops-driven-cloud-management-and-new-network-as-a-service-capabilities-available-on-hpe-greenlake
              • hpe-transforms-data-lifecycle-management-with-expanded-hpe-alletra-portfolio-with-new-file-block-and-data-protection-services
            • 05
              • as-digitalization-demands-increase-it-leaders-miss-vital-connection-between-the-enterprise-network-and-employee-experiences
              • crown-commercial-service-signs-new-memorandum-of-understanding-with-hewlett-packard-enterprise-to-accelerate-the-adoption-of-sustainable-it-in-the-public-sector
              • hewlett-packard-enterprise-to-present-live-audio-webcast-of-fiscal-2023-second-quarter-earnings-conference-call
              • hpe-and-tokyo-tech-collaborate-to-build-the-next-generation-tsubame40-supercomputer-for-artificial-intelligence-scientific-research-and-innovation
              • hpe-ezmeral-software-accelerates-and-simplifies-analytics-and-aiml-initiatives-with-significant-advances-to-the-hybrid-multi-cloud-data-and-analytics-platform
              • hpe-opens-global-center-of-excellence-for-hpe-ezmeral-software-in-greece
              • swedish-service-provider-dsolution-chooses-hpe-greenlake-to-accelerate-cloud-services-delivery-and-meet-increasing-customer-demands
            • 06
              • hewlett-packard-enterprise-helps-organizations-reduce-it-carbon-footprint-with-new-sustainability-dashboard-and-comprehensive-portfolio-of-services
            • 07
            • 08
            • 09
            • 10
            • 11
            • 12
        • src/py
    • object-detection
    • sentiment-analysis

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

181 files changed

+48429
-75
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# PDK - Pachyderm | Determined | KServe
22
## Deployment and Setup Guide
3-
**Date/Revision:** January 02, 2024
3+
**Date/Revision:** February 23, 2024
44

55

66
![alt text][big_picture]

bring-your-own-model/PDK_implementation/container/deploy/common.py

+1
Original file line numberDiff line numberDiff line change
@@ -243,6 +243,7 @@ def create_inference_service(
243243
tolerations=tol,
244244
pytorch=(
245245
V1beta1TorchServeSpec(
246+
args=["--model-store=/mnt/models"],
246247
protocol_version=version,
247248
storage_uri=f"s3://{bucket_name}/{model_name}",
248249
resources=(

bring-your-own-model/PDK_implementation/container/deploy/deploy.py

+9-3
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,12 @@ def create_mar_file(model_name, model_version):
6868
# =====================================================================================
6969

7070

71-
def create_properties_file(model_name, model_version):
71+
def create_properties_file(model_name, model_version, cloud_model_host):
72+
print(f"--> Cloud Model Host: {cloud_model_host}")
73+
model_store = "/mnt/models/model-store"
74+
if cloud_model_host == "aws":
75+
print("--> Changing Model Store to match AWS")
76+
model_store = "/mnt/models"
7277
config_properties = """inference_address=http://0.0.0.0:8085
7378
management_address=http://0.0.0.0:8083
7479
metrics_address=http://0.0.0.0:8082
@@ -81,8 +86,9 @@ def create_properties_file(model_name, model_version):
8186
NUM_WORKERS=1
8287
number_of_netty_threads=4
8388
job_queue_size=10
84-
model_store=/mnt/models/model-store
89+
model_store=%s
8590
model_snapshot={"name":"startup.cfg","modelCount":1,"models":{"%s":{"%s":{"defaultVersion":true,"marName":"%s.mar","minWorkers":1,"maxWorkers":5,"batchSize":1,"maxBatchDelay":5000,"responseTimeout":120}}}}""" % (
91+
model_store,
8692
model_name,
8793
model_version,
8894
model_name,
@@ -124,7 +130,7 @@ def main():
124130
create_mar_file(model.name, model.version)
125131

126132
# Create config.properties for .mar file, return files to upload to GCS bucket
127-
model_files = create_properties_file(model.name, model.version)
133+
model_files = create_properties_file(model.name, model.version, args.cloud_model_host)
128134

129135
# Upload model artifacts to Cloud bucket in the format for TorchServe
130136
upload_model(

bring-your-own-model/PDK_implementation/pipelines/_on_prem_deployment-pipeline.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
"stdin": [
1919
"python deploy.py --deployment-name customer-churn --service-account-name pach-deploy --resource-requests cpu=2,memory=4Gi --resource-limits cpu=10,memory=8Gi"
2020
],
21-
"image": "pachyderm/pdk:byom-deploy-v0.0.4",
21+
"image": "pachyderm/pdk:byom-deploy-v0.0.6",
2222
"secrets": [
2323
{
2424
"name": "pipeline-secret",

bring-your-own-model/PDK_implementation/pipelines/_on_prem_training-pipeline.json

+4
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,11 @@
1919
"stdin": [
2020
"python train.py --git-url https://[email protected]:/determined-ai/pdk.git --git-ref main --sub-dir bring-your-own-model/PDK_implementation/experiment --config const.yaml --repo customer-churn-data --model customer-churn --project pdk-customer-churn"
2121
],
22+
<<<<<<< Updated upstream
2223
"image": "pachyderm/pdk:train-v0.0.5",
24+
=======
25+
"image": "pachyderm/pdk:train-v0.0.6",
26+
>>>>>>> Stashed changes
2327
"secrets": [
2428
{
2529
"name": "pipeline-secret",

bring-your-own-model/PDK_implementation/pipelines/deployment-pipeline.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
"stdin": [
1919
"python deploy.py --deployment-name customer-churn --cloud-model-host gcp --cloud-model-bucket pdk-repo-models --resource-requests cpu=2,memory=4Gi --resource-limits cpu=10,memory=8Gi"
2020
],
21-
"image": "pachyderm/pdk:byom-deploy-v0.0.4",
21+
"image": "pachyderm/pdk:byom-deploy-v0.0.6",
2222
"secrets": [
2323
{
2424
"name": "pipeline-secret",

bring-your-own-model/PDK_implementation/pipelines/training-pipeline.json

+4
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,11 @@
1919
"stdin": [
2020
"python train.py --git-url https://[email protected]:/determined-ai/pdk.git --git-ref main --sub-dir bring-your-own-model/PDK_implementation/experiment --config const.yaml --repo customer-churn-data --model customer-churn --project pdk-customer-churn"
2121
],
22+
<<<<<<< Updated upstream
2223
"image": "pachyderm/pdk:train-v0.0.5",
24+
=======
25+
"image": "pachyderm/pdk:train-v0.0.6",
26+
>>>>>>> Stashed changes
2327
"secrets": [
2428
{
2529
"name": "pipeline-secret",

bring-your-own-model/readme.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
# PDK - Pachyderm | Determined | KServe
66
## Bringing Your Model to PDK
7-
**Date/Revision:** January 02, 2024
7+
**Date/Revision:** February 23, 2024
88

99
In this section, we will train and deploy a simple customer churn model on PDK.
1010

deploy/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
# PDK - Pachyderm | Determined | KServe
66
## Deployment and Setup Guide
7-
**Date/Revision:** January 02, 2024
7+
**Date/Revision:** February 23, 2024
88

99
This page contains step-by-step guides for installing the infrastructure and all necessary components for the PDK environment, covering different Kubernetes plaforms.
1010

deploy/deploy_aws.md

+17-10
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55

66
# PDK - Pachyderm | Determined | KServe
77
## Deployment Guide for AWS
8+
<b>Date/Revision:</b> February 23, 2024
89

910

1011
This guide will walk you through the steps of deploying the PDK components to AWS.
@@ -23,8 +24,8 @@ The following software versions will be used for this installation:
2324
- Python: 3.8 and 3.9
2425
- Kubernetes (K8s): latest supported *(currently 1.27)*
2526
- Postgres: 13
26-
- MLDE (Determined.AI): latest *(currently 0.26.7)*
27-
- MLDM (Pachyderm): latest *(currently 2.8.2)*
27+
- MLDE (Determined.AI): latest *(currently 0.28.1)*
28+
- MLDM (Pachyderm): latest *(currently 2.8.4)*
2829
- KServe: 0.12.0-rc0 (Quickstart Environment)
2930

3031
PS: some of the commands used here are sensitive to the version of the product(s) listed above.
@@ -702,7 +703,7 @@ kubectl apply -f - <<EOF
702703
apiVersion: v1
703704
kind: PersistentVolume
704705
metadata:
705-
name: efs-pv
706+
name: pdk-pv
706707
spec:
707708
capacity:
708709
storage: 200Gi
@@ -718,7 +719,7 @@ spec:
718719
apiVersion: v1
719720
kind: PersistentVolumeClaim
720721
metadata:
721-
name: efs-pvc
722+
name: pdk-pvc
722723
namespace: default
723724
spec:
724725
accessModes:
@@ -737,7 +738,7 @@ kubectl apply -f - <<EOF
737738
apiVersion: v1
738739
kind: PersistentVolume
739740
metadata:
740-
name: efs-pv-gpu
741+
name: pdk-pv-gpu
741742
spec:
742743
capacity:
743744
storage: 200Gi
@@ -753,7 +754,7 @@ spec:
753754
apiVersion: v1
754755
kind: PersistentVolumeClaim
755756
metadata:
756-
name: efs-pvc
757+
name: pdk-pvc
757758
namespace: gpu-pool
758759
spec:
759760
accessModes:
@@ -956,14 +957,20 @@ The next step is to setup the 3 databases that will be used by PDK. Since the AW
956957
- Use the postgres `psql` command line utility (`psql -h ${RDS_CONNECTION_URL} postgres postgres`)
957958
- Create a pod with psql and connect to the instance
958959

960+
You will also need the password, which can be obtained by running this command:
961+
962+
```bash
963+
echo $RDS_ADMIN_PASSWORD
964+
```
965+
959966
To create the databases using the psql pod, use these commands:
960967

961968

962969
```bash
963970
kubectl run psql -it --rm=true --image=postgres:13 --command -- psql -h ${RDS_CONNECTION_URL} -U postgres postgres
964971

965972
# The prompt will freeze as it loads the pod. Wait for the message "If you don't see a command prompt, try pressing enter".
966-
# Then, type the password and press enter.
973+
# Then, type (or paste) the password and press enter.
967974

968975
postgres=> CREATE DATABASE pachyderm;
969976

@@ -1116,7 +1123,7 @@ proxy:
11161123
11171124
determined:
11181125
enabled: true
1119-
detVersion: "0.26.7"
1126+
detVersion: "0.28.1"
11201127
imageRegistry: determinedai
11211128
enterpriseEdition: false
11221129
imagePullSecretName:
@@ -1175,7 +1182,7 @@ determined:
11751182
volumes:
11761183
- name: shared-fs
11771184
persistentVolumeClaim:
1178-
claimName: efs-pvc
1185+
claimName: pdk-pvc
11791186
- pool_name: gpu-pool
11801187
max_aux_containers_per_agent: 1
11811188
kubernetes_namespace: gpu-pool
@@ -1193,7 +1200,7 @@ determined:
11931200
volumes:
11941201
- name: shared-fs
11951202
persistentVolumeClaim:
1196-
claimName: efs-pvc
1203+
claimName: pdk-pvc
11971204
tolerations:
11981205
- key: "nvidia.com/gpu"
11991206
operator: "Equal"

deploy/deploy_gcp.md

+16-14
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,15 @@
44

55
# PDK - Pachyderm | Determined | KServe
66
## Deployment Guide for Google Cloud
7-
<b>Date/Revision:</b> January 02, 2024
7+
<b>Date/Revision:</b> February 23, 2024
88

99
This guide will walk you through the steps of deploying the PDK components to Google Cloud.
1010

1111
## Reference Architecture
1212
The installation will be performed on the following hardware:
1313

1414
- 3x e2-standard-16 CPU-based nodes (16 vCPUs, 64GB RAM, 1000GB SSD)
15-
- 2x n1-standard-8 GPU-based nodes (4 NVIDIA-T4, 8 vCPUs, 30GB RAM, 1000GB SSD)
15+
- 2x n1-standard-8 GPU-based nodes (4 NVIDIA-T4, 16 vCPUs, 64GB RAM, 1000GB SSD)
1616

1717
The 3 CPU-based nodes will be used to run the services for all 3 products, and the MLDM pipelines. The GPU-based nodes will be used to run MLDE experiments.
1818

@@ -21,8 +21,8 @@ The following software versions will be used for this installation:
2121
- Python: 3.8 and 3.9
2222
- Kubernetes (K8s): latest supported *(currently 1.27)*
2323
- Postgres: 13
24-
- MLDE (Determined.AI): latest *(currently 0.26.7)*
25-
- MLDM (Pachyderm): latest *(currently 2.8.2)*
24+
- MLDE (Determined.AI): latest *(currently 0.28.1)*
25+
- MLDM (Pachyderm): latest *(currently 2.8.4)*
2626
- KServe: 0.12.0-rc0 (Quickstart Environment)
2727

2828
PS: some of the commands used here are sensitive to the version of the product(s) listed above.
@@ -160,7 +160,7 @@ export GCP_ZONE="us-central1-c"
160160
export K8S_VERSION="1.27.3-gke.100"
161161
export KSERVE_MODELS_NAMESPACE="models"
162162
export CLUSTER_MACHINE_TYPE="e2-standard-16"
163-
export GPU_MACHINE_TYPE="n1-standard-8"
163+
export GPU_MACHINE_TYPE="n1-standard-16"
164164
export SQL_CPU="2"
165165
export SQL_MEM="7680MB"
166166

@@ -320,7 +320,8 @@ gcloud container clusters create ${CLUSTER_NAME} \
320320
--enable-dataplane-v2 \
321321
--workload-pool=${PROJECT_ID}.svc.id.goog \
322322
--workload-metadata="GKE_METADATA" \
323-
--node-locations ${GCP_ZONE}
323+
--node-locations ${GCP_ZONE} \
324+
--tags pdk
324325
```
325326

326327
This process will take several minutes. The output message will show the cluster configuration. You can also check the status of the provisioning in the Google Cloud Console.
@@ -357,7 +358,8 @@ gcloud container node-pools create "gpu-pool" \
357358
--max-surge-upgrade 1 \
358359
--max-unavailable-upgrade 0 \
359360
--scopes=storage-full,cloud-platform \
360-
--node-locations ${GCP_ZONE}
361+
--node-locations ${GCP_ZONE} \
362+
--tags pdk
361363
```
362364

363365
This can take several minutes to complete. If it takes more than 1 hour, it will timeout the client. If that happens, track the progress of the provisioning process through the Google Cloud web console.
@@ -715,7 +717,7 @@ spec:
715717
kind: PersistentVolumeClaim
716718
apiVersion: v1
717719
metadata:
718-
name: nfs
720+
name: pdk-pvc
719721
spec:
720722
accessModes:
721723
- ReadWriteMany
@@ -747,7 +749,7 @@ spec:
747749
kind: PersistentVolumeClaim
748750
apiVersion: v1
749751
metadata:
750-
name: nfs
752+
name: pdk-pvc
751753
spec:
752754
accessModes:
753755
- ReadWriteMany
@@ -856,7 +858,7 @@ proxy:
856858
857859
determined:
858860
enabled: true
859-
detVersion: "0.26.7"
861+
detVersion: "0.28.1"
860862
imageRegistry: determinedai
861863
enterpriseEdition: false
862864
imagePullSecretName:
@@ -894,7 +896,7 @@ determined:
894896
volumes:
895897
- name: pdk-pvc-nfs
896898
persistentVolumeClaim:
897-
claimName: nfs
899+
claimName: pdk-pvc
898900
gpuPodSpec:
899901
apiVersion: v1
900902
kind: Pod
@@ -907,7 +909,7 @@ determined:
907909
volumes:
908910
- name: pdk-pvc-nfs
909911
persistentVolumeClaim:
910-
claimName: nfs
912+
claimName: pdk-pvc
911913
metadata:
912914
labels:
913915
nodegroup-role: gpu-worker
@@ -930,7 +932,7 @@ determined:
930932
volumes:
931933
- name: pdk-pvc-nfs
932934
persistentVolumeClaim:
933-
claimName: nfs
935+
claimName: pdk-pvc
934936
- pool_name: gpu-pool
935937
max_aux_containers_per_agent: 1
936938
kubernetes_namespace: gpu-pool
@@ -947,7 +949,7 @@ determined:
947949
volumes:
948950
- name: pdk-pvc-nfs
949951
persistentVolumeClaim:
950-
claimName: nfs
952+
claimName: pdk-pvc
951953
tolerations:
952954
- key: "nvidia.com/gpu"
953955
operator: "Equal"

deploy/deploy_k8s.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
# PDK - Pachyderm | Determined | KServe
77
## Deployment Guide for Kubernetes
8-
<b>Date/Revision:</b> January 02, 2024
8+
<b>Date/Revision:</b> February 23, 2024
99

1010

1111
This guide will walk you through the steps of deploying the PDK components to a vanilla Kubernetes environment.
@@ -23,8 +23,8 @@ The following software versions will be used for this installation:
2323
- Python: 3.8 and 3.9
2424
- Kubernetes (K8s): latest supported *(currently 1.27)*
2525
- Postgres: 13
26-
- MLDE (Determined.AI): latest *(currently 0.26.7)*
27-
- MLDM (Pachyderm): latest *(currently 2.8.2)*
26+
- MLDE (Determined.AI): latest *(currently 0.28.1)*
27+
- MLDM (Pachyderm): latest *(currently 2.8.4)*
2828
- KServe: 0.12.0-rc0 (Quickstart Environment)
2929

3030
PS: some of the commands used here are sensitive to the version of the product(s) listed above.
@@ -610,7 +610,7 @@ spec:
610610
kind: PersistentVolumeClaim
611611
apiVersion: v1
612612
metadata:
613-
name: mlde-pvc
613+
name: pdk-pvc
614614
spec:
615615
accessModes:
616616
- ReadWriteMany
@@ -639,7 +639,7 @@ spec:
639639
kind: PersistentVolumeClaim
640640
apiVersion: v1
641641
metadata:
642-
name: mlde-pvc
642+
name: pdk-pvc
643643
namespace: gpu-pool
644644
spec:
645645
accessModes:
@@ -722,7 +722,7 @@ proxy:
722722
723723
determined:
724724
enabled: true
725-
detVersion: "0.26.7"
725+
detVersion: "0.28.1"
726726
imageRegistry: determinedai
727727
enterpriseEdition: false
728728
imagePullSecretName:
@@ -779,7 +779,7 @@ determined:
779779
volumes:
780780
- name: shared-fs
781781
persistentVolumeClaim:
782-
claimName: mlde-pvc
782+
claimName: pdk-pvc
783783
- pool_name: gpu-pool
784784
max_aux_containers_per_agent: 1
785785
kubernetes_namespace: gpu-pool
@@ -796,7 +796,7 @@ determined:
796796
volumes:
797797
- name: shared-fs
798798
persistentVolumeClaim:
799-
claimName: mlde-pvc
799+
claimName: pdk-pvc
800800
tolerations:
801801
- key: "nvidia.com/gpu"
802802
operator: "Equal"

deploy/images/example_llm_chatui.png

71.3 KB
Loading
66.8 KB
Loading

deploy/images/example_llm_model.png

138 KB
Loading

0 commit comments

Comments
 (0)