Skip to content

Commit

Permalink
Don't set pe_count for the C++ impl of the TritonInferenceStage (#1640)
Browse files Browse the repository at this point in the history
* Ensure that both `pe_count` & `engines_per_pe` are both set to `1` for the C++ impl of the `TritonInferenceStage`
* Remove hard-coded `--num_threads=1` from validation scripts
* Disable hammah validation script until #1641 can be resolved
* Back-port of #1636

Closes #1639

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md).
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - David Gardner (https://github.com/dagardner-nv)
  - Eli Fajardo (https://github.com/efajardo-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: #1640
  • Loading branch information
dagardner-nv authored Apr 19, 2024
1 parent ab8d0a7 commit 883b804
Show file tree
Hide file tree
Showing 4 changed files with 25 additions and 9 deletions.
2 changes: 1 addition & 1 deletion ci/check_style.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ rapids-dependency-file-generator \
--file_key checks \
--matrix "cuda=${RAPIDS_CUDA_VERSION%.*};arch=$(arch);py=${RAPIDS_PY_VERSION}" | tee env.yaml

rapids-mamba-retry env create --force -f env.yaml -n checks
rapids-mamba-retry env create --yes -f env.yaml -n checks
conda activate checks

# Run pre-commit checks
Expand Down
10 changes: 10 additions & 0 deletions morpheus/stages/inference/triton_inference_stage.py
Original file line number Diff line number Diff line change
Expand Up @@ -781,3 +781,13 @@ def _get_cpp_inference_node(self, builder: mrc.Builder) -> mrc.SegmentObject:
self._needs_logits,
self._input_mapping,
self._output_mapping)

def _build_single(self, builder: mrc.Builder, input_node: mrc.SegmentObject) -> mrc.SegmentObject:
node = super()._build_single(builder, input_node)

# ensure that the C++ impl only uses a single progress engine
if (self._build_cpp_node()):
node.launch_options.pe_count = 1
node.launch_options.engines_per_pe = 1

return node
10 changes: 8 additions & 2 deletions scripts/validation/val-run-all.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,14 +31,20 @@ ensure_triton_running
export USE_CPP=0

${SCRIPT_DIR}/abp/val-abp-all.sh
${SCRIPT_DIR}/hammah/val-hammah-all.sh

# Disabled per #1641
# ${SCRIPT_DIR}/hammah/val-hammah-all.sh

${SCRIPT_DIR}/phishing/val-phishing-all.sh
${SCRIPT_DIR}/sid/val-sid-all.sh

# Run everything once USE_CPP=True
export USE_CPP=1

${SCRIPT_DIR}/abp/val-abp-all.sh
${SCRIPT_DIR}/hammah/val-hammah-all.sh

# Disabled per #1641
# ${SCRIPT_DIR}/hammah/val-hammah-all.sh

${SCRIPT_DIR}/phishing/val-phishing-all.sh
${SCRIPT_DIR}/sid/val-sid-all.sh
12 changes: 6 additions & 6 deletions scripts/validation/val-run-pipeline.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ function run_pipeline_sid_minibert(){
VAL_FILE=$4
VAL_OUTPUT=$5

morpheus --log_level=DEBUG run --num_threads=1 --pipeline_batch_size=1024 --model_max_batch_size=32 --use_cpp=${USE_CPP} \
morpheus --log_level=DEBUG run --num_threads=$(nproc) --pipeline_batch_size=1024 --model_max_batch_size=32 --use_cpp=${USE_CPP} \
pipeline-nlp --model_seq_length=256 \
from-file --filename=${INPUT_FILE} \
deserialize \
Expand All @@ -58,7 +58,7 @@ function run_pipeline_sid_bert(){
VAL_FILE=$4
VAL_OUTPUT=$5

morpheus --log_level=DEBUG run --num_threads=1 --pipeline_batch_size=1024 --model_max_batch_size=32 --use_cpp=${USE_CPP} \
morpheus --log_level=DEBUG run --num_threads=$(nproc) --pipeline_batch_size=1024 --model_max_batch_size=32 --use_cpp=${USE_CPP} \
pipeline-nlp --model_seq_length=256 \
from-file --filename=${INPUT_FILE} \
deserialize \
Expand All @@ -79,7 +79,7 @@ function run_pipeline_abp_nvsmi(){
VAL_FILE=$4
VAL_OUTPUT=$5

morpheus --log_level=DEBUG run --num_threads=1 --pipeline_batch_size=1024 --model_max_batch_size=1024 --use_cpp=${USE_CPP} \
morpheus --log_level=DEBUG run --num_threads=$(nproc) --pipeline_batch_size=1024 --model_max_batch_size=1024 --use_cpp=${USE_CPP} \
pipeline-fil --columns_file=${MORPHEUS_ROOT}/morpheus/data/columns_fil.txt \
from-file --filename=${INPUT_FILE} \
deserialize \
Expand All @@ -100,7 +100,7 @@ function run_pipeline_phishing_email(){
VAL_FILE=$4
VAL_OUTPUT=$5

morpheus --log_level=DEBUG run --num_threads=1 --pipeline_batch_size=1024 --model_max_batch_size=32 --use_cpp=${USE_CPP} \
morpheus --log_level=DEBUG run --num_threads=$(nproc) --pipeline_batch_size=1024 --model_max_batch_size=32 --use_cpp=${USE_CPP} \
pipeline-nlp --model_seq_length=128 --labels_file=${MORPHEUS_ROOT}/morpheus/data/labels_phishing.txt \
from-file --filename=${INPUT_FILE} \
deserialize \
Expand All @@ -121,7 +121,7 @@ function run_pipeline_hammah_user123(){
VAL_FILE=$4
VAL_OUTPUT=$5

morpheus --log_level=DEBUG run --num_threads=1 --pipeline_batch_size=1024 --model_max_batch_size=1024 --use_cpp=${USE_CPP} \
morpheus --log_level=DEBUG run --num_threads=$(nproc) --pipeline_batch_size=1024 --model_max_batch_size=1024 --use_cpp=${USE_CPP} \
pipeline-ae --columns_file="${MORPHEUS_ROOT}/morpheus/data/columns_ae_cloudtrail.txt" --userid_filter="user123" --userid_column_name="userIdentitysessionContextsessionIssueruserName" --timestamp_column_name="event_dt" \
from-cloudtrail --input_glob="${MORPHEUS_ROOT}/models/datasets/validation-data/dfp-cloudtrail-*-input.csv" \
train-ae --train_data_glob="${MORPHEUS_ROOT}/models/datasets/training-data/dfp-cloudtrail-*.csv" --source_stage_class=morpheus.stages.input.cloud_trail_source_stage.CloudTrailSourceStage --seed 42 \
Expand All @@ -143,7 +143,7 @@ function run_pipeline_hammah_role-g(){
VAL_FILE=$4
VAL_OUTPUT=$5

morpheus --log_level=DEBUG run --num_threads=1 --pipeline_batch_size=1024 --model_max_batch_size=1024 --use_cpp=${USE_CPP} \
morpheus --log_level=DEBUG run --num_threads=$(nproc) --pipeline_batch_size=1024 --model_max_batch_size=1024 --use_cpp=${USE_CPP} \
pipeline-ae --columns_file="${MORPHEUS_ROOT}/morpheus/data/columns_ae_cloudtrail.txt" --userid_filter="role-g" --userid_column_name="userIdentitysessionContextsessionIssueruserName" --timestamp_column_name="event_dt" \
from-cloudtrail --input_glob="${MORPHEUS_ROOT}/models/datasets/validation-data/dfp-cloudtrail-*-input.csv" \
train-ae --train_data_glob="${MORPHEUS_ROOT}/models/datasets/training-data/dfp-cloudtrail-*.csv" --source_stage_class=morpheus.stages.input.cloud_trail_source_stage.CloudTrailSourceStage --seed 42 \
Expand Down

0 comments on commit 883b804

Please sign in to comment.