Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create token metrics only when they are available #1092

Merged
merged 3 commits into from
Feb 5, 2025

Conversation

eero-t
Copy link
Contributor

@eero-t eero-t commented Dec 30, 2024

Description

This avoids generating useless token / request histogram metrics for services that use Orchestrator class, but never call its token processing functionality. Such dummy metrics can confuse telemetry users.

(It also helps in differentiating frontend megaservice metrics from backend megaservice ones, especially when multiple OPEA applications with wrapper microservices run in the same cluster.)

Issues

n/a.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Dependencies

n/a.

Tests

Manual testing with latest versions, to verify that:

  • services processing tokens, generate token histogram metrics
  • ones not processing them, produce only pending requests gauge

Copy link

codecov bot commented Dec 30, 2024

Codecov Report

Attention: Patch coverage is 96.42857% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
comps/cores/mega/orchestrator.py 96.42% 1 Missing ⚠️
Files with missing lines Coverage Δ
comps/cores/mega/orchestrator.py 91.17% <96.42%> (+0.39%) ⬆️

@eero-t
Copy link
Contributor Author

eero-t commented Jan 3, 2025

@Spycsh Could you review this?

(And maybe also #1107.)

@eero-t
Copy link
Contributor Author

eero-t commented Jan 3, 2025

Rebased to main.

@Spycsh
Copy link
Member

Spycsh commented Jan 6, 2025

This avoids generating useless token / request histogram metrics for services that use Orchestrator class, but never call its token processing functionality. Such dummy metrics can confuse telemetry users.

Why will this happen? Metrics are only updated when calling self.metrics.pending_update-like methods in schedule, right? These are all controllable code. So what you mean is that there are some other thing that can update the metrics?

@eero-t
Copy link
Contributor Author

eero-t commented Jan 7, 2025

Why will this happen?

@Spycsh Because Prometheus client will start providing metrics after they've been created.

In current code, all metrics are created when Orchestrator/OrhestratorMetrics is instantiated: https://github.com/opea-project/GenAIComps/blob/main/comps/cores/mega/orchestrator.py#L33

Metrics are only updated when calling self.metrics.pending_update-like methods in schedule, right?

Those methods only update the value of the metric, they do not create them. This PR changes Histogram metric creation to be delayed until first call of the update methods.

@eero-t
Copy link
Contributor Author

eero-t commented Jan 7, 2025

I dropped pending metric doc update & rebased to main. I'll have it in separate PR where I fix additional issues I noticed, which require pending requests metric type / name change.

@eero-t
Copy link
Contributor Author

eero-t commented Jan 7, 2025

opea/dataprep-redis:latest does not seem to generate megaservice_* metrics anymore.

Has ServiceOrchestrator use been dropped from backend services?

@eero-t
Copy link
Contributor Author

eero-t commented Jan 7, 2025

I dropped pending metric doc update & rebased to main. I'll have it in separate PR where I fix additional issues I noticed, which require pending requests metric type / name change.

Could not find any good fix for it, so I just filed a ticket on it: #1121

@Spycsh
Copy link
Member

Spycsh commented Jan 8, 2025

This avoids generating useless token / request histogram metrics for services that use Orchestrator class, but never call its token processing functionality. Such dummy metrics can confuse telemetry users.

(It also helps in differentiating frontend megaservice metrics from backend megaservice ones, especially when multiple OPEA applications with wrapper microservices run in the same cluster.)

OK so what you mean it that the dummy metrics will show zeros after initialization and before the first request and users should not see wrong values of request number... But you think the k8s will scrape the metrics even there are no requests and it is resource-consuming so you decide to delay the initialization only when there are requests. I agree with this approach.

@Spycsh
Copy link
Member

Spycsh commented Jan 8, 2025

opea/dataprep-redis:latest does not seem to generate megaservice_* metrics anymore.

Has ServiceOrchestrator use been dropped from backend services?

dataprep microservice itself should not generate megaservice_* metrics. Only megaservices like opea/chatqna do.

@eero-t
Copy link
Contributor Author

eero-t commented Jan 8, 2025

OK so what you mean it that the dummy metrics will show zeros after initialization and before the first request and users should not see wrong values of request number...

Technically the zero counts are not wrong, but presence of token / LLM metrics is misleading for services that will never generate tokens (or use LLM). That's the main reason for this PR.

But you think the k8s will scrape the metrics even there are no requests and it is resource-consuming so you decide to delay the initialization only when there are requests. I agree with this approach.

Visibility

All OPEA originated services use HttpService i.e. provide HTTP access metrics [1]. To see those, serviceMonitors are installed for them when monitoring option is enabled in Helm charts. Meaning that any megaservice_* metrics they generate, will also be visible to user e.g. in Grafana.

Perf

I doubt skipping generation of extra metrics has any noticeable perf impact on the service providing the metrics (currently serviceMonitors are configured to poll them at 5s interval), but every little bit can help.

Each Prometheus Histogram type provides about dozen different metrics, and in larger clusters, amount of metrics needs to be reduced to keep telemetry stack resource usage & perf reasonable. Telemetry stack resource usage should be significant concern only when there's larger number of such pods though.


[1] There's large number of HTTP metrics, and some Python ones too. It would be good to have controls for limiting those in larger clusters, but I did not see any options for that in prometheus_fastapi_instrumentator API.

@eero-t
Copy link
Contributor Author

eero-t commented Jan 10, 2025

@Spycsh from you comment in the bug #1121 (comment)

I realized that changing the method on first metric access is racy. It's possible that multiple threads end up in create method, before that method is changed to update one. Meaning that multiple identical metrics would be created, and Prometheus would barf on that.

=> I'll add lock & check to handle that.

@eero-t eero-t force-pushed the metrics-update branch 2 times, most recently from 23cd2c5 to 0a4e313 Compare January 13, 2025 10:17
This avoids generation of useless token/request histogram metrics
for services that use Orchestrator class, but never call its token
processing functionality.

(Helps in differentiating frontend megaservice metrics from backend
megaservice ones, especially when multiple OPEA applications run in
the same cluster.)

Also change Orchestrator CI test workaround to use unique prefix for
each metric instance, instead of metrics being (singleton) class
variables.

Signed-off-by: Eero Tamminen <[email protected]>
As that that could be called from multiple request handling threads.

Signed-off-by: Eero Tamminen <[email protected]>
@eero-t
Copy link
Contributor Author

eero-t commented Feb 4, 2025

Rebased by main, no other changes.

@Spycsh Does this look fine / OK to merge, now that v1.2 got branched & tagged?

@mkbhanda
Copy link
Collaborator

mkbhanda commented Feb 4, 2025

@eero-t does it make sense to have Guage for every service or just the end to end application, as in megaservice chatQnA?

Was just curious why we did not have some default absolute value for histogram buckets for first token latency given it is like an SLA.

I did see some settings in GenAIEval (https://github.com/opea-project/GenAIEval/blob/55174246234fa458afeca138e759d91648d28d03/evals/benchmark/grafana/vllm_grafana.json and https://github.com/opea-project/GenAIInfra/blob/5b2cca97206e6d27e7ea31e6a38e38dc21eec404/kubernetes-addons/Observability/chatqna/dashboard/tgi_grafana.json) and get its really data collection first and only display determined in Grafana.

@Spycsh
Copy link
Member

Spycsh commented Feb 5, 2025

LGTM.

@eero-t
Copy link
Contributor Author

eero-t commented Feb 5, 2025

@eero-t does it make sense to have Guage for every service or just the end to end application, as in megaservice chatQnA?

Short answer:

There's no way to differentiate whether service instantiating orchestrator class is frontend, or middle service, so orchestrator metrics appear to whichever service instantiates its class. This is the problem this PR tries deals with.

Long answer:

Every OPEA service does provide metrics, at least HTTP query stats, but those include also health and metric queries (regularly polled from the services by other k8s components), and do not include all relevant info for service SLAs.

Metrics this PR is concerned, are end-user request and token latencies, relevant for tracking service SLAs, as measured within the service orchestrator class (only place where they actually can be measured).

The problem that this PR deals with, is that service orchestrator is/was/can be used also OPEA components that do not process tokens, so providing (zero valued) metrics about them would be misleading (especially if those services then show up unwanted in dashboards). It also makes megaservice metric behaviour similar to (TEI/TGI/vLLM) inferencing services, which do not provide metrics until they've processed their first request.

Was just curious why we did not have some default absolute value for histogram buckets for first token latency given it is like an SLA.

How well given buckets fit given service, depends completely on what kind of LLM model/params are used, and whether inferencing is accelerated, how many backend there are, and also to some extent how stressed the service is.

Prometheus already has defaults for histogram buckets, which are exponential. They are good enough that you'll see metrics being spread to multiple buckets, regardless of used model/acceleration.

If one would want more details, buckets would need to be specified separately for each different service and its underlying HW configuration. I.e. they would need to be externally configurable, and specified in some kind of service/HW profile.

I did see some settings in GenAIEval (...) and get its really data collection first and only display determined in Grafana.

Values given to Prometheus quantile queries are percentage thresholds, not the histogram (le = less than) bucket thresholds. I.e. those dashboards should work regardless of what are the used metric value bucket thresholds.

How accurate the quantile information is depends both on how well the values are spread to buckets, and how well that matches quantile percentage thresholds though.

Copy link
Collaborator

@mkbhanda mkbhanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and thank you @eero-t for the clarifications on Guage and default histogram buckets.

@mkbhanda mkbhanda merged commit 4ede405 into opea-project:main Feb 5, 2025
15 checks passed
Spycsh added a commit to Spycsh/GenAIComps that referenced this pull request Feb 12, 2025
- Fix the wrong _instance_id handling in opea-project#1092
- essential for opea-project/GenAIExamples#1528 UT pass

Signed-off-by: Spycsh <[email protected]>
aMahanna added a commit to arangoml/GenAIComps1.2 that referenced this pull request Feb 26, 2025
commit ad8f517
Author: Dina Suehiro Jones <[email protected]>
Date:   Wed Feb 26 11:35:04 2025 -0800

    Dataprep Multimodal Redis README fixes (opea-project#1330)

    Signed-off-by: Dina Suehiro Jones <[email protected]>

commit c70f868
Author: Ervin Castelino <[email protected]>
Date:   Tue Feb 25 07:42:42 2025 +0000

    Update README.md (opea-project#1253)

    Signed-off-by: Ervin <[email protected]>

commit 589587a
Author: ZePan110 <[email protected]>
Date:   Mon Feb 24 17:54:51 2025 +0800

    Fix docker image security issues (opea-project#1321)

    Signed-off-by: ZePan110 <[email protected]>

commit f5699e4
Author: Brijesh Thummar <[email protected]>
Date:   Mon Feb 24 12:00:19 2025 +0530

    [Doc] vLLM - typo in README.md (opea-project#1302)

    Fix Typo in README

    Signed-off-by: [email protected] <[email protected]>

commit 364ccad
Author: Jonathan Minkin <[email protected]>
Date:   Sun Feb 23 19:27:31 2025 -0800

    Add support for string message in Bedrock textgen (opea-project#1291)

    * Add support for string message in bedrock, update README
    * Add test for string message in test script

    Signed-off-by: Jonathan Minkin <[email protected]>

commit 625aec9
Author: Daniel De León <111013930+daniel-de-leon-user293@users.noreply.github.com>
Date:   Fri Feb 21 13:20:58 2025 -0800

    Add native support for toxicity detection guardrail microservice (opea-project#1258)

    * add opea native support for toxic-prompt-roberta

    * add test script back

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * add comp name env variable

    * set default port to 9090

    Signed-off-by: Daniel Deleon <[email protected]>

    * add service to compose

    Signed-off-by: Daniel Deleon <[email protected]>

    * removed debug print

    Signed-off-by: Daniel Deleon <[email protected]>

    * remove triton version because habana updated

    Signed-off-by: Daniel Deleon <[email protected]>

    * add locust results

    Signed-off-by: Daniel Deleon <[email protected]>

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * skip warmup for halluc test

    Signed-off-by: Daniel Deleon <[email protected]>

    ---------

    Signed-off-by: Daniel Deleon <[email protected]>
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
    Co-authored-by: Liang Lv <[email protected]>
    Co-authored-by: Abolfazl Shahbazi <[email protected]>

commit 4352636
Author: ZePan110 <[email protected]>
Date:   Fri Feb 21 17:14:08 2025 +0800

    Fix trivy issue in Dockerfile (opea-project#1304)

    Signed-off-by: ZePan110 <[email protected]>

commit 135ef91
Author: rbrugaro <[email protected]>
Date:   Thu Feb 20 15:29:39 2025 -0800

    Change neo4j Bolt default PORT from 7687 to $NEO4J_PORT2 (opea-project#1292)

    * Change neo4j Bolt default PORT from 7687 to

    -configured the port in neo4j compose.yaml to use variable value
    -made all corresponding changes in neo4j  dataprep and retriever components and test scripts to use env variable instead of default port value.

    Signed-off-by: rbrugaro <[email protected]>

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * missing positional arg in milvus dataprep

    Signed-off-by: rbrugaro <[email protected]>

    * remove redundance in stop_docker

    Signed-off-by: rbrugaro <[email protected]>

    * resolve retriever to neo4j connectivity issue bad URL

    Signed-off-by: rbrugaro <[email protected]>

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * set neo4j ports to neo4j defaults and fix environment variables in READMEs

    Signed-off-by: rbrugaro <[email protected]>

    ---------

    Signed-off-by: rbrugaro <[email protected]>
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
    Co-authored-by: Liang Lv <[email protected]>

commit a4f6af1
Author: Letong Han <[email protected]>
Date:   Thu Feb 20 13:38:01 2025 +0800

    Refine dataprep test scripts (opea-project#1305)

    * Refine dataprep Milvus CI
    Signed-off-by: letonghan <[email protected]>

commit 2102a8e
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Thu Feb 20 10:46:19 2025 +0800

    Bump transformers (opea-project#1278)

    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: Liang Lv <[email protected]>

commit a033c05
Author: Liang Lv <[email protected]>
Date:   Wed Feb 19 14:19:02 2025 +0800

    Fix milvus dataprep ingest files failure (opea-project#1299)

    Signed-off-by: lvliang-intel <[email protected]>
    Co-authored-by: Letong Han <[email protected]>

commit 022d052
Author: lkk <[email protected]>
Date:   Wed Feb 19 09:50:59 2025 +0800

    fix agent message format. (opea-project#1297)

    1. set default session_id for react_langchain strategy, because the langchain version upgrade.
    2. fix request message format

commit 7727235
Author: Liang Lv <[email protected]>
Date:   Tue Feb 18 20:55:20 2025 +0800

    Refine CLIP embedding microservice by leveraging the third-party CLIP (opea-project#1298)

    * Refine CLI embedding microservice using dependency
    Signed-off-by: lvliang-intel <[email protected]>

commit a353f99
Author: Spycsh <[email protected]>
Date:   Mon Feb 17 11:35:38 2025 +0800

    Fix telemetry connection issue when disabling telemetry (opea-project#1290)

    * Fix telemetry connection issue when disabling telemetry

    - use ENABLE_OPEA_TELEMETRY to control whether to enable open telemetry, default false
    - fix the issue that logs always show telemetry connection error with each request when telemetry is disabled
    - ban the above error propagation to microservices when telemetry is disabled

    Signed-off-by: Spycsh <[email protected]>

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * Fix ut failure where required the flag to be on

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    ---------

    Signed-off-by: Spycsh <[email protected]>
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

commit 7c2e7f6
Author: xiguiw <[email protected]>
Date:   Sat Feb 15 11:25:15 2025 +0800

    update vLLM CPU to latest tag (opea-project#1285)

    Get the latest vLLM stable version.
    Signed-off-by: Wang, Xigui <[email protected]>

commit c3c8497
Author: Letong Han <[email protected]>
Date:   Fri Feb 14 22:29:38 2025 +0800

    Fix Qdrant retriever RAG issue. (opea-project#1289)

    * Fix Qdrant retriever no retrieved result issue.
    Signed-off-by: letonghan <[email protected]>

commit 47f68a4
Author: Letong Han <[email protected]>
Date:   Fri Feb 14 20:29:27 2025 +0800

    Fix the retriever issue of Milvus (opea-project#1286)

    * Fix the retriever issue of Milvus DB that data can not be retrieved
    after ingested using dataprep.

    Signed-off-by: letonghan <[email protected]>

    ---------

    Signed-off-by: letonghan <[email protected]>
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

commit 0e3f8ab
Author: minmin-intel <[email protected]>
Date:   Thu Feb 13 20:24:02 2025 -0800

    Improve multi-turn capability for agent (opea-project#1248)

    * first code for multi-turn

    Signed-off-by: minmin-intel <[email protected]>

    * test redispersistence

    Signed-off-by: minmin-intel <[email protected]>

    * integrate persistent store in react llama

    Signed-off-by: minmin-intel <[email protected]>

    * test multi-turn

    Signed-off-by: minmin-intel <[email protected]>

    * multiturn for assistants api and chatcompletion api

    Signed-off-by: minmin-intel <[email protected]>

    * update readme and ut script

    Signed-off-by: minmin-intel <[email protected]>

    * update readme and ut scripts

    Signed-off-by: minmin-intel <[email protected]>

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * fix bug

    Signed-off-by: minmin-intel <[email protected]>

    * change memory type naming

    Signed-off-by: minmin-intel <[email protected]>

    * fix with_memory as str

    Signed-off-by: minmin-intel <[email protected]>

    ---------

    Signed-off-by: minmin-intel <[email protected]>
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

commit 4a90692
Author: rbrugaro <[email protected]>
Date:   Thu Feb 13 18:12:25 2025 -0800

    Bug Fix neo4j dataprep ingest error handling and skip_ingestion argument passing (opea-project#1288)

    * Fix dataprpe ingest error handling and skip_ingestion argument passing in dataprep neo4j integration
    Signed-off-by: rbrugaro <[email protected]>

commit d1dfd0e
Author: Spycsh <[email protected]>
Date:   Thu Feb 13 22:39:47 2025 +0800

    Align mongo related chathistory/feedbackmanagement/promptregistry image names with examples (opea-project#1284)

    Align mongo related chathistory/feedbackmanagement/promptregistry image names with examples

    Signed-off-by: Spycsh <[email protected]>
    Co-authored-by: Liang Lv <[email protected]>

commit bef501c
Author: Liang Lv <[email protected]>
Date:   Thu Feb 13 21:18:58 2025 +0800

    Fix VDMS retrieval issue (opea-project#1252)

    * Fix VDMS retrieval issue
    Signed-off-by: lvliang-intel <[email protected]>

commit 23b2be2
Author: ZePan110 <[email protected]>
Date:   Thu Feb 13 16:07:14 2025 +0800

    Fix Build latest images on push event workflow (opea-project#1282)

    Signed-off-by: ZePan110 <[email protected]>

commit f8e6216
Author: Spycsh <[email protected]>
Date:   Wed Feb 12 15:45:14 2025 +0800

    fix metric id issue when init multiple Orchestrator instance (opea-project#1280)

    Signed-off-by: Spycsh <[email protected]>

commit d3906ce
Author: chen, suyue <[email protected]>
Date:   Wed Feb 12 14:56:55 2025 +0800

    update default service list (opea-project#1276)

    Signed-off-by: chensuyue <[email protected]>

commit 17b9672
Author: XinyaoWa <[email protected]>
Date:   Wed Feb 12 13:53:31 2025 +0800

    Fix langchain and huggingface version to avoid bug in FaqGen and DocSum, remove vllm hpu triton version fix (opea-project#1275)

    * Fix langchain and huggingface version to avoid bug

    Signed-off-by: Xinyao Wang <[email protected]>

commit b777db7
Author: Letong Han <[email protected]>
Date:   Mon Feb 10 16:00:55 2025 +0800

    Fix Dataprep Ingest Data Issue. (opea-project#1271)

    * Fix Dataprep Ingest Data Issue.

    Trace:
    1. The update of `langchain_huggingface.HuggingFaceEndpointEmbeddings` caused the wrong size of embedding vectors.
    2. Wrong size vectors are wrongly saved into Redis database, and the indices are not created correctly.
    3. The retriever can not retrieve data from Redis using index due to the
       reasons above.
    4. Then the RAG seems `not work`, for the file uploaded can not be
       retrieved from database.

    Solution:
    Replace all of the `langchain_huggingface.HuggingFaceEndpointEmbeddings`
    to `langchain_community.embeddings.HuggingFaceInferenceAPIEmbeddings`,
    and modify related READMEs and scirpts.

    Related issue:
    - opea-project/GenAIExamples#1473
    - opea-project/GenAIExamples#1482

    ---------

    Signed-off-by: letonghan <[email protected]>
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

commit 0df374b
Author: Daniel De León <111013930+daniel-de-leon-user293@users.noreply.github.com>
Date:   Sun Feb 9 22:01:58 2025 -0800

    Update docs for LLamaGuard & WildGuard Microservice (opea-project#1259)

    * working README for CLI and compose

    Signed-off-by: Daniel Deleon <[email protected]>

    * update for direct python execution

    Signed-off-by: Daniel Deleon <[email protected]>

    * fix formatting

    Signed-off-by: Daniel Deleon <[email protected]>

    * [pre-commit.ci] auto fixes from pre-commit.com hooks

    for more information, see https://pre-commit.ci

    * bring back depends_on condition

    Signed-off-by: Daniel Deleon <[email protected]>

    ---------

    Signed-off-by: Daniel Deleon <[email protected]>
    Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
    Co-authored-by: Abolfazl Shahbazi <[email protected]>

commit fb86b5e
Author: Louie Tsai <[email protected]>
Date:   Sat Feb 8 00:58:33 2025 -0800

    Add Deepseek model into validated model table and add required Gaudi cards for LLM microservice  (opea-project#1267)

    * Update README.md for Deepseek support and numbers of required gaudi cards

    Signed-off-by: Tsai, Louie <[email protected]>

    * Update README.md

    Signed-off-by: Tsai, Louie <[email protected]>

    ---------

    Signed-off-by: Tsai, Louie <[email protected]>

commit ecb7f7b
Author: Spycsh <[email protected]>
Date:   Fri Feb 7 16:58:22 2025 +0800

    Fix web-retrievers hub client and tei endpoint issue (opea-project#1270)

    * fix web-retrievers hub client and tei endpoint issue

    Signed-off-by: Spycsh <[email protected]>

commit 5baada8
Author: ZePan110 <[email protected]>
Date:   Thu Feb 6 15:03:00 2025 +0800

    Fix CD test issue. (opea-project#1263)

    1.Fix template name in README
    2.Fix invalid release name

    Signed-off-by: ZePan110 <[email protected]>

commit fa01f46
Author: minmin-intel <[email protected]>
Date:   Wed Feb 5 13:57:57 2025 -0800

    fix tei embedding and tei reranking bug (opea-project#1256)

    Signed-off-by: minmin-intel <[email protected]>
    Co-authored-by: Abolfazl Shahbazi <[email protected]>

commit 4ede405
Author: Eero Tamminen <[email protected]>
Date:   Wed Feb 5 22:04:50 2025 +0200

    Create token metrics only when they are available (opea-project#1092)

    * Create token metrics only when they are available

    This avoids generation of useless token/request histogram metrics
    for services that use Orchestrator class, but never call its token
    processing functionality.

    (Helps in differentiating frontend megaservice metrics from backend
    megaservice ones, especially when multiple OPEA applications run in
    the same cluster.)

    Also change Orchestrator CI test workaround to use unique prefix for
    each metric instance, instead of metrics being (singleton) class
    variables.

    Signed-off-by: Eero Tamminen <[email protected]>

    * Add locking for latency metric creation / method change

    As that that could be called from multiple request handling threads.

    Signed-off-by: Eero Tamminen <[email protected]>

    ---------

    Signed-off-by: Eero Tamminen <[email protected]>
    Co-authored-by: Malini Bhandaru <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants