Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/add-etcd-vm-node-scrape #614

Merged
merged 12 commits into from
Feb 6, 2025

Conversation

klinch0
Copy link
Contributor

@klinch0 klinch0 commented Feb 4, 2025

Summary by CodeRabbit

  • New Features

    • Enhanced system monitoring with a new configuration option to collect etcd metrics. Users can now enable the scraping of etcd metrics via updated settings, which improves observability.
    • Introduced a secure proxy mechanism that conditionally routes metrics data from etcd, offering administrators greater control over monitoring capabilities.
    • New configuration sections added to various bundles to support etcd metrics scraping.
  • Bug Fixes

    • Removed outdated configuration for VMNodeScrape resource, ensuring clarity and accuracy in monitoring configurations.
  • Chores

    • Added new service accounts, roles, and bindings to facilitate secure access for monitoring etcd metrics.

@klinch0 klinch0 requested a review from kvaps as a code owner February 4, 2025 21:00
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Feb 4, 2025
Copy link
Contributor

coderabbitai bot commented Feb 4, 2025

Walkthrough

This pull request adds a new scrapeRules configuration for the etcd service in multiple monitoring agent release files. The configuration sets enabled: true for etcd scraping in the core and PaaS bundles. Additionally, new Kubernetes resource configurations—including a kube-rbac-proxy DaemonSet, ServiceAccounts, roles, role bindings, a Secret, and a VMPodScrape—are defined to support secure scraping of etcd metrics. The templates now conditionally render resources based on the value of .Values.scrapeRules.etcd.enabled.

Changes

File(s) Change Summary
packages/core/platform/bundles/distro-full.yaml,
packages/core/platform/bundles/distro-hosted.yaml,
packages/core/platform/bundles/paas-full.yaml,
packages/core/platform/bundles/paas-hosted.yaml
Added a new values section under the monitoring-agents release to include scrapeRules: etcd: enabled: true.
packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml Introduced a DaemonSet (kube-rbac-proxy), associated ServiceAccounts, ClusterRoles, ClusterRoleBindings, a Secret, and a VMPodScrape resource, conditionally deployed when etcd scraping is enabled.
packages/system/monitoring-agents/templates/etcd-scrape.yaml Removed the entire configuration for the VMNodeScrape resource related to etcd monitoring.
packages/system/monitoring-agents/values.yaml Added a new scrapeRules section with an etcd configuration, defaulting enabled to false.

Sequence Diagram(s)

sequenceDiagram
    participant Loader as Config Loader
    participant Values as Values.yaml
    participant Proxy as kube-rbac-proxy DaemonSet
    participant Scrape as VMNodeScrape/etcd-scrape
    participant Etcd as etcd Service

    Loader->>Values: Read scrapeRules.etcd.enabled flag
    alt Flag is true
      Values-->>Loader: Return true
      Loader->>Proxy: Deploy etcd-proxy resources
      Proxy-->>Loader: Resources initialized
      Loader->>Scrape: Render etcd-scrape config
      Scrape->>Etcd: Start scraping metrics
    else Flag is false
      Values-->>Loader: Return false
      Loader->>Loader: Skip etcd scraping components
    end
Loading

Suggested labels

size:M, lgtm

Suggested reviewers

  • kvaps

Poem

I'm a rabbit on the run, hopping with delight,
New YAML fields now glow in code so bright.
Etcd metrics are sprouting, fresh and keen,
In clusters and pods where magic is seen.
With configs in place, I dance all day—
A code rabbit's joy in every way!
🥕🐇

Tip

🌐 Web search-backed reviews and chat
  • We have enabled web search-based reviews and chat for all users. This feature allows CodeRabbit to access the latest documentation and information on the web.
  • You can disable this feature by setting web_search: false in the knowledge_base settings.
  • Please share any feedback in the Discord discussion.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@dosubot dosubot bot added the enhancement New feature or request label Feb 4, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml (1)

120-120: Fix typo in resource name.

The VMPodScrape name contains a typo: etcd-managment-scrape should be etcd-management-scrape.

-  name: etcd-managment-scrape
+  name: etcd-management-scrape
hack/download-dashboards.sh (1)

71-71: Updated Grafana Dashboard Filename

The filename has been updated from kube-etcd3.json to kube-etcd.json, which aligns with the renaming mentioned in the PR objectives. Please verify that all downstream references (such as those in dashboards lists or documentation) reflect this change consistently.

Note: There is a trailing #TODO comment on this line. Ensure this TODO is either resolved or tracked appropriately in your issue tracker.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 835ee11 and 653e737.

📒 Files selected for processing (15)
  • dashboards/control-plane/kube-etcd3.json (0 hunks)
  • hack/download-dashboards.sh (2 hunks)
  • packages/core/platform/bundles/distro-full.yaml (1 hunks)
  • packages/core/platform/bundles/distro-hosted.yaml (1 hunks)
  • packages/core/platform/bundles/paas-full.yaml (1 hunks)
  • packages/core/platform/bundles/paas-hosted.yaml (1 hunks)
  • packages/extra/etcd/Chart.yaml (1 hunks)
  • packages/extra/etcd/templates/etcd-cluster.yaml (1 hunks)
  • packages/extra/etcd/templates/podscrape.yaml (1 hunks)
  • packages/extra/etcd/templates/prometheus-rules.yaml (1 hunks)
  • packages/extra/monitoring/dashboards.list (1 hunks)
  • packages/extra/versions_map (1 hunks)
  • packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml (1 hunks)
  • packages/system/monitoring-agents/templates/etcd-scrape.yaml (1 hunks)
  • packages/system/monitoring-agents/values.yaml (1 hunks)
💤 Files with no reviewable changes (1)
  • dashboards/control-plane/kube-etcd3.json
✅ Files skipped from review due to trivial changes (2)
  • packages/extra/etcd/Chart.yaml
  • packages/extra/versions_map
🧰 Additional context used
🪛 YAMLlint (1.35.1)
packages/system/monitoring-agents/templates/etcd-scrape.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)


[warning] 68-68: wrong indentation: expected 0 but found 2

(indentation)


[warning] 123-123: wrong indentation: expected 2 but found 4

(indentation)

🔇 Additional comments (12)
packages/extra/monitoring/dashboards.list (1)

33-33: LGTM! Dashboard path updated.

The update from kube-etcd3 to kube-etcd aligns with the broader changes to enhance etcd monitoring.

packages/extra/etcd/templates/podscrape.yaml (1)

1-11: LGTM! VMPodScrape configuration looks good.

The configuration correctly targets etcd pods using a specific label selector and appropriate HTTP scheme for internal pod scraping.

packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml (3)

19-19: Verify if host networking is required.

The DaemonSet uses hostNetwork: true. Please verify if this is necessary, as it could potentially expose the proxy to the host network.


125-126: Review TLS configuration for security concerns.

The configuration uses insecureSkipVerify: true, which bypasses certificate validation. Please verify if this is necessary and document the security implications.


24-24: Verify kube-rbac-proxy version.

The image uses a fixed version v0.11.0. Please verify if this is the latest stable version and consider using a version range or latest tag if appropriate.

packages/core/platform/bundles/distro-hosted.yaml (1)

61-64: Enabling etcd Scrape Rules

The added values section with the nested scrapeRules block (setting etcd: enabled: true) is well placed under the monitoring-agents release. This modification directly supports enhanced etcd monitoring in this bundle and is consistent with the overall PR objectives.

packages/extra/etcd/templates/etcd-cluster.yaml (1)

43-48: Exposing Metrics via a Dedicated etcd Container

The addition of the containers section within the pod template—including a container named etcd that exposes the metrics port on 2381/TCP—effectively enables metrics scraping for the etcd cluster. Ensure that the application process running in the etcd container is configured to actually emit the expected metrics on this port.

packages/core/platform/bundles/distro-full.yaml (1)

78-82: Consistent Enabling of etcd Metrics Scraping

The introduction of the scrapeRules configuration (with etcd: enabled: true) under the monitoring-agents release is consistent with the changes in other bundle files (for example, in distro-hosted.yaml). This consistency helps ensure that etcd metrics are scraped regardless of the deployment bundle selected.

packages/extra/etcd/templates/prometheus-rules.yaml (1)

1-9: New PrometheusRule Resource for etcd Monitoring

The new PrometheusRule resource (etcd-rules) is structured correctly and groups several alerts to monitor etcd cluster health and performance. The overall YAML syntax and use of templated labels are in line with Prometheus best practices.

packages/core/platform/bundles/paas-hosted.yaml (1)

73-77: ScrapeRules Configuration for etcd Enabled
The new values block for the monitoring-agents release now includes scrapeRules with etcd.enabled: true, which aligns with similar changes in other bundle files. This addition explicitly enables etcd metrics scraping for deployments relying on this bundle.

packages/core/platform/bundles/paas-full.yaml (1)

100-104: Consistent Enabling of etcd ScrapeRules
Adding the scrapeRules configuration with etcd.enabled: true in the monitoring-agents release maintains consistency across bundle configurations. This ensures that etcd metrics are scraped in environments using the paas-full.yaml bundle as well.

packages/system/monitoring-agents/values.yaml (1)

363-366: ScrapeRules for etcd Default Set to False
In this system values file, the new scrapeRules section sets etcd.enabled to false by default. This contrasts with the bundles where it is explicitly enabled. Verify that this discrepancy is intentional—perhaps to allow overriding via external configuration—and consider adding an inline comment or documentation note to clarify the intended behavior.

Comment on lines 30 to 34
scheme: http
scrapeTimeout: 5s
tlsConfig:
caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecureSkipVerify: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Review TLS configuration for security concerns.

The configuration uses HTTP scheme while also specifying TLS config with insecureSkipVerify: true. This combination raises security concerns:

  1. HTTP scheme with TLS config is inconsistent.
  2. insecureSkipVerify: true bypasses certificate validation.

Consider using HTTPS scheme if TLS is required, or verify if TLS can be safely disabled for this internal communication.


🏁 Script executed:

#!/bin/bash
# Check if etcd is configured to use TLS for metrics endpoint
rg -A 5 "metrics.*cert" packages/extra/etcd/

Length of output: 46


🏁 Script executed:

#!/bin/bash
# Display the TLS settings and surrounding context from the etcd-scrape.yaml file for manual verification.
rg -C 5 "tlsConfig:" packages/system/monitoring-agents/templates/etcd-scrape.yaml

Length of output: 307

Security Misconfiguration in TLS Settings

The file packages/system/monitoring-agents/templates/etcd-scrape.yaml is configured with an HTTP scheme yet also provides TLS settings with insecureSkipVerify: true. This combination is problematic because:

  • HTTP scheme vs. TLS config: Using HTTP alongside a TLS configuration is inconsistent. If encryption is desired, the scheme should be HTTPS.
  • Bypassing Certificate Validation: The use of insecureSkipVerify: true disables proper certificate validation, exposing the communication to potential security risks.

It is recommended to either:

  • Update the scheme to https and remove insecureSkipVerify: true so that certificate validation is enforced, or
  • Eliminate the TLS configuration entirely if secure communication is not needed for this internal endpoint.

Comment on lines 36 to 44
- alert: etcdHighNumberOfFailedGRPCRequests
annotations:
summary: "etcd cluster '{{`{{ $labels.job }}`}}': '{{`{{ $value }}`}}' of requests for '{{`{{ $labels.grpc_method }}`}}' failed on etcd instance '{{`{{ $labels.instance }}`}}'."
expr: |
100 * sum(rate(grpc_server_handled_total{job=~".*etcd.*", grpc_code!="OK"}[5m])) BY (job, instance, grpc_service, grpc_method)
/
sum(rate(grpc_server_handled_total{job=~".*etcd.*"}[5m])) BY (job, instance, grpc_service, grpc_method)
> 1
for: 10m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Duplicate Alert Definition: etcdHighNumberOfFailedGRPCRequests (First Instance)

This alert is defined here with a threshold of > 1 over a 10-minute period and a severity of warning. However, an identical alert name is used later with different parameters. Duplicate alert names can lead to conflicts in alert processing.

Comment on lines 48 to 57
- alert: etcdHighNumberOfFailedGRPCRequests
annotations:
summary: "etcd cluster '{{`{{ $labels.job }}`}}': '{{`{{ $value }}`}}' of requests for '{{`{{ $labels.grpc_method }}`}}' failed on etcd instance '{{`{{ $labels.instance }}`}}'."
expr: |
100 * sum(rate(grpc_server_handled_total{job=~".*etcd.*", grpc_code!="OK"}[5m])) BY (job, instance, grpc_service, grpc_method)
/
sum(rate(grpc_server_handled_total{job=~".*etcd.*"}[5m])) BY (job, instance, grpc_service, grpc_method)
> 5
for: 5m
labels:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Duplicate Alert Definition: etcdHighNumberOfFailedGRPCRequests (Second Instance)

This alert is defined again—this time with a threshold of > 5, a shorter duration of 5 minutes, and a severity of critical. Having two alerts with the same name but differing thresholds and durations is likely to cause ambiguity. Please consider consolidating these into a single alert or renaming one to clearly differentiate their intent.

Comment on lines 89 to 97
- alert: etcdHighNumberOfFailedHTTPRequests
annotations:
summary: "'{{`{{ $value }}`}}' of requests for '{{`{{ $labels.method }}`}}' failed on etcd instance '{{`{{ $labels.instance }}`}}'."
expr: |
sum(rate(etcd_http_failed_total{job=~".*etcd.*", code!="404"}[5m])) BY (method) / sum(rate(etcd_http_received_total{job=~".*etcd.*"}[5m])) BY (method) > 0.01
for: 10m
labels:
severity: warning

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Duplicate Alert Definition: etcdHighNumberOfFailedHTTPRequests (First Instance)

The first definition of this alert uses a threshold of > 0.01 for a 10-minute period with a severity of warning.

Comment on lines 98 to 106
- alert: etcdHighNumberOfFailedHTTPRequests
annotations:
summary: "'{{`{{ $value }}`}}' of requests for '{{`{{ $labels.method }}`}}' failed on etcd instance '{{`{{ $labels.instance }}`}}'."
expr: |
sum(rate(etcd_http_failed_total{job=~".*etcd.*", code!="404"}[5m])) BY (method) / sum(rate(etcd_http_received_total{job=~".*etcd.*"}[5m])) BY (method) > 0.05
for: 10m
labels:
severity: critical

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Duplicate Alert Definition: etcdHighNumberOfFailedHTTPRequests (Second Instance)

A second alert with the same name is defined immediately after, this time using a threshold of > 0.05 for 10 minutes and a severity of critical. Duplicate alert definitions with differing parameters can introduce confusion in alert handling. Please merge or differentiate these alerts accordingly.

@klinch0 klinch0 force-pushed the feature/add-etcd-VMNodeScrape branch from 653e737 to 4655657 Compare February 5, 2025 12:39
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Feb 5, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml (2)

86-88: Review ClusterRole rule configuration.
The rule for the etcd-metrics-reader ClusterRole includes nonResourceURLs: ["/metrics"]. Confirm that this approach is intentional rather than employing resource-based permissions, as this choice might impact the access control for metric endpoints.


119-121: Spelling inconsistency in resource name.
The VMPodScrape resource is named etcd-managment-scrape. It appears that “management” is misspelled. Correcting this to etcd-management-scrape could improve clarity and consistency.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 653e737 and 4655657.

📒 Files selected for processing (7)
  • packages/core/platform/bundles/distro-full.yaml (1 hunks)
  • packages/core/platform/bundles/distro-hosted.yaml (1 hunks)
  • packages/core/platform/bundles/paas-full.yaml (1 hunks)
  • packages/core/platform/bundles/paas-hosted.yaml (1 hunks)
  • packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml (1 hunks)
  • packages/system/monitoring-agents/templates/etcd-scrape.yaml (1 hunks)
  • packages/system/monitoring-agents/values.yaml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • packages/system/monitoring-agents/values.yaml
  • packages/core/platform/bundles/paas-full.yaml
  • packages/core/platform/bundles/paas-hosted.yaml
  • packages/core/platform/bundles/distro-full.yaml
  • packages/core/platform/bundles/distro-hosted.yaml
🧰 Additional context used
🪛 YAMLlint (1.35.1)
packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)


[warning] 68-68: wrong indentation: expected 0 but found 2

(indentation)


[warning] 123-123: wrong indentation: expected 2 but found 4

(indentation)

packages/system/monitoring-agents/templates/etcd-scrape.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

🔇 Additional comments (5)
packages/system/monitoring-agents/templates/etcd-scrape.yaml (1)

1-1: Helm Templating Conditional Block and YAML Lint Note
The conditional block ({{- if .Values.scrapeRules.etcd.enabled }}) is correctly used to conditionally render the resource. Note that YAML lint tools may report a syntax error (e.g., "expected the node content, but found '-'") due to the Helm templating syntax; this is a known false positive and can be safely ignored.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml (4)

25-27: Confirm upstream port configuration.
The container arguments include --upstream=http://127.0.0.1:2381/. Since etcd’s default client port is typically 2379, please verify that port 2381 is intentional and correctly configured for your environment.


67-70: Validate subjects list indentation.
YAMLlint has flagged an indentation warning for the list item under subjects: (notably at line 68). Please review and adjust the indentation to ensure it complies with YAML best practices (e.g., ensure the dash is properly aligned relative to the parent key).

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 68-68: wrong indentation: expected 0 but found 2

(indentation)


122-123: Verify YAML indentation for podMetricsEndpoints.
YAMLlint reports a warning on line 123 about unexpected indentation (expected 2 spaces but found 4). Please double-check the indentation levels under podMetricsEndpoints: to ensure they match the required YAML structure.

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 123-123: wrong indentation: expected 2 but found 4

(indentation)


1-1: ⚠️ Potential issue

Conditional rendering and Helm template syntax issue.
The file begins with {{- if not .Values.scrapeRules.etcd.enabled }}, which inverts the typical expectation that resources be rendered when etcd scraping is enabled. Verify whether the intended behavior is to deploy these resources when .Values.scrapeRules.etcd.enabled is true—in which case the condition should be updated to remove the not modifier. Additionally, YAMLlint reports a syntax error here; this may be a false positive due to Helm templating. Ensure your linting configuration supports Helm template directives.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

targetLabel: metrics_path
- replacement: etcd
targetLabel: job
scheme: http
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Protocol Scheme vs. TLS Configuration Inconsistency
The scheme is explicitly set to http, while a TLS configuration block (lines 32–34) is still provided. TLS settings are generally applicable with HTTPS endpoints. Please verify your intent:

  • If secure communication is desired, update the scheme to https and adjust TLS settings accordingly, or
  • If HTTP is intended, consider removing the TLS block to reduce confusion.

Comment on lines 32 to 34
tlsConfig:
caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecureSkipVerify: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Redundant TLS Configuration for HTTP Endpoint
The TLS configuration (including insecureSkipVerify: true) appears redundant when using http as the protocol. Removing these settings—if TLS is not required—can simplify the configuration and avoid potential misinterpretation.

Copy link
Member

@kvaps kvaps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to bind it only on NODE_IP

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml (2)

1-1: Helm Template Directive and Linter Compatibility
The Helm conditional directive ({{- if .Values.scrapeRules.etcd.enabled }}) is correctly used to conditionally render this block. Note that YAML linters may report a syntax error (as seen in the static analysis) because they do not recognize Helm templating. Consider configuring your linter to ignore Helm directives or use a Helm-aware linter.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)


120-139: VMPodScrape Resource: Indentation and Naming Refinement
The VMPodScrape resource is well defined; however, please address the following points:

  1. Indentation Warning: The static analysis indicates an indentation issue on line 128 (expected 2 spaces but found 4). Verify that the list item under podMetricsEndpoints is indented to meet your YAML style guidelines.
  2. Naming Spelling: The metadata name etcd-managment-scrape appears to have a typographical error. Consider renaming it to etcd-management-scrape for clarity and consistency.
    A suggested diff for the renaming is provided below:
-  name: etcd-managment-scrape
+  name: etcd-management-scrape

Please review the indentation within the podMetricsEndpoints block to confirm it meets your expected YAML structure.

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 128-128: wrong indentation: expected 2 but found 4

(indentation)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4655657 and 86b6861.

📒 Files selected for processing (2)
  • packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml (1 hunks)
  • packages/system/monitoring-agents/templates/etcd-scrape.yaml (0 hunks)
💤 Files with no reviewable changes (1)
  • packages/system/monitoring-agents/templates/etcd-scrape.yaml
🧰 Additional context used
🪛 YAMLlint (1.35.1)
packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)


[warning] 73-73: wrong indentation: expected 0 but found 2

(indentation)


[warning] 128-128: wrong indentation: expected 2 but found 4

(indentation)

🔇 Additional comments (8)
packages/system/monitoring-agents/templates/etcd-proxy-scrape.yaml (8)

2-39: DaemonSet Configuration Confirmation
The DaemonSet for kube-rbac-proxy is defined appropriately. The configuration correctly sets the namespace, uses host networking, and specifies the node selector for control-plane nodes. The container’s arguments (e.g. --secure-listen-address=$(NODE_IP):9443 and --upstream=http://127.0.0.1:2381/) follow previous recommendations. Ensure that the image version (v0.11.0) is intended and compatible with your environments.


42-47: ServiceAccount for kube-rbac-proxy
The ServiceAccount definition for kube-rbac-proxy in the cozy-monitoring namespace is straightforward and correct.


50-61: ClusterRole for kube-rbac-proxy-auth
The ClusterRole is properly configured with the necessary permissions for token reviews and subject access reviews.


64-76: ClusterRoleBinding Indentation Check
The ClusterRoleBinding for kube-rbac-proxy-auth-binding is correctly defined; however, note the static analysis warning regarding the indentation of the subjects list (e.g. line 73). YAML best practices typically require list items to be indented consistently (commonly 2 spaces under the key). Please verify that this indentation aligns with your team's style and adjust your linter configuration if necessary.

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 73-73: wrong indentation: expected 0 but found 2

(indentation)


77-84: ServiceAccount for vm-scrape
The ServiceAccount for vm-scrape is cleanly defined and placed in the correct namespace.


85-94: ClusterRole for etcd-metrics-reader – Permission Scope
The removal of the nonResourceURLs: ["/metrics"] line suggests an intentional change in the permission scope. Please confirm that this removal is deliberate and that the new or remaining permissions accurately allow the intended access. If further permissions are required for scraping metrics, consider verifying the access rules.


95-109: ClusterRoleBinding for etcd-metrics-reader: Indentation Consistency
Similar to the earlier binding, ensure that the list items under the subjects key (lines 106–108) are indented consistently according to your YAML formatting standards. This may require alignment adjustments if your YAML linter flags indentation issues.


110-119: Secret Definition for vm-token
The Secret is correctly defined as a service-account token for vm-scrape with the appropriate annotation.

Copy link
Member

@kvaps kvaps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Feb 6, 2025
@kvaps kvaps merged commit 5a47754 into aenix-io:main Feb 6, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants