Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add edge observability #713

Merged
merged 20 commits into from
Feb 20, 2025
Merged

feat: Add edge observability #713

merged 20 commits into from
Feb 20, 2025

Conversation

chriswk
Copy link
Member

@chriswk chriswk commented Feb 5, 2025

No description provided.

@chriswk chriswk requested a review from sighphyre February 5, 2025 15:03
@chriswk chriswk self-assigned this Feb 5, 2025
Copy link

github-actions bot commented Feb 5, 2025

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

PackageVersionScoreDetails
cargo/opentelemetry 0.28.0 🟢 6.6
Details
CheckScoreReason
Code-Review🟢 10all changesets reviewed
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
Maintained🟢 1030 commit(s) and 26 issue activity found in the last 90 days -- score normalized to 10
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
License🟢 10license file detected
Vulnerabilities🟢 100 existing vulnerabilities detected
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 10security policy file detected
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Fuzzing⚠️ 0project is not fuzzed
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/opentelemetry-prometheus 0.28.0 🟢 6.6
Details
CheckScoreReason
Code-Review🟢 10all changesets reviewed
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
Maintained🟢 1030 commit(s) and 26 issue activity found in the last 90 days -- score normalized to 10
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
License🟢 10license file detected
Vulnerabilities🟢 100 existing vulnerabilities detected
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 10security policy file detected
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Fuzzing⚠️ 0project is not fuzzed
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/opentelemetry_sdk 0.28.0 🟢 6.6
Details
CheckScoreReason
Code-Review🟢 10all changesets reviewed
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
Maintained🟢 1030 commit(s) and 26 issue activity found in the last 90 days -- score normalized to 10
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
License🟢 10license file detected
Vulnerabilities🟢 100 existing vulnerabilities detected
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 10security policy file detected
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Fuzzing⚠️ 0project is not fuzzed
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/opentelemetry >= 0.28.0, < 0.29.0 🟢 6.6
Details
CheckScoreReason
Code-Review🟢 10all changesets reviewed
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
Maintained🟢 1030 commit(s) and 26 issue activity found in the last 90 days -- score normalized to 10
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
License🟢 10license file detected
Vulnerabilities🟢 100 existing vulnerabilities detected
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 10security policy file detected
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Fuzzing⚠️ 0project is not fuzzed
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/opentelemetry-prometheus >= 0.28.0, < 0.29.0 🟢 6.6
Details
CheckScoreReason
Code-Review🟢 10all changesets reviewed
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
Maintained🟢 1030 commit(s) and 26 issue activity found in the last 90 days -- score normalized to 10
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
License🟢 10license file detected
Vulnerabilities🟢 100 existing vulnerabilities detected
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 10security policy file detected
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Fuzzing⚠️ 0project is not fuzzed
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
cargo/opentelemetry_sdk >= 0.28.0, < 0.29.0 🟢 6.6
Details
CheckScoreReason
Code-Review🟢 10all changesets reviewed
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
Maintained🟢 1030 commit(s) and 26 issue activity found in the last 90 days -- score normalized to 10
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
License🟢 10license file detected
Vulnerabilities🟢 100 existing vulnerabilities detected
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 10security policy file detected
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies⚠️ 0dependency not pinned by hash detected -- score normalized to 0
Fuzzing⚠️ 0project is not fuzzed
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0

Scanned Files

  • Cargo.lock
  • server/Cargo.toml

@chriswk chriswk force-pushed the feat/edgeObservability branch from 6e19fe1 to 466eba7 Compare February 5, 2025 15:04
@chriswk chriswk changed the title More work for dealing with prometheus data feat: Add edge observability Feb 5, 2025
@chriswk chriswk marked this pull request as ready for review February 10, 2025 08:32
Comment on lines +124 to +127
.with_boundaries(vec![
1.0, 5.0, 10.0, 20.0, 30.0, 40.0, 50.0, 100.0, 200.0, 300.0, 400.0, 500.0, 750.0,
1000.0, 1500.0, 2000.0,
])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one used 0.0 as its first bucket, which since we're operating with time, we don't have negative numbers.

@chriswk chriswk force-pushed the feat/edgeObservability branch from 337e957 to e064c77 Compare February 10, 2025 12:11
.unleash_client
.send_instance_data(observed_data, &instance_data_sender.token)
.await;
match status {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal taste but I think if let Err is a nice construct than an empty Ok match. You could replace this with

                if let Err(e) = status {
                    match e {
                        EdgeError::EdgeMetricsRequestError(status, message) => {
                            warn!("Failed to post instance data with status {status} and {message:?}");
                            if status == StatusCode::NOT_FOUND {
                                debug!("Upstream edge metrics not found, clearing our data about downstream instances to avoid growing to infinity (and beyond!).");
                                empty = true;
                                do_the_work = false;
                            } else if status == StatusCode::FORBIDDEN {
                                warn!("Upstream edge metrics rejected our data, clearing our data about downstream instances to avoid growing to infinity (and beyond!)");
                                empty = true;
                                do_the_work = false;
                            }
                        }
                        _ => {
                            warn!("Failed to post instance data due to unknown error {e:?}");
                            empty = false;
                        }
                    }
                }

loop {
let mut empty = true;
tokio::time::sleep(std::time::Duration::from_secs(60)).await;
if let Some(instance_data_sender) = instance_data_sender.clone() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this function is pretty nested. Again personal opinion, but if we never stored the data in if we couldn't send it, then we could use


        let Some(instance_data_sender) = instance_data_sender.clone() else {
            continue;
        };

to flatten this out a little

}
}

pub async fn send_instance_data(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give this a different name? We now have two functions called the same thing doing different things

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Ok(())
} else {
match result.status() {
StatusCode::BAD_REQUEST => Err(EdgeMetricsRequestError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this specific check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because copy-pasta from earlier request. I still like it though, because it exposes full message if we receive 400 from upstream, so we know what we need to fix.

pub p99: f64,
}

impl LatencyMetrics {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default is already implemented for this struct, this shouldn't be needed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

started: Utc::now(),
traffic: InstanceTraffic::default(),
latency_upstream: UpstreamLatency::default(),
connected_edges: Vec::new(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not married to it but I'm so used to seeing the macro invocation over explicit new here

Suggested change
connected_edges: Vec::new(),
connected_edges: vec![],

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, fixed

@chriswk chriswk force-pushed the feat/edgeObservability branch from f3060e5 to c802591 Compare February 14, 2025 09:11
Copy link
Member

@nunogois nunogois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@chriswk chriswk force-pushed the feat/edgeObservability branch from 9b8636c to 1f2778d Compare February 20, 2025 13:00
@chriswk chriswk enabled auto-merge (squash) February 20, 2025 13:10
@chriswk chriswk merged commit 130fba6 into main Feb 20, 2025
15 checks passed
@chriswk chriswk deleted the feat/edgeObservability branch February 20, 2025 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants