-
-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return a Stream<Item=Result<T>>
from request_events
#92
Conversation
Ahh seems like I had |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial comments. looks great!
I have a little more tolerance for longer lines in signature, but otherwise the changes are appropriate. Don't mind the changes being kept in. At least you're only changing the files you touch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice work. some questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! I will have a quick sanity check on your branch then merge :-)
Hah, I think the stuff that breaks is my original unfold code. It looks like all informers break inside the client's
|
Been stress testing the pod_informer on a busy namespace and upping the default Informer timeout to 5 minutes (more standard it looks like by inspecting kubernetes code) diff --git a/src/api/informer.rs b/src/api/informer.rs
index fe952786..7f0ead47 100644
--- a/src/api/informer.rs
+++ b/src/api/informer.rs
@@ -44,7 +44,7 @@ where
Informer {
client: r.client,
resource: r.api,
- params: ListParams::default(),
+ params: ListParams { timeout: Some(60*5), ..ListParams::default() },
events: Arc::new(RwLock::new(VecDeque::new())),
version: Arc::new(RwLock::new(0.to_string())),
} It seems to be running super smoothly! |
This MIGHT be fixed by a trim, because it looks like we have 3 leading whitespace characters for some reason. It might also be because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Been stress testing the pod_informer on a busy namespace and upping the default Informer timeout to 5 minutes (more standard it looks like by inspecting kubernetes code)
diff --git a/src/api/informer.rs b/src/api/informer.rs index fe952786..7f0ead47 100644 --- a/src/api/informer.rs +++ b/src/api/informer.rs @@ -44,7 +44,7 @@ where Informer { client: r.client, resource: r.api, - params: ListParams::default(), + params: ListParams { timeout: Some(60*5), ..ListParams::default() }, events: Arc::new(RwLock::new(VecDeque::new())), version: Arc::new(RwLock::new(0.to_string())), }It seems to be running super smoothly!
Do you want me to include this change in the PR?
Don't worry about that, I need to find some sensible numbers across the crate anyway. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments
Have been testing the We need something like this: diff --git a/src/api/informer.rs b/src/api/informer.rs
index fe952786..10cea2da 100644
--- a/src/api/informer.rs
+++ b/src/api/informer.rs
@@ -149,7 +149,10 @@ where
Ok(WatchEvent::Added(o))
| Ok(WatchEvent::Modified(o))
| Ok(WatchEvent::Deleted(o)) => o.meta().resourceVersion.clone(),
- _ => None,
+ Ok(WatchEvent::Error(e) => {
+ // TODO: trigger self.reset() here somehow
+ // annoying because it requires await at this point
+ },
};
// Update our version need be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more suggestions, but it doesn't seem to crash anymore.
The only real thing remaining here is seeing if we can do anything about automatic rewatching: #92 (comment). Do you want to have a look at that or shall I investigate? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. I'm impressed you've managed to thread this through this way. Although have left some thoughts on exposing the error event. Think we might be able to simplify the interface a bit.
src/api/informer.rs
Outdated
// Sleep for 10s before allowing a retry | ||
let dur = Duration::from_secs(10); | ||
Delay::new(dur).await; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can avoid this convoluted structure and do this at the beginning of the Informer::poll
if needs_resync
is true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a lot simpler, thanks! I've updated the code as such. Now when we run into an error the needs resetting, we'll mark that needs_resync
flag, and then on the next call to poll we will wait a bit and reset our resource version. I appreciate all the code review, I know it's a lot of work but I've definitely learned a lot in working on this 🔥 🦀
Here's a small change to at least handle the 410 Gone correctly: diff --git a/examples/pod_informer.rs b/examples/pod_informer.rs
index 7da0fcdb..a660b63b 100644
--- a/examples/pod_informer.rs
+++ b/examples/pod_informer.rs
@@ -19,7 +19,7 @@ async fn main() -> anyhow::Result<()> {
let namespace = env::var("NAMESPACE").unwrap_or("default".into());
let resource = Api::v1Pod(client.clone()).within(&namespace);
- let inf = Informer::new(resource.clone()).init().await?;
+ let inf = Informer::new(resource.clone()).timeout(300).init_from("212710302".into());
// Here we both poll and reconcile based on events from the main thread
// If you run this next to actix-web (say), spawn a thread and pass `inf` as app state
diff --git a/src/api/informer.rs b/src/api/informer.rs
index de4dfd09..e0417b4d 100644
--- a/src/api/informer.rs
+++ b/src/api/informer.rs
@@ -164,18 +164,24 @@ where
Ok(WatchEvent::Added(o))
| Ok(WatchEvent::Modified(o))
| Ok(WatchEvent::Deleted(o)) => o.meta().resourceVersion.clone(),
- _ => None,
+ Ok(WatchEvent::Error(e)) => {
+ if e.code == 410 {
+ warn!("Stream desynced: {:?}", e);
+ *needs_resync.write().unwrap() = true; // resync
+ }
+ None
+ },
+ Err(e) => {
+ warn!("Unexpected watch error: {:?}", e);
+ None
+ },
};
- // If we hit an error, mark that we need to resync on the next call
- if let Err(e) = &event {
- warn!("Poll error: {:?}", e);
- *needs_resync.write().unwrap() = true;
- }
// Update our version need be
// Follow docs conventions and store the last resourceVersion
// https://kubernetes.io/docs/reference/using-api/api-concepts/#efficient-detection-of-changes
- else if let Some(nv) = new_version {
+ if let Some(nv) = new_version {
*version.write().unwrap() = nv;
} (the pod_informer example is only there for me to test it) |
I think we are good now? Everything appears to be working, no big shortcomings with the interface AFAIKT. It's possible we can make reflector examples a little better by spawning You happy to merge? |
Sounds good to me 🎉 |
Ok, I will prepare a release! But first; thank you so much for all your help with this! This is probably the most significant contribution I have received in the open source world, so I really appreciate it. |
Published in [email protected] 🎉 |
Altered the client's
request_events
method to return aStream
and updated dependent code.#83