-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(cri): restructure CRI API (improve robustness, clarity and maintainability) #1600
Conversation
8975a71
to
e76e2e8
Compare
@@ -209,7 +209,8 @@ std::string sinsp_container_manager::container_to_json(const sinsp_container_inf | |||
inet_ntop(AF_INET, &iph, addrbuff, sizeof(addrbuff)); | |||
container["ip"] = addrbuff; | |||
|
|||
container["cni_json"] = container_info.m_pod_cniresult; | |||
container["cni_json"] = container_info.m_pod_sandbox_cniresult; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better / more clear variable and function naming throughout.
@@ -327,7 +327,9 @@ class sinsp_container_info | |||
int64_t m_cpu_period; | |||
int32_t m_cpuset_cpu_count; | |||
std::list<container_health_probe> m_health_probes; | |||
std::string m_pod_cniresult; | |||
std::string m_pod_sandbox_id; | |||
std::map<std::string, std::string> m_pod_sandbox_labels; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having m_pod_sandbox_labels
natively in the container will optimize k8s filterchecks and get rid of the confusion and overloaded container labels that even confused us maintainers.
userspace/libsinsp/cri.h
Outdated
* @param info status.info() Map | ||
* @return Json::Value, can be null | ||
*/ | ||
Json::Value get_info_jvalue(const google::protobuf::Map<std::string, std::string> &info); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A common helper reduces code duplication and redundant lookups. We now consistently only do this lookup once for each API call, not anymore 3-5 times as previously. It also helps making the CRI API code base more transparent and clearer. In general, I made the variable and params naming more consistent and expressive throughout to improve maintainability going forward.
{ | ||
template<typename api> bool pod_uses_host_netns(const typename api::PodSandboxStatusResponse &resp) | ||
{ | ||
const auto netns = resp.status().linux().namespaces().options().network(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A helper for that was unnecessary API surface. Removed the helper.
@@ -104,7 +95,7 @@ inline sinsp_container_type cri_interface<api>::get_cri_runtime_type() const | |||
} | |||
|
|||
template<typename api> | |||
inline grpc::Status cri_interface<api>::get_container_status(const std::string &container_id, | |||
inline grpc::Status cri_interface<api>::get_container_status_resp(const std::string &container_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once more this type of renaming will make the API code base clearer and improve maintainability going forward.
if(container_info == nullptr || container_info->m_labels.empty()) | ||
// No m_pod_sandbox_id means no k8s. | ||
// m_pod_sandbox_id retrieved from the ContainerStatusResponse CRI API call. | ||
if(container_info == nullptr || container_info->m_pod_sandbox_id.empty()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@leogr @Andreagit97 this should satisfy the desired behavior, please confirm.
return NULL; | ||
} | ||
RETURN_EXTRACT_STRING(container_info->m_pod_cniresult); | ||
// Requires s_cri_extra_queries enabled, which is the default for Falco. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clear docs.
"log_path": "busybox.0.log", | ||
"linux": { | ||
"resources": { | ||
"metadata": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only adjusted indentation
@@ -741,4 +690,88 @@ TEST_F(sinsp_with_test_input, container_parser_cri_containerd) | |||
ASSERT_EQ(get_field_as_string(evt, "k8s.pod.ip"), "10.244.0.2"); | |||
ASSERT_EQ(get_field_as_string(evt, "k8s.pod.cni.json"), "{\"bridge\":{\"IPConfigs\":null},\"eth0\":{\"IPConfigs\":[{\"Gateway\":\"10.244.0.1\",\"IP\":\"10.244.0.2\"}]}}"); | |||
} | |||
|
|||
TEST_F(sinsp_with_test_input, container_parser_cri_containerd_sandbox_container) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now extra tests for pod sandbox containers.
m_inspector.m_container_manager.add_container(std::move(container_info), init_thread_info); | ||
|
||
auto sandbox_container_info = std::make_shared<sinsp_container_info>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test now also clearly shows that we have all we need in the container, no need for extra pod sandbox container lookup from the container cache anymore.
@@ -808,6 +785,27 @@ inline bool cri_interface<api>::parse(const libsinsp::cgroup_limits::cgroup_limi | |||
container.m_imagetag.c_str(), container.m_image.c_str(), | |||
container.m_imagedigest.c_str()); | |||
} | |||
|
|||
/* | |||
* The recent refactor makes full use of PodSandboxStatusResponse, removing the need to access pod sandbox containers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gnosek highlighting again that I am not introducing extra CRI API calls. They were already there before.
In summary, the design is that for each container that is not a pod sandbox container you get a ContainerStatsResponse
once and also a PodSandboxStatusResponse
once and fully make use of the fact that you have the PodSandboxStatusResponse
.
If we want to complicate things and cache PodSandboxStatusResponse
info in pod sandbox containers so that the call is only done once for each pod is a different discussion. I would not recommend complicating things for now, but we could think about it in the future.
187f557
to
87ab2f3
Compare
{ | ||
sandbox_container_id.resize(12); | ||
// Fallback: Retrieve PodSandboxStatusResponse fields stored in pod sandbox container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For maximum robustness in retrieving the fields we get from the PodSandboxStatusResponse fetch them either from the container or as fallback from the pod sandbox container.
06aa5b1
to
3e72119
Compare
3e72119
to
decf0e4
Compare
@gnosek may I ask for a first-pass review from you? Thanks a bunch in advance? |
If ok with everyone /milestone 0.15.0 |
decf0e4
to
0184fda
Compare
0184fda
to
f7ba20d
Compare
f7ba20d
to
a17eaba
Compare
@falcosecurity/libs-maintainers kindly checking in on the status of this PR. Thank you. |
Please note that this PR has been open for 2 months now without receiving any review whatsoever. During this time, it appears that our focus has been primarily directed towards prioritizing and integrating similarly complex code alterations. Why this PR matters:
|
PR LGTM, but i am in no way an expert on this part of the code :) |
On my side, I will prioritize this once #1595 gets merged (hopefully soon). I still want to confirm that this is targetted for libs 0.15 |
Signed-off-by: Melissa Kilby <[email protected]>
Signed-off-by: Melissa Kilby <[email protected]>
Signed-off-by: Melissa Kilby <[email protected]>
Signed-off-by: Melissa Kilby <[email protected]>
Includes minor additional cleanups wrt to comments. Signed-off-by: Melissa Kilby <[email protected]>
a17eaba
to
a121523
Compare
Signed-off-by: Melissa Kilby <[email protected]>
@leogr this PR should be the next one. Rebased and added another commit on top in an attempt to improve comments and naming and code organization even more. If it help, happy to schedule a call to more interactively review this PR, just let me know, thanks a bunch! |
I'm working on it. @therealbobo already did some tests, and everything worked properly (@therealbobo do you confirm?)/ cc @falcosecurity/libs-maintainers for visibility |
Thank you! While it's a terrible diff, I believe this PR is less risky, plus at least it's backed up my unit tests (acknowledging some limitations w/ mock CRI response unit tests, but still better than before when we had nothing). Perhaps, double check and pay close attention to all conditional checks etc. Plus plz let me know if the filterchecks fallback mechanisms are ok with you:
|
@leogr I did! I also see new commits. Let me check! 👀 |
Only the last commit is new, else conflict-free rebase. The last commit is merely one more renaming and adding more comments pass etc, no changes really. |
… time Signed-off-by: Melissa Kilby <[email protected]>
Since we are doing this, let's do it right: I pushed one more commit to really be super consistent also in the order of calls in the CRI code base (plus few more comments). Ok to call me crazy on those details, but as said hopefully we don't need to touch this again for a bit. |
👍 Testing this again (in progress). |
LGTM label has been added. Git tree hash: 0087a2cc73d9fc24d2e741c5bb350dc12e4c7e33
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: FedeDP, incertum, leogr The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind cleanup
Any specific area of the project related to this PR?
/area libsinsp
Does this PR require a change in the driver versions?
What this PR does / why we need it:
I have added very detailed comments to the code diff re the implemented changes.
At a high level the refactor achieves the following:
Which issue(s) this PR fixes:
#1589
Why this PR matters:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: