Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Program nexthop table as a ternary match. #59

Merged
merged 3 commits into from
Oct 11, 2023
Merged

Conversation

n-sandeep
Copy link
Contributor

Earlier, nexthop table was exact match, due to LAG addition, nexthop table is moved to WCM block, ternary match. Changes in krnlmon is made to program nexthop table as ternary match

Earlier, nexthop table was exact match, due to LAG addition,
nexthop table is moved to WCM block, ternary match.
Changes in krnlmon is made to program nexthop table as ternary match

Signed-off-by: Sandeep N <[email protected]>
Copy link

@vsureshkumarp vsureshkumarp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looks good to me.

Copy link
Contributor

@nupuruttarwar nupuruttarwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@aashishkuma aashishkuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is already part of the LAG PR.
#54

Copy link
Contributor

@ffoulkes ffoulkes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

tdi_key_field_set_value_and_mask() is an existing function, so this change should be backward compatible with older versions of the SDE.

Copy link
Contributor

@ffoulkes ffoulkes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afterthought: the commit comment says, "nexthop table is moved to WCM block, ternary match." Does this mean that this change is dependent on a change in the P4 program?

If so, and assuming that tdi_key_field_set_value_and_mask() doesn't take care of this internally, we have a design issue: tight coupling between implementation details of a specific P4 program and the Kernel Monitor source code.

What will happen if there's a mismatch? Will the user receive a comprehensible error message? If so, how will they receive it?

@ffoulkes ffoulkes added the question Further information is requested label Oct 9, 2023
@n-sandeep
Copy link
Contributor Author

tdi_key_field_set_value_and_mask

yes, this change is added in LAG PR as well. But we have an HSDES for LNW feature, because of overlay ping failure.
Need to fix the issue seperately, as LAG PR is taking some more time to get merged.

@n-sandeep
Copy link
Contributor Author

Afterthought: the commit comment says, "nexthop table is moved to WCM block, ternary match." Does this mean that this change is dependent on a change in the P4 program?

If so, and assuming that tdi_key_field_set_value_and_mask() doesn't take care of this internally, we have a design issue: tight coupling between implementation details of a specific P4 program and the Kernel Monitor source code.

What will happen if there's a mismatch? Will the user receive a comprehensible error message? If so, how will they receive it?

Yes Derek, since there is a change in p4 program, where nexthop table has been moved to WCM block from SEM block (exact match).
TDI provides seperate API's for each block. Since Ternary/WCM needs a MASK to be populated we need to use a newer API which can take mask as a parameter.
I believe TDI API cannot be same for both ternay and exact match, as exact match doesnt require any mask, but ternary needs a mask that is defined by user. In this case mask can be anything in the 16 bits defined.

If user tries to program a ternary match type with TDI API other than the one used in the PR, we will see an TDI error which comes back to krnlmon and entry doesnt get programmed.

@ffoulkes
Copy link
Contributor

ffoulkes commented Oct 9, 2023

@n-sandeep
I don't have a problem with there being separate API calls. My concern is that:

  1. Once this update is committed, P4CP will no longer work with older versions of the P4 Program.
  2. In the event of a mismatch, the user won't get meaningful feedback that helps them understand what the problem is.

@ffoulkes ffoulkes closed this Oct 9, 2023
@ffoulkes ffoulkes reopened this Oct 9, 2023
@n-sandeep
Copy link
Contributor Author

@n-sandeep I don't have a problem with there being separate API calls. My concern is that:

  1. Once this update is committed, P4CP will no longer work with older versions of the P4 Program.
  2. In the event of a mismatch, the user won't get meaningful feedback that helps them understand what the problem is.

You are right Derek, with the current TDI API mechanism, we cannot have a backward compatability as the match type in p4 program and TDI API are tightly coupled. When match type changes, we need to modify/use new TDI API to program the entry.

Atleast P4CP doesn't have control over this, how do we achieve backward compatability ?

@ffoulkes
Copy link
Contributor

ffoulkes commented Oct 10, 2023

At least P4CP doesn't have control over this, how do we achieve backward compatibility ?

  1. We add support for the new action, while retaining support for the old action. (Open Closed Principle)
  2. We introduce a conditional that Kernel Monitor can use to choose between the two actions.
  3. We provide a mechanism that can be used to specify which of the two actions to take.
  4. We add a new conditional whenever a new decision needs to be made.

In a nutshell, Kernel Monitor should be configurable, just like the pipeline.

The quick-and-dirty approach is to introduce a new C header file that #defines symbols specifying compile-time conditionals. You'd have to edit the file and rebuild krnlmon to apply the change, but at least it wouldn't be baked in.

Next step up would be introduce a configuration file that is read at startup time (or injected via gRPC) and used to set run-time conditionals. We could include an appropriate configuration file in the same directory as the P4 program.

Better still would be to use annotations in the P4 program to specify the appropriate metadata.

@n-sandeep
Copy link
Contributor Author

At least P4CP doesn't have control over this, how do we achieve backward compatibility ?

  1. We add support for the new action, while retaining support for the old action. (Open Closed Principle)
  2. We introduce a conditional that Kernel Monitor can use to choose between the two actions.
  3. We provide a mechanism that can be used to specify which of the two actions to take.
  4. We add a new conditional whenever a new decision needs to be made.

In a nutshell, Kernel Monitor should be configurable, just like the pipeline.

The quick-and-dirty approach is to introduce a new C header file that #defines symbols specifying compile-time conditionals. You'd have to edit the file and rebuild krnlmon to apply the change, but at least it wouldn't be baked in.

Next step up would be introduce a configuration file that is read at startup time (or injected via gRPC) and used to set run-time conditionals. We could include an appropriate configuration file in the same directory as the P4 program.

Better still would be to use annotations in the P4 program to specify the appropriate metadata.

Yes Derek, annotations will come as part of next releases. we can see if such backward compatability achieveed with annotations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the short term:

  1. Add #define NEXTHOP_TABLE_TERNARY_MATCH 1 at the top of this file.
  2. Instead of replacing the code outright, change it to something like this:
#if NEXTHOP_TABLE_TERNARY_MATCH
    // new p4 program
    status = tdi_key_field_set_value_and_mask(key_hdl, field_id,
                                     (api_nexthop_pd_info->nexthop_handle &
                                     ~(SWITCH_HANDLE_TYPE_NHOP <<
                                     SWITCH_HANDLE_TYPE_SHIFT)), 0xFFFF);
#else
    // legacy p4 program
    status = tdi_key_field_set_value(key_hdl, field_id,
                                     (api_nexthop_pd_info->nexthop_handle &
                                     ~(SWITCH_HANDLE_TYPE_NHOP <<
                                     SWITCH_HANDLE_TYPE_SHIFT)));
#endif
  1. Provide a Git commit comment saying flat-out that this commit is a breaking change. Recap the problem (a change in the P4 program [because...]), say what you changed, and state briefly what the side effects of the change are.
  2. Advertise the fact that there's a breaking change. For starters, I'd add a *Breaking Changes to the top-level krnlmon README.md file. Say what was done and what the user should do about it. Make the change part of this commit.
  3. We'll want to add a Breaking Changes notice to networking-recipe as well, when we update the submodule reference.

@ffoulkes ffoulkes removed the question Further information is requested label Oct 11, 2023
Since there is a change in p4 program, where nexthop table has been
moved to WCM block from SEM block (exact match).
TDI provides seperate API's for each block. Since Ternary/WCM needs
a MASK to be populated we need to use a newer API which can take mask as
a parameter.

This change introduces a MACRO, which defines ternary match MACRO
enablement.
If user wants to pick older p4 program, then user need to modify this
MACRO value to 0 and re-build infrap4d to be compatible with older Exact
match type for nexthop table.

Signed-off-by: Sandeep N <[email protected]>
@n-sandeep n-sandeep force-pushed the nextop_ternary_match branch from 3d7aff0 to 4751c74 Compare October 11, 2023 17:41
Copy link
Contributor

@ffoulkes ffoulkes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

See comment regarding the text to go in the commit comment and the text to go into the README file.

@@ -1,2 +1,15 @@
# ipdk-io/krnlmon
The Kernel Monitor receives RFC 3549 messages from the Linux Kernel over a Netlink socket when changes are made to the kernel networking data structures.

## Breaking changes

Copy link
Contributor

@ffoulkes ffoulkes Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded for use in this context.

Do we need to say which P4 program?

Reword comment
@ffoulkes ffoulkes merged commit 47c61cc into main Oct 11, 2023
@ffoulkes ffoulkes deleted the nextop_ternary_match branch October 11, 2023 19:21
5abeel pushed a commit that referenced this pull request Oct 12, 2023
The P4 program has been modified to move the NextHop table from
the SEM block (exact match) to the WCM block (ternary match). This
requires a corresponding change in the Kernel Monitor.

This is a breaking change. Newer versions of krnlmon are incompatible
with older versions of the P4 program.

Signed-off-by: Sandeep N <[email protected]>
Co-authored-by: Derek G Foster <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants