Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

network limit tbf #350

Merged
merged 47 commits into from
Mar 27, 2020
Merged

network limit tbf #350

merged 47 commits into from
Mar 27, 2020

Conversation

vincent178
Copy link
Contributor

@vincent178 vincent178 commented Mar 18, 2020

What problem does this PR solve?

This PR is adding bandwidth limit feature for network chaos which specify in #303, which use tbf qdisc under the hood.

What is changed and how does it work?

  • proto add message SetTbf and DeleteTbf
  • NetworkChaosSpec add BandwidthSpec
  • Add tbf controller which reconciler when k8s config
  • chaosdaemon implement SetTbf and DeleteTbf grpc function
  • Add related document
  • Add related example

Some notable things I would like to work next is

  1. refactor networkchaos netem and tbf with a more general implementation, which is related to 2
  2. make netem and tbf work together

Check List

Tests

  • Unit test
  • Manual test (add detailed scripts or steps below)

manually run kubectl apply -f network-chaos.yaml and kubectl delete -f network-chaos.yaml below.

apiVersion: pingcap.com/v1alpha1
kind: NetworkChaos
metadata:
  name: tbf-test
  namespace: default
spec:
  action: bandwidth
  mode: one
  selector:
    labelSelectors:
      "app": "tbf-test"
  bandwidth:
    rate: 10000kbps
    limit: 1000
    buffer: 10000

screenshot below
Screen Shot 2020-03-24 at 4 19 21 PM

Code changes

  • Has Go code change

Side effects

n/a

Related changes

  • Need to update the documentation
    done

Does this PR introduce a user-facing change?:

Add new action bandwidth in network chaos to support bandwidth limit.

@CLAassistant
Copy link

CLAassistant commented Mar 18, 2020

CLA assistant check
All committers have signed the CLA.

@codecov-io
Copy link

codecov-io commented Mar 19, 2020

Codecov Report

Merging #350 into master will not change coverage by %.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #350   +/-   ##
=======================================
  Coverage   58.63%   58.63%           
=======================================
  Files          60       60           
  Lines        3626     3626           
=======================================
  Hits         2126     2126           
  Misses       1332     1332           
  Partials      168      168           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8ed2d27...8ed2d27. Read the comment docs.

@vincent178 vincent178 force-pushed the network-limit-tbf branch 4 times, most recently from b523ffc to 2004080 Compare March 23, 2020 23:42
@vincent178 vincent178 changed the title [WIP] network limit tbf network limit tbf Mar 24, 2020
Copy link
Member

@YangKeao YangKeao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM.

@cwen0 PTAL

@hound hound bot deleted a comment from YangKeao Mar 24, 2020
@YangKeao
Copy link
Member

Please also document that minikube doesn't support this feature, as CONFIG_NET_SCH_TBF was disabled in minikube's image.

I will try to add this kernel option for minikube later.

YangKeao
YangKeao previously approved these changes Mar 26, 2020
Copy link
Member

@YangKeao YangKeao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cwen0 PTAL

Copy link
Member

@cwen0 cwen0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

"app.kubernetes.io/component": "tikv"
bandwidth:
rate: 100kbps
limit: 100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if we set this value to "0"?

Copy link
Contributor Author

@vincent178 vincent178 Mar 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, string "0" will be fail for type checking error, and int 0 will fail to apply qdisc as limit is required for tbf (in the background). One thing could improve this is add a validation before actually apply this.

}

// TODO(vincent178): this is ok if we only apply either netem or tbf, revisit this if we want both working
func qdiscExists(qdisc netlink.Qdisc, handler *netlink.Handle, link netlink.Link) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some unit tests for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function is simple and straightforward, but I do find the comment is not valid anymore, so I remove it. Thank you!

@AngleNet
Copy link
Contributor

@vincent178 Is this PR ready for another round of review now ?

@vincent178
Copy link
Contributor Author

@AngleNet I'm adding validation for api object, should be quick though. Will let you guys know after that.

@vincent178
Copy link
Contributor Author

$ kubectl apply -f network-chaos.yaml
The NetworkChaos "tbf-test" is invalid:
* spec.bandwidth.buffer: Invalid value: 1: spec.bandwidth.buffer in body should be greater than or equal to 1
* spec.bandwidth.limit: Invalid value: 1: spec.bandwidth.limit in body should be greater than or equal to 1

@vincent178
Copy link
Contributor Author

Hi @AngleNet @YangKeao @cwen0 the code is ready for review now. Thanks

Copy link
Member

@cwen0 cwen0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cwen0
Copy link
Member

cwen0 commented Mar 27, 2020

@YangKeao @fewdan PTAL

Copy link
Member

@YangKeao YangKeao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@YangKeao
Copy link
Member

/merge

@sre-bot
Copy link
Contributor

sre-bot commented Mar 27, 2020

/run-all-tests

return nil, mockError("SetTbf")
}

func (c *MockChaosDaemonClient) DeleteTbf(ctx context.Context, in *chaosdaemon.TbfRequest, opts ...grpc.CallOption) (*empty.Empty, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported method MockChaosDaemonClient.DeleteTbf should have comment or be unexported

@@ -43,6 +43,14 @@ func (c *MockChaosDaemonClient) DeleteNetem(ctx context.Context, in *chaosdaemon
return nil, mockError("DeleteNetem")
}

func (c *MockChaosDaemonClient) SetTbf(ctx context.Context, in *chaosdaemon.TbfRequest, opts ...grpc.CallOption) (*empty.Empty, error) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported method MockChaosDaemonClient.SetTbf should have comment or be unexported

networkTbfActionMsg = "network tbf action duration %s"
)

type Reconciler struct {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported type Reconciler should have comment or be unexported

@sre-bot
Copy link
Contributor

sre-bot commented Mar 27, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Mar 27, 2020

@vincent178 merge failed.

@YangKeao YangKeao merged commit e0bfd44 into chaos-mesh:master Mar 27, 2020
sjwsl pushed a commit to sjwsl/chaos-mesh that referenced this pull request May 6, 2021
* proto and tbf server apply logic

* refactor common logic to netlink_linux and apply it in tbf_server

* [temp] remove limit in networkchaos

* make fmt and make yaml

* fix var

* add mock implementation

* fix missing grpc option

* rough implementation of API spec and controller

* implement InnerReconciler

* make fmt

* rename to bandwidth

* make

* fix api zero value parse issue

* fix remove tbf issue

* adjust type for bandwidth

* commit yaml and test

* fix tbf server test

* add unit support for rate

* add network bandwidth example

* add documentation

* make yaml

* make fmt

* update pb

* make yaml

* update log

* refactor netem with toQdiscFunc

* update based on comments

* update based on comments

* optional

* fix nil pointer dereference

* fix typo

* update document

* update document

* update based on comments

* add more doc

* add unit test and add license comment

* add comments for tbf

* fix comment

* add license comment

* fix ut

* add validation

* update licensed

* update code

* Revert "update licensed"

This reverts commit f7877b7.

* update license file to 2020

Co-authored-by: pingcap-github-bot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants