-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-17137 doc: Complete self_test doc #15951
base: master
Are you sure you want to change the base?
Conversation
Ticket title is 'Complete self_test documentation' |
701c79f
to
7b19c41
Compare
Add section explaining how to use self_test without a daos_agent. Doc-only: true Signed-off-by: Cedric Koch-Hofer <[email protected]>
7b19c41
to
87e2448
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments inline
docs/admin/performance_tuning.md
Outdated
By default, `self_test` will use the network interface selected by the agent. | ||
This can be forced by setting the `OFI_INTERFACE` and `OFI_DOMAIN` environment | ||
variables manually. e.g. export `OFI_INTERFACE=eth0; export OFI_DOMAIN=eth0` | ||
or `export OFI_INTERFACE=ib0; export OFI_DOMAIN=mlx5_0` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these mentions sholud be replaced with D_INTERFACE/D_DOMAIN. we deprecated OFI_* prefixes long ago for our envariables, but looks like their mentions were not changed here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Replace
OFI_INTERFACE
andOFI_DOMAIN
respectively withD_INTERFACE
andD_DOMAIN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Replace
OFI_INTERFACE
andOFI_DOMAIN
respectively withD_INTERFACE
andD_DOMAIN
Fixed with commit 5b9f964.
docs/admin/performance_tuning.md
Outdated
|
||
- line 1: the DAOS System Name | ||
- line 2: The number of ranks | ||
- line 3: "all" or "self" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe this all vs self makes no difference when we parse out the file and is there more for info purposes if anyone is reading this file by hand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Makes line 3 optional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Makes line 3 optional
Fixed with commit 5b9f964.
docs/admin/performance_tuning.md
Outdated
- line 3: "all" or "self" | ||
- "all" means dump all ranks' CaRT uri | ||
- "self" means only dump this rank's CaRT uri | ||
- line 4 to #ranks: the list of ranks id and their CaRT uri (e.g. `ucx+dc_mlx5://10.6.4.104:32416`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: technically its line 4 to line (4 + #ranks - 1):)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Fix the line interval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Fix the line interval
Fixed with commit 5b9f964.
docs/admin/performance_tuning.md
Outdated
3 ucx+dc_mlx5://10.6.4.5:32416 | ||
``` | ||
|
||
This configuration file can be generated thanks to the `daos_agent` sub-command `dum-attachinfo` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo, should be dump-attachinfo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Fix typo: dum-attachinfo -> dump-attachinfo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Fix typo: dum-attachinfo -> dump-attachinfo
Fixed with commit 5b9f964.
docs/admin/performance_tuning.md
Outdated
log_file: /root/ckochhof/daos_agent-self_test.log | ||
``` | ||
|
||
Pinging the SWIM service of the rank 0 without a DAOS agent could be done in the following way: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rank 0, tag=1 to be precise, or you can change endpoint below to be just --endpoint 0 , instead of endpoint 0:1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Add tag ID of the SWIM service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Add tag ID of the SWIM service.
Not sure to have fully understand your comment. It should be fixed with commit commit 5b9f964.
docs/admin/performance_tuning.md
Outdated
|
||
- `D_PROVIDER` defines mercury NA plugin and transport to be used (e.g.`ofi+verbs;ofi_rxm`) | ||
- `D_INTERFACE` defines network device name to be used (e.g. `ib0`) | ||
- `D_DOMAIN` defines the network name to be used (e.g. `mlx5_0:1`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe nowadays in almost all cases D_DOMAIN is optional and providing provider/interface is sufficient now for mercury to figure out proper domain on its own
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Remove useless
D_DOMAIN
environment variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Remove useless
D_DOMAIN
environment variable
Fixed with commit 5b9f964.
D_INTERFACE
and D_DOMAIN
are working as expected.
docs/admin/performance_tuning.md
Outdated
$ self_test -u --group-name daos_server --endpoint 0-<MAX_SERVER-1>:0 \ | ||
--master-endpoint 0-<MAX_RANK>:0-<MAX_TAG> \ | ||
$ self_test --use-daos-agent-env --group-name daos_server \ | ||
--endpoint 0-<MAX_SERVER-1>:0 --master-endpoint 0-<MAX_RANK>:0-<MAX_TAG> \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you already use MAX_RANK when describing master-endpoint. should "--endpoint" be also described using MAX_RANK instead of MAX_SERVER-1 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Replace
MAX_SERVER
withMAX_RANK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed with commit 5b9f964.
Fix reviewers comments: - Replace OFI_INTERFACE and OFI_DOMAIN respectively with D_INTERFACE and D_DOMAIN - Makes line 3 optional - Fix the line interval - Fix typo: dum-attachinfo -> dump-attachinfo - Add tag ID of the SWIM service - Remove useless D_DOMAIN environment variable Doc-only: true Signed-off-by: Cedric Koch-Hofer <[email protected]>
Description
Add section explaining how to use
self_test
without adaos_agent
.Doc-only: true
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: