Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug or Wrong documentation for --output-filename option #7095

Closed
ericch1 opened this issue Oct 17, 2019 · 8 comments · Fixed by #7162
Closed

Bug or Wrong documentation for --output-filename option #7095

ericch1 opened this issue Oct 17, 2019 · 8 comments · Fixed by #7162

Comments

@ericch1
Copy link

ericch1 commented Oct 17, 2019

Hi,

In the documentation of mpixec, the option "-output-filename, --output-filename " is described as:

Redirect the stdout, stderr, and stddiag of all processes to a process-unique version of the specified filename. Any directories in the filename will automatically be created. Each output file will consist of filename.id, where the id will be the processes’ rank in MPI_COMM_WORLD, left-filled with zero’s for correct ordering in listings. A relative path value will be converted to an absolute path based on the cwd where mpirun is executed. Note that this will not work on environments where the file system on compute nodes differs from that where mpirun is executed.

Refs:
https://www.open-mpi.org/doc/v4.0/man1/mpiexec.1.php
https://www.open-mpi.org/doc/v3.1/man1/mpiexec.1.php
https://www.open-mpi.org/doc/v3.0/man1/mpiexec.1.php
https://www.open-mpi.org/doc/v2.1/man1/mpiexec.1.php
https://www.open-mpi.org/doc/v2.0/man1/mpiexec.1.php
https://www.open-mpi.org/doc/v1.10/man1/mpiexec.1.php

However, as of version 3.x and 4.x, the filename is not generated, it is instead a directory.
Is it a bug or a wrong documentation? Is there a way to have the same behavior as of 2.X and before?

to reproduce this error:

[eric@lorien] bug (master $ u=)> mpiexec -n 1 --output-filename out.txt echo "Hi"
Hi
[eric@lorien] bug (master $ u=)> ls -latr
total 8
drwxr-xr-x 18 eric giref 4096 Oct 17 09:59 ..
drwx------  3 eric giref   15 Oct 17 10:00 out.txt
drwx------  3 eric giref   21 Oct 17 10:00 .
[eric@lorien] bug (master $ u=)> ls -la out.txt/
total 0
drwx------ 3 eric giref 15 Oct 17 10:00 .
drwx------ 3 eric giref 21 Oct 17 10:00 ..
drwx------ 3 eric giref 20 Oct 17 10:00 1
[eric@lorien] bug (master $ u=)> ls -la out.txt/1/
total 0
drwx------ 3 eric giref 20 Oct 17 10:00 .
drwx------ 3 eric giref 15 Oct 17 10:00 ..
drwxr-x--- 2 eric giref 34 Oct 17 10:00 rank.0
[eric@lorien] bug (master $ u=)> ls -la out.txt/1/rank.0/
total 4
drwxr-x--- 2 eric giref 34 Oct 17 10:00 .
drwx------ 3 eric giref 20 Oct 17 10:00 ..
-rw------- 1 eric giref  0 Oct 17 10:00 stderr
-rw------- 1 eric giref  3 Oct 17 10:00 stdout
[eric@lorien] bug (master $ u=)> mpiexec --version
mpiexec (OpenRTE) 4.0.1

Report bugs to http://www.open-mpi.org/community/help/
@jsquyres
Copy link
Member

@ericch1 You are absolutely correct. I'm sure the behavior was exactly as described in mpirexec.1 at some point, but later we changed the behavior and then neglected to update mpiexec.1. ☹️

Thanks for bringing this to our attention!

I may have someone for whom this would be a nice simple way to introduce themselves to the Open MPI code base / git / github pull requests / etc...

@ericch1
Copy link
Author

ericch1 commented Oct 18, 2019

Is it configurable or not for a user to have the same behavior as before?

Thanks for the information Jeff.

@jsquyres
Copy link
Member

I'm guessing we totally removed the old functionality because of the dichotomy of --output-filename foo vs. the fact that there are 2 outputs: stdout and stderr.

Is the current functionality ok? Or is there a different way you'd like to see it?

@ericch1
Copy link
Author

ericch1 commented Oct 18, 2019

I really like the old way it was done: It gave a simple list of files to explore/grep from.

I won't ask you to change the actual behavior because of my preferences: I expect there are greater reasons why it changed...

But I was totally confused about the non-documented change, so I first thought it was a bug with slurm, then a bug with OpenMPI...

Now I know what to do to have all the files renamed and moved to replicate the old behavior...

I may suggest you rename the option --output-directory since the name of the option itself is really confusing.

@gnreeke
Copy link

gnreeke commented Oct 30, 2019

I also would very much like to have the old behavior available as an option. I went to a lot of trouble to code my application so all output is compiled to a named file written from rank 0 and I want to direct it somewhere with no added characters in the file name, no new directories that I would have to delete, etc. When a run fails, it is fine when stderr comes to my login screen, but any other treatment separate from or merged into stdout is OK with me.

@MaxSagebaum
Copy link

I ran into the same problem and I did some digging. I am using fedora (5.3.8-200.fc30.x86_6)

Man page gives me the description of the old behaviour.
mpirun --help output gives me the description of the new behaviour.

I quick grep on the master branch gives two location where 'output-filename' is declared as an parameter.
./orte/orted/orted_main.c:

    { "orte_output_filename", '\0', "output-filename", "output-filename", 1,
      NULL, OPAL_CMD_LINE_TYPE_STRING,
      "Redirect output from application processes into filename.rank" },

./orte/mca/schizo/ompi/schizo_ompi.c

{ "orte_output_filename", '\0', "output-filename", "output-filename", 1,
      &orte_cmd_options.output_filename, OPAL_CMD_LINE_TYPE_STRING,
      "Redirect output from application processes into filename/job/rank/std[out,err,diag]. A relative path value will be converted to an absolute path",
      OPAL_CMD_LINE_OTYPE_OUTPUT },

Can these two behaviours be selected with some mca parameters or is the version in ./orte/orted/orted_main.c dead code or just wrong documentation?

@rhc54
Copy link
Contributor

rhc54 commented Nov 13, 2019

I'm afraid not - as indicated, the old behavior is currently not available. It isn't dead code anywhere - it is simply a case of the documentation not being updated to indicate that the behavior has changed.

@ericch1
Copy link
Author

ericch1 commented Nov 13, 2019

Cool! Both features!!!!
Thanks a lot! :)

e-kwsm added a commit to e-kwsm/ompi that referenced this issue Nov 4, 2022
e-kwsm added a commit to e-kwsm/ompi that referenced this issue Nov 4, 2022
e-kwsm added a commit to e-kwsm/ompi that referenced this issue Nov 14, 2022
e-kwsm added a commit to e-kwsm/ompi that referenced this issue Nov 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants