Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistent collective communication request #2758

Closed
wants to merge 4 commits into from

Conversation

kawashima-fj
Copy link
Member

@kawashima-fj kawashima-fj commented Jan 19, 2017

This PR adds the persistent collective communication request feature proposed in the MPI Forum.

The proposal in the MPI Forum and this work are both in progress. The purposes of this PR are:

  • Declare that Fujitsu is working on this feature and eliminate possible duplicated works by other people
  • Receive comments about my implementation
  • Enable evaluation of this feature by other people

Once the standardization in the MPI Forum progresses and my implementation completes, I'll update this PR and remove the WIP-DNM label.

How this feature is implemented is described in a comment of the ompi/mca/coll/nbpreq/coll_nbpreq.h file. This implementation does not focus on performance. Performance improvement can be achieved in other COLL component.

All MPI_*_INIT routines have a prefix MPIX_ instead of MPI_ and you need to include mpi-ext.h in your program to use this feature because this feature is not standardized yet. Requests created by MPIX_*_INIT can be passed to normal MPI_START and MPI_STARTALL.

Any comments are welcome!

Signed-off-by: KAWASHIMA Takahiro [email protected]

@hjelmn
Copy link
Member

hjelmn commented Jan 23, 2017

Are we ok on the size of ompi_predefined_communicator_t? We overran it when we added non-blocking.

@kawashima-fj
Copy link
Member Author

@hjelmn Thanks. I've confirmed the size.

sizeof(struct ompi_predefined_communicator_t) is defined to be 1536 on 64-bit architecture and 768 on 32-bit architecture.

#define PREDEFINED_COMMUNICATOR_PAD (sizeof(void*) * 192)

struct ompi_predefined_communicator_t {
    struct ompi_communicator_t comm;
    char padding[PREDEFINED_COMMUNICATOR_PAD - sizeof(ompi_communicator_t)];
};

With the default configure options, I confirmed sizeof(ompi_communicator_t) of this PR source.

Platform configure options sizeof(ompi_communicator_t)
Linux/x86_64 --disable-debug 1312
Linux/x86_64 --enable-debug 1376
Linux/i386 --disable-debug 700
Linux/i386 --enable-debug 744

They all fit in the ompi_predefined_communicator_t. But they may have different values depending on platform or configure options.

  • If --with-max-object-name is used and MPI_MAX_OBJECT_NAME (default: 64, maximum: 256) is increased, sizeof(ompi_communicator_t) will increase.
  • If --enable-peruse is used, sizeof(ompi_communicator_t) will increase by sizeof(void *).
  • If sizeof(pthread_mutex_t) is larger than Linux/x86_64 (40) or Linux/i386 (24), sizeof(ompi_communicator_t) will increase.

Especially, increasing MPI_MAX_OBJECT_NAME will overrun the struct. Which method do you prefer?

  • Increase PREDEFINED_COMMUNICATOR_PAD at some point (break binary compatibility)
  • Change the member c_coll of ompi_communicator_t to pointer (mca_coll_base_comm_coll_t c_coll -> mca_coll_base_comm_coll_t *c_coll)

@kawashima-fj
Copy link
Member Author

The current implementation has the same problem as #2151. Once it is fixed, I'll update this PR in a similar way.

@ibm-ompi
Copy link

The IBM CI (PGI Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/3aed646317a83647afe31b3248975017

@ibm-ompi
Copy link

ibm-ompi commented Jun 2, 2017

The IBM CI (GNU Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/8fc08ecc32dad6655013038464bbdc10

@ibm-ompi
Copy link

ibm-ompi commented Jun 2, 2017

The IBM CI (XL Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/94991564daf6b803703ca776d93cfe68

@ibm-ompi
Copy link

ibm-ompi commented Jun 2, 2017

The IBM CI (PGI Compiler) build failed! Please review the log, linked below.

Gist: https://gist.github.com/8b934227786cd0bd140a132c264c81d1

@ggouaillardet
Copy link
Contributor

@kawashima-fj i noticed there is a ruby script. currently, Open MPI only requires perl (e.g. no python, php nor ruby). if we decide to move forward with this PR, would you consider rewriting the script in perl ?

@jsquyres any thoughts ?

@ggouaillardet
Copy link
Contributor

:bot:retest

@kawashima-fj
Copy link
Member Author

@ggouaillardet My Ruby script generates coll_nbpreq_request_h and the generated coll_nbpreq_request_his committed. So it is not needed when running autogen/configure/make. It is needed only by a maintainer of this component.

The nbpreq COLL component implements the persistent collective
communication request feature proposed at the MPI Forum.
"nbpreq" is an abbreviation of "nonblocking persistent request".

Signed-off-by: KAWASHIMA Takahiro <[email protected]>
Now nbpreq COLL was added and all `*_init` functions of the COLL
interface are available. So let's enable the check of availability
of those functions on a communicator creation.

Signed-off-by: KAWASHIMA Takahiro <[email protected]>
Until the MPI Forum decides to add the persistent collective
communication request feature to the MPI Standard, these functions
are supported through MPI extensions with the `MPIX_` prefix.

Only C bindings are supported currently.

Signed-off-by: KAWASHIMA Takahiro <[email protected]>
@kawashima-fj
Copy link
Member Author

kawashima-fj commented Nov 24, 2017

I've updated the PR to reflect the latest MPI Standard draft (mpi-forum/mpi-issues#25), especially the addition of info argument to MPI_*_INIT routines.

Missing parts are:

@jsquyres
Copy link
Member

Portable Python or Perl would still be greatly preferred over Ruby.

By introducing Ruby, you're adding another requirement on our build servers, which we try hard not to do.

@bwbarrett
Copy link
Member

well, it sounds like Ruby would only be needed when the header file is generated. But that begs more questions than it answers, like why we're committing generated code rather than generate it during "make dist" or (preferably) "make". Otherwise, we're just asking for that file to get modified along the way and then those modifications to be undone by rerunning the script for some other change.

@ggouaillardet
Copy link
Contributor

@jsquyres does Open MPI currently requires python ?

We do not commit automatically generated files.

That being said, these files could be inside the dist tarball, so ruby would only be required when building from git, and not from tarball.
This is far from ideal, but at least, that would not add a burden to the end users.

@bwbarrett
Copy link
Member

@ggouaillardet currently, we do not. But if you gave me the choice between python and ruby, the answer would be python. All of the build systems already have python installed (because we do use it for a number of admin things outside of actually building a tarball from git), but don't use ruby anywhere.

@jsquyres
Copy link
Member

'zactly what Brian said. FWIW:

$ find . -name \*.pl | wc -l
      49
$ find . -name \*.py | wc -l
       8
$ find . -name \*.rb | wc -l
       0

I don't think any of the Python scripts are used during building, but (in general) Perl is kinda on the way out and Python is on the way in...

@rhc54
Copy link
Contributor

rhc54 commented Nov 27, 2017

yeah, Python scripts are for running tests and stuff - not for building (at least, so far)

@kawashima-fj
Copy link
Member Author

To be clear: This Ruby script is not required when running autogen/configure/make. It is used by a component maintainer when the structure of this component is changed or a new collective communication routine is added. And manual update of the generated code is not so painful.

The discussion points are:

  1. Requiring Ruby when running autogen/configure/make. I also strongly agree not to require Ruby. As described above, this PR does not have this problem.
  2. Requiring Ruby for maintenance of this component. Manual update of the header file is not so painful once the header file is generated. So I'll remove the Ruby script if the community prefer it.
  3. Putting a Ruby script in the Open MPI source tree. Same as 2.
  4. Putting a generated code in the Open MPI source tree. I'll convert the Ruby script to Python or Perl and call it from autogen.pl if the community prefer it. But give me some time to work it.

@kawashima-fj
Copy link
Member Author

FYI: The component which includes the Ruby script will become unnecessary and will be removed when another component like #4515 becomes ready.

@kawashima-fj
Copy link
Member Author

@ggouaillardet I am afraid I cannot fully understand your intent. I understand MPI_Start handles only persistent requests and passing a nonblocking collective request to MPI_Start is erroneous. But it is same for point-to-point communication. OMPI_REQUEST_PML is used for both nonblocking and persistent requests. Why should we distinguish them only in collective?

@ggouaillardet
Copy link
Contributor

@kawashima-fj that is a good point.
@bosilca, should we check the request is a persistent one and fail otherwise ?

@kawashima-fj
Copy link
Member Author

We have req_persistent flag in ompi_request_t already. Checking this flag in MPI_Start and MPI_Startall is sufficient.

current ompi/request/request.h:

struct ompi_request_t {
    opal_free_list_item_t super;                /**< Base type */
    ompi_request_type_t req_type;               /**< Enum indicating the type of the request */
    ompi_status_public_t req_status;            /**< Completion status */
    volatile void *req_complete;                /**< Flag indicating wether request has completed */
    volatile ompi_request_state_t req_state;    /**< enum indicate state of the request */
    bool req_persistent;                        /**< flag indicating if the this is a persistent request */
    int req_f_to_c_index;                       /**< Index in Fortran <-> C translation array */
    ompi_request_start_fn_t req_start;          /**< Called by MPI_START and MPI_STARTALL */
    ompi_request_free_fn_t req_free;            /**< Called by free */
    ompi_request_cancel_fn_t req_cancel;        /**< Optional function to cancel the request */
    ompi_request_complete_fn_t req_complete_cb; /**< Called when the request is MPI completed */
    void *req_complete_cb_data;
    ompi_mpi_object_t req_mpi_object;           /**< Pointer to MPI object that created this request */
};

@ggouaillardet
Copy link
Contributor

@kawashima-fj got it, thanks !
i made ggouaillardet/ompi@a125778 for that purpose

@kawashima-fj
Copy link
Member Author

@ggouaillardet Thanks. I'll integrate your commits. But what's the intent of #4618? #4618 is the work I wanted during the conversation in #4515. #4618 integrates this PR and the concept of #4515. If #4618 is intented to be merged, I'll throw away this PR and can collaborate in #4618.

@bosilca
Copy link
Member

bosilca commented Dec 13, 2017

@ggouaillardet not sure about the validity of a check right now. Looking through the code base we are setting the persistent flag for different types of requests, now only for those that are persistent. But this looks like a bug.

@kawashima-fj
Copy link
Member Author

#4618 obsolete this PR. Close.

@kawashima-fj kawashima-fj deleted the pcollreq branch July 9, 2018 01:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants