Distributed Parallel Startup Methods
Startup Method
Currently GPU, Ascend and CPU support multiple startup methods respectively, three of which are dynamic cluster, mpirun
and rank table
:
Dynamic cluster: this method does not rely on third-party libraries, has disaster recovery function, good security, and supports three hardware platforms. It is recommended that users prioritize the use of this startup method.
mpirun: this method relies on the open source library OpenMPI, and startup command is simple. Multi-machine need to ensure two-by-two password-free login. It is recommended for users who have experience in using OpenMPI to use this startup method.
rank table: this method requires the Ascend hardware platform and does not rely on third-party library. After manually configuring the rank_table file, you can start the parallel program via a script, and the script is consistent across multiple machines for easy batch deployment.
The hardware support for the three startup methods is shown in the table below:
GPU |
Ascend |
CPU |
|
---|---|---|---|
Dynamic cluster |
Support |
Support |
Support |
|
Support |
Support |
Not support |
|
Not support |
Support |
Not support |