nvidia-cuda-mps-control - NVIDIA CUDA Multi Process Service management program
MPS is a runtime service designed to let multiple MPI processes using CUDA to
run concurrently in a way that's transparent to the MPI program. A CUDA
program runs in MPS mode if the MPS control daemon is running on the system.
When CUDA is first initialized in a program, the CUDA driver attempts to connect
to the MPS control daemon. If the connection attempt fails, the program
continues to run as it normally would without MPS. If however, the connection
attempt to the control daemon succeeds, the CUDA driver then requests the
daemon to start an MPS server on its behalf. If there's an MPS server already
running, and the user id of that server process matches that of the requesting
client process, the control daemon simply notifies the client process of it,
which then proceeds to connect to the server. If there's no MPS server already
running on the system, the control daemon launches an MPS server with the same
user id (UID) as that of the requesting client process. If there's an MPS
server already running, but with a different user id than that of the client
process, the control daemon requests the existing server to shutdown as soon
as all its clients are done. Once the existing server has terminated, the
control daemon launches a new server with the user id same as that of the
queued client process.
The MPS server creates the shared GPU context, and manages its clients. An MPS
server can support a finite amount of CUDA contexts determined by the hardware
architecture it is running on. For compute capability SM 3.5 through SM 6.0
the limit is 16 clients per GPU at a time. Compute capability SM 7.0 has a
limit of 48. MPS is transparent to CUDA programs, with all the complexity of
communication between the client process, the server and the control daemon
hidden within the driver binaries.
Currently, CUDA MPS is available on 64-bit Linux only, requires a device that
supports Unified Virtual Address (UVA) and has compute capability SM 3.5 or
higher. Applications requiring pre-CUDA 4.0 APIs are not supported under CUDA
MPS. Certain capabilities are only available starting with compute capability
Start the MPS control daemon, assuming the user has enough privilege (e.g.
Print a help message.
Start the front-end management user interface to the MPS control daemon, which
needs to be started first. The front-end UI keeps reading commands from stdin
until EOF. Commands are separated by the newline character. If an invalid
command is issued and rejected, an error message will be printed to stdout.
The exit status of the front-end UI is zero if communication with the daemon
is successful. A non-zero value is returned if the daemon is not found or
connection to the daemon is broken unexpectedly. See the "quit"
command below for more information about the exit status.
Commands supported by the MPS control daemon:
- Print out a list of PIDs of all MPS servers.
- start_server -uid UID
- Start a new MPS server for the specified user (UID).
- shutdown_server PID [-f]
- Shutdown the MPS server with given PID. The MPS server will not
accept any new client connections and it exits when all current clients
disconnect. -f is forced immediate shutdown. If a client launches a
faulty kernel that runs forever, a forced shutdown of the MPS server may
be required, since the MPS server creates and issues GPU work on behalf of
- get_client_list PID
- Print out a list of PIDs of all clients connected to the MPS server with
- quit [-t TIMEOUT]
- Shutdown the MPS control daemon process and all MPS servers. The MPS
control daemon stops accepting new clients while waiting for current MPS
servers and MPS clients to finish. If TIMEOUT is specified (in
seconds), the daemon will force MPS servers to shutdown if they are still
running after TIMEOUT seconds.
This command is synchronous. The front-end UI waits for the daemon to
shutdown, then returns the daemon's exit status. The exit status is zero
iff all MPS servers have exited gracefully.
- Commands available to Volta MPS control daemon:
- get_device_client_list PID
- List the devices and PIDs of client applications that enumerated this
device. It optionally takes the server instance PID.
- set_default_active_thread_percentage percentage
- Set the default active thread percentage for MPS servers. If there is
already a server spawned, this command will only affect the next server.
The set value is lost if a quit command is executed. The default is
- Query the current default available thread percentage.
- set_active_thread_percentage PID percentage
- Set the active thread percentage for the MPS server instance of the given
PID. All clients created with that server afterwards will observe the new
limit. Existing clients are not affected.
- get_active_thread_percentage PID
- Query the current available thread percentage of the MPS server instance
of the given PID.
- Specify the directory that contains the named pipes and UNIX domain
sockets used for communication among the MPS control, MPS server, and MPS
clients. The value of this environment variable should be consistent in
the MPS control daemon and all MPS client processes. Default directory is
- Specify the directory that contains the MPS log files. This variable is
used by the MPS control daemon only. Default directory is
Log files created by the MPS control daemon in the specified directory
- Record startup and shutdown of MPS control daemon, user commands issued
with their results, and status of MPS servers.
- Record startup and shutdown of MPS servers, and status of MPS