|tm_me||The task id of this calling task.|
|tm_parent||The task id of the task which spawned this task or TM_NULL_TASK if the calling task is the initial task started by PBS.|
|tm_nnodes||The number of nodes allocated to the job.|
|tm_ntasks||This will always be 0 for PBS.|
|tm_taskpoolid||PBS does not support task pools so this will always be -1.|
|tm_tasklist||This will be NULL for PBS.|
tm_nodeinfo() places a pointer to a malloced array of tm_node_ids in the pointer pointed at by list. The order of the tm_node_ids in list is the same as that specified to MOM in the "exec_host" attribute. The int pointed to by nnodes contains the number of nodes allocated to the job. This is information that is returned during initialization and does not require communication with MOM. If tm_init has not been called, TM_ESYSTEM is returned, otherwise TM_SUCCESS is returned.
tm_poll() is the function which will retrieve information about the task management system to locations specified when other routines request an action take place. The bookkeeping for this is done by generating an event for each action. When the task manager (MOM) sends a message that an action is complete, the event is reported by tm_poll and information is placed where the caller requested it. The argument poll_event is meant to be used to request a specific event. This implementation does not use it and it must be set to TM_NULL_EVENT or an error is returned. Upon return, the argument result_event will contain a valid event number or TM_ERROR_EVENT on error. If wait is zero and there are no events to report, result_event is set to TM_NULL_EVENT. If wait is non-zero an there are no events to report, the function will block waiting for an event. If no local error takes place, TM_SUCCESS is returned. If an error is reported by MOM for an event, then the argument tm_errno will be set to an error code.
tm_notify() is described in the PSCHED documentation, but is not implemented for PBS yet. It will return TM_ENOTIMPLEMENTED.
tm_spawn() sends a message to MOM to start a new task. The node id of the host to run the task is given by where. The parameters argc, argv and envp specify the program to run and its arguments and environment very much like exec(). The full path of the program executable must be given by argv and the number of elements in the argv array is given by argc. The array envp is NULL terminated. The argument event points to a tm_event_t variable which is filled in with an event number. When this event is returned by tm_poll , the tm_task_id pointed to by tid will contain the task id of the newly created task. In addition, the tid is available to the process in the PBS_TASKNUM environment variable. Similarly, the node number is in the PBS_NODENUM variable and the cpu number is in the PBS_VNODENUM variable.
tm_kill() sends a signal specified by sig to the task tid and puts an event number in the tm_event_t pointed to by event.
tm_obit() creates an event which will be reported when the task tid exits. The int pointed to by obitval will contain the exit value of the task when the event is reported.
tm_taskinfo() returns the list of tasks running on the node specified by node. The PSCHED documentation mentions a special ability to retrieve all tasks running in the job. This is not supported by PBS. The argument tid_list points to an array of tm_task_ids which contains list_size elements. Upon return, event will contain an event number. When this event is polled, the int pointed to by ntasks will contain the number of tasks running on the node and the array will be filled in with tm_task_ids. If ntasks is greater than list_size, only list_size tasks will be returned.
tm_atnode() will place the node id where the task tid exists in the tm_node_id pointed to by node.
tm_rescinfo() makes a request for a string specifying the resources available on a node given by the argument node. The string is returned in the buffer pointed to by resource and is terminated by a NUL character unless the number of characters of information is greater than specified by len. The resource string PBS returns is formated as follows:
A space separated set of strings from the uname system call followed by a colon (:). The order of the strings is sysname, nodename, release, version, machine.
A comma spearated set of strings giving the components of the "Resource_List" attribute of the job. Each component has the resource name, an equal sign, and the limit value.
For example, a return for a task running on an SGI workstation might look like:
IRIX golum 6.2 03131015 IP22:cput=20:00,mem=400kb
tm_publish() causes len bytes of information pointed at by info to be sent to the local MOM to be saved under the name given by name.
tm_subscribe() returns a copy of the information named by name for the task given by tid. The argument info points to a buffer of size len where the information will be returned. The argument info_len will be set with the size of the published data. If this is larger than the supplied buffer, the data will have been truncated.
tm_finalize() may be called to free any memory in use by the library and close the connection to MOM.
pbs_mom, PSCHED: An API for Parallel Job/Resource Managment, http://parallel.nas.nasa.gov/Psched/psched-api-report.ps
|-->||TM (3)||21 May 1997|