Context Navigation

Changes between Version 42 and Version 43 of rpc_implementation

Timestamp:: Oct 3, 2019, 2:21:28 PM (5 years ago)
Author:: alain
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

rpc_implementation

-                      v42
+                      v43
 Any client thread T running in any cluster K can send an RPC request to any cluster K'. Each core in server cluster K' has a private RPC requests queue, where the client thread must register its RPC request. In order to share the working load associated with RPC handling, the client thread T running on the client core [i] select the waiting queue of core [i] in server cluster K'. If it is not possible (when the number of cores in cluster K' is smaller than the number of cores in client cluster), ALMOS-MKH selects core [0] in server cluster.
 For each core [i] in a cluster K, ALMOS-MKH implement the RPC requests queue as a software RPC_FIFO[i,k], implemented as a global variable in the KDATA segment. More precisely, each RPC_FIFO[i] has the type ''remote_fifo_t'', and is a member of the "cluster_manager" structure of cluster [k].
+For each core [i] in a cluster K, ALMOS-MKH implement the RPC requests queue as a software RPC_FIFO[i,k], implemented as a global variable in the KDATA segment. More precisely, each RPC_FIFO[i,k] has the type ''remote_fifo_t'', and is a member of the "cluster_manager" structure of cluster [k].
 This RPC_FIFO has been designed to support a large number (N) of concurrent writers, an a small number (M) of readers:
  * N is the number of client threads (practically unbounded). A client thread can execute in any cluster, and can send a RPC request to any target cluster K. To synchronize these multiple client threads, each RPC_FIFO[i,k] implements a ticket based policy, defining a first arrived / first served priority to register a new request into a given RPC_FIFO[i,k].
  * M is the number of server threads in charge of handling RPC requests stored in a given RPC_FIFO[i,k]. M is bounded by the CONFIG_RPC_THREAD_MAX parameter. For each PRC_FIFO[i,k], it can exist several server threads, because we must avoid the ''head-of-line blocking'' phenomenon, as explained below in section 6. To synchronize these multiple  server threads, the RPC FIFO implements a ''light lock'', that is a non blocking lock : only one RPC thread at a given time can take the lock and become the FIFO owner. Another RPC thread T' failing to take the lock simply returns to IDLE state.
+ * M is the number of server threads in charge of handling RPC requests stored in a given RPC_FIFO[i,k]. M is bounded by the CONFIG_RPC_THREAD_MAX parameter. For each PRC_FIFO[i,k], it can exist several server threads, to avoid the ''head-of-line blocking'' phenomenon, as explained below in section 6. To synchronize these multiple  server threads, the RPC FIFO implements a ''light lock'', that is a non blocking lock : only one RPC thread at a given time can take the lock and become the FIFO owner. Another RPC thread T' failing to take the lock simply returns to IDLE state.
 == 3) RPC descriptor format ==
 …
  * '''parallel RPC''' : the client thread send in parallel several RPC requests to several servers, and is expecting several responses.
 Both RPC types use the same RPC descriptor format. One entry in the RPC_FIFO (located on the server side) contains a remote pointer (xptr_t) on the RPC descriptor (''rpc_desc_t''), that is stored on the client side. This RPC descriptor contains the following informations:
  * The '''index''' field defines the required service type (ALMOS-MKH defines about 30 service types).
+Each slot in the RPC_FIFO (located on the server side) contains a remote pointer (xptr_t) on the RPC descriptor (''rpc_desc_t''), that is stored on the client side. This RPC descriptor contains the following informations:
+ * The '''index''' field defines the required service type (ALMOS-MKH defines roughly 30 service types).
  * The '''blocked''' field defines the RPC mode : true for a simple RPC, false for a parallel RPC.
  * The '''args''' field is an array of 10 uint64_t, containing the service arguments (both input & output).
 …
 . blocks and deschedule, waiting to be re-activated by the server thread when the server completed the requested service.
 For each RPC service type XYZ, ALMOS-MKH define a specific ''rpc_xyz_client()'' function that performs the 3 first tasks, and call the generic ''rpc_send()'' function to perform the three last tasks.
+For each RPC service type XYZ, ALMOS-MKH defines a specific ''rpc_xyz_client()'' function that performs the 3 first tasks, and calls the generic ''rpc_send()'' function to perform the three last tasks.
 On the server side, a  kernel RPC thread is activated at the next scheduling point on the selected server core, as soon as the RPC_FIFO is non-empty. This server thread executes the following tasks:
 …
 . if this response is the last expected response, unblocks the client thread, and send an IPI to the client core.
+In order to reduce latency, ALMOS-MKH use IPIs (Inter-Processor Interrupts). The client thread select a core in the server cluster, and send an IPI to the selected server core. An IPI forces the target core to make a scheduling. This reduces the RPC latency when no RPC thread is active for the server core, because the RPC threads are kernel threads that have the highest scheduling priority. When no RPC thread is active for this core, the selected core will activate (or create) a new RPC thread and execute it. When an RPC thread is already active, the IPI forces a scheduling point on the target core, but no new RPC thread is activated (or created).
+In order to reduce latency, ALMOS-MKH use IPIs (Inter-Processor Interrupts):
+ * The client thread select a core in the server cluster, and send an IPI to the selected server core. An IPI forces the target core to make a scheduling. This reduces the RPC latency when the server core is  running an user thread, because the kernel threads have the highest scheduling priority. The scheduler will activate (or create) a new RPC thread and immediately execute it. When an RPC thread is already running on the server core, the IPI forces an useless scheduling point on the server core.
+ * The server (RPC) thread unblock the client thread, and send also an IPI to the client core to force a scheduling. As the client thread is generally an user thread, it reduces the latency when the server core is running the IDLE thread.
 == 5) Parallel RPC scenario ==