Context Navigation

Changes between Version 32 and Version 33 of rpc_implementation

Timestamp:: Jul 29, 2018, 12:42:01 PM (6 years ago)
Author:: alain
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

rpc_implementation

-                      v32
+                      v33
 To enforce locality for complex operations requiring a large number of remote memory accesses, the various kernel instances can communicate using RPCs (Remote Procedure Call), respecting  the client/server model. This section describe the RPC mechanism implemented by ALMOS-MKH.
 The corresponding code is defined in the ''rpc.c'' and ''rpc.h'' files. The software FIFO implementing the client/server communication channel is defined in the ''remote_fifo.c'' and ''remote_fifo.h''.
+The corresponding code is defined in the ''rpc.c'' and ''rpc.h'' files. The software FIFO implementing the client/server communication channel is defined in the ''remote_fifo.c'' and ''remote_fifo.h'' files.
 == 1) Hardware platform assumptions ==
 The target architecture is clusterised: the physical address space is shared by all cores, but it is physically distributed, with one physical memory bank per cluster, and the following assumptions:
  * The physical addresses - also called extended adresses - are 64 bits encoded. Therefore the global physical address space cannot be larger than (32 G * 32 G) bytes.
+ * The physical addresses - also called extended adresses - are 64 bits encoded.
  * The max size of the physical address space in a single cluster is defined by the CONFIG_CLUSTER_SPAN configuration parameter, that must be a power of 2. It is  4 Gbytes for the TSAR architecture, but it can be larger for Intel based architectures.
  * For a given architecture, the physical address is therefore split in two fixed size fields : The '''LPA''' field (Local Physical Adress) contains the LSB bits, and defines the physical address inside a given cluster. The '''CXY''' field (Cluster Identifier Index) contains the MSB bits, and directly identifies the cluster.
  * Each cluster can contain several cores (including 0), several peripherals, and a physical memory bank of any size (including 0 bytes). This is defined in the ''arch_info'' file.
  * There is one kernel instance in each cluster containing at least one core, one interrupt controler, and one physical memory bank.
+ * There is one kernel instance in each cluster containing at least one core, one local interrupt controler, and one physical memory bank.
 == 2) Server Core Selection ==
 Any client thread T running in any cluster K can send an PRC request to any cluster K'. In order to share the working load associated with RPC handling, each core in server cluster K'
 has a private RPC requests queue, where the client threads must register its RPC request. In principle, the client thread T running on the client core [i] select the waiting queue  queue [i] in server cluster K'. If it is not possible (when the number of cores in cluster K' is smaller than the number of cores in client cluster), ALMOS-MKH selects core [0] in server cluster.
+has a private RPC requests queue, where the client thread must register its RPC request. In principle, the client thread T running on the client core [i] select the waiting queue [i] in server cluster K'. If it is not possible (when the number of cores in cluster K' is smaller than the number of cores in client cluster), ALMOS-MKH selects core [0] in server cluster.
 ALMOS-MKH replicates the KDATA segment (containing the kernel global variables) in all clusters, and uses the same LPA (Local Physical Address) for the KDATA base in all clusters.
 …
 This RPC_FIFO has been designed to support a large number (N) of concurrent writers, an a small number (M) of readers:
  * N is the number of client threads (practically unbounded). A client thread can execute in any cluster, and can send a RPC request to any target cluster K. To synchronize these multiple client threads, each RPC_FIFO[i,k] implements a ticket based policy, defining a first arrived / first served priority to register a new request into a given RPC_FIFO[i,k].
  * M is the number of server threads in charge of handling RPC requests stored in a given RPC_FIFO[i,k]. M is bounded by the CONFIG_RPC_THREAD_MAX parameter. For each PRC_FIFO[i,k], it can exist several server threads, because we must avoid the ''head-of-line blocking'' phenomenon, when a given server thread handling a given RPC is blocked on a given resource. To synchronize these multiple  server threads, the RPC FIFO implements a ''light lock'', that is a non blocking lock : only one RPC thread at a given time can take the lock and become the FIFO owner, and another RPC thread T' failing to take the lock simply returns to IDLE state.
+ * M is the number of server threads in charge of handling RPC requests stored in a given RPC_FIFO[i,k]. M is bounded by the CONFIG_RPC_THREAD_MAX parameter. For each PRC_FIFO[i,k], it can exist several server threads, because we must avoid the ''head-of-line blocking'' phenomenon, as explained below in section 4. To synchronize these multiple  server threads, the RPC FIFO implements a ''light lock'', that is a non blocking lock : only one RPC thread at a given time can take the lock and become the FIFO owner. Another RPC thread T' failing to take the lock simply returns to IDLE state.
 == 3) Client / Server Synchronization ==
 All RPC requests are blocking for the client thread: the client thread register its request in the target RPC_FIFO, blocks on the THREAD_BLOCKED_RPC condition, and deschedules. The client thread is unblocked by the RPC server thread, when the RPC is completed.
+Simple RPC requests are blocking for the client thread: the client thread register its request in the target RPC_FIFO, blocks on the THREAD_BLOCKED_RPC condition, and deschedules. The client thread is unblocked by the RPC server thread, when the RPC is completed.
 In order to reduce the RPC latency, ALMOS-MKH use IPIs (Inter-Processor Interrupts). The client thread select a core in the server cluster, and send an IPI to the selected server core. An IPI forces the target core to make a scheduling. This reduces the RPC latency when no RPC thread is active for the server core, because the RPC threads have the highest scheduling priority. When no RPC thread is active for this core, the selected core will activate (or create) a new RPC thread and execute it. When an RPC thread is already active, the IPI forces a scheduling point on the target core, but no new RPC thread is activated (or created).
+In order to reduce latency, ALMOS-MKH use IPIs (Inter-Processor Interrupts). The client thread select a core in the server cluster, and send an IPI to the selected server core. An IPI forces the target core to make a scheduling. This reduces the RPC latency when no RPC thread is active for the server core, because the RPC threads are kernel threads that have the highest scheduling priority. When no RPC thread is active for this core, the selected core will activate (or create) a new RPC thread and execute it. When an RPC thread is already active, the IPI forces a scheduling point on the target core, but no new RPC thread is activated (or created).
 Similarly, when the RPC server thread completes, it uses a remote access to unblock the client thread, and send and IPI to the client core to force a scheduling on the client core.
 …
 == 4) Parallel RPC requests handling ==
 As explained above, a private pool of RPC threads is associated to each RPC_FIFO[i,k]. These RPC server threads are dynamically created when required.
+As explained above, a private pool of RPC threads is associated to each RPC_FIFO[i,k] (i.e. to each core). These RPC server threads are dynamically created when required.
 At any time, only one RPC thread has the FIFO ownership and can consume RPC requests from the FIFO. Nevertheless, ALMOS-MKH supports supports several RPC server threads per RPC_FIFO because a given RPC thread T handling a given request can block, waiting for a shared resource, such as a peripheral. In that case, the blocked RPC thread T releases the FIFO ownership before blocking and descheduling. This RPC thread T will complete the current RPC request when the blocking condition is solved, and the thread T is rescheduled. If the RPC FIFO is not empty, another RPC thread T' will be scheduled to handle the pending RPC requests. If all existing RPC threads are blocked, a new RPC thread is dynamically created.
 Therefore, it can exist for each RPC_FIFO[i,k] a variable number X of active RPC threads: the running one is the FIFO owner, and the (X-1) others are blocked on a wait condition.
 This number X can temporarily exceed the CONFIG_RPC_THREAD_MAX value, but the but the exceeding threads are destroyed when the temporary overload is solved.
+Therefore, it can exist for each RPC_FIFO[i,k] a variable number M of active RPC threads: the running one is the FIFO owner, and the (M-1) others are blocked on a wait condition.
+This number M can temporarily exceed the CONFIG_RPC_THREAD_MAX value, but the but the exceeding server threads are destroyed when the temporary overload is solved.
 == 5) RPC request format ==