wiki:rpc_implementation

Version 32 (modified by alain, 4 years ago) (diff)

--

Remote Procedure Call

To enforce locality for complex operations requiring a large number of remote memory accesses, the various kernel instances can communicate using RPCs (Remote Procedure Call), respecting the client/server model. This section describe the RPC mechanism implemented by ALMOS-MKH. The corresponding code is defined in the rpc.c and rpc.h files. The software FIFO implementing the client/server communication channel is defined in the remote_fifo.c and remote_fifo.h.

1) Hardware platform assumptions

The target architecture is clusterised: the physical address space is shared by all cores, but it is physically distributed, with one physical memory bank per cluster, and the following assumptions:

  • The physical addresses - also called extended adresses - are 64 bits encoded. Therefore the global physical address space cannot be larger than (32 G * 32 G) bytes.
  • The max size of the physical address space in a single cluster is defined by the CONFIG_CLUSTER_SPAN configuration parameter, that must be a power of 2. It is 4 Gbytes for the TSAR architecture, but it can be larger for Intel based architectures.
  • For a given architecture, the physical address is therefore split in two fixed size fields : The LPA field (Local Physical Adress) contains the LSB bits, and defines the physical address inside a given cluster. The CXY field (Cluster Identifier Index) contains the MSB bits, and directly identifies the cluster.
  • Each cluster can contain several cores (including 0), several peripherals, and a physical memory bank of any size (including 0 bytes). This is defined in the arch_info file.
  • There is one kernel instance in each cluster containing at least one core, one interrupt controler, and one physical memory bank.

2) Server Core Selection

Any client thread T running in any cluster K can send an PRC request to any cluster K'. In order to share the working load associated with RPC handling, each core in server cluster K' has a private RPC requests queue, where the client threads must register its RPC request. In principle, the client thread T running on the client core [i] select the waiting queue queue [i] in server cluster K'. If it is not possible (when the number of cores in cluster K' is smaller than the number of cores in client cluster), ALMOS-MKH selects core [0] in server cluster.

ALMOS-MKH replicates the KDATA segment (containing the kernel global variables) in all clusters, and uses the same LPA (Local Physical Address) for the KDATA base in all clusters. Therefore, in two different clusters, a given global variable, identified by its LPA can have different values. This feature is used by by ALMOS-MKH to allow a client thread in cluster K to access a global variable in a server cluster K', building a physical address by concatenation of the LPA with the CXY cluster identifier for the server cluster K'.

For each core [i] in a cluster K, ALMOS-MKH implement the RPC requests queue as a software RPC_FIFO[i,k], implemented as a global variable in the KDATA segment. More precisely, each RPC_FIFO[i] has the type remote_fifo_t, and is a member of the "cluster_manager" structure of cluster [k].

This RPC_FIFO has been designed to support a large number (N) of concurrent writers, an a small number (M) of readers:

  • N is the number of client threads (practically unbounded). A client thread can execute in any cluster, and can send a RPC request to any target cluster K. To synchronize these multiple client threads, each RPC_FIFO[i,k] implements a ticket based policy, defining a first arrived / first served priority to register a new request into a given RPC_FIFO[i,k].
  • M is the number of server threads in charge of handling RPC requests stored in a given RPC_FIFO[i,k]. M is bounded by the CONFIG_RPC_THREAD_MAX parameter. For each PRC_FIFO[i,k], it can exist several server threads, because we must avoid the head-of-line blocking phenomenon, when a given server thread handling a given RPC is blocked on a given resource. To synchronize these multiple server threads, the RPC FIFO implements a light lock, that is a non blocking lock : only one RPC thread at a given time can take the lock and become the FIFO owner, and another RPC thread T' failing to take the lock simply returns to IDLE state.

3) Client / Server Synchronization

All RPC requests are blocking for the client thread: the client thread register its request in the target RPC_FIFO, blocks on the THREAD_BLOCKED_RPC condition, and deschedules. The client thread is unblocked by the RPC server thread, when the RPC is completed.

In order to reduce the RPC latency, ALMOS-MKH use IPIs (Inter-Processor Interrupts). The client thread select a core in the server cluster, and send an IPI to the selected server core. An IPI forces the target core to make a scheduling. This reduces the RPC latency when no RPC thread is active for the server core, because the RPC threads have the highest scheduling priority. When no RPC thread is active for this core, the selected core will activate (or create) a new RPC thread and execute it. When an RPC thread is already active, the IPI forces a scheduling point on the target core, but no new RPC thread is activated (or created).

Similarly, when the RPC server thread completes, it uses a remote access to unblock the client thread, and send and IPI to the client core to force a scheduling on the client core.

4) Parallel RPC requests handling

As explained above, a private pool of RPC threads is associated to each RPC_FIFO[i,k]. These RPC server threads are dynamically created when required.

At any time, only one RPC thread has the FIFO ownership and can consume RPC requests from the FIFO. Nevertheless, ALMOS-MKH supports supports several RPC server threads per RPC_FIFO because a given RPC thread T handling a given request can block, waiting for a shared resource, such as a peripheral. In that case, the blocked RPC thread T releases the FIFO ownership before blocking and descheduling. This RPC thread T will complete the current RPC request when the blocking condition is solved, and the thread T is rescheduled. If the RPC FIFO is not empty, another RPC thread T' will be scheduled to handle the pending RPC requests. If all existing RPC threads are blocked, a new RPC thread is dynamically created.

Therefore, it can exist for each RPC_FIFO[i,k] a variable number X of active RPC threads: the running one is the FIFO owner, and the (X-1) others are blocked on a wait condition. This number X can temporarily exceed the CONFIG_RPC_THREAD_MAX value, but the but the exceeding threads are destroyed when the temporary overload is solved.

5) RPC request format

ALMOS-MKH implement two types of RPCs :

  • simple RPC : the client thread send the RPC request to one single server waiting one single response.
  • multicast RPC : the client thread send the same RPC request to several servers, expecting several responses.

Each entry in the RPC_FIFO contains a remote pointer (xptr_t) on a RPC descriptor (rpc_desc_t), that is stored on the client side (in the client thread stack). This RPC descriptor has a fixed format, and contains the following informations:

  • The index field defines the required service type.
  • The args field is an array of 10 uint64_t, containing the service arguments (both input & output).
  • The thread field defines the client thread (used by the server thread to unblock the client thread).
  • The lid field defines the client core local index (used by the server thread to send the completion IPI).
  • The response field defines the number of expected responses.

The semantic of the args array depends on the RPC index field.

This format supports both simple and multicast RPCs: The client thread initializes the response field with the number of expected responses, and each server thread atomically decrement this counter when the RPC request has been satisfied.

6) How to define a new RPC

To introduce a new RPC service rpc_new_service in ALMOS-MKH, you need to modify the rpc.c and rpc.h files:

  • You must identify (or possibly implement if it does not exist) the kernel function kernel_service() that you want to execute remotely. This function should not use more than 10 arguments.
  • You must register the new RPC in the enum rpc_index_t (in rpc.h file), and in the rpc_server [ ] array (in the rpc.c file).
  • You must implement the marshaling function rpc_kernel_service_client(), that is executed on the client side by the client thread, in the rpc.c and rpc.h files. This blocking function (1) register the input arguments in the RPC descriptor, (2) register the RPC request in the target cluster RPC_FIFO, (3) extract the output arguments from the RPC descriptor.
  • You must implement the marshaling function rpc_kernel_service_server(), that is executed on the server side by the RPC thread, in the rpc.c and rpc.h files. This function (1) extract the input arguments from the RPC descriptor, (2) call the kernel_service() function, (3) register the outputs arguments in the RPC descriptor.