Changes between Version 9 and Version 10 of AtomicOperations


Ignore:
Timestamp:
Dec 14, 2011, 5:46:07 PM (12 years ago)
Author:
alain
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AtomicOperations

    v9 v10  
    1515On the direct network, the VCI CMD field can take four values : READ, WRITE, LINKED_LOAD (LL), and STORE_CONDITIONAL (SC). From a conceptual point of view, the atomicity his handled on the memory controller side (actually the memory cache controller), as the memory controllers must maintain a list of all pending atomic operations in a ''reservation table'' :
    1616
     17=== 2.1 General principle ===
     18
    1719 * When a processor, identified by its SRCID, executes the LL(X) instruction to an address X, the memory controller registers an entry (SRCID, X) in the reservation table, and returns the memory value stored at address X in the VCI RDATA field. If there was another reservation for the same processor SRCID, but for another address X’, the previous reservation for X’ is lost (it means that the previous reservation is cancelled).
    18  * When a processor, identified by its SRCID, executes the SC(X) instruction, there is two possibilities. If there is a valid reservation entry (SRCID, X) indicating that no other access to the X address has been received, the atomic operation is a success : the write is done, the memory cache controller returns a “true” value in he RDATA VCI field, and all entries in the reservation table for the X address are cancelled. If there is no valid reservation entry (SRCID, X) in the reservation table, the atomic operation is a failure : The write is not done, and the memory cache returns a “false” value in the RDATA field.
     20 * When a processor, identified by its SRCID, executes the SC(X) instruction, there is two possibilities. If there is a valid reservation entry (SRCID, X) indicating that no other access to the X address has been received, the atomic operation is a success : the write is done, the memory cache controller returns a ''success'' value in he RDATA VCI field and all entries in the reservation table for the X address are cancelled. If there is no valid reservation entry (SRCID, X) in the reservation table, the write is not done, and the memory cache returns a ''fail'' value in the RDATA field.
    1921
    2022Clearly, in case of concurrent access, the winner is defined by the first SC instruction received by the memory controller.
    2123
    22 As described below (using MIPS32 instruction set), this mechanism can be used to implement a spin-lock, using any memory address :
     24=== 2.2 Failure / Success encoding ===
     25
     26The actual encoding of the (success/failure) return value for a SC access depends on the processor core: For the MIPS2
     27and ARM processors, a success is encoded as a non-zero value. For the PPC processor, a success is encoded as a zero value.
     28In the TSAR architecture, the memory cache controller returns the value 0 for a success, and the value 1 for a failure.
     29If the architecture uses a MIPS or ARM processor, the SC value must be transcoded by the L1 cache controller before
     30to be transmitted to the processor core.
     31 
     32=== 2.3 Software implementation on MIPS32 processor ===
     33
     34As described below, the LL/SC mechanism can be used to implement a spin-lock, using any memory address :
    2335 * The lock acquisition is done by an atomic LL/SC operation.
    2436 * The lock release is done by a simple WRITE instruction.
    25 
     37Remember that a SC failure in encoded by a zero value for the MIPS processor.
     38 
    2639{{{
    2740                        _itmask                 # enter critical section
     
    4154== 3.  Cachable atomic operations ==
    4255
    43 In order to support cachable spin-locks and a better scalability, the memory cache controller, and the L1 cache controller must cooperate to implement the LL/SC mechanism. But the standard semantic of the LL/SC mechanism has to be modified.
    44 
     56In order to support cachable spin-locks and a better scalability, the TSAR memory cache controller, and the L1 cache controller cooperate to implement the LL/SC mechanism. But the standard semantic of the LL/SC mechanism has to be modified:
     57 * The LL operation is implemented by the L1 cache controller as a standard Read operation.
     58 * The SC opration is implemented as a Compare and Swap operation.
    4559Furthermore, the LL/SC mechanism is extended to support both 32 and 64 bits atomic accesses.
    4660
    4761=== 3.1 new semantic ===
    4862
    49 The previous semantic is :
     63The formal semantic of a LL/SC access is :
    5064(1) The Store Conditional succeeds if there was no other Store Conditional at the same address since the last Linked Load.
    5165
    52 The new semantic is :
     66The implemented semantic is :
    5367(2) The Store Conditional succeeds if the content of the memory has not changed since the last Linked Load.
    5468
     
    5973In the new protocol, there is no more LL on the network :
    6074
    61  * Linked Loads become simple Reads, where the data sent to the processor is recorded in a register of the L1 cache. When the processor issues a Store Conditional, the L1 cache sends a "SC" packet, where the first flits contain the data previously read and the last flits contain the data to write in the memory. So this new SC packet is 2 (32 bits accesses) or 4 flits (64 bits accesses) long.
     75 * Linked Loads become simple Reads, where the data sent to the processor is recorded in a register of the L1 cache. When the processor issues a Store Conditional, the L1 cache sends a "SC" packet, where the first flits contain the data previously read and the next flits contain the data to write in the memory. So this new SC packet is 2 (32 bits accesses) or 4 flits (64 bits accesses) long.
    6276
    63  * The memory cache controller then compares the data read by the L1 cache to the data in the memory cache. If these two values are equal, the Store Conditional is issued to the memory and the response to the SC is TRUE, else the Store Conditional is not issued and the response is FALSE.
     77 * The memory cache controller then compares the data read by the L1 cache to the data in the memory cache. If these two values are equal, the Store Conditional is done, and the response to the SC is success (0 value), else the Store Conditional is not done, and the response is failure (1 value).
    6478
    6579=== 3.3 memory cache controller ===