Changes between Version 1 and Version 2 of AtomicOperations


Ignore:
Timestamp:
Jun 30, 2009, 7:59:30 PM (15 years ago)
Author:
alain
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AtomicOperations

    v1 v2  
    33= Atomic Operations =
    44
    5 == 1) Goals ==
     5== 1. Goals ==
    66
    77The TSAR architecture implements atomic read-then write operations to support various software synchronization mechanisms. The constraints are the following :
     
    1010 * As the atomic access can be used to implement spin-locks, the lock address must be cachable in order to benefit from the general coherence protocol, and avoid unnecessary transactions on the interconnection network.
    1111
    12 == 2) LL/SC mechanism ==
     12== 2. LL/SC mechanism ==
    1313
    14 The TSAR memory sub-system supports the LL/SC mechanism, as the LL & SC commands are defined in the VCI/OCP protocol. The LL and SC instructions must be defined by the processor Instructon Set Architecture. This is natively the case for the MIPS32 & PowerPC processors.
    15 On the direct network, the VCI CMD field can take four values : READ, WRITE, LINKED_LOAD (LL), and STORE_CONDITIONAL (SC). From a conceptual point of view, the atomicity his handled on the memory controller side (actually the memory cache controllers), as the memory controllers must maintain a list of all pending atomic operations in a reservation table :
     14The TSAR memory sub-system supports the LL/SC mechanism. The LL & SC commands are defined in the VCI/OCP protocol, and the LL and SC instructions must be defined in the processor Instructon Set Architecture. This is natively the case for the MIPS32 & PowerPC processors.
     15On the direct network, the VCI CMD field can take four values : READ, WRITE, LINKED_LOAD (LL), and STORE_CONDITIONAL (SC). From a conceptual point of view, the atomicity his handled on the memory controller side (actually the memory cache controllers), as the memory controllers must maintain a list of all pending atomic operations in a ''reservation table'' :
    1616
    17 -       When a processor, identified by its SRCID, executes the LL(X) instruction, the memory controller registers an entry (SRCID, X) in the reservation table, and returns the memory value stored at address X in the VCI RDATA field. If there was another reservation for the same processor SRCID, but for another address X’, the previous reservation for X’ is canceled.
    18 -       When a processor, identified by its SRCID, executes the SC(X) instruction, there is two possibilities. If there is a pending entry (SRCID, X) indicating that there no other access to the X address has been received, the atomic operation is a success : the write is done, the memory cache controller returns a “true” value in he RDATA VCI field, and all entries in the reservation table for the X address are cancelled. If there is no pending entry (SRCID, X) in the reservation table, the atomic operation is a failure : The write is not done, and the memory cache returns a “false” value in the RDATA field.
     17 * When a processor, identified by its SRCID, executes the LL(X) instruction to an address X, the memory controller registers an entry (SRCID, X) in the reservation table, and returns the memory value stored at address X in the VCI RDATA field. If there was another reservation for the same processor SRCID, but for another address X’, the previous reservation for X’ is lost (it means that the previous reservation can be cancelled).
     18 * When a processor, identified by its SRCID, executes the SC(X) instruction, there is two possibilities. If there is a valid reservation entry (SRCID, X) indicating that no other access to the X address has been received, the atomic operation is a success : the write is done, the memory cache controller returns a “true” value in he RDATA VCI field, and all entries in the reservation table for the X address are cancelled. If there is no valid reservation entry (SRCID, X) in the reservation table, the atomic operation is a failure : The write is not done, and the memory cache returns a “false” value in the RDATA field.
    1919
    20 This mechanism can be used to implement a spin-lock, using any memory address :
    21 -       The lock acquisition is done by an atomic LL/SC operation.
    22 -       The lock release is done by a simple WRITE instruction.
     20Clearly, in case of concurrent access, the winner is defined by the first SC instruction received by the memory controller.
    2321
     22As described below, this mechanism can be used to implement a spin-lock, using any memory address :
     23 * The lock acquisition is done by an atomic LL/SC operation.
     24 * The lock release is done by a simple WRITE instruction.
     25
     26{{{
    2427_itmask         # enter critical section
    2528# lock acquisition
    2629loop            LL Reg1 @               # Reg1 <= M[@]
    2730                        BNE Reg1 loop   # continue if lock not taken (Reg1 == 0)
    28                         SC  1 @         # M[@] <= 1 / Reg2 <= KO
     31                        SC  1 @                 # M[@] <= 1 / Reg2 <= KO
    2932                        BNE Reg2 loop   # retry if not atomic (Reg2 != 0)
    3033                        ...
    3134                ...
    3235# lock release
    33                         SW 0 @          # M[@] <= 0
     36                        SW 0 @                  # M[@] <= 0
    3437                        _itunmask               # exit critical section
     38}}}
    3539
     40== 3.  Cachable atomic operations ==
    3641
     42In order to support cachable spin-locks, the memory cache controller, and the L1 cache controller must cooperate to implement the LL/SC mechanism.
    3743
     44=== 3.1 memory cache controller ===
    3845
    39 
    40 5.3     Cachable atomic operations
    41 In order to support cachable spin-locks, the memory cache controller, and the L1 cache controller must cooperate to implement the LL/SC mechanism.
    42 5.3.1   memory cache controller
    43 The memory cache controller contains a dedicated storage that is used to register, for each cache line the set of  L1 caches that have copies. These sets of copies are implemented as linked lists of SRCIDs. To implement the Reservation Table, we just introduce, for each registered copy of a cache line, (i.e. each entry in this Reservation tTable) one extra bit  to register a pending LL/SC atomic operation.
     46The memory cache controller contains a dedicated storage that is used to register, for each cache line the set of  L1 caches that have copies. These sets of copies are implemented as linked lists of SRCIDs. To implement the Reservation Table, we just introduce, for each registered copy of a cache line, (i.e. each entry in this Reservation Table) one extra bit  to register a pending LL/SC atomic operation.
    4447This approach is scalable, but creates the possibility of “false conflicts”, when several atomic access are done to the same cache line.
    4548
    46   Request type    Actions in the memory cache controller
    47   
    48   LL(SRCID, X)
    49 
    50           Scan all copies associated to the cache line containing the X address
     49The actions done by the memory cache controller for the various commands are described below :
     50 
     51'''LL(SRCID, X)'''
     52{{{
     53   Scan all copies associated to the cache line containing the X address
    5154  If ( a copy corresponding to SRCID.exists) {
    5255      RESERVED = true
     
    5558      and marked RESERVED in the linked list
    5659  }
    57  
    58   SC(SRCID, X)
     60 }}}
    5961
    60          Scan all copies associated to the cache line containing the X address
     62'''SC(SRCID, X)
     63{{{
     64  Scan all copies associated to the cache line containing the X address
    6165  If ( a copy corresponding to SRCID.exists and RESERVED == true ) {
    6266      - scan again the linked list of copies to send an UPDATE request
     
    6872      return false to the L1 cache
    6973  }
    70  
    71   WRITE(SRCID, X)
    72  
    73           - Scan the linked list of copies to send an UPDATE request
     74}}}
     75
     76'''WRITE(SRCID, X)'''
     77{{{
     78  - Scan the linked list of copies to send an UPDATE request
    7479    to the L1 caches (other than SRCID), and invalidate all RESERVED bits
    7580  - Write data in the memory cache
    7681  - after all responses to UPDATE have been received, acknowledge the
    7782    write request.
     83}}}
    7884
    79   READ(SRCID, X)          If ( cachable request ) {
    80 -       register the SRCID in the list of copies associated to the X address.
    81 -       return the complete cache line
     85'''READ(SRCID, X)'''
     86{{{
     87  If ( cachable request ) {
     88    - register the SRCID in the list of copies associated to the X address.
     89    - return the complete cache line
    8290  } else {
    83      - return a single word.
     91    - return a single word.
    8492  }
     93}}}
    8594
     95=== 3.2  L1 cache controller ===
    8696
    87 
    88 
    89 5.3.2   L1 cache controller
    9097The L1 cache controller receiving a  new LL(X) request from the processor must locally register this reservation on the X address to validate the use of the locally cached copy, and to check the address when it receives a SC(X) request from the processor. This requires an extra register to store the address, and a RESERVED flip-flop in the L1 cache controller.
    9198
    92   Request type    Actions in the L1 cache controller
    93  
    94   LL(X)
     99The actions done by the L1 cache controller for the various commands are described below :
    95100
    96   (from processor)        If (RESERVED = true & ADDRESS = X) {    // local spin-lock         
     101'''LL(X) from processor'''
     102{{{
     103  If (RESERVED = true & ADDRESS = X) {    // local spin-lock         
    97104    return the read data to the processor
    98105  } else {                                                           // first LL access                       
     
    102109    and return the read value to the processor
    103110  }
    104  
    105   SC(X)
     111}}}
    106112
    107   (from processor)        If (RESERVED = true & ADDRESS = X)  {    // possible succès
     113'''SC(X) from processor'''
     114{{{
     115  If (RESERVED = true & ADDRESS = X)  {    // possible succès
    108116    send a SC(X) request to the memory cache,
    109117    and return the Boolean response to the processor
     
    113121   RESERVED <= false
    114122  }
     123}}}
    115124
    116   INVAL(L) or
    117   UPDATE(L)
    118 
    119    (from memory)
    120           If (ADDRESS = L)  {                                        // invalidate reservation
     125'''INVAL(L) or UPDATE(L) from memory controller'''
     126{{{
     127  If (ADDRESS = L)  {                                        // invalidate reservation
    121128    RESERVED <= false
    122129  }
     130  and the L1 cache is updated or invalidated.
     131}}}
    123132
    124   and the L1 cache must be updated or invalidated.
    125  
    126