Changes between Version 44 and Version 45 of InterconnexionNetworks


Ignore:
Timestamp:
Mar 19, 2013, 10:51:42 AM (11 years ago)
Author:
alain
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • InterconnexionNetworks

    v44 v45  
    77The TSAR architecture defines three logically independent VCI compliant networks, that are fully separated for dead-lock prevention :
    88 
    9  * The '''Direct Network''' implements the 40 bits TSAR physical address space that is visible by the software. It  transports the direct READ, WRITE, LL, & SC transactions from any VCI initiator (typically a L1 cache controller or another hardware coprocessor with a DMA capability) to any VCI target (typically a memory cache controller, or a memory mapped peripheral).
    10 
    11  * The '''Coherence Network''' implements a separated 40 bits physical address space, used to transport the coherence transactions : MULTI_UPDATE, MULTI_INVAL, BROADCAST_INVAL (from memory cache controllers to L1 cache controllers) and  CLEANUP (from the L1 cache controllers to the memory cache controllers). This address space is not visible by the software.
     9 * The '''Direct Network''' implements the 40 bits TSAR physical address space that is visible by the software. It  transports the direct READ, WRITE, LL, SC and CAS transactions from any VCI initiator (typically a L1 cache controller or another hardware coprocessor with a DMA capability) to any VCI target (typically a memory cache controller, or a memory mapped peripheral).
     10
     11 * The '''Coherence Network''' implements a separated address space, used to transport the coherence transactions between memory cache controllers and L1 cache controllers. This address space is not visible by the software.
    1212
    1313 * The '''External Network''' implements a 34 bits physical address space.This network transports the PUT and GET transactions from the memory cache controller to the external RAM controller, in case of MISS or cache line replacement in the memory cache. This address space is not visible by the software.
     
    5151Therefore, the total SRCID width cannot be larger than 14 bits. It can use less than 14 bits when the number of clusters is smaller than 1024.
    5252
    53 == 3.  Direct Network & Coherence Network ==
    54 
    55 These two networks are implemented by the DSPIN network on chip general infrastructure :
    56 
    57  * The '''direct network'''
    58 
    59  * The '''coherence network'''
    60 
    61 === 3.1  VCI encoding of the various transaction types on the direct network ===
     53== 3.  VCI encoding of the various transaction types on the direct network ==
    6254
    6355All Hardware components connected to the direct network respect the VCI/OCP communication interface.
     
    7567The TSAR architecture uses one single bit for the VCI RERROR field, even if the DSPIN infrastructure supports 2 bits for the error field.
    7668
    77 There are 8 transaction types on the direct network: '''READ_DATA_UNC''', '''READ_DATA_MISS''', '''READ_INS_UNC''', '''READ_INS_MISS''', '''WRITE''', '''CAS''', '''LL''', '''SC'''. These types are encoded through the VCI fields '''CMD''' and '''PKTID'''. The '''PKTID''' field in TSAR is 4 bits long, but the MSB is ignored (reserved for future use).
     69There are 8 transaction types on the direct network, that are encoded through the VCI fields '''CMD''' and '''PKTID'''. The '''PKTID''' field in TSAR is 4 bits long, but the MSB is ignored (reserved for future use).
    7870
    7971||TYPE          ||CMD (2 bits)||PKTID (4 bits)|| '''PKTID''' mnemo   || '''CMD''' mnemo ||
     
    9587When a given initiator can send several simultaneous transactions of a given type (such as several simultaneous '''WRITE''' transactions), the VCI '''TRDID''' field is used to discriminate them. The '''TRDID''' field is 4 bits, supporting up to 16 simultaneous transactions for a given initiator.
    9688
    97 ==== 3.1.1 VCI READ transaction ====
     89=== 3.1 VCI READ transaction ===
    9890
    9991A VCI '''READ''' command packet contains one flit. In case of burst, all addresses must within the same cache line.
     
    10698 * Exactly 16 flits containing one word per flit in the '''RDATA''' field (for a '''PKTID''' = TYPE_READ_*_MISS).
    10799
    108 ==== 3.1.2 VCI WRITE transaction ====
     100=== 3.2 VCI WRITE transaction ===
    109101
    110102A VCI '''WRITE''' command packet contains from 1 to 16 flits. In case of burst, all addresses must within the same cache line.
     
    115107A VCI '''WRITE''' response packet always returns a single flit with a 0 value in the '''RDATA''' field.
    116108
    117 ==== 3.1.3 VCI LL (Linked Load) transaction ====
     109=== 3.3 VCI LL (Linked Load) transaction ===
    118110
    119111A VCI '''LL (Linked Load)''' command packet contains one single flit.
     
    127119 * The second flit contains in the '''RDATA''' field the data that has been read in the memory cache.
    128120
    129 ==== 3.1.4 VCI SC (Store Conditional) transaction ====
     121=== 3.4 VCI SC (Store Conditional) transaction ===
    130122
    131123A VCI '''SC (Store Conditionnal)''' command packet contains 2 flits.
     
    140132 * The '''RDATA''' field contains 0 (resp. 1) to indicate an SC success (resp. failure).
    141133
    142 ==== 3.1.5 VCI CAS (Compare & Swap) transaction ====
     134=== 3.5 VCI CAS (Compare & Swap) transaction ===
    143135
    144136A VCI '''CAS (Compare & Swap)''' command packet contains 2 flits.
     
    153145 * The '''RDATA''' field contains 0 (resp. 1) to indicate a CAS success (resp. failure).
    154146
    155 === 3.2  VCI encoding of the various transaction types on the coherence network ===
    156 
    157 On the coherence network the VCI encoding is defined by the hardware with the following policy:
    158 
    159 For all command packets (multi-update, multi-invalidate, broadcast-invalidate, and cleanup), the VCI CMD field is a WRITE. The line index (up to 34 bits if we use 40 bits addresses) is transported in the WDATA and BE fields of the first VCI flit. The WDATA field contains the 32 LSB bits of the line index, and the BE field contain the 2 MSB bits of the line index. The multicast invalidate, broadcast invalidate, and cleanup packets contain one single VCI flit. The multi-cast update packets contain (2+N) flits : the WDATA field of the second flit contains the index of the first word to be updated in the cache line. The following flits (at most 16 flits) contains the values to be written.
    160 
    161  * In a '''multicast''' command packet from a memory cache controller to a L1 cache controller, the address is obtained by copying the target L1 cache SRCID in the MSB bits of the VCI ADDRESS (left aligned) : The L1 cache L_ID is actually used as the LADR address field. UPDATE/INVAL requests are distinguished by the bit ADDRESS[3] (0 for INVAL, 1 for UPDATE). DATA/INSTRUCTION caches are distinguished by the bit ADDRESS[2] (0 for DATA, 1 for INSTRUCTION).
    162 
    163  * In a '''cleanup''' command packet from a L1 cache controller to a memory cache controller, the address is obtained by copying the (NX + NY) MSB bits of the line index in the VCI ADDRESS field (left aligned). The NPROCS value for the LADR address field is used to select the memory cache.
    164 
    165  * In a '''broadcast_invalidate''' command packet, from a memory cache controller to a L1 cache controller, the  ADDRESS[1:0] bits must be equal to 0x3. The 20 bits ADDRESS[39:20]  contain the XMIN,XMAX,YMIN,YMAX values defining the bounding box of the broadcast:
    166 
    167 || XMIN || XMAX || YMIN || YMAX ||  reserved ||11 ||
    168 || (5)  || (5)  || (5)  || (5)  ||   (18)    ||(2)||
    169 
    170 
    171 === 3.2  DSPIN encoding of the various transaction types on the direct network ===
     147== 4.  DSPIN encoding of the various transaction types on the direct network ==
    172148
    173149The VCI command & response packets are translated (actually serialized) to a more convenient DSPIN network format by the VCI/RING wrappers (in platform using the RING local interconnect) or by the VCI/DSPIN wrappers (in platforms using a XBAR local interconnect). These wrappers are located between the VCI initiator and target components and the DSPIN network. The DSPIN command packet width is 40 bits, and the DSPIN response packet width is 33 bits. The DSPIN interconnexion network uses only the following information to route both the DSPIN packets to the proper destination:
    174  * The MSB bit is the EOP flag, defining the last flit of a DSPIN packet.
     150 * The EOP flag, defining the last flit of a DSPIN packet.
    175151 * The LSB bit of the first flit is the BC flag,  defining a DSPIN broadcast packet.
    176  * For a response packet, BC=0 and the RSRCID field is used to route the packet to the proper destination.
    177  * For a non broadcast command packet, BC = 0), and the (NX+NY+NL) MSB bits of the ADDRESS field are used to route  the packet to the proper destination.
    178  * For a broadcast packet, BC = 1, and the XMIN, XMAX, YMIN, YMAX fields (5 bits each), are used by the network to limit the broadcast.
    179 
    180 The DSPIN format has been designed to transport 40 bits VCI ADDRESS, and 14 bits VCI SRCID.
    181 If the VCI ADDRESS use less than 40 bits (for example 32 bits), the VCI ADDRESS field is left aligned, and the LSB bits of the DSPIN field are completed with "0".
     152 * For a non broadcast command packet (BC = 0), the (NX+NY+NL) MSB bits of the first field are used to route  the packet to the proper destination.
     153 * For a broadcast packet (BC = 1), and the XMIN, XMAX, YMIN, YMAX fields (5 bits each), are used by the network to limit the broadcast.
     154
     155The DSPIN format can transport 40 bits VCI ADDRESS, and 14 bits VCI SRCID.
     156If the VCI ADDRESS use less than 40 bits (for example 32 bits), the DSPIN ADDRESS field is left aligned, and the LSB bits of the DSPIN field are completed with "0".
    182157If the SRCID field uses less than 14 bits (NX < 5 or NY < 5), the SRCID field is left aligned, and the LSB bits of the DSPIN field are completed with "O".
    183158
    184 The five types of DSPIN packets are defined below:
    185 
    186 ==== 3.3.1      DSPIN Read Command packet format (40 bits) ====
     159The DSPIN packets formats are defined below:
     160
     161=== 4.1 DSPIN Read Command packet format (40 bits) ===
    187162
    188163A single flit VCI Read Command packet (this includes LL packets) is translated to a 2 flits DSPIN Read Command packet :
     
    195170|| 1 || (14)||(2)||(2)|| (8)|| (4) || (4) ||(4)||(1)||
    196171
    197 ==== 3.3.2      DSPIN write Command packet format (40 bits) ====
    198 
     172=== 4.2 DSPIN write Command packet format (40 bits) ===
    199173A N flits VCI Write Command packet (this includes SC packets) is translated to a N+2 flits DSPIN Write Command packet :
    200174
     
    209183|| 1 || (3) ||(4)||       (32)                       ||
    210184
    211 ==== 3.3.3      DSPIN Broadcast Command packet format (40 bits) ====
    212 
    213 The single flit VCI Write Broadcast is translated to a 2 flits DSPIN Broadcast Command packet.
    214 The CID field contains the 10 MSB bits of the VCI SRCID (actually the source cluster coordinates). The XMIN,XMAX, YMIN, YMAX fields are the 20 MSB bits of the
    215 VCI ADDRESS, used by the network to limit the broadcast.
    216 
    217 Flit 0 :
    218 ||EOP||XMIN||XMAX||YMIN||YMAX||SRCID||TRDID  ||BC||
    219 || 0 || (5)|| (5)|| (5)|| (5)|| (14)|| (4)   || 1||
    220 Flit 1 :
    221 ||EOP||-res-||----------------NLINE-----------------||
    222 || 1 || (5) ||                (34)                  ||
    223 
    224 ==== 3.3.4      DSPIN Response packet format (33 bits) ====
     185
     186=== 4.3 DSPIN single flit Response packet format (33 bits) ===
    225187
    226188A single flit DSPIN Response packet is built for the following VCI response packets:
    227189 * a single flit VCI response packet to a WRITE command (no data transmitted),
    228  * a single flit VCI response packet to a READ command, where the RDATA field has value 0,
     190 * a single flit VCI response packet to a READ or LL command, where the RDATA field has value 0,
    229191 * a single flit VCI response packet to a SC or CAS command, where the RDATA field has value 0,
     192
     193Flit 0 :
     194||EOP||RSRCID||RERROR||RTRDID||RPKTID||res||BC||
     195|| 1 || (14) || (2)  || (4)  || (4)  ||(7)|| 0||
     196
     197=== 4.4 DSPIN multi-flit Response packet format (33 bits) ===
    230198
    231199For all other VCI response packets (multi-flits VCI response packet, or non-zero RDATA value)
     
    239207|| 1 ||                (32)                        ||
    240208
    241 ==== 3.3.5      DSPIN Write response packet format (33 bits) ====
    242 
    243 A single flit VCI Write Response packet is translated to a single flit DSPIN Write Response packet.
    244 
    245 Flit 0 :
    246 ||EOP||RSRCID||RERROR||RTRDID||RPKTID||res||BC||
    247 || 1 || (14) || (2)  || (4)  || (4)  ||(7)|| 0||
    248 
    249 Note : This format is also used for the response packets to a broadcast command, as each VCI response packet to a broadcast command is actually a VCI response packet to a single flit write command.
    250 
    251 == 4.  External Network ==
    252 
    253 This network has a specific topology, as the communication scheme is very peculiar: All PUT/GET transactions are from N initiators (one initiator per cluster) to one single target (the external RAM controller).
     209== 5. DSPIN encoding of the coherence transactions
     210
     211The coherence transactions are directly transmitted to the coherence network by the L1 caches and L2 caches in DSPIN format. The L2-to-L1 network uses 40 bits flits. The L1-to-L2 network uses 33 bits flits.
     212There is 4 packets types from L2 to L1, and 2 packet types from L1 to L2. 
     213
     214=== 5.1 DSPIN MULTI-UPDATE packet format (L2-to-L1 : 40 bits) ===
     215
     216This DSPIN packet contains 2+N flits.
     217 * The DEST field contains the target L1 cache identifier (SRCID).
     218 * The SOURCE field contains the source L2 cache identifier (SRCID.
     219 * The UPTID field contains the UPDATE Table index.
     220 * The WORD field contains the first modified word index.
     221 * The NLINE field contains the cache line identifier (34 bits).
     222
     223Flit 0 :
     224||EOP||---DEST---||-res-||--SOURCE--||UPTID||TYPE||BC||
     225||0  || (14)     || (3) || (14)     || (4) || (3)||0 ||
     226Flit 1 :
     227||EOP||res||WORD||---------------NLINE-----------------||
     228||0  ||(1)|| (4)||               (34)                  ||
     229Flit 3 :
     230||EOP||-res-||-BE-||-------------WDATA-----------------||
     231||0  ||(3)  ||(4) ||             (32)                  ||
     232
     233Flit N :
     234||EOP||-res-||-BE-||-------------WDATA-----------------||
     235||1  ||(3)  ||(4) ||             (32)                  ||
     236
     237=== 5.2 DSPIN MULTI-INVAL packet format (L2-to-L1 : 40 bits) ===
     238
     239This DSPIN packet contains 2 flits.
     240 * The DEST field contains the target L1 cache identifier (SRCID).
     241 * The SOURCE field contains the source L2 cache identifier (SRCID.
     242 * The UPTID field contains the UPDATE Table index.
     243 * The WORD field contains the first modified word index.
     244 * The NLINE field contains the cache line identifier (34 bits).
     245
     246Flit 0 :
     247||EOP||---DEST---||-res-||--SOURCE--||UPTID||TYPE||BC||
     248||0  || (14)     || (3) || (14)     || (4) || (3)||0 ||
     249Flit 1 :
     250||EOP||res||WORD||---------------NLINE-----------------||
     251||1  ||(1)|| (4)||               (34)                  ||
     252 
     253=== 5.3 DSPIN BROADCAST packet format (L2-to-L1 : 40 bits) ===
     254
     255This DSPIN packet contains 2 flits.
     256 * The SOURCE field contains the source L2 cache identifier (SRCID).
     257 * The XMIN,XMAX, YMIN, YMAX fields define the limits of the broadcast.
     258 * The UPTID field contains the UPDATE Table index.
     259 * The NLINE field contains the cache line identifier (34 bits).
     260
     261Flit 0 :
     262||EOP||XMIN||XMAX||YMIN||YMAX||--SOURCE--||-res-||BC||
     263|| 0 || (5)|| (5)|| (5)|| (5)||   (14)   || (4) || 1||
     264Flit 1 :
     265||EOP||res||UPTID||------------NLINE-------------------||
     266|| 1 ||(1)|| (4) ||            (34)                    ||
     267
     268=== 5.4 DSPIN CLEANUP-ACK packet format (L2-to-L1 : 40 bits) ===
     269
     270This DSPIN packet contains one flit.
     271 * The DEST field contains the target L1 cache identifier (SRCID).
     272 * The SET field contains the cleared set index.
     273 * The WAY field contains the cleared way index.
     274
     275Flit 0 :
     276||EOP||---DEST---||-res-||--SET-----||-WAY-||TYPE||BC||
     277||1  || (14)     || (3) || (16)     || (2) || (3)||0 ||
     278
     279
     280=== 5.5 DSPIN CLEANUP packet format (L1-to-L2 : 33 bits)
     281
     282This DSPIN packet contains 2 flits.
     283 * The DEST field contains the target (X,Y) cluster coordinates.
     284 * The SOURCE field contains the source L1 cache identifier (SRCID).
     285 * The NL32 field contains the 32 LSB bits of the cache line index.
     286 * The NL2 field contains the 2 MSB bits of the cache line index.
     287 * The WAY field contains the cleared way index.
     288
     289Flit 0 :
     290||EOP||--DEST--||--SOURCE--||NL2||res||WAY||TYPE||BC||
     291|| 0 ||   (10) ||   (14)   ||(2)||(1)||(2)||(2) ||0 ||
     292
     293Flit 1 :
     294||EOP||---------------NLINE-------------------------||
     295||1  ||               (32)                          ||
     296
     297=== 5.6 DSPIN MULTI-ACK packet format ===
     298
     299This DSPIN packet contains one flit.
     300 * The DEST field contains the target L1 cache identifier (SRCID).
     301 * The UPTID field contains the UPDATE Table index.
     302 * The WAY field contains the cleared way index.
     303
     304Flit 0 :
     305||EOP||--DEST--||------res---------||UPTID||TYPE||BC||
     306|| 1 ||   (10) ||      (15)        || (4) ||(2) ||0 ||
     307
     308== 6.  External Network ==
     309
     310This network has a 3D mesh topology: All PUT/GET transactions are from N initiators to M targets (the M tiles of the L3 cache).
    254311
    255312=== 4.1 VCI parameters ===