Changes between Version 15 and Version 16 of DsxDocumentation


Ignore:
Timestamp:
Jan 30, 2008, 10:32:42 AM (16 years ago)
Author:
Nicolas Pouillon
Comment:

Cosmetics, typos and generators

Legend:

Unmodified
Added
Removed
Modified
  • DsxDocumentation

    v15 v16  
    55== A) Goals and general principles ==
    66
    7 DSX stands for ''Design Space eXplorer''. It helps the system designer to map a multi-threaded software application
     7DSX stands for ''Design Space Explorer''. It helps the system designer to map a multi-threaded software application
    88on a multi-processor hardware architecture (MP-SoC) modeled with the SoCLib components.
    99
     
    1414
    1515A specific goal of DSX is to allow the system designer to control not only the placement of the
    16 tasks on the processors, but the placement of the software objects (execution stacks,
     16tasks on the processors, but also the placement of the software objects (execution stacks,
    1717communication buffers, synchronization locks, etc.) on the memory banks. In shared memory multi-processors
    1818architectures with several physically distributed memory banks, such control is mandatory to optimize
    1919both the performances and the power consumption.
    2020
    21 The two targeted application domains are the telecommunication applications (where the tasks are handling packets or packet descriptors), and multi-media applications (where the tasks are handling audio or video streams).
    22 
    23 The general principles of the DSX tool are the following:
     21The two targeted application domains are the telecommunication applications (where the tasks are handling packets or packet descriptors), and multimedia applications (where the tasks are handling audio or video streams).
     22
     23The general principles of DSX are the following:
    2424 * The coarse grain parallelism of the software application must be statically defined as  a '''Task & Communications Graph (TCG)'''. The number of tasks, and the communication channels between tasks should not change during execution.
    2525 * The software tasks are supposed to be written in C or C++, but - for portability reasons - the tasks must use an abstract '''System Resource Layer (SRL)''' API to access the communication and synchronizations resources.
    26  * Each task in the TCG can be implemented as a '''software task''' (software running on an embedded processor), or can be implemented as an '''hardware task''', (running as a dedicated hardware coprocessor).
    27  * DSX allows the programmer to use unprotected shared memory spaces, but the prefered inter-tasks communication mechanism use the '''MWMR middleware'''. The MWMR (Multi-Writer, Multi-Reader)communication  channels, are implemented as software FIFOs and can be shared by ''software tasks'', and by ''hardware tasks''.
     26 * Each task in the TCG can be implemented as a '''software task''' (software running on an embedded processor), or can be implemented as an '''hardware task''' (running as a dedicated hardware coprocessor).
     27 * DSX allows the programmer to use unprotected shared memory regions, but the prefered inter-tasks communication mechanism use the '''MWMR middleware'''. The MWMR (Multi-Writer, Multi-Reader) communication  channels, are implemented as software FIFOs and can be shared by ''software tasks'' and by ''hardware tasks''.
    2828 * DSX provides classical synchronization mechanisms such as '''barriers''' and '''locks''', but inter-task synchronisation is mainly done through the data availability in the MWMR channels.
    29  * The target hardware architecture is a '''shared memory multi-processor system on chip''' (MP-SoC) using the SoCLib library of IP cores. But - in order to validate the multi-threaded software application - DSX is able to generate an executable binary code for a standard POSIX workstation.
     29 * The target hardware architecture is a '''shared memory multi-processor system on chip''' (MP-SoC) using the SoCLib library of IP cores. In order to validate the multi-threaded software application, DSX is able to generate an executable binary code for a standard POSIX workstation.
    3030 * DSX supports the '''POSIX''' compliant [https://www-asim.lip6.fr/trac/mutekh Mutek]  OS kernel for embedded MPSoCs
    31  * Finally, DSX defines the '''DSX/L''' language, based on PYTHON, that allows the system designer to describe in a single file the Task & Communication Graph (TCG), the MP-SoC hardware architecture, and various mapping of the TCG on the MP-Soc architecture.
     31 * DSX defines the '''DSX/L''' language, based on PYTHON, that allows the system designer to describe in a single file the Task & Communication Graph (TCG), the MP-SoC hardware architecture, and various mapping of the TCG on the MP-Soc architecture.
    3232
    3333The DSX/L script execution generates the binary code executable on the workstation, the
    34 SystemC model of the ''top cell'' correspondint to the MP-SoC architecture, and the binary
     34simulator correspondint to the MP-SoC architecture, and the binary
    3535code that will be uploaded in the MP-Soc embedded memory.
    3636
     
    5555 * flush a MWMR channel
    5656    * '''srl_mwmr_flush( )'''
     57
    5758 * Synchronization barrier
    5859    * '''srl_barrier_wait( )'''
     60
    5961 * taking and releasing a lock
    60     * '''srl_loock_lock( )'''
     62    * '''srl_lock_lock( )'''
    6163    * '''srl_lock_unlock( )'''
     64
    6265 * accessing a shared memory space (address and size)
    6366    * '''srl_memspace_addr( )'''
    6467    * '''srl_memspace_size( )'''
    6568
    66 Three  platforms are presently supported :
     69Three  platforms are currently supported :
    6770 * Any Linux (or Unix)  workstation  supporting the POSIX threads,
    68  * MP-SoC architecture using the MUTEK/D operation system,
    69  * MP-SoC architecture using the MUTEK/S operating system,
    70 
    71 MUTEK/D is an embedded, POSIX compliant, distributed,  operating system for MP-SoCs,
    72 while MUTEK/S is an optimized version: the performances are improved, and the memory
     71 * MP-SoC architecture using the Mutek/D operation system,
     72 * MP-SoC architecture using the Mutek/S operating system,
     73
     74Mutek/D is an embedded, POSIX compliant, distributed,  operating system for MP-SoCs,
     75while Mutek/S is an optimized version: the performances are improved, and the memory
    7376footprint is reduced, at the cost of loosing the POSIX compatibility.
    7477
     
    8285[[Image(MjpegCourse:mjpeg.png)]]
    8386
    84 The two TG & RAMDAC tasks will be implemented as hardware coprocessors : the TG component implements a wire-less receiver for the MJPEG stream, and the RAMDAC component is a graphic display controller.
     87The two TG & RAMDAC tasks will be implemented as hardware coprocessors : the TG component implements a wireless receiver for the MJPEG stream, and the RAMDAC component is a graphic display controller.
    8588The 5 other tasks can be implemented as ''software tasks'' or  as ''hardware tasks''. In this particular example,
    8689all MWMR communication channels have one single producer, and one
    87 single consumer, which is frequent for stream oriented multi-media applications.
     90single consumer, which is frequent for stream oriented multimedia applications.
    8891
    8992=== C1) Task Model definition ===
    9093
    91 As a software application can instanciate several instances of the same task, we must distinguish the task, and the task model. A task model defines the code associated to the task, and the task interface (corresponding to the system resources used by the task : MWMR communications channels, synchronization barriers, locks, and memspaces).
     94As a software application can have several instances of the same task, we must distinguish the task, and the task model. A task model defines the code associated to the task, and the task interface corresponding to the system resources used by the task (MWMR communications channels, synchronization barriers, locks, memspaces, ...).
    9295{{{
    9396task_model = TaskModel( 'model_name',
     
    97100                    barriers = [ 'barrier_name', ... ] ,
    98101                    memspaces = [ 'memspace_name', ... ] ,
    99                     signals = [ 'signal_name', ... ] ,
    100102                    impls = [ SwTask( 'func', stack_size = 1024 , sources = [ 'func.c' ] )
    101103}}}   
     
    138140 1. ''lock'' : lock protecting exclusive access
    139141
    140 === C4) Memspace definition
     142=== C4) Memspace definition ===
    141143
    142144Direct communication through shared memory buffers is supported by DSX, but there is no protection mechanism, and the synchronization is the programmer responsability.
     
    158160my_lock = Lock( 'lock_name' )
    159161}}}
    160 In the mapping section of the DSX/L program, the lock can be explicitely placed in the memory space.
     162In the mapping section of the DSX/L program, 1 software object must be placed :
     163 1. ''lock'' : Where to place the lock
    161164
    162165=== C6) Task instanciation ===
     
    175178 1. ''run'' : processor running the task
    176179
    177 === C8) TCG definition ===
     180=== C7) TCG definition ===
    178181
    179182The Task and Communication Graph must be defined : 
     
    196199=== D1) SoCLib components  ===
    197200
    198 In the present version of DSX, each hardware component must be described by a PYTHON
     201In the current version of DSX, each hardware component must be described by a Python
    199202class that defines the component interface, and the component parameters.
    200203The list of available components can be found in SoclibComponents.
     
    221224Depending on the component type, the port designation can vary:
    222225 * When the number of ports is fixed, the ports are attributs : My_Proc0.cache define the cache port of the MIPS processor.
    223  * When the number of port is not fixed (typivally for interconnect component, the ports are accessed through a dedicated method : the getTarget() method of the !LocalCrossbar component returns a VCI target port.
     226 * When the number of port is not fixed (typivally for interconnect component, the ports are allocated through a dedicated method : the getTarget() method of the !LocalCrossbar component allocates a VCI target port, the getInit() method allocates an VCI Initiator port.
    224227The following example describes asimple system with two processor and on e embedded memory:
    225228{{{
     
    250253
    251254In any shared memory architecture, the address space is a shared resource.  This resource is structured in several segments. A segment has a name, a base address, a size
    252 (number of bytes), and a cacheability attribut (Boolean). A segment is a physical entity associated to a
     255(number of bytes), and a cacheability attribute (boolean). A segment is a physical entity associated to a
    253256given VCI target. Several segments can be associated to the same VCI target, but a given segment cannot be distributed over several VCI targets.
    254257
     
    263266
    264267# Instanciating a VCI target hardware component
    265 # and Linking  the segments to this component
     268# and assigning  the segments to this component
    266269my_ram = MultiRam ( 'ram', seg_data1, seg_data2, seg_reset )
    267270}}}
     
    278281=== D4) Generic platforms ===
    279282
    280 As DSX/L is based on PYTHON, it is possible to define generic, parametrized architectutes, that can
     283As DSX/L is based on Python, it is possible to define generic, parametrized architectutes, that can
    281284be reused for various applications. Those reusable architectures are derived classes
    282285from the basic '''Architecture''' class. The implementation is defined in the architecture() method.
    283286
    284 As an example we define a parameterized multi-processors architecture, called MultiProc, and containing
    285  a variable number of processors. The parameter(s) must be named, and the actual parameter value is defined when the architecture is instanciated. The parameter is referenced with the ''getParam()'' method, and it is possible to define a default value.
     287As an example we define a parameterized multi-processors architecture, called !MultiProc, and containing
     288a variable number of processors. The parameter(s) must be named, and the actual parameter value is defined
     289when the architecture is instanciated. The parameter is referenced with the ''getParam()'' method, and it
     290is possible to define a default value.
    286291{{{
    287292#################################
     
    292297    def architecture(self):
    293298
    294     # segments definition
    295     self.reset = Segment( ’reset’, address = 0xbfc00000, type = Cached )
    296     self.code = Segment( ’code’, type = Cached )
    297     self.data = Segment( ’data’, type = Uncached )
    298 
    299     # components instanciation and connexion
    300     self.vgmn = Vgmn( ’vgmn’ )
    301     self.ram = MultiRam( ’ram’, self.reset, self.code, self.data )
    302     # processors and caches
    303     self.cpus = []
    304     for i in self.getParam( ’nbcpu’ ):
    305         m = Mips( ’mips%d’%i )
    306         self.cpus.append( m )
    307         c = Xcache( ’cache%d’%i )
    308         g:c.cache // m.cache )
    309         c.vci // self.vgmn.getTarget() )
    310     self.vgmn.getTarget() // self.c1
    311     self.vgmn.getTarget() // self.c2
    312     self.vgmn.getInit() // self.ram
    313 
    314     # base definition
    315     self.setBase( self.vgmn )
    316 
    317     # segment table initialization
    318     self.setConfig(’mapping_table’, MappingTable() )
     299        # segments definition
     300        self.reset = Segment( ’reset’, address = 0xbfc00000, type = Cached )
     301        self.code = Segment( ’code’, type = Cached )
     302        self.data = Segment( ’data’, type = Uncached )
     303
     304        # components instanciation and connexion
     305        self.vgmn = Vgmn( ’vgmn’ )
     306        self.ram = MultiRam( ’ram’, self.reset, self.code, self.data )
     307        # processors and caches
     308        self.cpus = []
     309        for i in self.getParam( ’nbcpu’ ):
     310            m = Mips( ’mips%d’%i )
     311            self.cpus.append( m )
     312            c = Xcache( ’cache%d’%i )
     313            g:c.cache // m.cache )
     314            c.vci // self.vgmn.getTarget() )
     315        self.vgmn.getTarget() // self.c1
     316        self.vgmn.getTarget() // self.c2
     317        self.vgmn.getInit() // self.ram
     318
     319        # base definition
     320        self.setBase( self.vgmn )
     321
     322        # segment table initialization
     323        self.setConfig(’mapping_table’, MappingTable() )
    319324
    320325####################################
     
    331336=== E1) Mapper declaration ===
    332337
    333 AS it is possible to define various mapping for a given TCG, and a given architecture, we must define a third object : this ''mapper'' will contain all the mapping directives defined by the system designer.
     338As it is possible to define various mapping for a given TCG, and a given architecture, we must define a third object : this ''mapper'' will contain all the mapping directives defined by the system designer.
    334339{{{
    335340my_mapper = Mapper( my_tcg, my_architecture )
     
    339344
    340345The mapper has a method ''map()'' that is used to assign a software object to an hardware component.
    341 An hardware component can b a processor, or a segment associated to an embedded memory bank,
     346An hardware component can be a processor, a segment associated to an embedded memory bank,
    342347or a segment associated to an addressable peripheral.
    343348{{{
     
    367372to various outputs : binary code for the software application, hardware architecture simulation model, etc.
    368373
    369 
     374This involves a code generator. Several code generator exist, they may apply to different parts of you design:
     375 * Software only (Tcg object)
     376  * Posix() for generating native workstation code
     377 * Software and hardware (Mapper object)
     378  * MutekS() to use Mutek/S as supporting embedded OS
     379  * MutekD() to use Mutel/D as supporting embedded OS
     380  * any hardware generator (those on next lines), this will create a platform automatically loading the embedded software
     381 * Hardware only (Hardware object)
     382  * Caba() to create a CABA netlist (with SoCLib)
     383  * Tlmt() to create a TLM-T netlist (with SoCLib)
     384
     385User may want to have a convenience Makefile in platform root which would build all code,
     386it may be created passing all generators created to generate code to TopMakefile()
     387
     388Example: Let's create
     389 * An application mapped on an hardware platform with CABA and TLM-T simulators
     390 * a corresponding application for the workstation
     391 * a top Makefile:
     392
     393{{{
     394
     395soft = Tcg( ... )
     396hard = Hardware( ... )
     397
     398mapping = Mapper( hard, soft )
     399
     400mapping.map( ... )
     401
     402# Generators now:
     403
     404muteks_generator = MutekS()
     405caba_generator = Caba()
     406tlmt_generator = Tlmt()
     407
     408posix_generator = Posix()
     409
     410# MutekS and simulators (Caba / Tlmt) generates platform and embedded software for a mapping:
     411
     412mapping.generate( muteks_generator, caba_generator, tlmt_generator )
     413
     414# Posix generates code for a Tcg
     415
     416tcg.generate( posix_generator )
     417
     418# TopMakefile takes the used generators:
     419
     420TopMakefile( muteks_generator, caba_generator, tlmt_generator, posix_generator )
     421}}}
     422