



# Une introduction à la synthèse de haut-niveau

# *(ou comment générer des architectures matérielles à partir du langage C)*

### Université de Bretagne-Sud Lab-STICC

#### Philippe COUSSY philippe.coussy@univ-ubs.fr

## **Productivity gap**



## Context



## Context



## **Design methodologies**

Synthesis and verification automation has always been key factors in the evolution of the design process

- Allow to explore the design space efficiently and rapidly
- Correct by construction design

## **Design methodologies**

#### Software domain

- Machine code (binary sequence)
- 1950s: concept of assembly language (and assembler)
  - □ based on mnemonics
  - □ Maurice V. Wilkes de l'université de Cambridge
- Later: High-level languages and compilers
  - □ 1951: First compiler
    - (A-0 system) par Grace Hopper
  - □ Fortran 1954-1957: First high-level language
    - FORmula TRANslator
  - □ Cobol 1959, Basic 1964, C 1972, C++ 1983...

#### High-level language

- Platform independent
- Follow the rules of human language
  - □ with a grammar, a syntax and a semantic
- Provide flexibility and portability
  - □ by hiding details of the computer architecture

## **Design methodologies**

#### Hardware domain

- 1960: IC were done by hand
  - □ designed, optimized and laid out
- 1970: Gate-level simulation
- end of 70: Cycle-based simulation
- 1980: Wide automation
  - place & route, schematic circuit capture, formal verification and static timing analysis
- Mid 1980: Hardware description language
  - □ 1986 Verilog, 1987 VHDL
- 1990: logic synthesis
  - □ VHDL and Verilog synthesizable subsets
- Mid 1990:
  - □ High-level synthesis (First gen),
  - Co-design, IP-core reuse...
- 2000 : Electronic System Level ESL
  - System level language
    - SystemC, SystemVerilog...,
    - Virtual prototyping, Transaction Level Modellin TLM ...

## Design gap

#### **SOC Design Cost Model**



8/68

## **Electronic System Level Design (ESLD)**



## **ESL Market**



## **Outline**

#### □ Lab-STICC

#### General context

#### □ High-Level Synthesis

- Brief introduction
- "In details"

#### 

- Overview
- Results
- **Conclusion**
- References

## **Typical HW design flow**

#### Starting from a Register Transfer Level description, generate an IC layout



## **Typical HW design flow**

# Starting from a functional description, automatically generate an RTL architecture



## **High-level synthesis**

# Starting from a functional description, automatically generate an RTL architecture

#### **Constraints**

- Timing constraints: latency and/or throughput
- Resource constraints: #Operators and/or #Registers and/or #Memory, #Slices...

#### Objectives

- Minimization: area i.e. resources, latency, power consumption...
- Maximization: throughput

## **Synthesis steps**

#### **Compilation**

Generates a formal modeling of the specification

#### □ Selection

Chooses the architecture of the operators

#### □ Allocation

Defines the number of operators for each selected type

#### □ Scheduling

Defines the execution date of each operation

#### □ Binding (or Assignment)

- Defines which operator will execute a given operation
- Defines which memory element will store a data

#### □ Architecture generation

Writes out the RTL source code in the target language e.g. VHDL

## **HLS steps: inputs**



## **HLS steps: Compilation**



## **Synthesis steps**

#### **Compilation**

Generates a formal modeling of the specification

#### **Selection**

Chooses the architecture of the operators

#### □ Allocation

Defines the number of operators for each selected type

#### □ Scheduling

Defines the execution date of each operation

#### □ Binding (or Assignment)

- Defines which operator will execute a given operation
- Defines which memory element will store a data

#### □ Architecture generation

Writes out the RTL source code in the target language e.g. VHDL

## **Operator architecture**

#### □ Full 1-bit adder : X + Y + Z

- X, Y are the operands
- Z is the input carry



#### **Ripple Carry Adder**

- Add two integers A and B
- Cascade of 1-bit adders => Ripple-Carry Adder



## **Operator architecture**

#### **Carry Look-ahead adder CLA**

Uses a carry generator to compute all the carries concurrently
*faster but also larger than the RCA*



## Library characterization

RTL architecture produced by HLS depends on the capabilities and characteristics of the operators

Library processing reads the available libraries and determines the functional, timing, and area characteristics of the available parts.



## **HLS steps: Selection**



## **Synthesis steps**

#### **Compilation**

Generates a formal modeling of the specification

#### □ Selection

Chooses the architecture of the operators

#### □ Allocation

Defines the number of operators for each selected type

#### □ Scheduling

Defines the execution date of each operation

#### □ Binding (or Assignment)

- Defines which operator will execute a given operation
- Defines which memory element will store a data

#### □ Architecture generation

Writes out the RTL source code in the target language e.g. VHDL

## **HLS steps: allocation**



## **Synthesis steps**

#### **Compilation**

Generates a formal modeling of the specification

#### □ Selection

Chooses the architecture of the operators

#### □ Allocation

Defines the number of operators for each selected type

#### **Scheduling**

Defines the execution date of each operation

#### □ Binding (or Assignment)

- Defines which operator will execute a given operation
- Defines which memory element will store a data

#### □ Architecture generation

Writes out the RTL source code in the target language e.g. VHDL

## **HLS steps: scheduling**



## **Synthesis steps**

#### **Compilation**

Generates a formal modeling of the specification

#### □ Selection

Chooses the architecture of the operators

#### □ Allocation

Defines the number of operators for each selected type

#### □ Scheduling

Defines the execution date of each operation

#### Binding (or Assignment)

- Defines which operator will execute a given operation
- Defines which memory element will store a data

#### □ Architecture generation

Writes out the RTL source code in the target language e.g. VHDL

## **HLS steps: binding**



## **Synthesis steps**

#### **Compilation**

Generates a formal modeling of the specification

#### □ Selection

Chooses the architecture of the operators

#### □ Allocation

Defines the number of operators for each selected type

#### □ Scheduling

Defines the execution date of each operation

#### □ Binding (or Assignment)

- Defines which operator will execute a given operation
- Defines which memory element will store a data

#### □ Architecture generation

Writes out the RTL source code in the target language e.g. VHDL

## **HLS steps: output**



## **RTL Architecture**

#### **Controller**

- FSM controller
- Programmable controller

#### Datapath components

- Storage components
- Functional units
- Connection components



#### Source :

Embedded System Design, © 2009, Gajski, Abdi, Gerstlauer, Schirner

## **Example**

#### □ This architecture performs the following operations:

- store two variables coming from the port P1 in R1 and R2
- store one variable coming from the port P2 in R3
- add the variables stored in R1 and R3 and put the result in R4
- add the variables stored in R2 and R3 and put the result in R4
- connect either R1 or R2 to A1

□ the control unit manages this connection through M1



## **RTL architecture**



Source : Embedded System Design, © 2009, Gajski, Abdi, Gerstlauer, Schirner

## **Problem examples and design flow**

## **Resource constrained HLS**

#### Limited number of resources

- e.g.: 2 multipliers, 3 adders
- Pseudo architecture

# Schedule operations according to the available operators in the current control step

#### Objectives

Minimize the latency or maximize the throughput

□ based on operations mobility i.e. operations urgency

## **Resource constrained HLS**

#### Limited number of resources

- e.g.: 2 multipliers, 3 adders
- Pseudo architecture

# Schedule operations according to the available operators in the current control step

#### Objectives

Minimize the latency or maximize the throughput

□ based on operations mobility i.e. operations urgency



#### **Allocation and then Scheduling**
# **Time constrained HLS**

#### Latency constraint

e.g. 5 clock cycles to process all the data

### Throughput constraint

- Cadency, initiation interval...
- e.g. process each 5 cycles a new set of input data

# Schedule operations by using operators as much as needed

### **Objective**

Minimize the circuit area

# **Time constrained HLS**

#### Latency constraint

e.g. 5 clock cycles to process all the data

### Throughput constraint

- Cadency, initiation interval...
- e.g. process each 5 cycles a new set of input data

# Schedule operations by using operators as much as needed

### **Objective**

Minimize the circuit area

### Scheduling and then Allocation

# **Design flows**



# And a lot of other problems...

## □ Variable merging Storage Sharing

Operation merging Operator sharing

### **Connection merging**

Bus sharing

### Register merging

Register file...

### **Chaining**

Several sequential operations in a cycle

### Multi-cycling

One operation takes more than one clock cycle to execute

### Pipelining

Pipelined Datapath, pipelined operator, pipelined controller

# Chaining, multi-cycling



one clock cycle to execute

# **Outline**

#### □ Lab-STICC

#### General context

#### □ High-Level Synthesis

- Brief introduction
- "In details"

#### 

- Overview
- Results
- **Conclusion**
- **References**

# **Synthesis steps**

### **Compilation**

Generates a formal modeling of the specification

### □ Selection

Chooses the architecture of the operators

### □ Allocation

Defines the number of operators for each selected type

### □ Scheduling

Defines the execution date of each operation

### □ Binding (or Assignment)

- Defines which operator will execute a given operation
- Defines which memory element will store a data

### □ Architecture generation

Writes out the RTL source code in the target language e.g. VHDL

# **High-level synthesis goal**

# Starting from a functional description, automatically generate an RTL architecture

- Mathematic formula
- Matlab/Simulink
- C/C++/SystemC
- · . . .

# Synthesizable models

#### **C** for the synthesis:

- No pointer
  - □ Statically unresolved
  - Arrays are allowed!
- No standard function call
   *printf, scanf, fopen, malloc...*
- Function calls are allowed
   Can be in-lined or not

Finite precision
 Bit accurate integers, fixed point, signed, unsigned...
 Based on SystemC or Mentor Graphics data types
 sc\_int, sc\_fixed
 ac\_int, ac\_fixed

## Synthesizable models

### **C** for the synthesis:

Finite precision

□ bit accurate integer, fixed point, signed, unsigned...

S/W C: Overflow checks everywhere.

unsigned int x, y, z, cy; z = x + y; if (0xFF..FF - x >= y) cy = 0; // bit 32 else cy = 1; // bit 32 H/W C: Check unnecessary.

## Purely functional Example #1: a simple C code

#define N 16

```
int main(int data_in, int *data_out)
{ static const int Coeffs [N] = {98,-39,-327,439,950,-2097,-1674,9883,9883,-1674,-2097,950,439,-327,-39,98};
 int Values[N];
 int temp;
 int sample, i, j;
  sample = data_in;
  temp = sample * Coeffs[N-1];
  for(i = 1; i<=(N-1); i++){
          temp += Values[i] * Coeffs[N-i-1];
   }
  for(j=(N-1); j>=2; j=1){
     Values[j] = Values[j-1];
  }
  Values[1] = sample;
  *data_out=temp;
```

return 0;

}

### Purely functional example #2: bit accurate C++ code

```
temp = sample * Coeffs[N-1];
for(int i = 1; i<=(N-1); i++){
    temp = Values [i] * Coeffs[N-i-1] + temp;
}</pre>
```

```
for(int j=(N-1); j>=2; j-=1){
    Values[j] = Values [j-1];
}
Values[1] = sample;
```

```
data_out=temp;
return 0;
```

}

# **Fixed-point**

**Fixed point:** 



# **Fixed point: rounding mode**



SC\_RND SC\_TRN **Fixed point: overflow mode** 



## **Bit accurate operation**

### □ Sign extension before the computation



# **Fixed point operation**



# **High-level synthesis goal**

# Starting from a functional description, automatically generate an RTL architecture

- Algorithmic description
   *no timing notion in the source code*
- Behavioral description
   Notion of step / local timing constraints in the source code
   by using the wait statements of SystemC for example
- The description can be
   *"RTL oriented" "Function oriented"*

# **High-level synthesis**

# Starting from a functional description, automatically generate an RTL architecture

Algorithmic description
 No timing notion in the source code
 Mainly oriented toward data dominated application
 Highly processing algorithm like filters...
 Initial description can be
 "RTL oriented"
 "Function oriented"

Behavioral description

□ Notion of step / local timing constraints in the source code

- by using the wait statements of SystemC for example
- □ Can be used for both data and control dominated application

Interface controller, DMA...

Filters...

# **High-level synthesis**

# Starting from a functional description, automatically generate an RTL architecture

#### Algorithmic description

- □ No timing notion in the source code
- □ Mainly oriented toward data dominated application
  - Highly processing algorithm like filters...
- □ Initial description can be
  - "RTL oriented"
  - "Function oriented"

#### Behavioral description

- □ Notion of step / local timing constraints in the source code
  - by using the wait statements of SystemC for example
- **Can be used for both data and control dominated application** 
  - Interface controller, DMA...
  - Filters...

# **Behavioral description**

### Behavioral description

□ Notion of step / local timing constraints in the source code

by using the wait statements of SystemC for example



# Function v.s. RTL description

```
01:
    int OnesCounter(int Data) {
02:
    int Ocount = 0;
03:
    int Temp, Mask = 1;
04:
    while (Data > 0) {
05:
    Temp = Data & Mask;
06
    Ocount = Data + Temp;
07:
    Data >>= 1;
08:
09:
    return Ocount;
10:
    }
```

01: **while**(1) { 02: while (Start == 0); 03: Done = 0;04: Data = Input; 05: Ocount = 0;06: Mask = 1;07: while (Data>0) { 08: Temp = Data & Mask; 09: Ocount = Ocount + Temp; 10: Data >>= 1;11: } 12: Output = Ocount; 13: Done = 1; 14: }

Function-based C code

RTL-based C code

Source :

Embedded System Design, © 2009, Gajski, Abdi, Gerstlauer, Schirner

# **High-level transformations**







No Unrolling 1 Adder shared for 4 additions Latency = 4 cycles

r[0] = a[0] + b[0]; r[1] = a[1] + b[1]; r[2] = a[2] + b[2]; r[3] = a[3] + b[3];



Unrolling = 4 (Full) 4 Adders in parallel Latency = 1 cycle

| for (i = 0; i<32; i++) |
|------------------------|
| {                      |
| a[i] = b[i] * c[i];    |
| }                      |
| for (i = 0; i<16; i++) |
| {                      |
| z[i] = a[i] + x[i];    |
| 1                      |

No Merging Loops execute sequentially Latency = 48 cycles

for (i = 0; i<32; i++)
{
 atmp = b[i] \* c[i];
 if (i<16)
 z[i] = atmp + x[i];
}</pre>

Merging Enabled Loops execute in parallel Latency = 32 cycles

# **High-level transformations**

#### **Loops**

- Loop pipelining,
- loop unrolling
  - □ None, partially, completely
- Loop merging
- Loop tiling
- ....

#### □ Arrays

- Arrays can be mapped on memory banks
- Arrays can be synthesized as registers
- Constant arrays can be synthesized as logic
- **...**

#### □ Functions

- Function calls can be in-lined
- Function is synthesized as an operator
  - Sequential, pipelined, functional unit...
- Single function instantiation
- **.**...

# Compilation

#### Optimization

- Constant folding
- Dead code elimination
- Common sub-expression elimination
   *Eliminate redundant operations*
- · ...

#### Formal model

- Inputs, outputs, and operations of the algorithm are identified
- Data and/or control dependencies are determined
- Intermediate representation is generated

# **Control-Flow Graph CFG**

### **Exhibits operation sequences**

Through control dependencies

### The sequence of operations comes directly from the source code

- The sequence is kept unchanged
  - □ This limits the parallelism which should be limited if this representation is used to model control-oriented application

## **Example**

1: 
$$t = a+b$$
;  
2:  $u = a'-b'$ ;  
3: *if* (av = t+c;  
*else*  
{  
5:  $w = u+c'$ ;  
6:  $v = w-d$ ;  
}  
7:  $x = v+e$ ;  
8:  $y = v-e$ ;  
8:  $y = v-e$ ;

Source code

Graphical representation

# Example (2)



# **Data Flow Graph DFG**

### Exhibits the parallelism between operations

Through data dependencies
 *Variable node, operation node*

Intermediate representation



 $O = ((n_{01}+n_{02})^*n_{12}) - (n_{21}+n_{22})$ 

# CDFG => DFG

### Exhibits the parallelism between operations

- Through data dependencies
  - □ Variable node, operation node

### Loops are completely unrolled



### **Conditional assignments are transformed**

■ i.e. *if/switch* constructs, are resolved by creating multiplexed values

# Example



## **Example**

(b') (a') **`C'** bu) a 1: t = a+b; 2: u = a'-b';3: *if* (a<b) 4: v = t+c;C W C else { 5: w = u + c';tmp1 tmp2 6: v = w+d;Cmux 7: x = v + e;8: y = v-e; e Source code

# **Data Flow Graph DFG**

### **Scheduling**

- Resource constrained
  - Latency minimization
    - List-Scheduling...

□ Throughput maximization

- Modulo scheduling (IMS, SMS...)
- Time constrained

**Resource** *minimization* 

Force-directed scheduling, ILP...

### Linear FSM controller

Worst execution time for the conditional assignments

# **Synthesis steps**

### **Compilation**

Generates a formal modeling of the specification

### □ Selection

Chooses the architecture of the operators

### □ Allocation

Defines the number of operators for each selected type

### **Scheduling**

Defines the execution date of each operation

### □ Binding (or Assignment)

- Defines which operator will execute a given operation
- Defines which memory element will store a data

### □ Architecture generation

Writes out the RTL source code in the target language e.g. VHDL



# **List-scheduling**



#### **Constraints**

- 1 adder (1 cycle)
- 1 subtractor (1 cycle)
- 1 comparing component (1 cycle)
- No chaining

ASAP







# **List-scheduling**



#### **Constraints**

- 1 adder (1 cycle)
- 1 subtractor (1 cycle)
- 1 comparing component (1 cycle)
- No chaining



### Priority = 1/Mobility

# **List-scheduling**


## **List-based scheduling**

#### Scheduling under throughput constraint (cadency)

- First operator allocation that a priori support the required parallelism
  - In many HLS approach, an initial resource allocation is performed and subsequently modified during scheduling and/or binding => it is a lower bound
- The average parallelism is calculated separately for each type of operation of the DGF

$$avr\_opr(type) = \left[\frac{nb\_ops(type)}{\left\lfloor\frac{H}{T(opr)}\right\rfloor}\right]$$

With *II* the Initiation Interval (cadency) *nb\_ops(type)* the number of operators of type *type T(opr)* the propagation time of the operator (in cycles)

#### **List-based scheduling**



#### **Constraint**

Throughput : one iteration each 3 cycles



Step3 Step4 Step5 Step6

Step2

Step1

### Impact on the memory



Input : X[N] Constant : H[N] // in memory

Without memory constraints Iteration\_period = 60ns Nb\_opr(\*) = 2 Nb\_opr(+) = 1





The memory mapping has to be done by the user

#### **Memory constraints**



With memory constraints => 1 memory bank Iteration\_period = 100ns Nb\_opr(\*) = 1 Nb\_opr(+) = 1



The memory access is the bottleneck !

Latency(arch) = 90 ns

### Impact on the I/O interface



Input : X[N] Constant : H[N] // in memory

Without I/O constraints Iteration\_period = 60ns Nb\_opr(\*) = 2 Nb\_opr(+) = 1



### I/O timing constraints



| 0    | 10    | 20         | 30         | 40         | 50  |
|------|-------|------------|------------|------------|-----|
| H(0) |       | H(3)       |            |            |     |
| H(1) |       |            |            |            |     |
| mul5 |       | mul21      |            |            |     |
| X(1) |       |            |            | _          |     |
| X(0) |       | X(3)       |            |            |     |
| mul9 |       |            |            |            |     |
|      | mul15 |            |            |            |     |
|      | X(2)  |            |            |            |     |
|      | H(2)  |            |            |            |     |
|      |       | workshop_1 |            |            |     |
|      |       | tmp        |            |            |     |
|      |       | add11      | add17      | add23      |     |
|      |       |            | tmp0001    | tmp0002    |     |
|      |       |            | workshop_1 |            |     |
|      |       |            |            | workshop_1 |     |
|      |       |            |            |            | sum |

With I/O constraints 4 input data in parallel Latency = 50ns Iteration\_period = 60ns Nb\_opr(\*) = 3 (and not 4) Nb\_opr(+) = 1



## **I/O timing constraints**



With I/O timing constraints 1 data per 4 cycles Latency (arch) = 150 ns

Iteration\_period = 170ns Nb\_opr(\*) = 1 Nb\_opr(+) = 1

#### => 1 input port / 1 output port



## **Synthesis steps**

#### **Compilation**

Generates a formal modeling of the specification

#### □ Selection

Chooses the architecture of the operators

#### □ Allocation

Defines the number of operators for each selected type

#### □ Scheduling

Defines the execution date of each operation

#### Binding (or Assignment)

- Defines which operator will execute a given operation
- Defines which memory element will store a data

#### □ Architecture generation

Writes out the RTL source code in the target language e.g. VHDL

## **Specification**





### **Compilation => DFG**



## Scheduling





## **Timing information**



 $\mathbf{O}$ 

#### Formal model for variable binding



(a) Data lifetimes

#### (b) Compatibility graph

#### Timing information and formal model







(c) Compatibility graph

## **Operation binding**



017

# **Compatibility and conflict graphs**

Clique partitioning : Binding based on a compatibility graph. Edge exists between two data which lifetimes are not overlapping: they can share the same register.

Compatibility graph



clique (sub-graph)

Graph coloring: Binding based on a conflict graph. Edge exists between two data which lifetimes are overlapping: they can not share the same register.

Incompatibility graph



Graph coloring

## (weighted) Bipartite Graph

A bipartite graph is a graph whose vertices can be divided into two disjoint sets A and B such that every edge connects a vertex in A to one in B





## Example

Goal: maximize the use of existing connections between operators (Muxes optimization) while minimizing their size weight = combination between the size and the number of connection



For each cycle (control step):

• Create a bipartite graph: free operators, operations to bind

Compute weights



## **Bipartite Weighted Matching**



Maximum Weighted Bipartite Matching : Hungarian method (munkres algorithm)

91/68

#### Clique partitioning algorithm: Tseng's Algorithm



- 1. Group nodes which have the greatest common neighbor number
- 2. Repeat until all the edges are removed
- 3. Each clique corresponds to a storage unit

### Data binding : the Left Edge algorithm

Data are ordered by increasing birth date

□ Leftmost data are bound to distinct registers







### Data binding : the Left Edge algorithm

Data are ordered by increasing birth date

□ Leftmost data are bound to distinct registers







#### Data binding : the Left Edge algorithm



#### Left-edge algorithm does not take into account multiplexor cost

# **Resource Binding**

#### Multiplexer and interconnect costs are significant.

#### Cyclic inter-dependency exists between FU binding and register binding To minimize interconnection, one task needs the other's result to make accurate decision



**Resource constraints: 2 FUs, 2 REGs** 



The inter-dependency is far more complicated in real designs

Use « manual allocation » to change FU binding arround the best point Use a metaHeuristic: Variable Neighborhood Search, simulated annealing or Tabu search

# **Resource Binding**

Register files may be used to hide the multiplexers, which are replaced by dedicated decoders

Merge registers with non-overlapping access dates











Х

**REG x** 

### **Outline**

#### □ Lab-STICC

#### □ General context

#### □ High-Level Synthesis

- Brief introduction
- "In details"

#### 

- Overview
- Results
- **Conclusion**
- **References**

### GAUT

#### □ An academic, free and open source HLS tool

#### Dedicated to DSP applications

- Data-dominated algorithm
  - □ 1D, 2D Filters
  - □ Transforms (Fourrier, Hadamar, DCT...)
  - □ Channel Coding, source coding algorithms

#### □ Input : bit-accurate C/C++ algorithm

bit-accurate integer and fixed-point from Mentor Graphics

#### **Output : RTL Architecture**

- VHDL
- SystemC
  - CABA: Cycle accurate and Bit accurate
  - □ TLM: Transaction level model
  - Compatible with both SocLib and MPARM virtual prototyping platforms
- Automated Test-bench generation
- Automated operators characterization

## **GAUT: Constraints**



## **GAUT: Design flow**



## **GAUT: Compilation**

| GAUT 2.4.3 build 17/02/2010 - Lab-STICC                                                                                                                                                                                             | ., UBS University, Lorient (F          | France) |       |        |                 |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|---------|-------|--------|-----------------|
| III 🥶 🖉 🎨 🗹 bitwidth i                                                                                                                                                                                                              | aware Library: notech_16b              |         |       |        | UMR 3/192       |
| Opened file : C:\GAUT_2_4_3_testnew\test\idct_so<br>C/C++ Compiler Graph                                                                                                                                                            | oclib\idct.c                           |         |       |        |                 |
| D 🕾 📲 🔮 🖻 🗅 👘 🖓                                                                                                                                                                                                                     | · 🖻 👗 🛍 🛛 👌                            | 8 🖗 🗄   | Arial | ✓ 12 ✓ | ۵.              |
| int main(const int32_t in[BLOCK_SIZE], /* uint8_t*/ in<br>{                                                                                                                                                                         | nt32_t ldct[BLOCK_SIZE])               |         |       |        | ^               |
| #define Idct(i,j) Idct[8*i+j]<br>int32_t Y[BLOCK_HEIGHT][BLOCK_WIDTH];<br>int row, column;                                                                                                                                          |                                        |         |       |        |                 |
| for (row = 0; row < BLOCK_HEIGHT; row++) {     for (column = 0; column < BLOCK_WIDTH; co         Y[row][column] = SCALE(in[(row << 3) + colu     idct_1d(Y[row],Y[row]);     /* Result Y is scaled up by factor sqrt(8)*2^S_I     } | olumn++)<br>umn], S_BITS);<br>BITS. */ |         |       |        |                 |
| for (column = 0; column < BLOCK_WIDTH; colu<br>int32_t Yc[BLOCK_HEIGHT];                                                                                                                                                            | ımn++) {                               |         |       |        |                 |
| for (row = 0; row < BLOCK_HEIGHT; row++)<br>Yc[row] = Y[row][column];                                                                                                                                                               |                                        |         |       |        |                 |
| idct_1d(Yc,Yc);<br>for (row = 0; row < BLOCK_HEIGHT; row++) {                                                                                                                                                                       |                                        |         |       |        |                 |
| /* Result is once more scaled up by a factor<br>int32_t r = 128 + DESCALE(Yc[row], S_BITS<br>/* Clip to 8 bits unsigned: */<br>r = r > 0 ? (r < 255 ? r : 255) : 0;                                                                 | r sqrt(8). */<br>3 + 3);               |         |       |        | =               |
| lact(row, column) = r;                                                                                                                                                                                                              |                                        |         |       |        | ~               |
| Warning : Variable idct_1d@rot@COS(0,1) is used but not<br>generate cdfg file : idct.cdfg                                                                                                                                           | t defined (constant ?) !!!             |         |       |        | ^               |
| Bitwidth and Signed optimization<br>End of analysis<br>Time used for analysis: 844 me                                                                                                                                               |                                        |         |       |        |                 |
|                                                                                                                                                                                                                                     |                                        |         |       |        | ~               |
|                                                                                                                                                                                                                                     |                                        |         |       |        | Line 1 Column 1 |

## **GAUT: DFG viewer**



### **GAUT: Operators characterization**



## **GAUT: Synthesis steps**



### **GAUT: I/O and memory constraints**

| 😫 GAUT 2.4.3 build 17/02/2010 - Lab-STICC, U            | BS University, Lorient (France)                |      |       |  |  |  |  |  |  |
|---------------------------------------------------------|------------------------------------------------|------|-------|--|--|--|--|--|--|
| 🗐 📑 🔮 🔀 🖉 Di                                            | itwidth aware                                  |      |       |  |  |  |  |  |  |
| Opened file : null                                      |                                                | _    |       |  |  |  |  |  |  |
| Input/Output Constraints Memory Constraints Synthesis M | ulti Mode                                      |      |       |  |  |  |  |  |  |
| From PU point of view                                   |                                                |      |       |  |  |  |  |  |  |
|                                                         | Mode                                           | Port | Time  |  |  |  |  |  |  |
| Name                                                    | Piode                                          | Foit |       |  |  |  |  |  |  |
| in(0)                                                   | Input                                          | 1    |       |  |  |  |  |  |  |
| in(1)                                                   | Input                                          | 1    | 5     |  |  |  |  |  |  |
| in(2)                                                   | Input                                          | 1    | 10    |  |  |  |  |  |  |
| in(3)                                                   | Input                                          | 1    | 15    |  |  |  |  |  |  |
| in(4)                                                   | Input                                          | 1    | 20    |  |  |  |  |  |  |
| in(5)                                                   | Input                                          | 1    | 25    |  |  |  |  |  |  |
| in(6)                                                   | Input                                          | 1    | 30    |  |  |  |  |  |  |
| In(7)                                                   | Input                                          | 1    | 35    |  |  |  |  |  |  |
| in(8)                                                   | Input                                          | 1    | 40    |  |  |  |  |  |  |
| in(9)                                                   | Input                                          | 1    | 45    |  |  |  |  |  |  |
| in(10)                                                  | Input                                          | 1    | 50    |  |  |  |  |  |  |
| in(11)                                                  | Input                                          | 1    | 55    |  |  |  |  |  |  |
| in(12)                                                  | Input                                          | 1    | 60    |  |  |  |  |  |  |
| in(13)                                                  | Input                                          | 1    | 65    |  |  |  |  |  |  |
| in(14)                                                  | Input                                          | 1    | 70    |  |  |  |  |  |  |
| in(15)                                                  | Input                                          | 1    | 75    |  |  |  |  |  |  |
| in(16)                                                  | Input                                          | 1    | 80    |  |  |  |  |  |  |
| in(17)                                                  | Input                                          | 1    | 85    |  |  |  |  |  |  |
| in(18)                                                  | Input                                          | 1    | 90    |  |  |  |  |  |  |
| in(19)                                                  | Input                                          | 1    | 95    |  |  |  |  |  |  |
| in(20)                                                  | Input                                          | 1    | 100   |  |  |  |  |  |  |
| in(21)                                                  | Input                                          | 1    | 105   |  |  |  |  |  |  |
| in(22)                                                  | Input                                          | 1    | 110   |  |  |  |  |  |  |
| in(23)                                                  | Input                                          | 1    | 115   |  |  |  |  |  |  |
| in(24)                                                  | Input                                          | 1    | 120   |  |  |  |  |  |  |
| in(25)                                                  | Input                                          | 1    | 125   |  |  |  |  |  |  |
| in(26)                                                  | Input                                          | 1    | 130   |  |  |  |  |  |  |
| in(27)                                                  | Input                                          | 1    | 135   |  |  |  |  |  |  |
| in(28)                                                  | Input                                          | 1    | 140   |  |  |  |  |  |  |
| in(29)                                                  | Input                                          | 1    | 145   |  |  |  |  |  |  |
| in(30)                                                  | Input                                          | 1    | 150 🗸 |  |  |  |  |  |  |
| <                                                       |                                                |      | >     |  |  |  |  |  |  |
| Time u                                                  | ised for creating io constraints table : 16 ms |      |       |  |  |  |  |  |  |

### **GAUT: Gantt viewer**

| 📴 GAUT 2.4.3 bi                                                          | uild 17/02/2                             | 010 - Lab-S | TICC, UBS Un | niversity, Lo   | rient (Franc    | e)              |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  |                   |                       |                 |                       |                   |          |
|--------------------------------------------------------------------------|------------------------------------------|-------------|--------------|-----------------|-----------------|-----------------|-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|------------------|-------------------|-----------------------|-----------------|-----------------------|-------------------|----------|
|                                                                          | 31                                       | 🖷 🐠 Re 💋    | ✓ bitwidth   | h aware 🛛       | brary: notech   | _16b            |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  |                   | Lab <mark>·</mark> ST |                 | <b>X</b> 5/<br>R 3192 |                   |          |
| Opened file : C:\GAUT_2_4_3_testnew\test\idct_soclib\idct_UT.gantt Gantt |                                          |             |              |                 |                 |                 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  |                   |                       |                 |                       |                   |          |
|                                                                          | ka k |             |              |                 |                 |                 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  |                   |                       |                 |                       |                   |          |
|                                                                          | 0                                        | 5           | 10           | 15              | 20              | 25              | 30              | 35                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 40              | 45               | 50                | 55                    | 60              | 65                    | 70                |          |
| slišsšsšsš1cvde.9                                                        | ell¢e¢e¢e¢1c                             |             |              | ell¢e¢e¢e¢1c    | ell¢e¢e¢e¢1c    | ell¢e¢e¢e¢1c    |                 | ell¢e¢e¢e¢1c                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                 | ell¢e¢e¢e¢1c     | ell¢e¢e¢e¢1c      | ell¢e¢e¢e¢1c          | ell¢e¢e¢e¢1c    | ell¢e¢e¢e¢1c          | ell¢e¢e¢e¢1c      |          |
| register, 15                                                             | in(0)                                    | in(1)       | in(2)        | in(3)           | in(4)           | in(5)           | in(6)           | in(7)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | in(8)           | in(9)            | in(10)            | in(11)                | in(12)          | in(13)                | in(14)            | <b>1</b> |
| constant.5                                                               | const 3                                  | 11(1)       | 11(2)        | 11(3)           | 11(1)           | 11(3)           | in(o)           | u1(7)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 11(0)           | 11(3)            | 11(10)            | "(11)                 | 11(12)          | 11(13)                | 11(1)             |          |
| register, 16                                                             | consc_o                                  | Y(0,0)      |              |                 |                 |                 | idet 1d SUB r   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  |                   |                       |                 |                       | idet 1d SUB       |          |
| register, 18                                                             | 1                                        | 1(0,0)      | Y(0 1)       |                 |                 |                 | lucc_10_000_1   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 | Y(1.0)           |                   |                       |                 |                       | Y(1.5)            | =        |
| mul\$s\$s\$s\$2cvcle.6                                                   | -                                        |             | 1(0,1)       | multetetetet    |                 | Y(0,0)          | [5:30]          |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 | 1(1,0)           | multetetet        |                       | multotototo     |                       | multetetet        |          |
| constant, 13                                                             |                                          |             |              | idct 1d rot     |                 |                 | maigogogogen    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  | 11010000000000    |                       | 1101000000000   |                       | 11014040404042111 |          |
| constant. 14                                                             |                                          |             |              | idct_1d_rot_n   | •               |                 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  |                   |                       |                 |                       |                   |          |
| mul\$s\$s\$s\$2cvcle.7                                                   |                                          |             |              | mul\$s\$s\$s\$2 |                 |                 |                 | mul\$s\$s\$s\$2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  | mul\$s\$s\$s\$2   |                       | mul\$s\$s\$s\$2 |                       | mul\$s\$s\$s\$2   |          |
| register.20                                                              |                                          |             |              | Y(0.2)          |                 | Y(0,4)          | Y(0.5)          |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Y(0.7)          |                  | Y(1,1)            |                       |                 |                       |                   |          |
| register.22                                                              |                                          |             |              |                 | Y(0.3)          |                 | . (             | Y(0.6)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                 |                  | idct 1d SUB       |                       | Y(1.3)          |                       |                   |          |
| constant.3                                                               |                                          |             |              |                 | const 23170     |                 |                 | . (5757                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |                 | •                |                   |                       | .(2/2)          |                       |                   |          |
| mul\$s\$s\$s\$2cvde.8                                                    |                                          |             |              |                 | mul\$s\$s\$s\$2 |                 |                 | mul\$s\$s\$s\$2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  | mul\$s\$s\$s\$2   |                       | mul\$s\$s\$s\$2 |                       | mul\$s\$s\$s\$2   |          |
| add\$s\$s\$s\$1cvde.2                                                    |                                          |             |              |                 |                 | add\$s\$s\$s\$1 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 | add\$s\$s\$s\$1  |                   |                       | add\$s\$s\$s\$1 | .add\$s\$s\$s\$1      | add\$s\$s\$s\$1   |          |
| sub\$s\$s\$s\$1cvde.12                                                   |                                          |             |              |                 |                 | sub\$s\$s\$s\$1 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | sub\$s\$s\$s\$1 | .sub\$s\$s\$s\$1 | . sub\$s\$s\$s\$1 | . sub\$s\$s\$s\$1.    | sub\$s\$s\$s\$1 | sub\$s\$s\$s\$1       |                   |          |
| register.24                                                              |                                          |             |              |                 |                 | idct 1d rot     |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 | idct 1d rot      |                   |                       | idct 1d rot     |                       | idct 1d rot       |          |
| constant.8                                                               |                                          |             |              |                 |                 | const 8192      |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  | •                 |                       |                 |                       |                   |          |
| register.25                                                              |                                          |             |              |                 |                 | idct 1d rot     | dct 1d CMU.     |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | dct 1d CMU.     | idct 1d rot      |                   |                       | idct 1d rot     |                       | dct 1d CMU        |          |
| add\$s\$s\$s\$1cycle.0                                                   |                                          |             |              |                 |                 | add\$s\$s\$s\$1 | add\$s\$s\$s\$1 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | add\$s\$s\$s\$1 | .add\$s\$s\$s\$1 | .add\$s\$s\$s\$1  | .add\$s\$s\$s\$1.     | add\$s\$s\$s\$1 | .add\$s\$s\$s\$1      | add\$s\$s\$s\$1   |          |
| add\$s\$s\$s\$1cvde.1                                                    |                                          |             |              |                 |                 | add\$s\$s\$s\$1 |                 | •                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | add\$s\$s\$s\$1 | .add\$s\$s\$s\$1 |                   |                       | add\$s\$s\$s\$1 | .add\$s\$s\$s\$1.     |                   |          |
| constant.2                                                               |                                          |             |              |                 |                 |                 | const 14        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  |                   |                       |                 |                       |                   |          |
| register.28                                                              |                                          |             |              |                 |                 |                 | idct 2276       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 | dct 1d ADD       | . idct 2277       |                       |                 | idct 2312             | idct 2300         | ic       |
| register.30                                                              | 1                                        |             |              |                 |                 |                 | idct 1d ADD     | r in the second s |                 |                  |                   |                       |                 | dct 1d ADD.           |                   |          |
| sra\$s\$s\$s\$1cycle.11                                                  | 1                                        |             |              |                 |                 |                 | sra\$s\$s\$s\$1 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  | sra\$s\$s\$s\$1   |                       |                 | sra\$s\$s\$s\$1       | . sra\$s\$s\$s\$1 | SI       |
| sra\$s\$s\$s\$1cycle.10                                                  | 1                                        |             |              |                 |                 |                 | sra\$s\$s\$s\$1 | sra\$s\$s\$s\$1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 | sra\$s\$s\$s\$1  | sra\$s\$s\$s\$1   |                       |                 | sra\$s\$s\$s\$1       | . sra\$s\$s\$s\$1 | SI       |
| register.29                                                              | 1                                        |             |              |                 |                 |                 | idct_2278       | idct_2282                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                 | idct_2284        | dct_1d_ADD        |                       | idct_1d rot     | idct 2303             | dct_1d_ADD        |          |
| register.36                                                              | 1                                        |             |              |                 |                 |                 | _               | idct 1d rot                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                 |                  |                   |                       |                 |                       | idct 1d rot       |          |
| register.35                                                              | 1                                        |             |              |                 |                 |                 |                 | idct 1d rot                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                 |                  |                   |                       |                 |                       | idct 1d rot       |          |
| register.40                                                              | 1                                        |             |              |                 |                 |                 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | dct 1d CMU      |                  | dct 1d CMU        | idct 1d rot .         | idct 1d rot     |                       |                   | ic       |
| register.44                                                              | 1                                        |             |              |                 |                 |                 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 | idct_1d_SUB      |                   | idct_1d_SUB           |                 |                       |                   |          |
| constant.9                                                               | 1                                        |             |              |                 |                 |                 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  | idct_1d_rot       |                       |                 |                       |                   |          |
| constant.10                                                              | 1                                        |             |              |                 |                 |                 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  | idct_1d_rot       |                       |                 |                       |                   |          |
| constant.12                                                              | 1                                        |             |              |                 |                 |                 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  | idct_1d_rot       |                       |                 |                       |                   |          |
| mul\$s\$s\$s\$2cycle.15                                                  | 1                                        |             |              |                 |                 |                 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  | mul\$s\$s\$s\$2   |                       | mul\$s\$s\$s\$2 |                       | mul\$s\$s\$s\$2   | <b>-</b> |
| register.51                                                              | 1                                        |             |              |                 |                 |                 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  | idct_2274         |                       |                 |                       | dct_1d_ADD        | ~        |
|                                                                          | <                                        |             |              |                 |                 |                 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  |                   |                       |                 |                       | >                 | 2        |
|                                                                          | (-)                                      |             | _            |                 |                 |                 |                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                 |                  |                   |                       |                 |                       |                   |          |

Time used for creating gantt diagram : 281 ms

## **GAUT: Interface synthesis**

| 📴 GAUT 2                                                                                                                                                                                                                | 2.4.3 build                                                                                                                         | 17/02/2010       | - Lab-STICC, UBS                         | University, Lorient (France)                                                   |                       |                             |                          |  |  |  |  |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|------------------|------------------------------------------|--------------------------------------------------------------------------------|-----------------------|-----------------------------|--------------------------|--|--|--|--|
|                                                                                                                                                                                                                         |                                                                                                                                     | A 🖬 (            | 👏 🦎 🔞 🕑 bitw                             | idth aware Library: notech_16b                                                 | 6                     |                             | UMR 3492                 |  |  |  |  |
| Opened file<br>ComGen                                                                                                                                                                                                   | : C:\GAUT_2                                                                                                                         | 2_4_3_testnew    | /\test\idct_soclib\id                    | ct.cfg                                                                         |                       |                             |                          |  |  |  |  |
| Configuration                                                                                                                                                                                                           | ion #ucomgen -e idct -i "C:\GAUT_2_4_3_testnew\test\idct_soclib\idct.mem" -cfg "C:\GAUT_2_4_3_testnew\test\idct_soclib\idct.cfg" -b |                  |                                          |                                                                                |                       |                             |                          |  |  |  |  |
| Graph: ide                                                                                                                                                                                                              | Command line:                                                                                                                       |                  |                                          |                                                                                |                       |                             |                          |  |  |  |  |
| Mem : idct                                                                                                                                                                                                              |                                                                                                                                     |                  | generate UCOM wi                         | thout IO Constraints                                                           |                       |                             |                          |  |  |  |  |
| Cfg : idct                                                                                                                                                                                                              |                                                                                                                                     |                  | parse C:\GAUT_2_4_<br>parse C:\GAUT_2_4_ | 3_testnew\test\idct_soclib\idct.cfgok<br>3_testnew\test\idct_soclib\idct.memok |                       |                             |                          |  |  |  |  |
| O Use Inc                                                                                                                                                                                                               | out/Output Co                                                                                                                       | onstraints (Ioc) | check ports and IOs                      | ok                                                                             |                       |                             |                          |  |  |  |  |
| Ŭ .                                                                                                                                                                                                                     |                                                                                                                                     | Due              | generate VHDL idct_                      | ucom.vhdPort/Bus 1 has Input IO(s) : process it.                               | Interface             | Televetification & events   |                          |  |  |  |  |
|                                                                                                                                                                                                                         | -                                                                                                                                   | Bus              |                                          | Direction                                                                      | Interface             | Identification/Lengt        |                          |  |  |  |  |
|                                                                                                                                                                                                                         | 1                                                                                                                                   |                  |                                          | in<br>out                                                                      | FSL                   | auto                        | _                        |  |  |  |  |
|                                                                                                                                                                                                                         | 1                                                                                                                                   |                  |                                          | out                                                                            | FSL                   | auto                        |                          |  |  |  |  |
| Performances of interfaces depend on data locality (data<br>fetch penality, cache miss)<br>Interface can be:<br>- Ping pong buffer (scratch-pad on Local Memory Bus)<br>- FIFO (i.e. FSL Fast Simplex Link from Xilinx) |                                                                                                                                     |                  |                                          |                                                                                |                       |                             |                          |  |  |  |  |
|                                                                                                                                                                                                                         |                                                                                                                                     |                  |                                          |                                                                                |                       |                             | ~                        |  |  |  |  |
|                                                                                                                                                                                                                         |                                                                                                                                     |                  | <                                        |                                                                                |                       |                             | >                        |  |  |  |  |
| Status: St                                                                                                                                                                                                              | arting                                                                                                                              | : ucomge         | en -e idct                               | -i "C:\GAUT_2_4_3_testne                                                       | ew\test\idct_soclib\i | dct.mem" -cfg "C:\GAUT_2_4_ | 3_testnew\test\idct_socl |  |  |  |  |
# **GAUT: Test-bench generation**

| GAUT 2.4.3 build 17/02/                              | /2010 - Lab-STICC, UBS University, Lorient (Frar                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | ice)                                                                                                                    |                                       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | _ 7×   |
|------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------|---------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------|
| -JI.                                                 | 📑 🍯 🍖 🔯 🗹 bitwidth aware Library: noted                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | ch_16b 🗧                                                                                                                |                                       | Lab STICC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |        |
| Opened file : C:\GAUT_2_4_3_te                       | estnew/testlidct_soclib/idct.mem                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                                                                                                         |                                       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |        |
| Configuration<br>Mem : idct                          | - Compiling package idct_pack - Loading package idct_pack - Loading package idct_pack - Compiling antity idct_um - Compiling antitecture idct_um_arch of idct_um - Compiling architecture idct_top_arch of idct_top - Compiling architecture idct_stimuli - Compiling antitecture idct_stimuli_arch of idct_stimuli - Loading package textio - Compiling entity idct_probe - Compiling entity idct_probe_arch of idct_probe + Vision detables deta |                                                                                                                         | Test-b<br>Model                       | ench Generation<br>Isim Script Generation                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |        |
| Impl : bitwidth_notech                               | <ul> <li>Warning: testbench.vhd(1485): (vcom-1194) FILE declaratii</li> <li>Warning: testbench.vhd(1488): (vcom-1238) Shared variable</li> <li>Compiling entity idct_test</li> <li>Compiling entity idct_test</li> </ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | n was written using VHDL1<br>Messa<br>/idct_test/clk                                                                    | -No Data-                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |        |
| Simulate : PU 🔽                                      | - Compiling architecture ldct_test_arch of ldct_test                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | <pre>/idct_test/rsth     /idct_test/enable</pre>                                                                        | -No Data-<br>-No Data-                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |        |
| Stimuli : FILE                                       | #E:/modeltech_6.4f/win32/vsim -c -do script_compil.do => 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |                                                                                                                         | -No Data-<br>-No Data-                | Control of the first of the fir |        |
| Sti: idct                                            | Reading E:/modeltech_6.4f/tol/vsim/pref.tcl<br># 6.4f<br># do script_compil.do<br># Loading project idct                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | <ul> <li>/idct_test/idct_top</li> <li>/idct_test/idct_top</li> <li>/idct_test/idct_top</li> </ul>                       | No Data-<br>No Data-<br>No Data-      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |        |
| <ul> <li>✓ Result File</li> <li>✓ Warning</li> </ul> | #E:/modeltech_6.4f/win32/vsim -c -do script_modelsim.val -q                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | <pre>/idct_test/idct_top<br/></pre>                                                                                     | No Data-<br>No Data-<br>No Data-      | <u>s</u> (a)  (a)  (a)  (b)  (a)  (b)  (b)  (b)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |        |
| Comp lib<br>Choose vsim directory :                  | Reading E:/modeltech_6.4f/tol/vsim/pref.tol<br># 6.4f<br># vsim_do.script_modelsim_val_co.ou/jet work idot_test                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | <u>-</u> /idct_test/idct_top <u>-</u> /idct_test/idct_top <u>-</u> /idct_test/idct_top <del>-</del> /idct_test/idct_top | No Data-<br>No Data-<br>No Data-      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |        |
| E:/modeltech_6.4f/win32/                             | # ** Note: (vsim-3812) Design is being optimized<br># // ModelSim SE 6.4f Oct 22 2009<br># //<br># // Copyright 1991-2009 Mentor Graphics Corporation                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | <ul> <li> /idct_test/idct_top</li> <li> /idct_test/idct_top</li> <li> /idct_test/idct_top</li> </ul>                    | No Data-<br>No Data-<br>No Data-      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |        |
|                                                      | # // All Rights Reserved.<br># //<br># // THIS WORK CONTAINS TRADE SECRET AND<br># // PROPRIETARY INFORMATION WHICH IS THE PROPER                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | <pre>//dct_test/idct_top<br/>//dct_test/idct_top<br/>//dct_test/idct_top</pre>                                          | -<br>No Data-<br>No Data-<br>No Data- |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |        |
| Control Result                                       | #// OF METTOR GRAPHICS CORPORTION OR ITS LICEN<br>#// AND IS SUBJECT TO LICENSE TERMS.<br>#//                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | <pre>//det_test/idet_top<br/>//det_test/idet_top<br/>//det_test/idet_top</pre>                                          | No Data-<br>No Data-                  |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | NAT    |
|                                                      | # do sonpt_modelsim.val<br># 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | <pre>//dct_test/idct_top<br/>//dct_test/idct_top<br/>//dct_test/idct_top</pre>                                          | No Data-<br>No Data-<br>No Data-      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | ~      |
| Status: You can see t                                | the results here or in the file                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                                                                                                         | <u>-1% Data-</u><br>w 990 ns          | C G BB PRG URBERGERGE Guilden auf and an and a second se                        | 1000   |
|                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Cursor                                                                                                                  | 1 <u>1021 ns</u>                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 1021 1 |

#### GAUT: more than 100 downloads each year



### **Outline**

#### □ Lab-STICC

#### □ General context

#### □ High-Level Synthesis

- Brief introduction
- "In details"

#### 

- Overview
- Results
- **Conclusion**
- **References**

#### **Experimental results: MJPEG decoding**



#### Block Diagram of mjpeg baseline decoder

| Function                      | Time ratio |
|-------------------------------|------------|
| IDCT                          | 43,41%     |
| yuv2rgb                       | 15,07%     |
| Entropy decoding              | 8,41%      |
| DeQuantization                | 5,10%      |
| others (each function is <5%) | 28,01%     |

**Execution time ratio for software MJPEG decoding** *(by using gprof)* 

### **Resource estimation for IDCT**



### **Resource estimation for IDCT**





#### SpeedUp = Logic Synthesis Time/HLS Synthesis Time





# **Synthesis results**

| Parallelism 1 (read/write 32 bits) |       |                                                                     |     |      |        |      |          |            | Parallelism 2 (read/write 64 bits) |               |         |      |      |      |      |          |         |
|------------------------------------|-------|---------------------------------------------------------------------|-----|------|--------|------|----------|------------|------------------------------------|---------------|---------|------|------|------|------|----------|---------|
|                                    | Opera | Deperators Reg (1 bit flip flop) Mux 2:1 Area (slices) Latency Freq |     | С    | perato | or   |          | Reg (1 bit | Mux 2:1                            | Area (slices) | Latency | Freq |      |      |      |          |         |
| add                                | mult  | sra                                                                 | sub |      |        |      | (0)0103) | (10112)    | add                                | mult          | sra     | sub  |      |      |      | (0)0103) | (10112) |
| 2                                  | 3     | 1                                                                   | 1   | 2653 | 3236   | 7033 | 129      | 123,5      | 4                                  | 7             | 3       | 2    | 2904 | 2965 | 9409 | 97       | 126     |
| 2                                  | 2     | 1                                                                   | 1   | 2818 | 3525   | 6948 | 188      | 128,1      | 2                                  | 3             | 1       | 1    | 2942 | 3268 | 7863 | 156      | 120,2   |
| 1                                  | 2     | 1                                                                   | 1   | 3304 | 3905   | 6988 | 228      | 124        | 2                                  | 3             | 1       | 1    | 3112 | 3300 | 8101 | 196      | 128,9   |
| 1                                  | 1     | 1                                                                   | 1   | 2876 | 3858   | 6192 | 348      | 123,7      | 1                                  | 2             | 1       | 1    | 3429 | 3969 | 7529 | 316      | 128,4   |
| 1                                  | 1     | 1                                                                   | 1   | 2421 | 3938   | 6422 | 448      | 125,7      | 1                                  | 1             | 1       | 1    | 2880 | 3106 | 6498 | 416      | 121,9   |

|     | Parallelism 4 (read/write 128 bits) |     |     |            |         |               |          |               |  |
|-----|-------------------------------------|-----|-----|------------|---------|---------------|----------|---------------|--|
| 0   | Operator                            |     |     | Reg (1 bit | Mux 2:1 | Area (slices) | Latency  | Freq<br>(Mhz) |  |
| add | mult                                | sra | sub |            |         |               | (cycles) | (10112)       |  |
| 9   | 15                                  | 6   | 5   | 3459       | 2694    | 12070         | 33       | 138,9         |  |
| 3   | 4                                   | 2   | 2   | 3021       | 2947    | 9282          | 92       | 132,1         |  |
| 2   | 3                                   | 1   | 1   | 2917       | 3091    | 7812          | 132      | 128,7         |  |
| 1   | 2                                   | 1   | 1   | 3462       | 4257    | 7846          | 252      | 122,5         |  |
| 1   | 1                                   | 1   | 1   | 2850       | 3314    | 6719          | 352      | 120,8         |  |

**IDCT** 

#### YUV2RGB

| Parallelism 1 (read/write 32 bits) |               |          |        |  |  |  |
|------------------------------------|---------------|----------|--------|--|--|--|
| Reg (1 bit                         | Area (slices) | Latency  | Freq   |  |  |  |
| flip flop)                         | Alea (Silces) | (cycles) | (Mhz)  |  |  |  |
| 388                                | 525           | 12       | 249,18 |  |  |  |
| 362                                | 524           | 13       | 282,11 |  |  |  |
| 272                                | 462           | 14       | 188,96 |  |  |  |
| 238                                | 460           | 15       | 188,96 |  |  |  |

# **Synthesis results**

|   | Parallelism 1 (read/write 32 bits) |                      |                       |     |                          |         |               |                     |               |   | Parallelism 2 (read/write 64 bits) |                       |                  |     |                          |         |               |                     |               |  |
|---|------------------------------------|----------------------|-----------------------|-----|--------------------------|---------|---------------|---------------------|---------------|---|------------------------------------|-----------------------|------------------|-----|--------------------------|---------|---------------|---------------------|---------------|--|
|   | add                                | Opera<br><i>mult</i> | ators<br>. <i>sra</i> | sub | Reg (1 bit<br>flip flop) | Mux 2:1 | Area (slices) | Latency<br>(cycles) | Freq<br>(Mhz) |   | O<br>add                           | perato<br><i>mult</i> | or<br><i>sra</i> | sub | Reg (1 bit<br>flip flop) | Mux 2:1 | Area (slices) | Latency<br>(cycles) | Freq<br>(Mhz) |  |
| 4 | 2                                  | 3                    | 1                     | 1   | 2653                     | 3236    | 7033          | 129                 | 123,5         |   | 4                                  | 7                     | 3                | 2   | 2904                     | 2965    | 9409          | 97                  | 126           |  |
|   | 2                                  | 2                    | 1                     | 1   | 2010                     | 0525    | 6940          | 188                 | 128,1         |   | 2                                  | 3                     | 1                | 1   | 2942                     | 3268    | 7863          | 156                 | 120,2         |  |
| / | 1                                  | 2                    | 1                     | 1   | 3304                     | 3905    | 6988          | 228                 | 124           | ſ | 2                                  | 3                     | 1                | 1   | 3112                     | 3300    | 8101          | 196                 | 128,9         |  |
|   | 1                                  | 1                    | 1                     | 1   | 2876                     | 3858    | 6192          | 348                 | 123,7         | ľ | 1                                  | 2                     | 1                | 1   | 3429                     | 3969    | 7529          | 316                 | 128,4         |  |
|   | 1                                  | 1                    | 1                     | 1   | 2421                     | 3938    | 6422          | 448                 | 125,7         | [ | 1                                  | 1                     | 1                | 1   | 2880                     | 3106    | 6498          | 416                 | 121,9         |  |

Virtual prototyping

IDCT

|        |          |              |   | Para | lelism 4 ( | read/wri | te 128 bits)  |                     | į             |  |
|--------|----------|--------------|---|------|------------|----------|---------------|---------------------|---------------|--|
|        | Operator |              |   |      | Reg (1 bit | Mux 2:1  | Area (slices) | Latency<br>(cycles) | Freq<br>(Mhz) |  |
|        | add      | muit sra sub |   | sup  |            |          |               | (0) 0.00)           | (101112)      |  |
| $\leq$ | 9        | 15           | 6 | 5    | 3459       | 2694     | 12070         | 33                  | 138.9         |  |
|        | 3        | 4            | 2 | 2    | 3021       | 2947     | 9282          | 92                  | 132,1         |  |
|        | 2        | 3            | 1 | 1    | 2917       | 3091     | 7812          | 132                 | 128,7         |  |
|        | 1        | 2            | 1 | 1    | 3462       | 4257     | 7846          | 252                 | 122,5         |  |
|        | 1        | 1            | 1 | 1    | 2850       | 3314     | 6719          | 352                 | 120,8         |  |
|        |          |              |   |      |            |          |               |                     |               |  |

Hardware prototyping

|          | Parallelism 1 (read/write 32 bits) |               |                     |               |  |  |  |  |  |
|----------|------------------------------------|---------------|---------------------|---------------|--|--|--|--|--|
|          | Reg (1 bit<br>flip flop)           | Area (slices) | Latency<br>(cvcles) | Freq<br>(Mhz) |  |  |  |  |  |
|          | 388                                | 525           | 12                  | 249,18        |  |  |  |  |  |
| 10121(00 | 362                                | 524           | 13                  | 282,11        |  |  |  |  |  |
|          | 272                                | 462           | 14                  | 188,96        |  |  |  |  |  |
|          | 238                                | 460           | 15                  | 188,96        |  |  |  |  |  |

#### SoCLib: a virtual prototyping platform

#### □ French National Research Project (ANR)

□ Free and open source virtual prototyping environment

- Library of SystemC simulation models
- Hardware components
  - □ CPUs, HW-ACCs, memories, busses
  - VCI/OCP interface protocol is used

#### Two types of model are available for each HW component

- □ CABA (Cycle Accurate / Bit Accurate)
- □ TLM-DT (Transaction Level Modeling with Distributed Time)

#### Software components

□ OS, API....

#### Associated tools

- Simulation, configuration, debug
- Automatic generation of simulation models



# GAUT is used, to generate simulation models of HW-ACC CABA and TLM-DT

### **SoCLib: Design flow**





#### Pure software implementation on a mono-processor architecture





#### Parallelized software implementation on a multiprocessor architecture





# **MJPEG** Results



Execution time of the application (in cycles) to process 50 images of 48\*48 pixels

IDCT generated by GAUT reduces the application latency by 14%

Parallelization of the application on 4 CPUs reduces the latency by 21%

# **MJPEG Results**



The 4 HW IDCT in the multiprocessor architecture further reduce the latency by 10%

Execution time of the application (in cycles) to process 50 images of 48\*48 pixels

## **MJPEG** Results



125/68

# **MJPEG: Hardware prototyping**

#### **Real time decoding: 24 QCIF images/sec**

- IDCT: maximum I/O bandwidth (4 parallel input ports) and the lower latency (33 cycles, Freq. 138,9Mhz)
- YUV2RGB: minimum latency (12 cycles, Freq. 249,18Mhz)

#### **Compared to a pure SW implementation**

- 10x speed-up for the IDCT function
- 5x speed-up for the yuv2rgb function



#### SoC design on a FPGA Xilinx Virtex 5 LX110 (XUPV5) board

# Viola Jones: Hardware prototyping

Block Diagram of a Viola Jones Face detector



7x speed-up compared to a pure sw implementation



Rgb2gray

Contrast Enhancement

Noise Reduction

Canny Edge Detector

Face Detection

# **HLS for Hardware prototyping**

Slope detection : acos (cordic) hwpu Texture detection: gaussian filter and square root hwpu SpeedUp >= 140Error <= 0.00006 Soc Leon3 interface (AHB, Grlib) SpeedUp 200 180 160 140 120 100 Error 80 SpeedUp 60 40 7,00E-05 20 6,00E-05 0 3,57E-01 4,08E-01 4,59E-01 5,10E-01 5,61E-01 1,02E-01 ,53E-01 5,00E-05 3E-01 0,00E+0C 5,10E-02 04E-01 ,55E-01 ,06E-01 ĺ2 Е-01 9,18E-01 9,69E-01 ,69E-01 65E-0 67E-0 4,00E-05 4 ω ര് ര് 3,00E-05 Error 2,00E-05 ieee754 acos(x) \* For  $|x| \le 0.5 \operatorname{acos}(x) = pi/2 - (x + x^{*}x^{2}R(x^{2}))$ 1,00E-05 \* where 0,00E+00 ,10E-02 \*  $R(x^2)$  is a rational approximation 0,00E+00 ,02E-01 <u>Е-</u>01 <u>Е-</u>01 <u>Е</u>-01 <u>Е</u>-01 1-0-1-0-1-0-<u>Е</u>-01 Е-01 Щ-01 Щ 2,55E-01 ЧÓ ò ш Ò ш of  $(asin(x)-x)/x^3$ 3,57E 9 3,061 4,08 591 10 6 4 8,67 ò \* For x>0.5 4 ഹ് ശ് ശ \*  $a\cos(x) = pi/2 - (pi/2 - 2a\sin(sqrt((1-x)/2)))$  $2^{-n} \approx 0.00001 \le error \le 2^{-(n-2)} \approx 0.00006$ \*/ n =16, number of rotation

# **Prototyping platform**

#### Sundance platform

Mother board

Daughter boards DSP C62 C67 (Texas Instrument) FPGA Virtex 1000E (Xilinx)

Interconnection matrix *Point to point links : Com Port (CP, up to 20 Mbytes/sec) and Sundance Digital Bus (SDB, up to 200 Mbytes/sec)* 



# **DVB-DSNG receiver architecture mapping**



### **DVB-DSNG** receiver

- Synchronization and interleaving : Sw : C62 DSP
- Viterbi and Reed Solomon decoders : Hw : Virtex-1000E FPGA
- 4 SDB links
- 26 Mbps throughput (limited by the synchronization bloc...C64 for higher throughputs)



# Viterbi decoding

• functional/application parameters : state number, throughput

| State Number                | 8   | 16  | 32   | 64   | 128  |
|-----------------------------|-----|-----|------|------|------|
| Throughput (Mbps)           | 44  | 39  | 35   | 26   | 22   |
| Synthesis Time (s)          | 1   | 1   | 3    | 9    | 27   |
| Number of logic<br>elements | 223 | 434 | 1130 | 2712 | 7051 |

• DVB-DSNG standard : throughput : 1.5 to 72 Mbps, 64 states Viterbi decoder



### **Reed Solomon decoding**

• functional/application parameters : number of input symbols, data symbols, throughput



• DVB-DSNG standard : 1.5 to 72 Mbps, RS (204/188) decoder











## **MPARM** Architecture



Luca Benini, Andrea Marongiu, Paolo Burgio, University of Bologna

138/68

### **Target architecture**



139/68

# **HWPU Integration**

#### Interface de communication

- Maître / Esclave
- Registres de configuration
  - Nombre d'entrées
  - □ Nombre de sorties
  - Emplacements des entrées
  - **Emplacements des sorties**

  - **Etat du HWPU**
  - Démarrage
- Les registres de configuration peuvent être doublés
  - **Recouvrement de la configuration et du calcul**

#### □ Interface de programmation

| Function name                                    | Brief description                                            |
|--------------------------------------------------|--------------------------------------------------------------|
| bool acc_busy ()                                 | Returns TRUE if no programming channel is available          |
| void acc_reset ()                                | Once a channel has been granted resets programming registers |
| <pre>void acc_set_input_count (int count)</pre>  | Sets number of inputs                                        |
| <pre>void acc_set_output_count (int count)</pre> | Sets number of outputs                                       |
| <pre>void acc_set_in_addrs (int addr)</pre>      | Sets current input parameter's address                       |
| <pre>void acc_set_out_addrs (int addr)</pre>     | Sets current input parameter's address                       |
| void acc_trigger ()                              | Initiates execution                                          |
| void acc_wait ()                                 | Waits for the HWPU to complete execution                     |



#### Example

```
void foo()
{
    int A, B, C;
    #pragma omp accelerate input(A, B) output(C)
    C = A + B;
}
```

#### **Example**



#### **Example**



#### **Results**



2 core

4 core

1 core

1 core

2 core

4 core

8 core

8 core
# GAUT 4 (not yet available, but soon...)

#### □ An open source HLS tool

For both data and control-dominated algorithms (CDFG)

#### Input :

- C/C++ bit-accurate integer sand fixed-points from Mentor Graphics
- SystemC : C and C++ lack the constructs and semantics to represent design hierarchy, timing, synchronization/concurrency
- Floating point

#### **Output : RTL Architecture**

- VHDL , Verilog
- SystemC (CABA + TLM)
- Resource and timing estimation
- Automated Test-bench generation
- Automated operators characterization
- Automated interface generation
  - AXI, AHB, FSL, ...

# GAUT 4 (not yet available, but soon...)

#### Constraints

Clock, I/O protocols, loop transformations (unrolling, merging, loop pipelining with Initiation Interval), memory mapping, function inlining, resource constraints

#### Objectives

- Minimization: area i.e. resources, latency, power consumption...
- Maximization: throughput

#### Keys features

- □ Used robust and state of the art compilation technology to extract instruction-level (Vectorization) and loop level parallelism (Polyhedral model: graphite for GCC, Polly for LLVM)
- □ Many scheduling strategies : modulo scheduling (SMS,IMS) , Force Directed List Scheduling (FDLS), System of difference constraint (SDC)...
- Memory analysis and optimizations: automatic partitioning of array elements to reduce conflicts and increase throughput
- □ Pattern mining for efficient resource sharing
- □ Hierarchy synthesis and function level parallelism/pipelining
- Design Space Exploration with directives (Loop transformation, memory partitionning) and constraints (script): one body of code, many hardware outcomes

# Conclusion

# HLS allows to automatically generate several RTL architectures

From an algorithmic/behavioral description and a set of constraints

#### □ HLS allows to generate

- VHDL models for synthesis purpose
- SystemC simulation models for virtual prototyping

#### HLS allows to explore the design space of

- Hardware accelerators
- MPSoC architectures including HW accelerators

#### GAUT is free downloadable at

http://lab-sticc.fr/www-gaut

# References

#### **HIGH-LEVEL SYNTHESIS**

Introduction to Chip and System Design

folited by

**Daniel D. Gajski** 



High Level Synthesis of ASICs Under Timing and Synchronization Constraints

David C. Ku Giovanni De Micheli

Kluwer Academic Publishers

# References





# **Academic tools**

- **Streamroller (Univ. Mich.)**
- **SPARK (UCSD)**
- **xPilot (UCLA)**
- □ UGH (TIMA+LIP6)
- □ MMALPHA (IRISA+CITI+...)
- **ROCCC (UC Riverside)**
- □ GAUT (UBS / Lab-STICC)

# **Commercial tools**

CatapultC (Mentor Graphics => Calypto)

- PICO (Spin-off HP => Synfora => Synopsys)
- **Cynthecizer (Forte design)**
- **Cyber (NEC)**
- AutoPilot (AutoESL => Xilinx)
- **C** to Silicon (Candence)
- Synphony (Synopsys)







# Une introduction à la synthèse de haut-niveau

# *(ou comment générer des architectures matérielles à partir du langage C)*

### Université de Bretagne-Sud Lab-STICC

#### Philippe COUSSY philippe.coussy@univ-ubs.fr