|
Double Precision Floating Point
Arithmetic Coprocessor IP Core
General
Description:
The FPAU-DP is a Double precision
Floating Point Coprocessor designed to assist CPU in performing the
floating point arithmetic computations. The FPAU-DP directly replaces C
software functions by equivalent and very fast hardware operations which
significantly accelerate system performance. The core does not require
any programming, so there is no need for any modifications being made in
the main software. Everything is done automatically during software
compilation by the FPAU-DP C driver.
The FPAU-DP is designed to operate with the
DP8051 microcontroller as
well as any other 8-, 16- and 32-bit processor. Drivers for all
popular 8051 C compilers are delivered together with the FPAU-DP
package.
The FPAU-DP uses the specialized algorithms to compute math
functions. It supports addition, subtraction, multiplication,
division, square root, comparison. It has built-in conversion
instructions from integer type to floating point type and vice versa.
The input numbers format is according to IEEE-754 standard.
The FPAU-DP supports double and single precision real numbers,
8-bit, 16-bit and 32-bit integers. The FPAU-DP is prepared to use with
8-, 16- and 32-bit processors.
The FPAU-DP is a technology independent design that can be implemented
in a variety of process technologies.
|
Key Features:
►
Direct replacement for C double, float software functions such as:
+, -, *, /,==, !=,>=, <=, <, >
►
Configurability of all available functions
► C
interface supplied for all popular compilers: GNU C/C++, 8051 compilers
►
No programming required
►
IEEE-754 Double precision real format support double type
►
IEEE-754 Single precision real format support float type
►
8-bit, 16-bit 32-bit and 52-bit integers format supported integer
types
►
Flexible arguments and result registers location
►
Performs the following functions:
o FADD, FSUB addition, subtraction
o FMUL, FDIV multiplication, division
o FSQRT square root
o FXAM examine input data
o FUCOM comparison
o FCLD, FILD 8-bit, 16-bit integer to double
o FLLD, FELD 32-bit, 52-bit integer to double
o FCST, FIST double to 8-bit, 16-bi integer
o FLST, FEST double to 32-bit, 52-bit integer
o FFLD float to double
o FFST double to float
►
Exceptions built-in routines
►
Masks each exception indicator:
o Precision lack PE
o Underflow result UE
o Overflow result OE
o Invalid operand IE
o Division by zero ZE
o Denormal operand DE
►
Fully configurable
►
Fully synthesizable
►
Static synchronous design
►
Positive edge clocking and no internal tri-states
o Scan test ready
The table and
figures below illustrates the system with FPMU-DP performance
improvements for typical 32-bit RISCCPU.
The FPAU-DP floating point instructions performance has been compared to
standard C library functions delivered with every commercial C compiler.
Each program was executed in the same system environments. Number of
clock periods were measured between input data loading into work
registers and output result storing after operation. The results are
placed in tables below.
Improvement has been computed as a number of clock cycles required by
the CPU to compute FP operation, by the number of clocks required to
compute the same operation by system of CPU with FPAU-DP:

32-Bit RISC Based System:
The table
below shows performance improvements of the sample 32-bit-RISC CPU with
FPAU-DP, compared to the same system without the FPAU-DP coprocessor.
|
|
|
Function |
CPU CLK |
DFPMU_DP
CLK |
Improvement |
|
Arithmetic
operations |
- |
- |
- |
|
Addition |
1376 |
114 |
12.0 |
|
Subtraction |
1338 |
114 |
11.7 |
|
Multiplication |
1628 |
153 |
10.6 |
|
Division |
2964 |
197 |
15.0 |
|
Square Root |
3030 |
141 |
21.5 |
|
Total |
- |
- |
14.1 |
|
Trigonometric
operations |
- |
- |
- |
|
Sine |
18730 |
360 |
52.0 |
|
Cosine |
21798 |
360 |
60.8 |
|
Tangent |
37500 |
383 |
97.9 |
|
Arcs Tangent |
36790 |
467 |
78.7 |
|
Total |
- |
- |
72.4 |
|
Average speed
improvement: |
- |
- |
55.0 |
|
Units
Interface
Makes interface
between external device and core internal 32-bit modules. It contains
data, control and status registers. It can be configured to work with
8-, 16- and 32-bit processors..
1 - data bus can be configured as 8-, 16-
or 32- bit depends on processors bus size
2 - address bus is aligned to work with 8- (3:0),
16- (3:1) or 32- (4:2) bit processors
Control Unit
It manages
execution of all instructions and internal operation required to execute
particular function.
Align
It performs the
numbers analyze against IEEE-754 standard compliance. Information about
the data classes are passed as result to appropriate internal module.
Exponent
It performs
operations on exponent part of number. The addition, subtraction,
shifting, comparison and conversion operations are executed in this
module. It contains exponents and work registers.
Mantissa
It performs
operations on mantissa part of number. The addition, subtraction,
multiplication, division, square root, comparison and conversion
operations are executed in this module. It contains mantissas and work
registers.
Shifter
It performs
mantissa shifting during normalization, denormalization operations.
Information about shifted-out bits are stored for rounding process.
Performance:
FPAU-DP implementation results for ALTERA devices.
The all features have been included.
|
Implementation |
Speed
grade |
Logic Cells |
Frequency
[MHz] |
|
CYCLONE |
-6 |
3660 |
79 |
|
CYCLONE-II |
-6 |
3660 |
71 |
|
STRATIX |
-5 |
3660 |
84 |
|
STRATIX II |
-3 |
2800 |
110 |
FPAU-DP implementation results for XILINX devices.
The all features have been included.
|
Implementation |
Speed
grade |
Slices |
Frequency
[MHz] |
|
VIRTEX-II |
-6 |
2015 |
80 |
|
VIRTEX-II pro |
-7 |
2015 |
97 |
|
VIRTEX-4 |
-11 |
1975 |
93 |
|
Price:

|
|











|