Single Precision Floating Point Multiplier
©2017
Textbook
67 Pages
Summary
The Floating Point Multiplier is a wide variety for increasing accuracy, high speed and high performance in reducing delay, area and power consumption. The floating point is used for algorithms of Digital Signal Processing and Graphics. Many floating point multipliers are used to reduce the area that perform in both the single precision and the double precision in multiplication, addition and subtraction.
Here, the scientific notations sign bit, mantissa and exponent are used. The real numbers are divided into two components: fixed component of significant range (lack of dynamic range) and exponential component in floating point (largest dynamic range). The authors convert decimal to floating point and normalize the exponent part and rounding operation to reduce latency. The mantissa of two values are multiplied and the exponent part is added. The sign results with exclusive-or are obtained. Then, the final result of shift and add floating point multiplier is compared with booth multiplication.
Here, the scientific notations sign bit, mantissa and exponent are used. The real numbers are divided into two components: fixed component of significant range (lack of dynamic range) and exponential component in floating point (largest dynamic range). The authors convert decimal to floating point and normalize the exponent part and rounding operation to reduce latency. The mantissa of two values are multiplied and the exponent part is added. The sign results with exclusive-or are obtained. Then, the final result of shift and add floating point multiplier is compared with booth multiplication.
Excerpt
Table Of Contents
2.2 Floating point operation in fast fourier
transform
4
2.3 Multiplication using carry save multiplier
5
2.4 Parallel implementation of floating point
5
2.5 Normalization of floating point
6
2.6 Configurable booth multiplier
6
2.7 Implementation on FPGA
7
2.8 Different multipliers
7
2.9 Double precision floating point
8
3
FLOATING POINT
MULTIPLICATION
OPERATION
9
3.1 Fraction
9
3.2 Representation of floating point
multiplication
10
3.3 Floating point algorithm
11
3.3.1Convert 2.625 to floating point format
11
3.3.2Adding an exponent part to binary
number
11
3.3.3Normalization
12
3.3.4Mantissa
12
iv
3.4 Multiplication operation
13
3.5 Multiplication of mantissa
14
3.6 Adding the exponents
15
3.7 Calculation
16
4
SIMULATION IN ISE
17
4.1 VHDL of adding the exponents
18
4.2 Multiplication
19
4.3 Multiplication of mantissa
19
5
SCHEMATIC GENERATED IN ISE
21
5.1 Actual Schematic
21
5.2 Schematic for Mantissa
21
5.3 Magnified image of Mantissa
22
5.4 Schematic for Exponent
22
5.5 Schematic for Sign
23
6
SIMULATION AND CALCULATION OF
POWER IN ISE DESIGN
24
6.1 Synthesis power
24
6.1.1 Device Static Power
25
6.1.2 Design Power
25
6.1.3 Power-On Current
25
6.1.4 Total On-Chip Power
26
v
6.1.5 Off-Chip Power
26
6.2 Simulation for top module
26
7
BOOTH MULTIPLICATION OPERATION
27
7.1 Algorithm for booth multiplication
27
7.2 Multiplication operation
28
8
SIMULATION
30
9
SCHEMATIC GENERATED
32
9.1 Actual Schematic
32
9.2 Schematic for booth multiplier
33
9.3 Schematic for Exponent
33
9.4 Schematic for Sign
34
10
SIMULATION AND POWER
35
10.1 Comparison of floating point multiplier and
booth multiplier
36
11
IMPLEMENTATION IN FPGA KIT
37
11.1 RTL Schematic
37
11.2 Process
38
11.3 The input to control FPGA kit
38
12
CONCLUSION
40
REFERENCES
41
APPENDIX
43
vi
LIST OF TABLES
TABLE NO
TITLE
PAGE NO
1.1
Data formats for precision
2
3.1
Basic formats for IEEE standards
9
3.2
Example result in 8bit and 32bit
13
3.3
Multiplication operation
14
3.4
Multiplication of Mantissa
14
3.5
Result in 32 bit
15
3.6
Final results of sign, exponent and
mantissa
16
6.1
Power analysis
25
10.1
Comparison of floating point array and
booth multiplier
36
vii
LIST OF FIGURES
FIG NO
TITLE
PAGE NO
1.1
Floating point representation
2
3.1
Block diagram of floating point multiplier
10
4.1
Simulation of adding exponents
18
4.2
Multiplication
19
4.3
Simulation of Multiplying Mantissa
20
5.1
Actual Schematic
21
5.2
Schematic for Mantissa
21
5.3
Magnified image of Mantissa
22
5.4
Schematic for Exponent
22
5.5
Schematic for Sign
23
6.1
Synthesis power
24
6.2
Simulation for top module
26
7.1
Block diagram of Booth multiplier
27
8.1
Simulation of booth multiplier
31
9.1
Actual Schematic
32
9.2
Schematic for Booth multiplier
33
9.3
Schematic for Exponent
33
viii
9.4
Schematic for Sign
34
10.1
X power analyzer for Booth multiplication
35
11.1
RTL Schematic
37
11.2
The input to control FPGA kit
38
11.3
The output to control FPGA kit
39
ix
LIST OF ABBREVIATIONS
LZA
- Leading Zero Anticipation
Model SIM
- Model Simulation
FPGA
-Field Programmable Gate Array
IEEE
- Institute of Electrical and Electronic Engineer
ASIC
- Application-Specific Integrated Circuit
RTL
- Register Transfer Level
VHDL
-Very high speed integrated circuit Hardware Description
Language
VHSIC
-Very High Speed Integrated Circuit
ISE
- Integrated Software Environment
x
CHAPTER 1
INTRODUCTION
The demand of floating point multiplier is more in Three-Dimensional (3D)
array and also used in graphics and image processing. Fast Fourier Transform
(FFT), Discrete Cosine Transform (DCT) and Butterfly operations are needed of
floating point numbers [1]. Due to output data size is twice larger than the input
data size so complexity, area and time are consumed by the multipliers. The best
design challenge to get high speed working is in Field Programmable Gate Array
(FPGA). The floating point shows the base, the location, the precision and it
normalized or not. There are many models for multiplication floating point.
Precision is the main role in floating point.
We deal with both single and double precision floating point. The main
significand of floating point number are (Sign bit * Mantissa * Base
Exponent
). The
single precision has 24-bits which contain 0 to 31, left to right and double precision
have 64-bits which contain 0 to 63, left to right [2]. The difference of these two
precision is data, the double precision has twice the data of RAM, Cache and Band
Width and reduce the performance. The result of sign bit by XOR and carry save
adder used for two exponent components.
1.1
Format Parameters
The implementation of hardware and software has basic IEEE format. In the
standard IEEE format the floating points are in binary number. The binary floating
point numbers are single precision and double precision. The single precision
contains 32 bits and the precision which adds the fraction and hidden bits 23+1,
exponent bit 8 is used. The maximum and the minimum values from +127 to -126.
The logic utilization of double precision is more by 49% when compared to single
1
precision format. However double precision fused unit exhibits high level of
precision when compared to single precision representation.
1.2
Data formats for single and double precision
Table1.1: Data formats for Precision
PRECISION SIGN
BIT
BIASED
EXPONENT
UNSIGNED
FRACTION
Single
S
8 bit E
23 bits P
Double
S
11 bit E
53 bits P
For the double precision, contain 64 bits and the precision which has 52+1,
exponent bit is 11 is used. The maximum and the minimum values from -126 to -
1022. For quadruple precision, hidden bits are 112+ 1 and the maximum and the
minimum value from -16382 to + 16383 [3].
1.3
Representation of the floating point
Figure1.1: Floating point representation
Source:
https://www.cise.ufl.edu/~mssz/CompOrg/CDA-arith.html
2
1.3.1 Denormalized
Exponent part contain zeros and fraction or significand contain non-zeros
denormalized is taken. Denormalized occur in zeros and lower normalized range
[3]. Zero is a special value for exponent field all zeros and fraction zeroes.
1.3.2 Overflow
Overflow occur limited range in smallest value and higher range in highest
value. It indicate the range when reach extreme value. It doesn't show the
indication when one operand is infinity. It must have the exact range [4]. When the
result reaches extreme range, bias should adjust and a NaN is delivered instead.
1.3.3 Underflow
Underflow takes place when floating point is smaller than the smallest value.
It may be negative or positive exponent from -128 to 127, when lesser than -128
underflow occur. The result may be zero or denormal [4]. There is loss of accuracy
after the denormalized numbers. Under flow adjust the result from overflow
delivery.
1.3.4 Infinity
The value of -infinity and +infinity used in exponent 0s and 1s. Sign bit for
positive 0 and negative 1 are used. It denotes infinity as special value for
operations to continue past overflow situations. It used undefined operations [5].
1.3.5 Not a Number
It is an invalid value when does not show the real number representation.
The exponent has 1s and the fraction has non-zeroes are taken in NaNs [6].
3
CHAPTER 2
LITERATURE SURVEY
2.1
FLOATING POINT MULTIPLIER USING VEDIC MATHEMATICS
Radhika Jumde, IJE [1], The research on the floating point multiplier
"Design of 32 bit single precision floating point multiplier using vedic
mathematics" is described. To improve delay a new algorithm called Urdhva-
Triyakbhyam will be design for the multiplier design. By using this approach
number of components will be decreased and complexity of hardware circuit will
also be decrease. In this project, Vedic multiplication technique will be used to
design IEEE 754 floating point multiplier. The sign bit of the result is calculated
using one XOR gate and a Carry Save Adder is used for adding the two biased
Exponents. The underflow and overflow cases are handled. The inputs to multiplier
will be design, synthesize and stimulate in VHDL using Xilinx ISE tool. This
result shows the high speed of multiplication with carry save adder.
2.2
FLOATING POINT OPERATION IN FAST FOURIER TRANSFORM
E. E. Swartzlander, Jr. and H. H. Saleh, 2012 [2], "FFT implementation with
fused floating point operation", IEEE Trans. This describes a floating-point fused
dot-product unit is presented that performs single-precision floating-point
multiplication and addition operations on two pairs of data in a time that is only
150% the time required for a conventional floating-point multiplication. When
placed and routed in a 45nm process, the fused dot-product unit occupied about
70% of the area needed to implement a parallel dot-product unit using
conventional floating-point adders and multipliers. The speed of the fused dot-
product is 27% faster than the speed of the conventional parallel approach. The
fused two-term dot-product multiplies two sets of operands and adds the products
4
as a single operation. The two products do not need to be rounded (only the sum is
normalized and rounded) which reduces the delay.
2.3
MULTIPLICATION USING CARRY SAVE MULTIPLIER
U. V. Chaudhari, Prof. A. P. Dhande, IJSER [3]. "Design and simulation of
binary floating point multiplier using VHDL". This paper shows the possible ways
to represent real numbers in binary format floating point numbers are represents
two floating point formats, Binary interchange format and Decimal interchange
format. To improve speed multiplication of mantissa is done using specific
multiplier replacing Carry Save Multiplier. To give more precision, rounding is not
implemented for mantissa multiplication. The binary floating point multiplier is
plane to do implement using VHDL and it is simulated and synthesized by using
Modal Sim and Xilinx ISE software. This multiplier doesn't implement rounding
and presents the significand multiplication result as is (48 bits), this gives better
precision if the whole 48 bits are utilized in another unit.
2.4
PARALLEL IMPLEMENTATION OF FLOATING POINT
S.Kishore, S.P.Prakash, IJIRSET [4], "The paper floating point fused add-
subtract and fused dot-product units" is presented that performs simultaneous
floating-point add and multiplication operations. It takes to perform a major part of
single addition, subtraction and dot-product using parallel implementation. This
unit uses the single-precision format and supports all rounding modes. The fused
add-subtract unit is only about 56% larger than a conventional floating-point
multiplier, and consumes 50% more power than the conventional floating-point
adder. The speed of the fused dot-product is about 27% faster than the
conventional parallel approach. This will combine to use for FFT algorithms
mainly. The simulation results are obtained using Xilinx 14.3 EDA tool.
The
5
proposed system reduces the shift amount and normalization is applied to reduce
the size of significand addition and LZA reduces the reduction tree.
2.5
NORMALIZATION OF FLOATING POINT
V.Narasimha Rao, V.Swathi [5], "Normalization on floating point
multiplication using Verilog HDL". This paper describes an efficient
implementation of an IEEE 754 single precision floating point multiplier targeted
for FPGA. VHDL is used to implement a technology-independent pipelined
design. The multiplier implementation handles the overflow and underflow cases.
Rounding is not implemented to give more precision when using the multiplier in a
multiply and Accumulate (MAC) unit. A latency optimized floating point unit
using the primitives of Xilinx Virtex II FPGA was implemented with a latency of 4
clock cycles. The multiplier reached a maximum clock frequency of 100 MHz.
2.6
CONFIGURABLE BOOTH MULTIPLIER
K. Sreenath, K. Shashidhar [6], "The design of low power and high speed
configurable booth multiplier".
This design of low power and high speed
configurable Booth multiplier (CBM) that supports single 16*16 multiplication,
single 8*8 multiplication, and twin parallel 8*8 multiplication operations. To
efficiently reduce power consumption, a dynamic-range detector is developed to
dynamically detect the effective dynamic ranges of two input operands. The
detection result is used not only to pick the operand with smaller dynamic range
for Booth encoding but also deactivate the redundant switching activities in
ineffective ranges as much as possible. Moreover, the output product of the
proposed multiplier can be truncated further which results in decrease power
consumption by sacrificing a bit of output precision. To obtain accurate results,
some additional components, including a sign-bit generator, a modified error
compensation circuit, dadda compressor, carry look ahead adder are also
6
developed. The results show that the proposed multiplier is complex than non-
CBMs, but significant power savings can be achieved. Furthermore, the paper
shows proposed multiplier maintains acceptable output accuracy where truncation
is performed.
2.7
IMPLEMENTATION ON FPGA
K.Kishore Shinde, A.K Kureshi [7], "Hardware implementation of
configurable booth multiplier on FPGA". This paper presents an efficient
implementation of a high performance configurable Radix-4 Booth multiplier with
3:2 compressors for both signed and unsigned 32 bit numbers multiplication & the
floating point arithmetic. Multiplication operation is a mostly used in many
scientific and signal processing applications. Thus it provides a flexible arithmetic
capacity and a better output precision and high speed, minimum area consumption.
The design also dynamically disables the switching operation of the non effective
input ranges. Thus the ineffective circuits can be efficiently deactivated, thereby
reducing power consumption and increasing the speed of operation. The proposed
design of multiplier out performs the conventional multiplier in terms of area and
speed efficiencies. The multiplier designs have been implemented on FPGA
Spartan6 XC6SLX9 platform.
VHDL code is written to generate the required
hardware and to produce the partial product for proposed booth multiplier. After
the successful compilation the RTL view generated.
2.8 DIFFERENT
MULTIPLIERS
Soniya, Suresh kumar [8], "A Review of Different Type of Multipliers and
Multiplier-Accumulator Unit". High speed and low power MAC unit is utmost
requirement of today's VLSI systems and digital signal processing applications
like FFT, Finite impulse response filters, convolution etc. In this paper, They have
discussed different types of multipliers like booth multiplier, combinational
multiplier, Wallace tree multiplier, array multiplier and sequential multiplier. Each
7
Details
- Pages
- Type of Edition
- Erstausgabe
- Publication Year
- 2017
- ISBN (PDF)
- 9783960676553
- ISBN (Softcover)
- 9783960671558
- File size
- 10.6 MB
- Language
- English
- Institution / College
- Mahalingam College of Engineering and Technology
- Publication date
- 2017 (June)
- Keywords
- Floating point representation Floating point multiplication Double precision Vedic mathematics Booth multiplication Booth multiplier Floating-point arithmetic
- Product Safety
- Anchor Academic Publishing