Architectures for floating - point division

Nikmehr, Hooman

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/37996

Type:	Thesis
Title:	Architectures for floating - point division
Author:	Nikmehr, Hooman
Issue Date:	2005
School/Discipline:	School of Electrical and Electronic Engineering
Abstract:	Almost all recent microprocessors and DSP chips perform addition, subtraction, multiplication and division in hardware. However, studying their performance reveals that division is not carried out as fast as the other three operations. One investigation shows that while floating-point division, with about 3 % of the dynamic floating - point instruction count, seems to be a relatively unimportant instruction, it may cause about 40 % degradation to the overall system performance. Several mathematical algorithms have been developed over the past 50 years to perform division quickly, with high precision. However, only a few are suitable for implementation in VLSI. Among them, digit recurrence algorithms are the most widely accepted methods of performing floating - point division in the latest processors. A survey shows that out of 13 recent processors, 11 use SRT division ¹ for performing floating - point division. Investigations show that SRT division gives the best tradeoff between delay and area. Selecting SRT division for implementing floating - point division is a reasonable choice because, unlike the other class of division algorithms, i.e. functional, it produces a correctly rounded quotient conforming to the IEEE 754 standard. There are techniques for improving the performance of SRT division. Of these, increasing the speed of quotient digit selection ( QDS ), making the best balance between the radix and the redundancy factor, representing the partial remainder in a redundant form, converting the quotient from redundant to conventional form the on - the - fly and overlapping the division recurrence components are the most important. In this thesis a different method of implementing the QDS function is proposed. This approach, which is described mathematically and architecturally, is based on the new comparison multiples idea. Unlike the traditional implementation of the QDS function, which searches for the quotient digit in a lookup table, the proposed method calculates the quotient digit directly in sign and magnitude format. This approach almost halves the fan out of some critical path components, which therefore operate faster. Having received the truncated partial remainder, the QDS function compares it with truncated multiples of the divisor to find the range in which the partial remainder belongs. The results of the comparisons are converted to the magnitude of the quotient digit using simple logic called the coder. Concurrently, another circuit checks the truncated partial remainder to determine whether the quotient digit is negative. This circuit operates off the critical path since the comparison multiples based QDS function calculates the sign and magnitude of the quotient digit separately. Having applied these changes, a faster QDS function and consequently, a shorter critical path delay for the floating - point divider is obtained. Implementations of radix - 4 and radix - 16 floating - point dividers are investigated and optimised to further decrease the cycle time. The idea of comparison multiples is extended to radix 10 to implement a decimal floating - point divider complying with the IEEE 754R standard. To achieve this goal, decimal signed - digit arithmetic along with implementations of carry - free addition and subtraction are proposed. The original comparison multiples based implementation of high - radix SRT division is modified to suit radix 10. The binary and decimal implementations of comparison multiples based division are evaluated for delay. Using the method of logical effort, the radix - 4, radix - 16 and decimal floating - point dividers are found to be faster than corresponding circuits reported in the public literature. Note : ¹ SRT division is a type of non - restoring digit recurrence division.
Advisor:	Lim, Cheng-Chew Philips, Braden
Dissertation Note:	Thesis (Ph.D.)--University of Adelaide, School of Electrical and Electronic Engineering, 2005.
Subject:	Microprocessors Computer architecture Algorithms
Keywords:	computer architecture, computer algorithms
Provenance:	This electronic version is made publicly available by the University of Adelaide in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. This thesis may incorporate third party material which has been used by the author pursuant to Fair Dealing exception. If you are the author of this thesis and do not wish it to be made publicly available or If you are the owner of any included third party copyright material you wish to be removed from this electronic version, please complete the take down form located at: http://www.adelaide.edu.au/legals
Appears in Collections:	Research Theses

Files in This Item:

File	Description	Size	Format
01front.pdf		108.31 kB	Adobe PDF	View/Open
02whole.pdf		1.17 MB	Adobe PDF	View/Open

Show full item record

Adelaide Research & Scholarship