1 of 28

1

Number Representation

krraju.in

2 of 28

  • How to represent data in a computer?
  • What is fixed-point and floating point number representation?
  • How to represent signed numbers?
  • What is IEEE 754 Standard?

2

Digital Computers and Arithmetic: Historical perspective and von Neumann computers, Fixed and floating-point representation of numbers, Addition and Subtraction, Multiplication and Division algorithms, Floating- point arithmetic

operations.

Unit-1

What you’ll learn

krraju.in

3 of 28

Digital Computers and Numbers

3

Numbers can be represented in multiple number base systems.

  • Decimal (base 10) system uses digits 0 to 9.
  • Computers use binary (base 2) system, as they are having digital components operating in two states - on and off. Basic unit of information is bit (binary digit) 0 and 1
    • Binary Coded Decimal: A decimal number represented by four binary bits, 0000 to 1001.
  • In computing, hexadecimal (base 16) or octal (base 8) number systems are used to represent binary numbers in a compact form.
    • Octal: 0-7 (Binary Coded Octal 000 to 111)
    • Hexadecimal: 0-9,A-F (Binary Coded Hexadecimal 0000 to 1111)

krraju.in

4 of 28

Representation of Numbers

4

Fixed-point

Number

Floating point

Unsigned

Signed

Signed-magnitude

Signed-1’s complement

Signed-2’s complement

(14) 0000 1110

(+14) 0000 1110

(-14) 1000 1110

(+14) 0000 1110

(-14) 1111 0001

(+14) 0000 1110

(-14) 1111 0010

±

m

×

2

±

e

Mantissa

Exponent

+

(.1001110)

×

2

+

4

2

01001110

000100

Fraction

Exponent

krraju.in

5 of 28

Fixed-point Representation

5

In integer numbers, radix point is fixed and assumed to be to the right of the rightmost digit.

  • As the radix point is fixed, the number system is referred to as fixed point number system.

Advantage

  • Consume less computing resources and are easy to perform arithmetic operations

Disadvantage

  • Relatively limited range of values

Fixed-point representation is convenient for representing numbers with bounded orders of magnitude.

krraju.in

6 of 28

Unsigned Numbers

6

The fixed point/ integer numbers are represented in signed and unsigned forms.

Unsigned representation

  • Used to represent positive numbers including zero.
  • Has an implied binary point between the integer and fraction bits analogous to decimal point.

2

1

4

3

0

+∞

krraju.in

7 of 28

Signed Numbers

7

Signed representation

  • Sign bit (0 for +ve and 1 for -ve) placed in the left most position of a number

Positive number: Sign bit is 0 and the magnitude is a positive number

Negative number: Sign bit is 1 and the rest of the number is represented in one of the following

    • Signed-magnitude: Consists of the magnitude and a -ve sign
    • Signed-1’s complement: 1’s complement of its positive value
    • Signed-2’s complement: 2’s complement of its positive value

Only one way to represent +ve number and three different ways to represent -ve number

2

1

4

3

-3

-4

-1

-2

0

+∞

-∞

krraju.in

8 of 28

Signed Magnitude Representation

8

  • For an n-bit number the leftmost bit is the sign bit (0 for a +ve and 1 for a -ve) and the remaining n-1 bits represent the magnitude.
    • e.g., (8-bit words): +14 = 0 000 1110 and -14 = 1 000 1110
    • Range: (-2n-1-1, +2n-1-1)
  • Easy to obtain the magnitude and the negative of a number.
  • Problems:
    • The sign must be considered during arithmetic operations
      • It makes arithmetic difficult because the binary sum of a number and its sign-magnitude negative is not zero.
    • The dual representation of zero (-0 and +0)

krraju.in

9 of 28

One’s Complement Representation

9

  • Binary case of diminished radix complement ( like 9s complement for base 10 numbers).
  • Negative numbers are represented by bit-by-bit complementation of the (positive) magnitude (the process of negation).
    • e.g., (8-bit words): +14 = 0 000 1110 and -14 = 1 111 0001
  • Magnitude can be obtained by
    • clearing the most significant bit (for positive numbers, MSB =0)
    • complement (for negative numbers, MSB = 1)
  • Still have a dual representation for zero (all zeros and all ones)

krraju.in

10 of 28

Two’s Complement Representation

10

Given the representation for +X, the representation for –X is found by taking the 1s complement of +X and adding 1

  • e.g., (8-bit words): +14 = 0 000 1110 and -14 = 1 111 0010
  • Difficult to calculate -ve numbers and magnitudes
  • Converting between two-word lengths
    • e.g., 8-bit into a 16-bit format requires a sign extension from its current location up to the new location and all bits in the extension take on the value of the old sign bit.

+14 = 0 000 1110 (8 bit) and 0 000 0000 0000 1110 (16 bit)

-14 = 1 111 0010 (8 bit) and 1 111 1111 1111 0010 (16 bit)

krraju.in

11 of 28

An 8-Bit Number Representation

11

Number

Representation

Example

2’s Complement

x = 0

0

0 (0000 0000)

0 < x < 127

x

77 (0100 1101)

-128 ≤ x <0

256 - |x|

-56 (1100 1000)

Number

Representation

Example

Sign-Magnitude

x = 0

0 or 128

0 (0000 0000)

0 (1000 0000)

0 < x < 127

x

77 (0100 1101)

-127 ≤ x <0

128 + |x|

-56 (1011 1000)

1’s Complement

x = 0

0 or 255

0 (0000 0000)

0 (1111 1111)

0 < x < 127

x

77 (0100 1101)

-127 ≤ x <0

255 - |x|

-56 (1100 0111)

krraju.in

12 of 28

Comparison Complement Systems

12

Operation

1’s Complement

2’s Complement

Sign bit of the result

If the carry from the MSB is

Then perform:

Then perform:

Add

0

(Result is in complement form); complement to convert to sign magnitude form

1

1

(Result is in sign magnitude form); add 1 to the LSB of the result

(Result is in sign magnitude form); neglect the carry

0

Left shift

Copy sign bit into the LSB

Insert 0 into the LSB

Sign bit = MSB of magnitude

Right shift

Copy sign bit into the MSB of the magnitude

Sign bit unchanged

krraju.in

13 of 28

Design considerations

  • Speed of arithmetic (addition, multiplication)
  • Speed of conversion
  • Range of values that can be represented.

Two’s complement is the most common method of representing signed integers on computers, and more generally, fixed point binary values.

13

Which Representation is Better?

krraju.in

14 of 28

Floating-point Representation

14

Real number representation format that allows a fixed number of digits before and after the binary (decimal) point.

  • Represents wide range of numbers, from very small to very large values useful for scientific and engineering applications
    • In computations, the position of the binary point is to float and is automatically adjusted so it is called floating point representation.
  • This representation is not exact and can result in rounding errors.
    • The precision decreases as the numbers get larger or smaller, which can lead to loss of accuracy.

krraju.in

15 of 28

Floating-point Numbers

15

Floating point numbers have a sign, mantissa (m) or significand (S), radix (r) or base (b) and exponent (e)

  • Analogous to scientific notation.
  • No limitation of having constant number of integer and fractional bits .

The floating point representation of a number has two parts

  • First part: Signed fixed point number called mantissa.
  • Second part: Designates the position of the binary (decimal) point and is called the exponent.

±

m

×

r

±

e

±

S

×

b

±

e

krraju.in

16 of 28

Expressible Numbers

16

Overflow and Underflow happen when a value is out of range. If the value is too big, it is overflow, if the value is too small, it is underflow.

krraju.in

17 of 28

Biased Representation

17

  • An integer representation that skews the bit patterns so as to look just like unsigned but actually represent negative numbers.
    • Exponents are signed values in order to represent both tiny and huge values.
    • In two's complement, the usual representation for signed values, comparison harder.
  • Biased exponent has advantages in performing bitwise comparison of two floating-point numbers for equality.

krraju.in

18 of 28

Biased Exponent

18

A constant value (called bias) is added to the true exponent. This allows the exponent to be represented as an unsigned integer.

  • For IEEE single-precision floats, this value is 127.
    • An exponent of zero means that 127 is stored in the exponent field.
  • Biased exponents are used in computer hardware and software to represent floating-point numbers efficiently and accurately.

krraju.in

19 of 28

Implied Base

19

The base is not represented in the format.

  • The base is assumed to be 2
  • In some systems, to increase the range of numbers that can be represented, the base is assumed as 4 or 16.

By increasing the implied base or the number of bits in the exponent field we can increase the range of numbers that can be represented.

Density of Floating-point numbers

krraju.in

20 of 28

Normalization

20

A number in scientific notation with no leading 0s is called a Normalised Number (e.g., 1.0 × 10-8 or 1.0 × 2-3 , Non normalised form: 10.0 × 10-9)

  • Simplifies the exchange of data
  • Simplifies the arithmetic algorithms to know that the numbers will always be in this form
  • Increases the accuracy of numbers that can be stored in a word
    • Maximum possible number of significant digits.
      • Unnecessary leading 0 is replaced by another significant digit to the right of the decimal (binary) point

krraju.in

21 of 28

Normalized Number

21

Floating point numbers are usually normalized. The exponent is adjusted so that leading bit (MSB) of mantissa is 1

±0.1bbb…b×2±e

  • Where b is either 0 or 1
  • Leftmost bit of mantissa is always 1
    • Since it is obviously unnecessary to store this bit, it is implicit.
  • Zero cannot be normalized and it is represented by all 0’s in the mantissa and exponent

By increasing the number of bits in the mantissa field we can increase the precision of a number.

  • Doubling the mantissa bits will result in double precision numbers

krraju.in

22 of 28

Normalization Example: Exponentiation to the base 2

22

Structured Computer Organization by A.S.Tanenbaum, 6th ed. P.677

krraju.in

23 of 28

Normalization Example: Exponentiation to the base 16

23

Structured Computer Organization by A.S.Tanenbaum, 6th ed. P.677

krraju.in

24 of 28

Precision and Range

24

In floating point numbers, the mantissa part of the number represents its magnitude or value. The mantissa, together with the exponent, determines the precision and range.

  • Increasing the mantissa bits will increase the precision of the number.
  • Increasing the exponent bits will increase the range of the number.

Sign extension means converting a floating point number from one precision to another precision, while maintaining the same value of the number.

  • From a smaller number of bits to a larger number of bits.

krraju.in

25 of 28

Floating Number Systems

25

Proposed standard

IBM system 360, 370

Burroughs B-5500

Control data 6000, 7000 series

Bits used

0-31

0-31

1-47

0-59

Radix

2

16

8

8

Radix point

Before the first bit

(with assumed 1 to left)

Before the first digit

After the last digit

After the last digit

Mantissa sign position

0

0

1

0

Value position

9-31

8-31

9-47

12-59

Representation

Sign magnitude, fractional, normalized with most significant bit assumed

Sign magnitude

Sign magnitude

One’s complement of the entire word

Exponent sign position

-

1

2

1

Value position

1-8

1-7

3-8

1-11

Representation

Value +127 (a non zero number must have a nonzero representation)

Value + 64

Sign magnitude

Value +1024 if >=0; value +1023 if <0

Range of value

-126 to 127

-64 to +63

-63 to +63

-1023 to +1023

(Int. to Computer Architecture by H.S.Stone, 2nd ed. 1988, p.76)

krraju.in

26 of 28

IEEE 754 Standard

26

1985

Leftmost bit is the sign bit for the fraction

Exponent value is

  • Excess 127 (Single precision).
  • Excess 1023 (Double precision).

krraju.in

27 of 28

Recap

27

krraju.in

28 of 28

Video Links

28

krraju.in