CS/COE0447 Computer Organization & Assembly Language Chapter 3 1 Topics Negative binary integers Sign magnitude, 1s complement, 2s complement Sign extension, ranges, arithmetic Signed versus unsigned operations Overflow (signed and unsigned) Branch instructions: branching backwards Implementations of addition, multiplication, division

Floating point numbers Binary fractions IEEE 754 floating point standard Operations underflow Implementations of addition and multiplication (less detail than for integers) Floating-point instructions in MIPS Guard and Round bits 2 Arithmetic So far we have studied Instruction set basics Assembly & machine language We will now cover binary arith metic algorithms and their impl ementations Binary arithmetic will provide t

he basis for the CPUs datapa th implementation 3 Binary Number Representation We looked at unsigned numbers before B31B30B2B1B0 B31231+BB30230+B+BB222+BB121+BB020 Now we want to deal with more complicated cases Negative integers Real numbers (a.k.a. floating-point numbers) Well start with negative integers Bit patterns and what they represent Well see 3 schemes; the 3rd (2s complement) is used in most computers 4 Case 1: Sign Magnitude {sign bit, absolute value (magnitude)} Sign bit 0 positive number 1 negative number

EX. (assume 4-bit representation) 0000: 0 0011: 3 1001: -1 1111: -7 1000: -0 Properties two representations of zero equal number of positive and negative numbers 5 Case 2: Ones Complement ((2N-1) number): To multiply a 1s Complement number by -1, subtract the number from (2N-1)_unsigned. Or, equivalently (and easily!), simply flip the bits

1CRepOf(A) +B 1CRepOf(-A) = 2N-1_unsigned (interesting tidbit) Lets assume a 4-bit representation (to make it easy to work with) Examples: 0011: 3 0110: 6 1001: -6 1111: -0 1000: -7 1111 0110 1111 0000 1111 0111 or just flip the bits of 0110 or just flip the bits of 0000 or just flip the bits of 0111

Properties Two representations of zero Equal number of positive and negative numbers 6 Case 3: Twos Complement (2N number): To multiply a 2s Complement number by -1, subtract the number from 2N_unsigned. Or, equivalently (and easily!), simply flip the bits and add 1. 2CRepOf(A) +B 2CRepOf(-A) = 2N_unsigned (interesting tidbit) Lets assume a 4-bit representation (to make it easy to work with) Examples: 0011: 3

0110: 6 1010: -6 1111: -1 1001: -7 1000: -8 10000 0110 10000 0001 10000 0111 10000 1000 or just flip the bits of 0110 and add 1 or just flip the bits of 0001 and add 1 or just flip the bits of 0111 and add 1 or just flip the bits of 1000 and add 1 Properties One representation of zero: 0000 An extra negative number: 1000 (this is -8, not -0) 7 Ranges of numbers Range (min to max) in N bits: SM and 1C: -2^(N-1) -1 to +B2^(N-1) -1 2C: -2^(N-1) to +B2^(N-1) -1

8 Sign Extension #s are often cast into vars with more capacity Sign extension (in all 3 representations): extend the sign bit to the left, and everything works out la $t0,0x00400033 addi $t1,$t0, 7 addi $t2,$t0, -7 R[rt] = R[rs] + SignExtImm SignExtImm = {16{immediate[15]},immediate} 9 Summary Code Sign-Magnitude 1s Complement 2s Complement

000 +B0 +B0 +B0 001 +B1 +B1 +B1 010 +B2 +B2 +B2

011 +B3 +B3 +B3 100 -0 -3 -4 101 -1 -2 -3 110

-2 -1 -2 111 -3 -0 -1 Issues # of zeros Balance (and thus range) Operations implementation 10 2s Complement Examples 32-bit signed numbers

0000 0000 0000 0000 0000 0000 0000 0000 = 0 0000 0000 0000 0000 0000 0000 0000 0001 = +B1 0000 0000 0000 0000 0000 0000 0000 0010 = +B2 0111 1111 1111 1111 1111 1111 1111 1110 = +B2,147,483,646 0111 1111 1111 1111 1111 1111 1111 1111 = +B2,147,483,647 1000 0000 0000 0000 0000 0000 0000 0000 = - 2,147,483,648 -2^31 1000 0000 0000 0000 0000 0000 0000 0001 = - 2,147,483,647 1000 0000 0000 0000 0000 0000 0000 0010 = - 2,147,483,646 1111 1111 1111 1111 1111 1111 1111 1101 = -3

1111 1111 1111 1111 1111 1111 1111 1110 = -2 1111 1111 1111 1111 1111 1111 1111 1111 = -1 11 Addition We can do binary addition just as we do decimal arithmetic Examples in lecture Can be simpler with 2s complement (1C as well) We dont need to worry about the signs of the operands! Examples in lecture 12 Subtraction Notice that subtraction can be done using addition Add 1

A B = A +B (-B) We know how to negate a number The hardware used for addition can be used for subtraction with a negating unit at one Invert (flip) the bits input 13 Signed versus Unsigned Operations unsigned operations view the operands a s positive numbers, even if the most signifi cant bit is 1 Example: 1100 is 12_unsigned but -4_2C Example: slt versus sltu li $t0,-4 li $t1,10 slt $t3,$t0,$t1 sltu $t4,$t0,$t1 $t3 = 1 $t4 = 0 !!

14 Signed Overflow Because we use a limited number of bits to represent a number, the result of an operation may not fit overflow No overflow when We add two numbers with different signs We subtract a number with the same sign Overflow when Adding two positive numbers yields a negative number Adding two negative numbers yields a positive number How about subtraction? 15 Overflow On an overflow, the CPU can Generate an exception Set a flag in a status register Do nothing In MIPS on green card:

add, addi, sub: footnote (1) May cause overflow exce ption 16 Overflow with Unsigned Operations addu, addiu, subu Footnote (1) is not listed for these instructions on the green card This tells us that, In MIPS, nothing is done on unsigned overflow How could it be detected for, e.g., add? Carry out of the most significant position (in some architectures, a condition code is set on unsigned overflow, which IS the carry out from the top positi on) 17 Branch Instructions: Branching Backwards

# $t3 = 1 +B 2 +B 2 +B 2 +B 2; $t4 = 1 +B 3 +B 3 +B 3 +B 3 li $t0,0 li $t3,1 li $t4, 1 loop: addi $t3,$t3,2 addi $t4,$t4,3 addi $t0,$t0,1 slti $t5,$t0,4 bne $t5,$zero,loop machine code: 0x15a0fffc BranchAddr = {14{imm[15]}, imm, 2b0} 18 1-bit Adder With a fully functional single-bit ad der We can build a wider adder by linking many one-bit adders 3 inputs A: input A B: input B Cin: input C (carry in)

2 outputs S: sum Cout: carry out 19 Implementing an Adder Solve how S can be represented by way of A, B, and C in Also solve for Cout Input Output A B Cin S Cout 0

0 0 0 0 0 0 1 1 0 0 1 0 1

0 0 1 1 0 1 1 0 0 1 0 1

0 1 0 1 1 1 0 0 1 1 1 1 1

1 20 Boolean Logic formulas Input Output A B Cin S Cout 0 0 0

0 0 0 0 1 1 0 0 1 0 1 0 0

1 1 0 1 1 0 0 1 0 1 0 1

0 1 1 1 0 0 1 1 1 1 1 1 S = ABCin+BABCin+BABCin+BABCin

Cout = AB+BBCin+BACin Can implement the adder using logic gates 21 Logic Gates 2-input AND 2-input OR 2-input NAND 2-input NOR A B Y Y=A&B Y Y=A|B

Y Y=~(A&B) Y Y=~(A|B) A B A B A B 22 Implementation in logic gates OR GATES Cout = AB+BBCin+BACin AND GATES

Well see more boolean logic and circuit implementation when we get to Appen dix B 23 N-bit Adder (0) An N-bit adder can be const ructed with N one-bit adders A carry generated in one stag e is propagated to the next (ri pple carry adder) 3 inputs A: N-bit input A B: N-bit input B Cin: input C (carry in) 2 outputs S: N-bit sum Cout: carry out 24

N-bit Ripple-Carry Adder (0) (0) (0) (1) (1) (0) (0) 0 0 1 1

1 0 0 1 1 0 0 (0) 1 (1) 1 (1) 0

(0) 1 25 Multiplication More complicated operation, so more complicated circuits Outline Human longhand, to remind ourselves of the steps involved Multiplication hardware Text has 3 versions, showing evolution to help you better understand how the circuits work 26 Multiplication Here see what book says More complicated than addition A straightforward implementation will involve shifts and adds More complex operation can lead to

More area (on silicon) and/or More time (multiple cycles or longer clock cycle time) Lets begin from a simple, straightforward method 27 Straightforward Algorithm 01010010 (multiplicand) x 01101101 (multiplier) 28 Implementation 1 29 Implementation 2 30 Implementation 3 31

Example Lets do 0010 x 0110 (2 x 6), unsigned Iteration Multiplicand 0 0010 1 2 3 4 0010 0010 0010

0010 Implementation 3 Step Product initial values 0000 0110 1: 0 -> no op 0000 0110 2: shift right 0000 0011 1: 1 -> product = produ ct +B multiplicand 0010 0011 2: shift right

0001 0001 1: 1 -> product = produ ct +B multiplicand 0011 0001 2: shift right 0001 1000 1: 0 -> no op 0001 1000 2: shift right 0000 1100 32 Binary Division Dividend = Divider Quotient +B Remainder Even more complicated Still, it can be implemented by way of shifts and

addition/subtraction We will study a method based on the paper-andpencil method We confine our discussions to unsigned numbers only 33 Implementation Figure 3.10 34 Algorithm (figure 3.11) Size of dividend is 2 * size of divisor Initialization: quotient register = 0 remainder register = dividend divisor register = divisor in left half 35 Algorithm continued Repeat for 33 iterations (size divisor +B 1): Subtract the divisor register from the remainder register and place the result in the remainder register If Remainder >= 0:

Shift quotient register left, placing 1 in bit 0 Else: Undo the subtraction; shift quotient register left, placing 0 in bit 0 Shift divisor register right 1 bit Example in lecture and figure 3.12 36 Floating-Point (FP) Numbers Computers need to deal with real numbers Fraction (e.g., 3.1416) Very small number (e.g., 0.000001) Very large number (e.g., 2.75961011) Components: sign, exponent, mantissa (-1)signmantissa2exponent More bits for mantissa gives more accuracy More bits for exponent gives wider range A case for FP representation standard Portability issues Improved implementations IEEE754 standard

37 Binary Fractions for Humans Lecture: binary fractions and their decimal equivalents Lecture: translating decimal fractions into binary Lecture: idea of normalized representation Then well go on with IEEE standard floating point representation 38 IEEE 754 A standard for FP representation in computers Single precision (32 bits): 8-bit exponent, 23-bit mantissa Double precision (64 bits): 11-bit exponent, 52-bit mantissa N-1 N-2 sign M M-1 exponent

0 Fraction (or mantissa) Leading 1 in mantissa is implicit (since the mantissa is normalized, the first digit is always a 1why waste a bit storing it?) Exponent is biased for easier sorting of FP numbers 39 Biased Representation Weve looked at different binary number representations so far Sign-magnitude 1s complement 2s complement

Now one more representation: biased representation 000000 is the smallest number 111111 is the largest number To get the real value, subtract the bias from the bit pattern, interpreting bit pattern as an unsigned number Representation = Value +B Bias Bias for exponent field in IEEE 754 127 (single precision) 1023 (double precision) 40 IEEE 754 A standard for FP representation in computers Single precision (32 bits): 8-bit exponent, 23-bit mantissa Double precision (64 bits): 11-bit exponent, 52-bit mantissa N-1 N-2 sign M M-1 exponent

0 significand (or mantissa) Leading 1 in mantissa is implicit Exponent is biased for easier sorting of FP numbers All 0s is the smallest, all 1s is the largest Bias of 127 for single precision and 1023 for double precision Getting the actual value: (-1)sign(1+Bsignificand)2(exponent-bias) 41 IEEE 754 Example -0.75ten Same as -3/4 In binary -11/100 = -0.11 In normalized binary -1.1twox2-1 In IEEE 754 format sign bit is 1 (number is negative!)

mantissa is 0.1 (1 is implicit!) exponent is -1 (or 126 in biased representation) 31 30 23 22 1 01111110 sign 8-bit exponent 0 1000 000 23-bit significand (or mantissa)

42 IEEE 754 Encoding Revisited Single Precision Double Precision Represented Object Exponent Fraction Exponent Fraction 0 0 0 0

0 0 non-zero 0 non-zero +B/- denormalized number 1~254 anything 1~2046 anything +B/- floating-point numbers 255

0 2047 0 +B/- infinity 255 non-zero 2047 non-zero NaN (Not a Number) 43 FP Operations Notes Operations are more complex We should correctly handle sign, exponent, significand We have underflow

Accuracy can be a big problem IEEE 754 defines two extra bits to keep temporary results accurately: guard bit and round bit Four rounding modes Positive divided by zero yields infinity Zero divided by zero yields Not a Number (NaN) Implementing the standard can be tricky Not using the standard can become even worse See text for 80x86 and Pentium bug! 44 Floating-Point Addition . Shift smaller umber to make xponents match 0.5ten 0.4375ten =1.000two2-1 1.110two2-2 . Add the significands . Normalize sum

verflow or underflow? Yes: exception no: ound the significand not still normalized, o back to step 3 45 Floating-Point Multiplication (1.000two2-1)(-1.110two2-2) 1. Add exponents and subtract bias 2. Multiply the significands 3. Normalize the product 4: overflow? If yes, raise exception 5. Round the significant to appropriate # of bits 6. If not still normalized, go back to step 3 7. Set the sign of the result

46 Floating Point Instructions in MIPS .data nums: .float 0.75,15.25,7.625 .text la $t0,nums lwc1 $f0,0($t0) lwc1 $f1,4($t0) add.s $f2,$f0,$f1 #0.75 + 15.25 = 16.0 = 10000 binary = 1.0 * 2^4 #f2: 0 10000011 000000... = 0x41800000 swc1 $f2,12($t0) #1001000c now contains that number # Click on coproc1 in Mars to see the $f registers 47 Another Example

.data nums: .float 0.75,15.25,7.625 .text loop: la $t0,nums lwc1 $f0,0($t0) lwc1 $f1,4($t0) c.eq.s $f0,$f1 # cond = 0 bc1t label # no branch c.lt.s $f0,$f1 # cond = 1 bc1t label # does branch add.s $f3,$f0,$f1 label: add.s $f2,$f0,$f1 c.eq.s $f2,$f0 bc1f loop # branch (infinite loop) #bottom of the coproc1 display shows condition bits 48 nums: .double

0.75,15.25,7.625,0.75 #0.75 = .11-bin. exponent is -1 (1022 biased). significand is 1000... #0 01111111110 1000... = 0x3fe8000000000000 la $t0,nums lwc1 $f0,0($t0) lwc1 $f1,4($t0) lwc1 $f2,8($t0) lwc1 $f3,12($t0) add.d $f4,$f0,$f2 #{$f5,$f4} = {$f1,$f0} + {$f2,$f1}; 0.75 + 15.25 = 16 = 1.0-bin * 2^4 #0 10000000011 0000... = 0x4030000000000000 # value+0 value+4 value+8 value+c # 0x00000000 0x3fe80000 0x00000000 0x402e8000 # float double # $f0 0x00000000 0x3fe8000000000000 # $f1 0x3fe80000 # $f2 0x00000000 0x402e800000000000

# $f3 0x402e8000 # $f4 0x00000000 0x4030000000000000 # $f5 0x40300000 49 Guard and Round bits To round accurately, hardware needs extra bits IEEE 274 keeps extra bits on the right during intermediate additions guard and round bits 50 Example (in decimal) With Guard and Round bits 2.56 * 10^0 +B 2.34 * 10^2

Assume 3 significant digits 0.0256 * 10^2 +B 2.34 * 10^2 2.3656 [guard=5; round=6] Round step 1: 2.366 Round step 2: 2.37 51 Example (in decimal) Without Guard and Round bits 2.56 * 10^0 +B 2.34 * 10^2 0.0256 * 10^2 +B 2.34 * 10^2 But with 3 sig digits and no extra bits: 0.02 +B 2.34 = 2.36 So, we are off by 1 in the last digit 52