Decimal and binary floating point rounding

    公开(公告)号:GB2530989A

    公开(公告)日:2016-04-13

    申请号:GB201417580

    申请日:2014-10-06

    Applicant: IBM

    Abstract: Floating point arithmetic logic (10) for rounding the final result of an arithmetic operation on a first number (104) and a second number (106) both in either binary floating point format or decimal floating point format. Binary floating point format numbers are mapped S10 to a decimal floating point format, by padding bits (18) to form digits in order to share the same fields as in a decimal floating point format. A guard digit (28) of zero (58) of at least one of the first or second numbers is generated S12 by transforming the first and second numbers (108; 110) using a compressing function (30) such as a carry-save adder. A result (130) and result plus one (132) are calculated depending on the arithmetic operation a sum (66), a first difference (67) or a second difference (68) of the transformed numbers (112, 114). Injection values (24, 26) for rounding a final result (20) are generated in dependence on the first and second numbers being in a decimal or binary floating point format, a rounding mode and of the arithmetic operation. Injection carry values (16, 17) are generated based on the transformed first and second floating point numbers and the injection values. The final result is selected from the result, the result plus one and a least significant digit (60) based on the injection carry values and end around carry signals.

    Decimal and binary floating point arithmetic calculations

    公开(公告)号:GB2530990A

    公开(公告)日:2016-04-13

    申请号:GB201417582

    申请日:2014-10-06

    Applicant: IBM

    Abstract: A decimal floating point unit for performing add or subtract calculations on a first (100) and second operand (101) comprising unpacking S200 the first and second operand such as by formatting 128 bit width mantissa to be 136 bit wide; conditionally swapping S202 the first and second operand, if an exponent (104) of the first operand is less than an exponent (105) of the second operand, and aligning S204, S206 the operands based on the exponent difference and a number of leading zeroes in the operand with the larger exponent. Adding or subtracting the operands S208 is performed on the aligned operands with normalizing and rounding of the result which is then packed S210. Binary floating point arithmetic can also be performed on the decimal floating point unit which may be pipelined.

Patent Agency Ranking