84664 # How can I multiply and divide 64-bit ints accurately?

I have a C function:

```int64_t fn(int64_t a, int32_t b, int32_t c, int32_t d) { /* should return (a * b * c)/d */ } ```

It is possible for a to be near INT64_MAX, but for the final result not to overflow, for instance if b = 1, c = d = 40. However, I am having trouble figuring out how to compute this so that I never lose data to rounding (by doing the division first) or have an intermediate result overflow.

If I had access to a large enough datatype to fit the whole product of a, b, and c, I would just do the math in that type and then truncate, but is there some way I can do this without big integers?

Write `a = q*d + r` with `|r| < |d|` (I'm assuming `d != 0`, otherwise the computation is meaningless anyway). Then `(a*b*c)/d = q*b*c + (r*b*c)/d`. If `q*b*c` overflows, the entire computation would overflow anyway, so either you don't care, or you have to check for overflow. `r*b*c` might still overflow, so we again use the same method to avoid overflow,

```int64_t q = a/d, r = a%d; int64_t part1 = q*b*c; int64_t q1 = (r*b)/d, r1 = (r*b)%d; return part1 + q1*c + (r1*c)/d; ```

I would suggest finding the greatest common divisor of d with each of a, b, and c, dividing out the common factors as you go:

```common = gcd(a,d) // You can implement GCD using Euclid's algorithm a=a/common d=d/common common = gcd(b,d) b=b/common d=d/common common = gcd(c,d) c=c/common d=d/common ```

Then calculate `a*b*c/d` with all the common factors removed. Euclid's GCD algorithm runs in logarithmic time, so this should be fairly efficient.

It's easy to see that some inputs would produce outputs that cannot be represented by the return type `int64_t`. For example, `fn(INT64_MAX, 2, 1, 1)`. However, the following approach <strike>will</strike> should allow you to return a correct answer for any combination of inputs that does in fact fit within the range of `int64_t`.

```int64_t fn(int64_t a, int32_t b, int32_t c, int32_t d) { /* find the integer and remainder portions of a/d */ int64_t leftI = a / d; int64_t leftR = a % d; /* multiply the integer portion of the result by b and c */ int64_t resultI = leftI * b * c; /* multiply the remainder portion by b */ int64_t resultR = leftR * b; resultI = resultI + (resultR / d) * c; /* multiply the remainder portion by c */ resultR = (resultR % d) * c; return resultI + (resultR / d); } ```

```int64_t fn(uint64_t a, uint64_t b, uint64_t c, uint64_t d) { asm ( "mulq %1\n" // a *= b "movq %%rbx, %%rdx\n"// rbx = upper 64 bit of the multiplication "mulq %2\n" // multiply the lower 64 bits by c "push %%rax\n" // temporarily save the lowest 64 bits on the stack "mov %%rcx, %%rdx\n" // rcx = upper 64 bits of the multiplication "movq %%rax, %%rbx\n"// "mulq %2\n" // multiply the upper 64 bits by c "addq %%rax, %%rcx\n"// combine the middle 64 bits "addcq %%rdx, \$0\n" // transfer carry tp the higest 64 bits if present "divq %3\n" // divide the upper 128 (of 192) bits by d "mov %%rbx, %%rax\n" // rbx = result "pop %%rax\n" "divq %3\n" // divide remainder:lower 64 bits by d : "+a" (a) // assigns a to rax register as in/out , "+b" (b) // assigns b to rbx register : "g" (c) // assigns c to random register , "g" (d) // assigns d to random register : "edx", "rdx" // tells the compiler that edx/rdx will be used internally, but does not need any input ); // b now holds the upper 64 bit if (a * b * c / d) > UINT64_MAX return a; } ```
The native `div` and `mul` instructions on x86 work on double-length exactly to allow for overflows. Sadly I am unaware of a compiler intrinsic to make use of them.