I'm a software engineer working with big data.

I decided to develop my own logarithm function in C without using any standard libraries. I wanted my solution to be portable, accurate, and fast.

The most significant bit
(MSB) position of a binary number `x`

is the same as `floor(log_2(x))`

. The MSB
can be computed efficiently on most machines in a single instruction. This will
only compute the integer part of the log (known as the characteristic), not the
fractional part (known as the mantissa).

Calculation of the mantissa is much more difficult. One of the common methods is to perform a Taylor series expansion, but this is computationally expensive. However, it is possible to extract just the mantissa of a logarithm when given the characteristic as follows:

```
Where c(x) = Log_2 characteristic of x.
Mantissa of log_2(x) = log_2((1/2^c(x)) * x)
Also, (1/2^c(x)) * x will be within the range of [1,2]
```

I decided to approximate the function `log_2(x)`

where `x`

is in in range of
`[1,2]`

with a quintic
function. I generated 1000
data points from `log_2(x)`

across the range `[1,2]`

and generated the 6
coefficients. I tried fitting several polynomial functions and found that
quintic functions provided the best regression with the minimal number of
multiplications.

With the MSB calculation of the characteristic and the quintic function to generate the mantissa, the output of the function is simply the sum of the two.

In terms of performance, the function performs ~6.5x faster than the
implementation `log2`

in `math.h`

on my machine (compiled with gcc and `-O2`

,
but not `-ffast-math`

as it breaks IEEE standards and is dangerous).

I computed the mean squared error (MSE) for this function on the values [1, 1
billion]: `0.00000000006418591926`

. The root mean squared error was
`0.00000801161152696295`

.

Using this new log base 2 function, we can modify it to handle any base as follows:

```
Have function for log base b of x -> log_b(x)
Need log base a of x -> log_a(x)
Then:
log_a(x) = log_b(x) / log_b(a)
```

The denominator can be precomputed as it only consists of constants. Logarithm base changes can then be performed with a single division.

The C code to estimate the logarithm using the method here can be found in my libscu GitHub project. The specific function to implement the logarithm function can be found here.

I released this code in the libscu project under the MIT license.