Complex Step Differentiation

Street Fighting Numerical Analysis - Part 2

computing
Author

Francis J. DiTraglia

Published

April 26, 2026

Sometimes we need a good approximation to the derivative \(f'(x)\) of a real-valued function \(f\) at some real value \(x\). So here’s a fun fact that you may not know. If \(\Delta\) is a small positive number and \(f\) can be evaluated at a complex argument, then \[ f'(x) \approx \frac{\text{Im}[f(x + \Delta i)]}{\Delta} \] where \(i\) is the imaginary unit and \(\text{Im}(z)\) denotes the imaginary part of a complex number. This unexpected but highly accurate approximation is called complex step differentiation. The method dates back to Lyness and Moler (1967); Squire and Trapp (1998) give a concise modern exposition.

To see why it works, Taylor-expand \(f(x + \Delta i)\) around \(x\) yielding \[ \begin{aligned} f(x + \Delta i) &= f(x) + f'(x)\; i\Delta + f''(x) \frac{(i\Delta)^2}{2!} + f'''(x) \frac{(i\Delta)^3}{3!} + \cdots\\ &= \left[f(x) - f''(x) \frac{\Delta^2}{2}\right]+ \left[f'(x)\; \Delta - f'''(x) \frac{\Delta^3}{3!}\right]i + \cdots \end{aligned} \] since \(i^2 = -1\) and \(i^3 = -i\). Taking the imaginary part and dividing by \(\Delta\), \[ \frac{\text{Im}[f(x + \Delta i)]}{\Delta} = f'(x) - f'''(x) \frac{\Delta^2}{3!} + \cdots \] so complex step differentiation approximates \(f'(x)\) to order \(O(\Delta^2)\).

Sure it’s a cute trick. But why go to the trouble of introducing complex numbers? Complex step differentiation turns out to be much better behaved from a numerical perspective than the simple difference approximation \[ f'(x) \approx \frac{f(x + \Delta) - f(x)}{\Delta} \] or the symmetric difference \[ f'(x) \approx \frac{f(x + \Delta) - f(x - \Delta)}{2\Delta}. \] In the remainder of this post, I’ll explain why. Before beginning, I’ll start with a brief overview of complex numbers in R. If this material is already familiar to you, skip ahead to the following section.

Complex Numbers in R

R provides basic functionality for working with complex numbers, documented in the helpful ?complex. The imaginary unit in R is denoted by 1i. Don’t try i by itself because that won’t work:

i # this doesn't work
Error:
! object 'i' not found
1i # this works
[1] 0+1i

To write a complex number, simply use + to separate the real and imaginary parts, e.g.

3 + 4i
[1] 3+4i

A common error when learning R is to write code with “implied multiplication” e.g. 4x (wrong) rather than 4 * x (right). But don’t be tempted to try 4 * i since, again, i is not the imaginary unit in R:

4 * i # this doesn't work 
Error:
! object 'i' not found
4 * 1i # this works
[1] 0+4i

To demonstrate R’s basic functionality for working with complex numbers, let’s specify a complex number \(z\) in Cartesian coordinates: \(z = 3 + 4i\). The real part is \(\text{Re}(z) = 3\), the imaginary part is \(\text{Im}(z) = 4\), the modulus is \(|z| = \sqrt{3^2 + 4^2} = 5\), and the complex conjugate is \(\bar{z} = 3 - 4i\). All of these operations are available in R:

z <- 3 + 4i
Re(z)
[1] 3
Im(z)
[1] 4
Mod(z)
[1] 5
abs(z) # same as Mod(z) for a complex number z
[1] 5
Conj(z)
[1] 3-4i

All of the basic operations +, -, *, /, and ^ work on complex numbers and are vectorized

w <- (-3) - 1i
z + w
[1] 0+3i
z - w
[1] 6+5i
z * w
[1] -5-15i
z / w
[1] -1.3-0.9i
z^3
[1] -117+44i

as are functions such as log(), exp(), sin(), cos(), etc. If you supply them a complex input, they will return a complex output; if you supply them a real input, they will return a real output. This behavior explains a common gotcha: sqrt(-1) doesn’t work because there’s no real number whose square root is -1 but sqrt(-1 + 0i) does work:

sqrt(-1) # this doesn't work
Warning in sqrt(-1): NaNs produced
[1] NaN
sqrt(-1 + 0i)  # this works
[1] 0+1i

Another common gotcha is trying to apply functions like max() and min() or operators like < and > to complex numbers. Since complex numbers aren’t ordered, this doesn’t work:

min(c(z, w))
Error in `min()`:
! invalid 'type' (complex) of argument
z < w
Error in `z < w`:
! invalid comparison with complex values

but testing for equality / non-equality does work:

z == w
[1] FALSE
z != w
[1] TRUE

Complex Step Differentiation in Action

Let’s test out complex step differentiation with a simple example. Let \(f(x) = x^{9/2}\) and suppose we want to compute \(f'(1.5)\). This one is easy to compute analytically: \[ f'(x) = \frac{9}{2} x^{7/2} \implies f'(1.5) = 4.5 \times(1.5)^{3.5} \] so we can check how accurate different numerical approximations turn out to be. Computing the derivative directly gives

direct <- 4.5 * (1.5)^(3.5)
direct
[1] 18.60081

Now we’ll compare this value against three numerical derivatives: the “simple difference” approach, the “symmetric difference” approach, and the complex step approach.

f <- function(x) x^(4.5) # the function to differentiate

x <- 1.5 # the point where we'll evaluate f'(x)

Since R functions are vectorized, we can evaluate the quality of each approximation over many values of \(\Delta\) at once by setting up a vector of progressively smaller positive values:

delta <- 1 / 10^(1:15)

simple <- (f(x + delta) - f(x)) / delta

symmetric <- (f(x + delta) - f(x - delta)) / (2 * delta)

complex_step <- Im(f(x + delta * 1i)) / delta

Now we’ll make two tables: the first containing raw results—the approximate value of \(f'(x)\)—and the second containing the relative error of the approximation in percentage points. When reading the first table, recall from above that direct calculation gives \(f'(1.5) \approx 18.60081\).

cbind(delta, simple, symmetric, complex_step) |> 
  knitr::kable(digits = 15)
delta simple symmetric complex_step
1e-01 20.89450 18.72139 18.48027
1e-02 18.81903 18.60202 18.59961
1e-03 18.62253 18.60082 18.60080
1e-04 18.60298 18.60081 18.60081
1e-05 18.60103 18.60081 18.60081
1e-06 18.60083 18.60081 18.60081
1e-07 18.60081 18.60081 18.60081
1e-08 18.60081 18.60081 18.60081
1e-09 18.60081 18.60081 18.60081
1e-10 18.60081 18.60081 18.60081
1e-11 18.60077 18.60081 18.60081
1e-12 18.60201 18.60245 18.60081
1e-13 18.58069 18.58513 18.60081
1e-14 18.56293 18.56293 18.60081
1e-15 20.42810 20.42810 18.60081
get_rel_error <- function(x, truth) 100 * abs(x - truth) / abs(truth)

cbind(
  delta,
  simple = get_rel_error(simple, direct),
  symmetric = get_rel_error(symmetric, direct),
  complex_step = get_rel_error(complex_step, direct)
) |> 
  knitr::kable(digits = c(15, rep(4, 3)))
delta simple symmetric complex_step
1e-01 12.3311 0.6483 0.6480
1e-02 1.1732 0.0065 0.0065
1e-03 0.1167 0.0001 0.0001
1e-04 0.0117 0.0000 0.0000
1e-05 0.0012 0.0000 0.0000
1e-06 0.0001 0.0000 0.0000
1e-07 0.0000 0.0000 0.0000
1e-08 0.0000 0.0000 0.0000
1e-09 0.0000 0.0000 0.0000
1e-10 0.0000 0.0000 0.0000
1e-11 0.0003 0.0000 0.0000
1e-12 0.0064 0.0088 0.0000
1e-13 0.1082 0.0843 0.0000
1e-14 0.2037 0.2037 0.0000
1e-15 9.8237 9.8237 0.0000

The simple difference approach is clearly less accurate than both the symmetric difference and complex step approach, particularly at larger values of \(\Delta\). And while it may seem that there’s not much to choose between these latter two approximations, look carefully at what happens as \(\Delta\) gets smaller. Eventually the relative error of the simple and symmetric difference approaches starts to increase! For example, the symmetric difference approximation is better with \(\Delta = 0.1\) (relative error \(<1\%\)) than it is at \(\Delta = 10^{-15}\) (relative error \(\approx 10\%\)).

How can this be? The derivative is defined as the limit of the simple difference approximation as \(\Delta \rightarrow 0\). So how can the approximation get worse if \(\Delta\) becomes smaller?

Catastrophic Cancellation

The culprit is catastrophic cancellation. As explained in an earlier post, computers cannot represent all real numbers exactly. Instead they use floating point numbers as an approximation. So when we compute a difference like \(f(x + \Delta) - f(x)\) what we’re really computing is the difference of two approximations. The problem is that even when \(A\) is a very good approximation to \(f(x + \Delta)\) and \(B\) is a very good approximation to \(f(x)\), the difference \(A - B\) may be a poor approximation to \(f(x + \Delta) - f(x)\). This is an unfortunate property of subtraction, nicely illustrated in this post by John D. Cook.

A double in R is accurate to around 16 decimal places. For small values of \(\Delta\), \(f(x + \Delta)\) and \(f(x)\) agree in nearly all of those digits. Subtracting them eliminates the digits that match, leaving behind what is effectively rounding noise. Dividing by \(\Delta\) then magnifies this noise. For example, since \(f'(x)\) in our example is around \(18.60081\) when we compute \(f(x + 10^{-15}) - f(x)\) we should obtain a result of approximately \(1.860081\times 10^{-14}\) but instead we get

sprintf("%.20f", f(x + 1e-15) - f(x))
[1] "0.00000000000002042810"

The first couple of digits are in the ballpark, but the remaining ones are just noise.
The symmetric difference formula has the same problem because it too relies on a subtraction of two approximate values in the numerator:

sprintf("%.20f", f(x + 1e-15) - f(x - 1e-15))
[1] "0.00000000000004085621"

Here we’d expect a result of around \(3.720162 \times 10^{-14}\) and, again, every digit after the first two is just noise.

In contrast, complex step differentiation has no subtraction in the numerator. This means that it is not subject to catastrophic cancellation. As \(\Delta\) becomes smaller, the approximation just keeps improving—down to roughly machine precision. The cost is that \(f\) must be analytic at \(x\) and evaluable at complex inputs. Most smooth functions you’ll encounter qualify, but those involving \(|x|\), \(\max\), \(\min\), or indicator functions generally don’t.

Don’t Roll Your Own

In this post I computed the complex step derivative by hand. That’s useful for understanding the method, but it’s a bad idea in practice. Whenever you can, you should rely on high-quality existing libraries to implement numerical methods. Numerical analysis is a deeply complicated subject and we are but lowly econometricians! Fortunately for us, the grad() function from the numDeriv R package implements complex step differentiation as one of its three methods:

library(numDeriv)
grad(f, 1.5, method = 'complex')
[1] 18.60081

So the next time you need to differentiate something numerically, remember the value in making things more complex than they need to be!