i # this doesn't workError:
! object 'i' not found
1i # this works[1] 0+1i
Street Fighting Numerical Analysis - Part 2
Francis J. DiTraglia
April 26, 2026
Sometimes we need a good approximation to the derivative \(f'(x)\) of a real-valued function \(f\) at some real value \(x\). So here’s a fun fact that you may not know. If \(\Delta\) is a small positive number and \(f\) can be evaluated at a complex argument, then \[ f'(x) \approx \frac{\text{Im}[f(x + \Delta i)]}{\Delta} \] where \(i\) is the imaginary unit and \(\text{Im}(z)\) denotes the imaginary part of a complex number. This unexpected but highly accurate approximation is called complex step differentiation. The method dates back to Lyness and Moler (1967); Squire and Trapp (1998) give a concise modern exposition.
To see why it works, Taylor-expand \(f(x + \Delta i)\) around \(x\) yielding \[ \begin{aligned} f(x + \Delta i) &= f(x) + f'(x)\; i\Delta + f''(x) \frac{(i\Delta)^2}{2!} + f'''(x) \frac{(i\Delta)^3}{3!} + \cdots\\ &= \left[f(x) - f''(x) \frac{\Delta^2}{2}\right]+ \left[f'(x)\; \Delta - f'''(x) \frac{\Delta^3}{3!}\right]i + \cdots \end{aligned} \] since \(i^2 = -1\) and \(i^3 = -i\). Taking the imaginary part and dividing by \(\Delta\), \[ \frac{\text{Im}[f(x + \Delta i)]}{\Delta} = f'(x) - f'''(x) \frac{\Delta^2}{3!} + \cdots \] so complex step differentiation approximates \(f'(x)\) to order \(O(\Delta^2)\).
Sure it’s a cute trick. But why go to the trouble of introducing complex numbers? Complex step differentiation turns out to be much better behaved from a numerical perspective than the simple difference approximation \[ f'(x) \approx \frac{f(x + \Delta) - f(x)}{\Delta} \] or the symmetric difference \[ f'(x) \approx \frac{f(x + \Delta) - f(x - \Delta)}{2\Delta}. \] In the remainder of this post, I’ll explain why. Before beginning, I’ll start with a brief overview of complex numbers in R. If this material is already familiar to you, skip ahead to the following section.
R provides basic functionality for working with complex numbers, documented in the helpful ?complex. The imaginary unit in R is denoted by 1i. Don’t try i by itself because that won’t work:
To write a complex number, simply use + to separate the real and imaginary parts, e.g.
A common error when learning R is to write code with “implied multiplication” e.g. 4x (wrong) rather than 4 * x (right). But don’t be tempted to try 4 * i since, again, i is not the imaginary unit in R:
To demonstrate R’s basic functionality for working with complex numbers, let’s specify a complex number \(z\) in Cartesian coordinates: \(z = 3 + 4i\). The real part is \(\text{Re}(z) = 3\), the imaginary part is \(\text{Im}(z) = 4\), the modulus is \(|z| = \sqrt{3^2 + 4^2} = 5\), and the complex conjugate is \(\bar{z} = 3 - 4i\). All of these operations are available in R:
[1] 3
[1] 4
[1] 5
[1] 5
[1] 3-4i
All of the basic operations +, -, *, /, and ^ work on complex numbers and are vectorized
as are functions such as log(), exp(), sin(), cos(), etc. If you supply them a complex input, they will return a complex output; if you supply them a real input, they will return a real output. This behavior explains a common gotcha: sqrt(-1) doesn’t work because there’s no real number whose square root is -1 but sqrt(-1 + 0i) does work:
Warning in sqrt(-1): NaNs produced
[1] NaN
[1] 0+1i
Another common gotcha is trying to apply functions like max() and min() or operators like < and > to complex numbers. Since complex numbers aren’t ordered, this doesn’t work:
Error in `min()`:
! invalid 'type' (complex) of argument
Error in `z < w`:
! invalid comparison with complex values
but testing for equality / non-equality does work:
Let’s test out complex step differentiation with a simple example. Let \(f(x) = x^{9/2}\) and suppose we want to compute \(f'(1.5)\). This one is easy to compute analytically: \[ f'(x) = \frac{9}{2} x^{7/2} \implies f'(1.5) = 4.5 \times(1.5)^{3.5} \] so we can check how accurate different numerical approximations turn out to be. Computing the derivative directly gives
Now we’ll compare this value against three numerical derivatives: the “simple difference” approach, the “symmetric difference” approach, and the complex step approach.
Since R functions are vectorized, we can evaluate the quality of each approximation over many values of \(\Delta\) at once by setting up a vector of progressively smaller positive values:
Now we’ll make two tables: the first containing raw results—the approximate value of \(f'(x)\)—and the second containing the relative error of the approximation in percentage points. When reading the first table, recall from above that direct calculation gives \(f'(1.5) \approx 18.60081\).
| delta | simple | symmetric | complex_step |
|---|---|---|---|
| 1e-01 | 20.89450 | 18.72139 | 18.48027 |
| 1e-02 | 18.81903 | 18.60202 | 18.59961 |
| 1e-03 | 18.62253 | 18.60082 | 18.60080 |
| 1e-04 | 18.60298 | 18.60081 | 18.60081 |
| 1e-05 | 18.60103 | 18.60081 | 18.60081 |
| 1e-06 | 18.60083 | 18.60081 | 18.60081 |
| 1e-07 | 18.60081 | 18.60081 | 18.60081 |
| 1e-08 | 18.60081 | 18.60081 | 18.60081 |
| 1e-09 | 18.60081 | 18.60081 | 18.60081 |
| 1e-10 | 18.60081 | 18.60081 | 18.60081 |
| 1e-11 | 18.60077 | 18.60081 | 18.60081 |
| 1e-12 | 18.60201 | 18.60245 | 18.60081 |
| 1e-13 | 18.58069 | 18.58513 | 18.60081 |
| 1e-14 | 18.56293 | 18.56293 | 18.60081 |
| 1e-15 | 20.42810 | 20.42810 | 18.60081 |
| delta | simple | symmetric | complex_step |
|---|---|---|---|
| 1e-01 | 12.3311 | 0.6483 | 0.6480 |
| 1e-02 | 1.1732 | 0.0065 | 0.0065 |
| 1e-03 | 0.1167 | 0.0001 | 0.0001 |
| 1e-04 | 0.0117 | 0.0000 | 0.0000 |
| 1e-05 | 0.0012 | 0.0000 | 0.0000 |
| 1e-06 | 0.0001 | 0.0000 | 0.0000 |
| 1e-07 | 0.0000 | 0.0000 | 0.0000 |
| 1e-08 | 0.0000 | 0.0000 | 0.0000 |
| 1e-09 | 0.0000 | 0.0000 | 0.0000 |
| 1e-10 | 0.0000 | 0.0000 | 0.0000 |
| 1e-11 | 0.0003 | 0.0000 | 0.0000 |
| 1e-12 | 0.0064 | 0.0088 | 0.0000 |
| 1e-13 | 0.1082 | 0.0843 | 0.0000 |
| 1e-14 | 0.2037 | 0.2037 | 0.0000 |
| 1e-15 | 9.8237 | 9.8237 | 0.0000 |
The simple difference approach is clearly less accurate than both the symmetric difference and complex step approach, particularly at larger values of \(\Delta\). And while it may seem that there’s not much to choose between these latter two approximations, look carefully at what happens as \(\Delta\) gets smaller. Eventually the relative error of the simple and symmetric difference approaches starts to increase! For example, the symmetric difference approximation is better with \(\Delta = 0.1\) (relative error \(<1\%\)) than it is at \(\Delta = 10^{-15}\) (relative error \(\approx 10\%\)).
How can this be? The derivative is defined as the limit of the simple difference approximation as \(\Delta \rightarrow 0\). So how can the approximation get worse if \(\Delta\) becomes smaller?
The culprit is catastrophic cancellation. As explained in an earlier post, computers cannot represent all real numbers exactly. Instead they use floating point numbers as an approximation. So when we compute a difference like \(f(x + \Delta) - f(x)\) what we’re really computing is the difference of two approximations. The problem is that even when \(A\) is a very good approximation to \(f(x + \Delta)\) and \(B\) is a very good approximation to \(f(x)\), the difference \(A - B\) may be a poor approximation to \(f(x + \Delta) - f(x)\). This is an unfortunate property of subtraction, nicely illustrated in this post by John D. Cook.
A double in R is accurate to around 16 decimal places. For small values of \(\Delta\), \(f(x + \Delta)\) and \(f(x)\) agree in nearly all of those digits. Subtracting them eliminates the digits that match, leaving behind what is effectively rounding noise. Dividing by \(\Delta\) then magnifies this noise. For example, since \(f'(x)\) in our example is around \(18.60081\) when we compute \(f(x + 10^{-15}) - f(x)\) we should obtain a result of approximately \(1.860081\times 10^{-14}\) but instead we get
The first couple of digits are in the ballpark, but the remaining ones are just noise.
The symmetric difference formula has the same problem because it too relies on a subtraction of two approximate values in the numerator:
Here we’d expect a result of around \(3.720162 \times 10^{-14}\) and, again, every digit after the first two is just noise.
In contrast, complex step differentiation has no subtraction in the numerator. This means that it is not subject to catastrophic cancellation. As \(\Delta\) becomes smaller, the approximation just keeps improving—down to roughly machine precision. The cost is that \(f\) must be analytic at \(x\) and evaluable at complex inputs. Most smooth functions you’ll encounter qualify, but those involving \(|x|\), \(\max\), \(\min\), or indicator functions generally don’t.
In this post I computed the complex step derivative by hand. That’s useful for understanding the method, but it’s a bad idea in practice. Whenever you can, you should rely on high-quality existing libraries to implement numerical methods. Numerical analysis is a deeply complicated subject and we are but lowly econometricians! Fortunately for us, the grad() function from the numDeriv R package implements complex step differentiation as one of its three methods:
So the next time you need to differentiate something numerically, remember the value in making things more complex than they need to be!