A nice approximation of the norm of a 2D vector.

While wandering on the internet, I stumbled upon Paul Hsieh's blog-post, where he demonstrates a way to approximate the norm of a vector without any call to the sqrt function. Let's see if I can reproduce the steps to derive this.

Table of contents

Setting-up the scene.

Calculating the norm of a vector (x,y)(x,y), or a complex number x+iyx+iy means calculating x2+y2\sqrt{x^2+y^2}. Without loss of generality, we can set x2+y2=1\sqrt{x^2+y^2}=1. If we draw this, we get the following.

The (x, y) pairs with a euclidean norm of 1.
The (x, y) pairs with a euclidean norm of 1.

Finding a lower bound to the norm.

Now, the issue with the norm is that the \sqrt{} operation is expensive to compute. That's why we would like another way to approximate the norm. A first idea is to look at other norms available, indeed, what we have called "norm" so far is actually the 2-norm, also named euclidean norm. Let's have a look at two other norms : the infinity norm and the Manhattan norm.

Infinity norm is :

(x,y)=max(x,y) \lVert(x,y)\rVert_\infty = \max(x,y)

Manhattan norm is :

(x,y)1=x+y \lVert(x,y)\rVert_1 = |x|+|y|
The (x, y) pairs with a euclidean norm of 1, an infinity norm of 1 or a Manhattan norm of 1.
The (x, y) pairs with a euclidean norm of 1, an infinity norm of 1 or a Manhattan norm of 1.

Now we see the Manhattan norm is indeed a lower bound for the 2-norm, even if it's rough. The Infinity norm, however, is too high. But that is not an issue, we could simply scale it up so that it is always higher than the 2-norm. The scaling factor is chosen, such as the yellow curve tangent to the circle. For that, we need it to be equal to cosπ4=12\cos\frac{\pi}{4}=\frac{1}{\sqrt{2}}.

We now have a nice lower bound of the euclidean norm!
We now have a nice lower bound of the euclidean norm!

We have a lower bound! By choosing the closest to the circle between the yellow and green curves, we get an octagon that is very close to the circle. We can define the upper bound of the circle with a function ff such as:

f(x,y)=max(max(x,y),12(x+y)) f(x,y) = \max\left(\max(x,y), \frac{1}{\sqrt{2}}(|x|+|y|)\right)

Note that this is different from Paul's article. You do need to take the maximum value of the two norms to select the points that are closest to the center. Generally speaking, for two norms, if one's value is higher than the other, then the former will be drawn closer to the origin when plotting the norm(x,y)=1\text{norm}(x,y)=1 curve.

To trace this function, note that Manhattan and infinity norms isolines cross when y=1|y|=1 and x=21|x| = \sqrt{2}-1 or x=1|x|=1 and y=21|y| = \sqrt{2}-1.

The lower bound of the norm outlined.
The lower bound of the norm outlined.

Finding an upper bound to the norm.

The first idea you can get from the lower bound we found is to scale it up so that the octagon corners touch the circle.

To do so, we need to find the 2-norm of one of the corners and divide ff by it.

Let's take the one at x=1x=1, y=21y=\sqrt{2}-1. We have:

x2+y2=1+(21)2=1+222+1=422 \begin{align} \sqrt{x^2+y^2} &=& \sqrt{1 + \left(\sqrt{2}-1\right)^2}\\ &=& \sqrt{1 + 2 - 2\sqrt{2} + 1}\\ &=& \sqrt{4 - 2\sqrt{2}} \end{align}

Thus, the upper-bound for the 2-norm with the octagon method is 422f(x,y)\sqrt{4 - 2\sqrt{2}}f(x,y):

f(x,y)x2+y2422f(x,y) f(x,y) \leq \sqrt{x^2+y^2} \leq \sqrt{4 - 2\sqrt{2}}f(x,y)
The upper and lower bounds of the norm outlined.
The upper and lower bounds of the norm outlined.

Choosing the best approximation for the norm.

Now, we could stick to Paul Hsieh's choice of taking the middle between the lower and the upper bounds, and it will probably be fine. But come on, let's see if it is the best choice. 😉

Formally, the problem is to find a number a[0,1]a\in[0,1] such as gg defined as follows is the closest possible to the norm-2.

g(x,y,a)=(1a)f(x,y)+a422f(x,y)=((1a)+a422)f(x,y) \begin{align} g(x,y,a) &=& (1-a)f(x,y)+\frac{a}{\sqrt{4 - 2\sqrt{2}}}f(x,y)\\ &=& \left((1-a) + a\sqrt{4 - 2\sqrt{2}}\right)f(x,y) \end{align}

Let's plot this function for various values of aa. To make things easier, I will "unroll" the circle, and plot the norms against θ\theta, the angle between our vector and the xx axis.

Various possible approximations for the norm.
Various possible approximations for the norm.

As expected, we can continuously vary our approximation between the upper and lower bounds. Notice that these functions are periodic and even. We can thus focus on the first half period to minimize the error. The first half period is when the vector is at the first octagon vertices, starting from the xx axis and circling anti-clockwise.

Zooming in the part of the unit circle that is interesting for calculating the error.
Zooming in the part of the unit circle that is interesting for calculating the error.

To minimize the error with our approximation, we want to minimize the square error. That is:

e(a)=0arctan(21)(g(x,y,a)1)2dθ \begin{align} e(a) &=& \int_0^{\arctan\left(\sqrt{2}-1\right)}(g(x,y,a)-1)^2\text{d}\theta \end{align}

Thankfully, the expression of f(x,y)f(x,y) and thus of g(x,y,a)g(x,y,a) should simplify a lot on the given interval. You can see on schematic above that on this interval we have, f(x,y)=max(x,y)=x=x=cosθf(x,y)=max(|x|,|y|)=|x|=x=\cos\theta. We can thus rewrite e(a)e(a) as follows.

e(a)=0arctan(21)(g(x,y,a)1)2dθ=0arctan(21)((1a+a422)cosθ1)2dθ=0arctan(21)(h(a)cosθ1)2dθ \begin{align} e(a) &=& \int_0^{\arctan\left(\sqrt{2}-1\right)}(g(x,y,a)-1)^2\text{d}\theta\\ &=& \int_0^{\arctan\left(\sqrt{2}-1\right)}\left(\left(1-a + a\sqrt{4-2\sqrt{2}}\right)\cos\theta-1\right)^2\text{d}\theta\\ &=& \int_0^{\arctan\left(\sqrt{2}-1\right)}\left(h(a)\cos\theta-1\right)^2\text{d}\theta \end{align}

Where h(a)=(1a+a422)h(a)=\left(1-a + a\sqrt{4-2\sqrt{2}}\right) and arctan(21)=π8\arctan\left(\sqrt{2}-1\right)=\frac{\pi}{8}.

Square error against θ.
Square error against θ.
Sum square error against a.
Sum square error against a.

As we can see from these plots, there is a minimal error, and though 0.5 is a reasonable choice for aa, we can do slightly better around 0.3.

We can explicitly calculate e(a)e(a). Let h(a)=(1+a(A1))h(a)=(1+a(A-1)). We have

e(a)=0π/8(h(a)cosθ1)2dθ=h2(a)0π/8cos2θdθ2h(a)0π/8cosθdθ+π8=h2(a)B2h(a)sinπ8+π8 \begin{align} e(a) &=& \int_0^{\pi/8}(h(a)\cos\theta-1)^2\text{d}\theta\\ &=&h^2(a)\int_0^{\pi/8}\cos^2\theta\text{d}\theta-2h(a)\int_0^{\pi/8}\cos\theta\text{d}\theta + \frac{\pi}{8}\\ &=& h^2(a)B-2h(a)\sin\frac{\pi}{8} + \frac{\pi}{8} \end{align}

Where B=π16+142B=\frac{\pi}{16}+\frac{1}{4\sqrt2}. Thus, we look for the position of the minimum, that is where e(a)=0e'(a)=0.

0=2B(A1)(1+a(A1))sinπ80=2B(A1)(1+a(A1))A22a=(A2B21)×1A1a0.311 \begin{align} 0 &=& 2B(A-1)(1+a(A-1))-\sin\frac{\pi}{8}\\ 0 &=& 2B(A-1)(1+a(A-1)) - \frac{A}{2\sqrt2}\\ a &=& \left(\frac{A}{2B\sqrt2}-1\right)\times\frac{1}{A-1}\\ a &\approx& 0.311 \end{align}

Not that far from 0.3!

The maximum deviation from the result is then maxθh(a)cosθ1\max_\theta{|h(a)\cos\theta-1|}. Looking for that maximum is like looking for the maximum of (h(a)cosθ1)2\left(h(a)\cos\theta-1\right)^2. Long story short, the maxima can only occur on the boundaries of the allowed domain for θ\theta, that is θ=0\theta=0 or θ=π/8\theta=\pi/8, meaning

maxθh(a)cosθ1=max(h(a)1,h(a)2221) \max_\theta{|h(a)\cos\theta-1|} = \max\left(h(a)-1, \left|h(a)\frac{\sqrt{2-\sqrt{2}}}{2}-1\right|\right)

With our choice for aa, we get h(a)1.026h(a)\approx 1.026, so the maximum deviation is 0.052. That is, we have at most a 5.3% deviation from the norm-2!

Our best approximation for the euclidean norm, with the calculated maximum errors.
Our best approximation for the euclidean norm, with the calculated maximum errors.


That was a fun Sunday project! Originally this was intended to be included in a longer blog-post that is yet to be finished, but I figured it was interesting enough to have its own post. The take-home message being, you can approximate the Euclidean norm of a vector with:

norm(x,y)=22π8+122max(max(x,y),12(x+y))1.026max(max(x,y),12(x+y)) \begin{align} \text{norm}(x,y) &=& \frac{\sqrt{2-\sqrt{2}}}{\frac{\pi}{8}+\frac{1}{2\sqrt{2}}}\max\left(\max(|x|,|y|), \frac{1}{\sqrt{2}}(|x|+|y|)\right)\\ &\approx& 1.026\max\left(\max(|x|,|y|), \frac{1}{\sqrt{2}}(|x|+|y|)\right) \end{align}

You'll get at most a 5.3% error. This is a bit different from what's proposed on Paul Hsieh's blog-post. Unless I made a mistake, there might be a typo on his blog!

If you are interested in playing with the code used to generate the figures in this article, have a look at the companion notebook!

As always, if you have any question, or want to add something to this post, you can leave me comment or ping me on Twitter or Mastodon.