Now, the issue with the norm is that the operation is expensive to compute. That's why we would like another way to approximate the norm. A first idea is to look at other norms available, indeed, what we have called "norm" so far is actually the 2-norm, also named euclidean norm. Let's have a look at two other norms : the infinity norm and the Manhattan norm.
Infinity norm is :
Manhattan norm is :
Now we see the Manhattan norm is indeed a lower bound for the 2-norm, even if it's rough. The Infinity norm, however, is too high. But that is not an issue, we could simply scale it up so that it is always higher than the 2-norm. The scaling factor is chosen, such as the yellow curve tangent to the circle. For that, we need it to be equal to cos4π=21.
We have a lower bound! By choosing the closest to the circle between the yellow and green curves, we get an octagon that is very close to the circle. We can define the upper bound of the circle with a function f such as:
Note that this is different from Paul's article. You do need to take the maximum value of the two norms to select the points that are closest to the center. Generally speaking, for two norms, if one's value is higher than the other, then the former will be drawn closer to the origin when plotting the norm(x,y)=1 curve.
To trace this function, note that Manhattan and infinity norms isolines cross when ∣y∣=1 and ∣x∣=2−1 or ∣x∣=1 and ∣y∣=2−1.
Let's plot this function for various values of a. To make things easier, I will "unroll" the circle, and plot the norms against θ, the angle between our vector and the x axis.
As expected, we can continuously vary our approximation between the upper and lower bounds. Notice that these functions are periodic and even. We can thus focus on the first half period to minimize the error. The first half period is when the vector is at the first octagon vertices, starting from the x axis and circling anti-clockwise.
To minimize the error with our approximation, we want to minimize the square error. That is:
Thankfully, the expression of f(x,y) and thus of g(x,y,a) should simplify a lot on the given interval. You can see on schematic above that on this interval we have, f(x,y)=max(∣x∣,∣y∣)=∣x∣=x=cosθ. We can thus rewrite e(a) as follows.
The maximum deviation from the result is then maxθ∣h(a)cosθ−1∣. Looking for that maximum is like looking for the maximum of (h(a)cosθ−1)2. Long story short, the maxima can only occur on the boundaries of the allowed domain for θ, that is θ=0 or θ=π/8, meaning
With our choice for a, we get h(a)≈1.026, so the maximum deviation is 0.052. That is, we have at most a 5.3% deviation from the norm-2!
That was a fun Sunday project! Originally this was intended to be included in a longer blog-post that is yet to be finished, but I figured it was interesting enough to have its own post. The take-home message being, you can approximate the Euclidean norm of a vector with: