Tag Archives: math

How to Measure Euler’s Number by Shuffling Cards.

Previously on this blog, I have talked about Buffon’s Needle, which is a way to measure \pi using random numbers. In this entry, I will be talking about a way to measure another fundamental constant e, also by using random numbers.

To measure e, we will shuffle a deck of cards many times and count the number of shuffles that result in the deck being deranged. This means that no card remains in its original position in the deck. This type of arrangement is called a “derangement”.

We will start by computing the theoretical probability (p) getting a derangement, which turns out to be 1/e or about 36.7%, for a sufficiently large number of cards. We will then experimentally measure e by shuffling cards.

To compute the theoretical probability p, we need to understand two numbers:
The first is the total number of arrangements that the cards can be in (R).
The second is how many of those arrangements are derangements (D).
The theoretical probability is the ratio of those two numbers:
p = D/R
and it approaches 1/e.
(Or you could say R/D approaches e.)

The first number is easy, it’s just the factorial of the number of cards in the deck (n).
R_n = n!
The second number D_n is not as straightforward. We have to count the number of derangements a deck can have.

To show a simple example of counting derangements, let’s look at a deck of only 3 cards:
cards
There are 6 ways to arrange these 3 cards (3! = 6)
A,2,3
A,3,2
2,A,3
3,A,2  (this one is a derangement)
2,3,A  (this one is a derangement)
3,2,A

Only 2 of these arrangements have every card moved from its original position. So the number of derangements for 3 cards is 2. So in this case, the probability of getting a derangement is about 33%.
D_3 = 2
p_3 = D/R = 2/3! = 1/3
That’s close to 1/e, but not quite, because it turns out 3 cards are just not enough to produce that probability. We need more.

In general, with a deck of n cards, we can arrange any number k of those cards in P_{nk} number of ways, where:
P_{nk} = \frac{n!}{(n-k)!}
We can use this formula to count the number of ways there are to rearrange a subset k of the cards, while n-k cards remain in a fixed position (we are counting the number of permutations).

Note that n-k \neq 0, then by definition, the corresponding arrangements will not be derangements. If we can count all of these, then we can subtract this number of arrangements from the total number n! to get the number of derangements.

Let’s see this in action by going back to the deck of 3 cards. If we select all the cards to rearrange, we can get all 6 possible permutations:
P_{3,3} = \frac{3!}{(3-3)!} = \frac{3!}{0!} = 3! = 6
A,2,3
A,3,2
2,A,3
3,A,2
2,3,A
3,2,A

If we select only 2 cards out of the 6 to rearrange, we only get a subset of the arrangements. However if we count how many there are with the formula we get 6 permutations:
P_{3,2} = \frac{3!}{(3-2)!} = \frac{3!}{1!} = 3! = 6
So there are 6 ways to arrange 2 cards (6 ways to hold 1 card fixed). But why do we get 6 permutations here? Well, let’s look at all the ways to do this.
If we select 2 and 3 to rearrange, we get 2 permutations:
A,2,3
A,3,2
If we select A and 3 to rearrange, we get 2 permutations:
A,2,3
3,2,A
And if select A and 2 to rearrange, we get 2 permutations:
A,2,3
2,A,3
Notice however that A,2,3 occurs 3 times, and so there are actually 4 unique arrangements here, and they are not derangements by definition. So we have 6 - 4 = 2 derangements for 6 cards.
But is there a way to count unique arrangements without doing it by hand like we did here? Let’s keep going with this pattern.

If we select only 1 card to rearrange, then that’s trivial, because you can’t rearrange one card.
P_{3,1} = \frac{3!}{(3-1)!} = \frac{3!}{2!} = 3
If selecting A:
A,2,3
If selecting 2:
A,2,3
If selecting 3:
A,2,3
All 3 permutations are the same arrangement!

Then if we select no cards to rearrange, this is also trivial.
P_{3,0} = \frac{3!}{(3-0)!} = 1
If selecting nothing:
A,2,3
That’s just 1 permutation.

All duplicate permutations, are accounted for here. The number of permutations for P_{3,0} accounts for one of the duplicates in P_{3,1}:
P_{3,1} - P_{3,0} = 3 - 1 = 2
The permutations left in this difference account for 2 of the duplicates in P_{3,2}:
P_{3,2} - (P_{3,1} - P_{3,0}) = 6 - 2 = 4
And you are left with the unique 4 arrangements that are not derangements.
So to get the number of derangements, we just have to subtract this from the total number of arrangements P_{3,3}:
P_{3,3} - (P_{3,2} - (P_{3,1} - P_{3,0})) = 6 - 4 = 2
And so we see that D_3 = 2

We have a pattern here. We can see that by distributing the negative signs in the above equation we get:
D_3 = P_{3,3} - P_{3,2} + P_{3,1} - P_{3,0}

This can be expressed as a sum:
D_3 = \sum\limits_{k = 0}^3 P_{3,k} (-1)^{3-k}
Or more explicitly:
D_3 = \sum\limits_{k = 0}^3 \frac{3!}{(3-k)!} (-1)^{3-k}

It turns out that this pattern holds for decks of any number of cards n (just replace the 3 here):
D_n = \sum\limits_{k = 0}^n \frac{n!}{(n-k)!} (-1)^{n-k}

For a deck of 3 cards, the number of derangements is: D_3 = 2
For a deck of 4 cards: D_4 = 9
For a deck of 5 cards: D_5 = 44
which then give us the probabilities of derangement:
p_3 = 2/3! = 33.3\%
p_4 = 9/4! = 37.5\%
p_5 = 44/5! = 36.7\%
So by 4 cards, we start seeing a probability closer to 1/e.

Now, to get the probability p of getting a derangement for n cards:
p_n = \frac{D_n}{R_n} = \frac{1}{n!}\,\sum\limits_{k = 0}^n \frac{n!}{(n-k)!} (-1)^{n-k}
The n! factors cancel out, and we are left with:
p_n = \sum\limits_{k = 0}^n \frac{1}{(n-k)!} (-1)^{n-k}

Since the order of addition doesn’t matter, we can change the direction of this sum to start at n and end at 0:
p_n = \sum\limits_{k = n}^0 \frac{1}{(n-k)!} (-1)^{n-k}
If we do some relabeling, the relationship to e will become more obvious. Let’s set m = n-k, so we can write:
p_n = \sum\limits_{m=0}^n \frac{1}{m!} (-1)^{m}
This looks like a truncated Taylor expansion of the exponential function e^x.
A Taylor expansion is a way of expressing a function as a polynomial.

The Taylor expansion of the exponential function is:
e^x = \sum\limits_{m=0}^\infty \frac{1}{m!} x^m
If we plug in x = -1 we get:
e^{-1} = \sum\limits_{m=0}^\infty \frac{1}{m!} (-1)^m
which is the limiting case of a deck with infinite cards:
p_\infty = \frac{1}{e}

This equivalence to the Taylor expansion is the reason why as you increase the number of cards, the probability of getting a derangement approaches 1/e.
To put it another way:
\frac{1}{p} = \frac{R}{D} \approx e.

Now that we have seen the theoretical probability, let’s do the experiment.
Take a standard 52 card deck, note the original positions of the cards, then shuffle and see if you have a derangement. Keep shuffling and noting if the result is a derangement. Keep doing this about 40,000 times. Count the total number of shuffles R and the number of derangements D you get.
If you then take the ratio \frac{R}{D} you will get something approaching e.

Just kidding. That would be extremely tedious. Instead, we can have a computer it for us.
eulerfig3

For a deck of 52 cards, it seems to take about 40k shuffles for the ratio of shuffles to derangements to approach e. See the figure above.

The code used to compute e from shuffles is shown below:
Here I used MATLAB code, to take advantage of vectorization and logical array operations. The inputs are cards (the number of cards in a deck) and games (the number of shuffles). Wins (a win is when you get a derangement) are then counted.

 function eulersnum = eulerdeck(cards,games)
 wins = 0;
 series = 1:cards;
 for ii=1:games
  shuffled = randperm(cards);
  wins = wins + ~any(series == shuffled);
 end
 eulersnum = games/wins;
end 

The known value of e is 2.7183…
To get an accurate measurement we should use a deck with a lot of cards, since that will include more terms of the Taylor expansion. We should also shuffle many times to be sure that we capture the probability.
So with a deck of 1000 cards, and 10 trials, each of 1000 shuffles, we measure e to be:
2.7265 \pm 0.1060

Advertisements

The Structure Behind Trigonometric Functions

This entry is dedicated to my friend Bud in Milwaukee.

Sine, cosine, tangent, etc. are actually abbreviations for combinations of complex exponential functions:

\text{sin}(\theta) = \frac{1}{2i} (e^{i\theta} - e^{-i\theta})

\text{cos}(\theta) = \frac{1}{2} (e^{i\theta} + e^{-i\theta})

\text{tan}(\theta) = \frac{ e^{i\theta} - e^{-i\theta} }{ i (e^{i\theta} + e^{-i\theta}) }

It’s just easier to write out something like “cos”, instead of the complex exponential form. However, it is handy to know the complex exponential forms to derive trig identities on the fly. So how exactly are trig functions made of complex exponential functions?

In a previous entry, I briefly discuss how the exponential function relates angle to Cartesian coordinates. When you put an imaginary number in the exponential function it results in a complex number (a number with a real and an imaginary component). If you treat the real component like an x coordinate, and the imaginary component like a y coordinate, then any complex number can represent a point  on a 2-D plane.

To visualize what’s actually going on in terms of geometry, see the figure below. Imagine a point (pictured in red) that can travel along a unit circle (a circle with radius 1). You can track the point’s angular position \varphi along the circle, and you can also track its position in terms of x and y. There is a relationship between the two ways to track position, and this relationship can be expressed as:
e^{i\varphi} = x + iy
euler3
At this point, you may see that x = \text{cos}(\varphi) and y = \text{sin}(\varphi).
By definition, cosine is adjacent over hypotenuse h
\text{cos}(\varphi) = x/h
and sine is opposite over hypotenuse
\text{sin}(\varphi) = y/h
and since radius = h = 1, you get:
x = \text{cos}(\varphi) and y = \text{sin}(\varphi)

So e^{i\varphi} = x + iy is really:
e^{i\varphi} = \text{cos}(\varphi) + i \text{sin}(\varphi)

From this equation, you can get the equations listed at the beginning by adding or subtracting complex conjugates. For example:
Take the complex conjugate (which just means to change all the i to -i).
e^{-i\varphi} = \text{cos}(\varphi) - i \text{sin}(\varphi)
Now add it to the original to get:
e^{i\varphi} + e^{-i\varphi} = \text{cos}(\varphi) + i \text{sin}(\varphi) + \text{cos}(\varphi) - i \text{sin}(\varphi)
e^{i\varphi} + e^{-i\varphi} = 2 \text{cos}(\varphi)
Now divide by 2 and you have the identity of cosine shown at the beginning!
\text{cos}(\varphi) = \frac{1}{2} (e^{i\varphi} + e^{-i\varphi})

So how do we know that something like e^{i\varphi} breaks up into a complex number like x + iy? One way is through Taylor expansions. I won’t get into how Taylor expansions are derived, since it requires calculus. The only thing you need to know here is that a function can be expressed as an infinitely long polynomial. In the case of the exponential function:
e^{z} = 1 + z + \frac{1}{2}z^2 + \frac{1}{6}z^3 + \frac{1}{24}z^4 + ...
or to be complete, we can write it as a summation:
e^{z} = \sum\limits_{n=0}^{\infty} \frac{1}{n!} z^n

So if we replace the variable z with i \varphi we get:
e^{i\varphi} = 1 + i\varphi + \frac{1}{2}(i\varphi)^2 + \frac{1}{6}(i\varphi)^3 + \frac{1}{24}(i\varphi)^4 + ...
The terms in the polynomial with odd exponents will be imaginary, and the terms with even exponents will be real. Recall that:
i^0 = 1
i^1 = i
i^2 = -1
i^3 = -i
i^4 = 1
So then we get:
e^{i\varphi} = 1 + i\varphi - \frac{1}{2}\varphi^2 - \frac{1}{6}i\varphi^3 + \frac{1}{24}\varphi^4 + ...
We can then group all the real terms together and all the imaginary terms together:
e^{i\varphi} = \left(1 - \frac{1}{2}\varphi^2 + \frac{1}{24}\varphi^4 -... \right) + i \left( \varphi - \frac{1}{6} \varphi^3 + ... \right)
or to be complete:
e^{i\varphi} = \sum\limits_{n=0}^{\infty} \frac{(-1)^n}{(2n)!} \varphi^{2n} + i \sum\limits_{n=0}^{\infty} \frac{(-1)^n}{(2n+1)!} \varphi^{2n+1}
So you can see that e^{i\varphi} indeed breaks up into a complex number. By definition, this is equivalent to:
e^{i\varphi} = \text{cos}(\varphi) + i \text{sin}(\varphi)

It is also worth noting that this gives us the Taylor series for sine and cosine:
\text{sin}(\varphi) = \sum\limits_{n=0}^{\infty} \frac{(-1)^n}{(2n+1)!} \varphi^{2n+1}

\text{cos}(\varphi) = \sum\limits_{n=0}^{\infty} \frac{(-1)^n}{(2n)!} \varphi^{2n}

For an example, let’s derive a trig identity using the complex exponential forms.
Let’s show that:
\text{sin}(\theta) \text{cos}(\theta) = \frac{1}{2} \text{sin}(2 \theta)

First write the complex exponential forms:
\text{sin}(\theta) = \frac{1}{2i} (e^{i\theta} - e^{-i\theta})

\text{cos}(\theta) = \frac{1}{2} (e^{i\theta} + e^{-i\theta})
Then multiply them:
\frac{1}{2i} (e^{i\theta} - e^{-i\theta}) \cdot \frac{1}{2} (e^{i\theta} + e^{-i\theta})
= \frac{1}{2} \cdot \frac{1}{2i} (e^{2 i \theta} + 1 - 1 - e^{-2 i \theta})
= \frac{1}{2} \cdot \frac{1}{2i} (e^{2 i \theta} - e^{-2 i \theta})
This is the same exponential form as the sine function with a couple adjustments (a factor of 1/2 out front and a factor of 2 inside the function):
= \frac{1}{2} \text{sin} (2 \theta)

Look up some other trig identities and try to derive them using complex exponentials.

More Buffon’s Needle

In my previous blog entry, I discussed Buffon’s needle (a method of measuring \pi using random numbers). I tried to recreate what would occur in physical reality without using trig functions in the measurement of \pi. The tricky part was trying to make a uniform probability distribution along the angle \theta, which is what you expect with physical needles being dropped. In this entry, I will do something more exact, but less physical.

In reality, the probability density in \theta is uniform due to there being no preferential direction for a needle to point. To simulate this with uniform random numbers without using trigonometric functions (to avoid just measuring python’s \pi) is tricky. The way I did this was assign uniform random numbers (uniform probability density) to a projection of a needle, either perpendicular or parallel to the lines (with a probability of 0.5 assigned to each case). This results in a probability distribution in \theta that isn’t quite uniform, but it’s close enough to approximate \pi after a large number of needles are dropped:
2 \frac{N_T}{N_C} = \pi
where N_T and N_C are the total number of needles dropped and number of needles crossing a line, respectively.

If instead of assigning a uniform probability to either projection fairly, what if we assign it to just one? This was shown in the previous post. As the number of needles dropped increases, the ratio 2 \frac{N_T}{N_C} approaches some number around 2.52.6, shown in the figure below (\pi shown in red). Why is this the case?
DPI1

The reason why has to do with a uniform probability density along a projection (q) does not translate into a uniform probability density along \theta, see figure below. This means needles have a preferred direction. It turns out this can still be used to extract a measurement of \pi!
BuffonS2
The uniform probability density is assigned to the projection q:
\rho_q = 1/L, for 0 \leq q \leq L, and \rho_q=0 elsewhere.
To get this in terms of \theta, some substitutions are made in the probability density integral:
\int_0^L \rho_q dq = 1
The variables q and \theta are related by:
\text{cos}(\theta) = q/L
so by taking the derivative we get the substitution:
dq = -L\text{sin}(\theta) d \theta

So now the probability density integral can be written as:
\int_0^{\pi/2} \rho_q L \text{sin}(\theta) d \theta
which is also:
= \int_0^{\pi/2} \rho_{\theta} d \theta
So it gives the form of the probability density for \theta in terms of the one for q:
\rho_{\theta} = \rho_q L\text{sin}(\theta)

Plug in \rho_q = 1/L to get:
\rho_{\theta} = \text{sin}(\theta)
and it is obviously NOT uniform, it’s just the sine function!
Plotting a histogram of the number of needles per angle \theta will give the shape of the sine function:
DPI2

Following the same steps as in the previous entry, we can find out why when a uniform probability is assigned to one projection, 2 N_T/N_C approaches 2.5 instead of \pi.
For simplicity W=L, and again for a needle to be crossing a line it has to be below the upper bound:
x \leq \frac{L}{2}\text{sin}(\theta)
and the probability of a needle crossing a line is:
P = \int_0^{\frac{L}{2}\text{sin}(\theta)} \int_0^{\pi/2} \rho_x \rho_{\theta} d \theta d x
Plugging in the probability densities: \rho_x = 2/L and \rho_\theta = \text{sin}(\theta):
P = \int_0^{\pi/2} \text{sin}^2(\theta) d \theta = \pi/4

P = \frac{N_C}{N_T} when the number of needles dropped is large, so by combining these equations:
2 \frac{N_T}{N_C} = \frac{8}{\pi} = 2.52
This is the number that is being approached!
But wait, \pi is there, so we can extract it by:
4 \frac{N_C}{N_T} = \pi

By plotting this ratio as a function of number of needles dropped, we see it approaching \pi. See the figure below.
DPI3

Even though this isn’t how needles behave in reality, it’s a simpler way to measure \pi by using random numbers in a simulation. The code used to make these plots is shown below.

By the last drop:
2 \frac{N_T}{N_C} = 2.54
4 \frac{N_C}{N_T} = 3.14

import os
import sys
import math
import numpy
import pylab
import random

CL = 0
NN = 0

PIa = []
PIb = []
NNa = []
PIc = []
ANGLES = []

for ii in range(100000):

    PPy = random.uniform(0.0,1.0) #A-end y position (lines separated by unit distance)

    RRx = random.uniform(0.0,1.0) #x projection
    RRy = numpy.sqrt(1.0 - RRx**2) #y projection

    QQy = PPy + RRy #B-end y position
    QQyf = math.floor(QQy)
    PPyf = math.floor(PPy)

    NN = NN + 1
    if (QQyf-PPyf > 0.0): #if crossing line
        CL = CL + 1

    NNa.append(NN)
    PIc.append(math.pi)
    RAT = (1.0*CL)/(1.0*NN)
    if (CL==0):
        PIa.append(0)
    else:
        PIa.append(2.0/RAT)

    PIb.append(4*RAT)

    ANGLES.append(abs(math.atan(RRy/(RRx+0.0001)))/math.pi)

print PIa[-1]
print PIb[-1]

pylab.figure(1)
pylab.xlabel("NT")
pylab.ylabel("2 NT/NC")
pylab.plot(NNa,PIc,'r')
pylab.plot(NNa,PIa,'k')
pylab.savefig('DPI1.png')

pylab.figure(2)
pylab.hist(ANGLES, bins=10)
pylab.xlabel("Acute angle in factors of pi")
pylab.ylabel("Needles dropped")
pylab.savefig('DPI2.png')

pylab.figure(3)
pylab.xlabel("NT")
pylab.ylabel("4 NC/NT")
pylab.plot(NNa,PIc,'r')
pylab.plot(NNa,PIb,'k')
pylab.savefig('DPI3.png')

pylab.show()

Happy Pi Day!

Since it is \pi day, I thought I would show a way to measure \pi. The way I have in mind is known as Buffon’s needle. If you have a flat surface with parallel lines of uniform separation and drop needs of uniform length on it, the ratio between the total number needles dropped and the number of needles crossing a line will be a factor of \pi.

The figure below shows the needles of length L in blue, and the lines in black with separation W. For each needle there are two important values: x and \theta. The x value is the distance between the center of the needle and the nearest line. The \theta value is the acute angle between the needle and the nearest line. The span of available values of x is then 0 \leq x \leq W/2, and the span of \theta is 0 \leq \theta \leq \pi/2.

Buffon3
Considering all possible values of x and \theta, the probability is 1 by definition.
P = \int_{\Delta \theta} \int_{\Delta x} \rho dx d\theta = 1
Where in general, if x and \theta are independent: \rho = \rho(x)\rho(\theta).

A uniform probability densities are assumed for both x and \theta since there is no preferred direction or position for needles. So the probability densities \rho are defined by:
\rho_{\theta} = 2/\pi, for 0 \leq \theta \leq \pi/2, and \rho_{\theta} = 0 elsewhere.
\rho_{x} = 2/W, for 0 \leq x \leq W/2, and \rho_{x} = 0 elsewhere.
A uniform probability density is shown in the figure below for \theta, note that the area is 1.

buffon5
For simplicity, we will only consider the case where the needles are shorter or the same length as the separation distance between the lines: L \leq W.
So for needles crossing a line, the projected half-length of the needle perpendicular to the lines is greater than the distance between the center of the needle to the nearest line:
x \leq \frac{L}{2}\text{sin}(\theta)
This is the upper bound that is then plugged into the probability integral, to get the probability of crossing a line:
P = \int_{0}^{\pi/2} \int_{0}^{\frac{L}{2}\text{sin}(\theta)} \rho_{x} \rho_{\theta} dx d\theta = \frac{2 L}{\pi W}
For the simple case where L = W:
P = \frac{2}{\pi}
After enough drops the ratio between the number of needles crossing a line N_C and the total number of needles N_T will resemble the probability:
P = \frac{N_C}{N_T}
Combining these two equations, we get ratio that can measure \pi:
2 \frac{N_T}{N_C} = \pi

Doing this measurement in real life by dropping needles would make a mess, so I wrote a python code to it instead. Below is a graph showing this ratio as a function of needle drops (shown in black), \pi is shown in red. You can see the black line approaches \pi as the number of needles increases.

BPI

The probability density as a function of \theta is shown below. It is not perfectly flat as in the ideal case. This is because in my code the uniform probability density is along the Cartesian axes (X,Y) rather than along \theta. (The parallel lines are separated along Y). I did this to avoid measuring python’s \pi in its trigonometric functions.

BPI2

A uniform random number is assigned to the projected needle length along X or Y with a range of 0 to 1. The length of a needle is forced to be 1 by using the Pythagorean theorem: X = \sqrt{1-Y^2} or Y = \sqrt{1-X^2}. There is a 0.5 probability that either X or Y is assigned to the uniform random number. This approximates a uniform probability density in \theta. It’s not perfect though.

If however, the uniform random number is assigned to only X, the ratio doesn’t make it to \pi, it stops around 2.6, and the probability density is far from uniform. See the figures below:

BPI1q

BPI2q

The code for these plots is shown below:

import os
import sys
import math
import numpy
import pylab
import random

CL = 0
NN = 0

PIa = []
NNa = []
PIc = []
ANGLES = []

for ii in range(10000):

    PPx = random.uniform(1.0,10.0) #1-10 A-end x position
    PPy = random.uniform(0.0,10.0) #1-10 A-end y position (lines)

    RRx = random.uniform(-1.0,1.0) #B-end x displacement
    RRy = numpy.sqrt(1.0 - RRx**2) #B-end y displacement

    if (random.uniform(0.0,1.0)>0.5): #balance probability density to make it more uniform in terms of acute angle
        SIGN = 1
        if (random.uniform(0.0,1.0)>0.5):
            SIGN = -1
        RRy = random.uniform(0.0,1.0) #B-end y displacement
        RRx = numpy.sqrt(1.0 - RRy**2)*SIGN #B-end x displacement


    QQx = PPx + RRx #B-end x position
    QQy = PPy + RRy #B-end y position
    QQyf = math.floor(QQy)
    PPyf = math.floor(PPy)

    NN = NN + 1
    if (QQyf-PPyf > 0.0): #if crossing line
        CL = CL + 1

    NNa.append(NN)
    PIc.append(math.pi)
    RAT = (1.0*CL)/(1.0*NN)
    if (CL==0):
        PIa.append(0)
    else:
        PIa.append(2.0/RAT)

    ANGLES.append(abs(math.atan(RRy/(RRx+0.0001)))/math.pi)

print PIa[-1]

pylab.figure(1)
pylab.xlabel("NT")
pylab.ylabel("2 NT/NC")
pylab.plot(NNa,PIc,'r')
pylab.plot(NNa,PIa,'k')
pylab.savefig('BPI1q.png')

pylab.figure(2)
pylab.hist(ANGLES, bins=10)
pylab.xlabel("Acute angle in factors of pi")
pylab.ylabel("Needles dropped")
pylab.savefig('BPI2q.png')

pylab.show()

Lego Loop Limits

LEGODIM0
How can we make a circular loop out of lego bricks? For simplicity, we’ll use 2×1 bricks. The dimensions of a 2×1 brick are shown below in millimetres.
LEGODIM2
We’ll define some variables using these dimensions:
Clearance per brick end: \epsilon = 0.1 mm
Distance between pegs: m = 8.0 mm
Width: w = m - 2 \epsilon = 7.8 mm
Length: l = 2m - 2 \epsilon = 15.8 mm

If bricks are assembled in a “running bond” style, two adjacent bricks can rotate maximally by using all the available clearance space. This means that if we build a wall long enough, it can be wrapped around into a loop. The question now is, what is the minimum number of bricks we need to do this? (The figure below exaggerates this rotation.)
LEGODIM
For any regular polygon with n sides, angles A and B are already known (see next figure below).
Center angle A is just a circle (2\pi) divided by the number of sides.
A=2\pi/n
Any triangle’s angles add up to \pi.
\pi = A + 2B
Combining these equations, inner angle B is then defined in terms of the number of sides:
B = \pi/2 - \pi/n
LEGODIMoct
Angle B is related to angle \theta (shown in the figure below) by:
\pi = B + \pi/4 + \theta
(The space between B and \theta is \pi/4 since the peg is centered such that it’s equidistant from the lengthwise edges and the nearest width-wise edge of the brick.)
Angle \theta can be expressed in terms of brick width w and clearance \epsilon.
In the figure below, the distance between the red point and blue point is w/\sqrt{2}, and the shortest distance from the blue point to the dashed line of symmetry is w/2 + \epsilon. So the angle \theta can be expressed as:
\text{sin}\theta = \frac{w/2+\epsilon}{w/\sqrt{2}}
or
\theta = \text{arcsin}(1/\sqrt{2} + \sqrt{2} \epsilon/w)
LEGODIMtri3
Angle B is then:
B = 3\pi/4 - \text{arcsin}(1/\sqrt{2} + \sqrt{2} \epsilon/w)

Plugging the earlier polygon formula into this equation gives a formula for n in terms of brick width and clearance:
n = \frac{\pi}{\text{arcsin}(1/\sqrt{2} + \sqrt{2} \epsilon/w)-\pi/4}

Plugging in the dimensions, we get a minimum number of sides to make a loop:
n \approx 121
and since at least 3 layers are needed to create a secure running bond, the minimum number of bricks needed is 363.
This has a corresponding angle:
\theta = 0.81136
and acute angle between bricks:
\delta = 2(\theta - \pi/4) = 0.051928

Trying this in real life, I was able to get a loop with 110 sides (330 bricks).
LEGODIMREAL2
n_R = 110
\theta_R = 0.81396
\delta_R = 0.057120
The theoretical calculation done above only assumes a perfectly rigid lego brick.
So the difference in n may be accounted for by tolerances on the lego brick (stretching and squashing) and asymmetric angles, allowing for tighter inner angles.

Imaginary Powers

Imaginary numbers (i = \sqrt{-1}) are used to relate Cartesian coordinates to polar coordinates as shown in the figure below. The argument \varphi is the angle from the real axis along the unit circle, and can be expressed as two terms: a real and an imaginary, giving the Cartesian (x,y) position.
e^{i\varphi} = \text{cos}(\varphi) + i \text{sin}(\varphi) = x + i y
(If you want to expand beyond the unit circle just multiply by the desired radius: e^{i\varphi} \rightarrow re^{i\varphi})

euler
This is known as the Euler relation, and it gives us some neat results, like:
e^{i\pi} = -1
e^{i \pi /2} = i
But what happens when you exponentiate i with itself? i^i.

From the Euler relation we know that i=e^{i \pi /2}, so it’s really a matter of substitution:
i^i = e^{i \pi / 2 i} = e^{-\pi / 2} = 0.20788...
So i^i is a real number!

But wait a second, this isn’t the whole answer. We can add or subtract 2\pi from any angle and wind up in the same place on the circle. So i = e^{i \pi /2 + i 2 \pi n}, where n is any integer. So we end up with a set of values:
i^i = e^{- \pi /2 - 2 \pi n}
which are all real!

icurve
The curve above goes through the set of values for i^i, where n is an integer.
i^i = \{...,111.32,0.20788,0.00038820,...\}
But 0.20788… will be looked at as the primary value for now since it is the result when n=0, and having one value keeps things simple.

So what happens when we repeat this exponentiation? i^{i^i}
First, this can be written in a simpler way:
^3i = i^{i^i}
This operation is called “tetration” or “power tower”.
You start from the top and work your way down. So ^3i = i^{i^i} = i^{0.20788}
Now use the Euler relation again (ignoring 2 \pi n terms in the exponential):
i^{0.20788} = e^{i \pi/2 \times 0.20788} = e^{i 0.32653}
So the result is complex:
^3i = 0.94715+0.32076i

We can keep going using results from the last iteration:
^4i = i^{0.94715+0.32076 i} = 0.05009+0.60211 i

Eventually you find that with ^mi as m increases, the results start to converge on one number. This implies there’s a value for ^\infty i. After plugging and chugging we can eventually conclude that:
^\infty i = 0.43828 + 0.36059 i
Convergence can be seen in both the real and imaginary components, around m \approx 70 if we require the difference between iterations to be less than 0.1%. (See figures below).

Refig Imfig

Each result has a different radius associated with it, since the imaginary component of the previous result puts a real term in the exponential of the next. This real term can be expressed as a factor r:
e^{i\varphi} = a + bi \rightarrow i^{a+bi} = e^{i \pi/2 (a + bi)} = e^{-b\pi/2}e^{i a \pi/2} = r e^{i a \pi/2}
The radius can be found by multiplying the result by its complex conjugate and then find the square root (i.e. the Pythagorean theorem).

We can then plot radius of ^mi as a function of m (see plot below).

Imfig

The python code for making these plots is:

import os
import sys
import math
import numpy
import pylab

REAL = []
IMAG = []
RADII = []
ITERA = []

pp = 1j
for n in range(100):
    ppL = pp #save previous value for cut off
    pp = (1j)**pp #interate power tower
    RA = 0.001 #ratio for cut off
    if (abs(pp.real - ppL.real)/(ppL.real + 0.00001) < RA) and (abs(pp.imag - ppL.imag)/(ppL.imag + 0.00001) < RA): #establish a cut off
        print n
        print pp
        break
    else:
        print numpy.sqrt(pp*pp.conjugate()) #radius

        REAL.append(pp.real) #save real component
        IMAG.append(pp.imag) #save imaginary component
        RADII.append(numpy.sqrt(pp*pp.conjugate())) #save radius
        ITERA.append(n+2) #save iteration "m"

RADII = [1]+RADII #add in the initial value for the plot
ITERA = [1]+ITERA
REAL = [0]+REAL
IMAG = [1]+IMAG

pylab.figure(1)
pylab.xlabel("m")
pylab.ylabel("radius")
pylab.plot(ITERA,RADII,'k')

pylab.savefig('Radfig.png')

# pylab.show()

pylab.figure(2)
pylab.xlabel("m")
pylab.ylabel("imaginary component")
pylab.plot(ITERA,IMAG,'k')

pylab.savefig('Imfig.png')

pylab.figure(3)
pylab.xlabel("m")
pylab.ylabel("real component")
pylab.plot(ITERA,REAL,'k')

pylab.savefig('Refig.png')
        

Translating a Wave Function

In algebra, or pre-calc, you learn that you can change the position of a function by modifying the argument. In quantum physics this idea is used to displace wave functions. If a function starts off at one position, and moves to another position, all that is needed is a change in argument. However, quantum physics likes to use linear operators to alter functions. What would an operator look like if it can change the argument of a function? In this post, I will construct a 1-D example of such an operator.

A general wave function can be written as: \Psi (x), where the shape of \Psi is dependent on the spatial variable x.
To translate a function by distance a, modify the argument of \Psi,
To move the function right by a, \Psi(x) \rightarrow \Psi (x-a)
To move the function left by a, \Psi(x) \rightarrow \Psi(x+a)
Let’s just take the \Psi(x+a) example, and without loss of generality say that a can be positive or negative.
Next we can take advantage of Taylor expansions.
A function f(x) can be expanded around a point a: f(x) = f(a) + (x-a)\frac{d}{dx}f(a) + \frac{(x-a)^2}{2!} \frac{d^2}{dx^2}f(a) + ...
Here, in our example, we want to expand \Psi(x+a) around x, to express the translated function \Psi(x+a) in terms of the original function \Psi(x):
\Psi(x+a) = \Psi(x) + a\frac{d}{dx}\Psi(x) + \frac{a^2}{2!} \frac{d^2}{dx^2} \Psi(x) + ...
Note: \frac{d}{dx} = \frac{i}{\hbar} p.
A more complete version of this expression would be:
\Psi(x+a) = \sum\limits_{n=0}^{\infty} \frac{a^n}{n!} \frac{d^n}{dx^n} \Psi(x) = \Psi(x) + \sum\limits_{n=1}^{\infty} \frac{a^n}{n!} \frac{d^n}{dx^n} \Psi(x)
This sum’s structure is similar to the Taylor expansion of the exponential function.
e^x = 1 + x + x^2/2! + ... = \sum\limits_{n=1}^{\infty} \frac{x^n}{n!}
Every operator in the \Psi(x+a) expansion can be reduced into an simplified operator: \Psi(x+a) = e^{a\frac{d}{dx}} \Psi(x)
= e^{\frac{ai}{\hbar} p} \Psi(x).
This new operator e^{a\frac{d}{dx}} can be expanded to return to what we had before: e^{a\frac{d}{dx}} = \sum\limits_{n=0}^{\infty} \frac{a^n}{n!} \frac{d^n}{dx^n}.
So the translated function is a version of the original function with a specific type of interference. Such that the structure is: Translated = Original + Interference.
\Psi(x+a) = \Psi(x) + \sum\limits_{n=1}^{\infty} \frac{a^n}{n!} \frac{d^n}{dx^n} \Psi(x) = \Psi(x) + \Delta(a)\Psi(x)
We can reduce the operators in the interference terms into an exponential in the usual way:
\Delta(a) = \sum\limits_{n=1}^{\infty} \frac{a^n}{n!} \frac{d^n}{dx^n}
= e^{a \frac{d}{dx} } - 1

The expectation value \left< e^{a\frac{d}{dx}} \right> characterizes the average overlap between \Psi(x) and \Psi(x+a).
\left< e^{a\frac{d}{dx}} \right> = \left< \Psi(x) \right| e^{a\frac{d}{dx}} \left| \Psi(x) \right>
= \left< \Psi(x) | \Psi(x+a) \right>
= \left< \Psi(x) \right| 1 + \Delta(a) \left| \Psi(x) \right>
= 1 + \left< \Delta(a) \right>
The expectation value of the translation operator is then unity plus the expectation value of the interference operator.
\left< \Delta (a) \right> = \sum\limits_{n=1}^{\infty} \frac{a^n}{n!} \left< \frac{d^n}{dx^n} \right>
= \sum\limits_{n=1}^{\infty} \frac{(ia)^n}{(\hbar)^n n!} \left< p^n \right>
The interference expectation value is shown to be an expectation value of a function of the momentum operator.
In the case of localized waves, as a gets much greater than the width of \Psi(x), the interference term approaches -1, since the overlap between the original and translated wave function decreases. This is equivalent to the original and displaced wave function becoming more and more orthogonal.
limit _{a\rightarrow \infty} \left< \Delta (a) \right> = -1
limit _{a\rightarrow \infty} \left< \Psi(x) | \Psi(x+a) \right> = 0