# Happy Pi Day!

Since it is $\pi$ day, I thought I would show a way to measure $\pi$. The way I have in mind is known as Buffon’s needle. If you have a flat surface with parallel lines of uniform separation and drop needs of uniform length on it, the ratio between the total number needles dropped and the number of needles crossing a line will be a factor of $\pi$.

The figure below shows the needles of length $L$ in blue, and the lines in black with separation $W$. For each needle there are two important values: $x$ and $\theta$. The $x$ value is the distance between the center of the needle and the nearest line. The $\theta$ value is the acute angle between the needle and the nearest line. The span of available values of $x$ is then $0 \leq x \leq W/2$, and the span of $\theta$ is $0 \leq \theta \leq \pi/2$.

Considering all possible values of $x$ and $\theta$, the probability is 1 by definition.
$P = \int_{\Delta \theta} \int_{\Delta x} \rho dx d\theta = 1$
Where in general, if $x$ and $\theta$ are independent: $\rho = \rho(x)\rho(\theta)$.

A uniform probability densities are assumed for both $x$ and $\theta$ since there is no preferred direction or position for needles. So the probability densities $\rho$ are defined by:
$\rho_{\theta} = 2/\pi$, for $0 \leq \theta \leq \pi/2$, and $\rho_{\theta} = 0$ elsewhere.
$\rho_{x} = 2/W$, for $0 \leq x \leq W/2$, and $\rho_{x} = 0$ elsewhere.
A uniform probability density is shown in the figure below for $\theta$, note that the area is 1.

For simplicity, we will only consider the case where the needles are shorter or the same length as the separation distance between the lines: $L \leq W$.
So for needles crossing a line, the projected half-length of the needle perpendicular to the lines is greater than the distance between the center of the needle to the nearest line:
$x \leq \frac{L}{2}\text{sin}(\theta)$
This is the upper bound that is then plugged into the probability integral, to get the probability of crossing a line:
$P = \int_{0}^{\pi/2} \int_{0}^{\frac{L}{2}\text{sin}(\theta)} \rho_{x} \rho_{\theta} dx d\theta = \frac{2 L}{\pi W}$
For the simple case where $L = W$:
$P = \frac{2}{\pi}$
After enough drops the ratio between the number of needles crossing a line $N_C$ and the total number of needles $N_T$ will resemble the probability:
$P = \frac{N_C}{N_T}$
Combining these two equations, we get ratio that can measure $\pi$:
$2 \frac{N_T}{N_C} = \pi$

Doing this measurement in real life by dropping needles would make a mess, so I wrote a python code to it instead. Below is a graph showing this ratio as a function of needle drops (shown in black), $\pi$ is shown in red. You can see the black line approaches $\pi$ as the number of needles increases.

The probability density as a function of $\theta$ is shown below. It is not perfectly flat as in the ideal case. This is because in my code the uniform probability density is along the Cartesian axes ($X,Y$) rather than along $\theta$. (The parallel lines are separated along $Y$). I did this to avoid measuring python’s $\pi$ in its trigonometric functions.

A uniform random number is assigned to the projected needle length along $X$ or $Y$ with a range of 0 to 1. The length of a needle is forced to be 1 by using the Pythagorean theorem: $X = \sqrt{1-Y^2}$ or $Y = \sqrt{1-X^2}$. There is a 0.5 probability that either $X$ or $Y$ is assigned to the uniform random number. This approximates a uniform probability density in $\theta$. It’s not perfect though.

If however, the uniform random number is assigned to only $X$, the ratio doesn’t make it to $\pi$, it stops around 2.6, and the probability density is far from uniform. See the figures below:

The code for these plots is shown below:

import os
import sys
import math
import numpy
import pylab
import random

CL = 0
NN = 0

PIa = []
NNa = []
PIc = []
ANGLES = []

for ii in range(10000):

PPx = random.uniform(1.0,10.0) #1-10 A-end x position
PPy = random.uniform(0.0,10.0) #1-10 A-end y position (lines)

RRx = random.uniform(-1.0,1.0) #B-end x displacement
RRy = numpy.sqrt(1.0 - RRx**2) #B-end y displacement

if (random.uniform(0.0,1.0)&amp;amp;amp;amp;amp;amp;gt;0.5): #balance probability density to make it more uniform in terms of acute angle
SIGN = 1
if (random.uniform(0.0,1.0)&amp;amp;amp;amp;amp;amp;gt;0.5):
SIGN = -1
RRy = random.uniform(0.0,1.0) #B-end y displacement
RRx = numpy.sqrt(1.0 - RRy**2)*SIGN #B-end x displacement

QQx = PPx + RRx #B-end x position
QQy = PPy + RRy #B-end y position
QQyf = math.floor(QQy)
PPyf = math.floor(PPy)

NN = NN + 1
if (QQyf-PPyf &amp;amp;amp;amp;amp;amp;gt; 0.0): #if crossing line
CL = CL + 1

NNa.append(NN)
PIc.append(math.pi)
RAT = (1.0*CL)/(1.0*NN)
if (CL==0):
PIa.append(0)
else:
PIa.append(2.0/RAT)

ANGLES.append(abs(math.atan(RRy/(RRx+0.0001)))/math.pi)

print PIa[-1]

pylab.figure(1)
pylab.xlabel(&amp;amp;amp;amp;amp;amp;quot;NT&amp;amp;amp;amp;amp;amp;quot;)
pylab.ylabel(&amp;amp;amp;amp;amp;amp;quot;2 NT/NC&amp;amp;amp;amp;amp;amp;quot;)
pylab.plot(NNa,PIc,'r')
pylab.plot(NNa,PIa,'k')
pylab.savefig('BPI1q.png')

pylab.figure(2)
pylab.hist(ANGLES, bins=10)
pylab.xlabel(&amp;amp;amp;amp;amp;amp;quot;Acute angle in factors of pi&amp;amp;amp;amp;amp;amp;quot;)
pylab.ylabel(&amp;amp;amp;amp;amp;amp;quot;Needles dropped&amp;amp;amp;amp;amp;amp;quot;)
pylab.savefig('BPI2q.png')

pylab.show()


# Lego Loop Limits

How can we make a circular loop out of lego bricks? For simplicity, we’ll use 2×1 bricks. The dimensions of a 2×1 brick are shown below in millimetres.

We’ll define some variables using these dimensions:
Clearance per brick end: $\epsilon = 0.1$ mm
Distance between pegs: $m = 8.0$ mm
Width: $w = m - 2 \epsilon = 7.8$ mm
Length: $l = 2m - 2 \epsilon = 15.8$ mm

If bricks are assembled in a “running bond” style, two adjacent bricks can rotate maximally by using all the available clearance space. This means that if we build a wall long enough, it can be wrapped around into a loop. The question now is, what is the minimum number of bricks we need to do this? (The figure below exaggerates this rotation.)

For any regular polygon with $n$ sides, angles $A$ and $B$ are already known (see next figure below).
Center angle $A$ is just a circle ($2\pi$) divided by the number of sides.
$A=2\pi/n$
Any triangle’s angles add up to $\pi$.
$\pi = A + 2B$
Combining these equations, inner angle $B$ is then defined in terms of the number of sides:
$B = \pi/2 - \pi/n$

Angle $B$ is related to angle $\theta$ (shown in the figure below) by:
$\pi = B + \pi/4 + \theta$
(The space between $B$ and $\theta$ is $\pi/4$ since the peg is centered such that it’s equidistant from the lengthwise edges and the nearest width-wise edge of the brick.)
Angle $\theta$ can be expressed in terms of brick width $w$ and clearance $\epsilon$.
In the figure below, the distance between the red point and blue point is $w/\sqrt{2}$, and the shortest distance from the blue point to the dashed line of symmetry is $w/2 + \epsilon$. So the angle $\theta$ can be expressed as:
$\text{sin}\theta = \frac{w/2+\epsilon}{w/\sqrt{2}}$
or
$\theta = \text{arcsin}(1/\sqrt{2} + \sqrt{2} \epsilon/w)$

Angle $B$ is then:
$B = 3\pi/4 - \text{arcsin}(1/\sqrt{2} + \sqrt{2} \epsilon/w)$

Plugging the earlier polygon formula into this equation gives a formula for $n$ in terms of brick width and clearance:
$n = \frac{\pi}{\text{arcsin}(1/\sqrt{2} + \sqrt{2} \epsilon/w)-\pi/4}$

Plugging in the dimensions, we get a minimum number of sides to make a loop:
$n \approx 121$
and since at least 3 layers are needed to create a secure running bond, the minimum number of bricks needed is 363.
This has a corresponding angle:
$\theta = 0.81136$
and acute angle between bricks:
$\delta = 2(\theta - \pi/4) = 0.051928$

Trying this in real life, I was able to get a loop with 110 sides (330 bricks).

$n_R = 110$
$\theta_R = 0.81396$
$\delta_R = 0.057120$
The theoretical calculation done above only assumes a perfectly rigid lego brick.
So the difference in $n$ may be accounted for by tolerances on the lego brick (stretching and squashing) and asymmetric angles, allowing for tighter inner angles.

# Imaginary Powers

Imaginary numbers ($i = \sqrt{-1}$) are used to relate Cartesian coordinates to polar coordinates as shown in the figure below. The argument $\varphi$ is the angle from the real axis along the unit circle, and can be expressed as two terms: a real and an imaginary, giving the Cartesian (x,y) position.
$e^{i\varphi} = \text{cos}(\varphi) + i \text{sin}(\varphi) = x + i y$
(If you want to expand beyond the unit circle just multiply by the desired radius: $e^{i\varphi} \rightarrow re^{i\varphi}$)

This is known as the Euler relation, and it gives us some neat results, like:
$e^{i\pi} = -1$
$e^{i \pi /2} = i$
But what happens when you exponentiate $i$ with itself? $i^i$.

From the Euler relation we know that $i=e^{i \pi /2}$, so it’s really a matter of substitution:
$i^i = e^{i \pi / 2 i} = e^{-\pi / 2} = 0.20788...$
So $i^i$ is a real number!

But wait a second, this isn’t the whole answer. We can add or subtract $2\pi$ from any angle and wind up in the same place on the circle. So $i = e^{i \pi /2 + i 2 \pi n}$, where $n$ is any integer. So we end up with a set of values:
$i^i = e^{- \pi /2 - 2 \pi n}$
which are all real!

The curve above goes through the set of values for $i^i$, where $n$ is an integer.
$i^i = \{...,111.32,0.20788,0.00038820,...\}$
But 0.20788… will be looked at as the primary value for now since it is the result when $n=0$, and having one value keeps things simple.

So what happens when we repeat this exponentiation? $i^{i^i}$
First, this can be written in a simpler way:
$^3i = i^{i^i}$
This operation is called “tetration” or “power tower”.
You start from the top and work your way down. So $^3i = i^{i^i} = i^{0.20788}$
Now use the Euler relation again (ignoring $2 \pi n$ terms in the exponential):
$i^{0.20788} = e^{i \pi/2 \times 0.20788} = e^{i 0.32653}$
So the result is complex:
$^3i = 0.94715+0.32076i$

We can keep going using results from the last iteration:
$^4i = i^{0.94715+0.32076 i} = 0.05009+0.60211 i$

Eventually you find that with $^mi$ as $m$ increases, the results start to converge on one number. This implies there’s a value for $^\infty i$. After plugging and chugging we can eventually conclude that:
$^\infty i = 0.43828 + 0.36059 i$
Convergence can be seen in both the real and imaginary components, around $m \approx 70$ if we require the difference between iterations to be less than 0.1%. (See figures below).

Each result has a different radius associated with it, since the imaginary component of the previous result puts a real term in the exponential of the next. This real term can be expressed as a factor $r$:
$e^{i\varphi} = a + bi \rightarrow i^{a+bi} = e^{i \pi/2 (a + bi)} = e^{-b\pi/2}e^{i a \pi/2} = r e^{i a \pi/2}$
The radius can be found by multiplying the result by its complex conjugate and then find the square root (i.e. the Pythagorean theorem).

We can then plot radius of $^mi$ as a function of $m$ (see plot below).

The python code for making these plots is:

import os
import sys
import math
import numpy
import pylab

REAL = []
IMAG = []
ITERA = []

pp = 1j
for n in range(100):
ppL = pp #save previous value for cut off
pp = (1j)**pp #interate power tower
RA = 0.001 #ratio for cut off
if (abs(pp.real - ppL.real)/(ppL.real + 0.00001) < RA) and (abs(pp.imag - ppL.imag)/(ppL.imag + 0.00001) < RA): #establish a cut off
print n
print pp
break
else:

REAL.append(pp.real) #save real component
IMAG.append(pp.imag) #save imaginary component
ITERA.append(n+2) #save iteration "m"

ITERA = [1]+ITERA
REAL = [0]+REAL
IMAG = [1]+IMAG

pylab.figure(1)
pylab.xlabel("m")

# pylab.show()

pylab.figure(2)
pylab.xlabel("m")
pylab.ylabel("imaginary component")
pylab.plot(ITERA,IMAG,'k')

pylab.savefig('Imfig.png')

pylab.figure(3)
pylab.xlabel("m")
pylab.ylabel("real component")
pylab.plot(ITERA,REAL,'k')

pylab.savefig('Refig.png')



# Fun with Repeating Decimals

So you may have noticed that when you divide an integer by 11, if it is not a multiple of 11, you get an alternating repeating decimal that looks like “.ABABABABABABAB…”. You may have also noticed that A and B always add up to 9. Let’s discuss why that is.

So let’s divide $n$ by 11, where $n$ is an integer less than 11. You only need to consider integers less than 11, because any improper fraction can be expressed as an integer plus a fraction where the numerator is less than the denominator.
For example:  $\frac{48}{11} = 4 \frac{4}{11}$

Now if you convert that fraction into a decimal you get an alternating repeating decimal, $\frac{4}{11} = 0.363636363...$, where 3 and 6 add to 9.
This decimal can also be expressed in another form by turning the decimal places into a sum:
$\frac{4}{11} = 3 \cdot \frac{1}{10} + 6 \cdot \frac{1}{100} + 3 \cdot \frac{1}{1000} + 6 \frac{1}{10000} + ...$

We can then generalize this for any integer $n$ less than 11.
$\frac{n}{11} = A \cdot \frac{1}{10} + B \cdot \frac{1}{100} + A \cdot \frac{1}{1000} + B \frac{1}{10000} +....$
Where A and B are integers.
So now we want to prove that $A+B = 9$.

First let’s put the above equation into a summation form, since the sum is infinitely long, but can be expressed simply because A is multiplied by odd powers of 10, and B is multiplied by even powers of 10. This is done to simplify the expression so there is one instance of A and one instance of B.
$\frac{n}{11} = A \cdot \sum_{j=0}^{\infty} 10^{-(2j+1)} + B \cdot \sum_{j=0}^{\infty} 10^{-(2j+2)}$

Now we want to simplify this a bit more to get rid of the infinite sum.
First let’s pull out some factors of $\frac{1}{10}$ to make the sums in each term equivalent.
$\frac{n}{11} = A \frac{1}{10} \sum_{j=0}^{\infty} 10^{-2j} + B \frac{1}{100} \sum_{j=0}^{\infty} 10^{-2j}$
$= \left( A \frac{1}{10} + B \frac{1}{100} \right) \sum_{j=0}^{\infty} 10^{-2j}$
Now the sum can be reduced to a number: $\sum_{j=0}^{\infty} 10^{-2j} = 1.01010101... = \frac{100}{99}$

So now we have:
$\frac{n}{11} = \left( A \cdot \frac{1}{10} + B \cdot \frac{1}{100} \right) \frac{100}{99}$
which can be reduced to a nice expression:
$9n = 10A + B$
From this we want to prove that $A+B=9$, where $n, A, B$ are integers.

To prove this, let’s plug in a more general equation $A+B = \mu$ and solve for $\mu$. The result is:
$n = A + \frac{\mu}{9}$
Now since $n$ and $A$ are integers, $\frac{\mu}{9}$ must also be an integer, meaning $\mu$ must be a multiple of 9. Since $A$ and $B$ can only hold one digit, the only options are 9 and 18. But $\mu =18$ only occurs when $n=11$, which corresponds to $\frac{11}{11}=1$.
So for the case $n<11$, it must be that $A+B=9$.

# An Intro to Time Evolution: The Heisenberg and Schrödinger Pictures

A quantum state is just a function that describes the probabilistic nature of a particle (or particles) in terms of measurable quantities. Measurable quantities are represented by hermitian operators that act on the state to give possible values. But what happens to a quantum state over time? How does it change? How do the measurable quantities change? Here I will elaborate on what is called “time evolution”, a method of evolving states and operators.
Evolving a state to a later time, and including time dependence are done in the same way.
$\left| \Psi (x,0) \right> \rightarrow \left| \Psi (x,t) \right>$
This is the Schrödinger picture, which evolves states. In the Heisenberg picture, operators evolve. The pictures are equivalent, but are suited for different purposes. One can’t talk about one without talking about the other.
To evolve a state, we want to construct a linear operator that changes the argument of the function.
$\mathscr{U} \left| \Psi (x,0) \right> = \left| \Psi(x,t) \right>$
The operator should be unitary, to conserve probability.
$P = \left< \Psi(x,0) | \Psi(x,0) \right> = \left< \Psi(x,t) | \Psi(x,t) \right> = \left< \Psi(x,0) \left| \mathscr{U}^{\dagger} \mathscr{U} \right| \Psi(x,0) \right>$
$\therefore \mathscr{U}^{\dagger} \mathscr{U} = 1$

One way to construct this operator is to solve the time-dependent Schrödinger equation, with initial conditions imposed on it. For simplicity, let’s assume that the Hamiltonian operator $H$ has no time dependence itself.
$\frac{\partial}{\partial t} \Psi = \frac{-i}{\hbar} H \Psi$
$ln(\Psi) = \frac{-i}{\hbar} \int H dt$
$\Psi (x,t) = e^{\frac{-i}{\hbar} H t} A$
$\Psi(x,0) = A$
$\Psi (x,t) = e^{\frac{-i}{\hbar} H t} \Psi (x,0)$
$\therefore \mathscr{U} = e^{\frac{-i}{\hbar} H t}$
The resulting operator $\mathscr{U}$ is a unitary time evolution operator.

In this simple case, one can also use the same approach used to construct the translation operator, except it will translate through time instead of space.
$\Psi (x,t) \rightarrow \Psi (x,t + \Delta t)$
$\Psi (x, t + \Delta t) = \Psi(x,t) + \Delta t \frac{d}{dt} \Psi(x,t) + \Delta t^2 \frac{1}{2!} \frac{d^2}{dt^2} \Psi(x,t) + ... = e^{\Delta t \frac{d}{dt}} \Psi(x,t)$
$\therefore \mathscr{U} = e^{\Delta t \frac{d}{dt}} = e^{\frac{-i}{\hbar} H \Delta t}$

The Schrödinger-style time evolution can be transformed into Heisenberg-style by pulling the time evolution operators from the state $\Psi$ and attaching them to the operator $\mathscr{O}$.
$\left< \Psi (x,0) \left| \mathscr{O} \right| \Psi (x,0) \right> \rightarrow \left< \Psi (x,t) | \mathscr{O} | \Psi (x,t) \right>$
$\left< \Psi (x,t) | \mathscr{O} | \Psi (x,t) \right> = \left< \Psi (x,0) \left| e^{\frac{i}{\hbar} H t} \mathscr{O} e^{\frac{-i}{\hbar} H t} \right| \Psi (x,0) \right> = \left< \Psi (x,0) \left| \mathscr{O}(t) \right| \Psi (x,0) \right>$
The resulting time-evolved operator is then:
$\mathscr{O}(t) = e^{\frac{i}{\hbar} H t} \mathscr{O} e^{\frac{-i}{\hbar} H t}$

Taking the time derivative of this general operator will give the Heisenberg equation of motion. Plugging in an operator for $\mathscr{O}$ will give an equation describing how that operator evolves with time.
$\frac{d}{dt} \mathscr{O}(t) = \frac{i}{\hbar} e^{\frac{i}{\hbar} H t} H \mathscr{O} e^{\frac{-i}{\hbar} H t} + e^{\frac{i}{\hbar} H t} \frac{\partial \mathscr{O}}{\partial t} e^{\frac{-i}{\hbar} H t} + \frac{-i}{\hbar} e^{\frac{i}{\hbar} H t} \mathscr{O} H e^{\frac{-i}{\hbar} H t}$
The exponential operator $e^{\frac{\pm i}{\hbar} H t}$ commutes with $H$, since it is made of only $H$ operators. In the first term and last term, the exponential operator can act on the $\mathscr{O}$ to evolve it into $\mathscr{O}(t)$, as shown previously. So the expression can be reduced to:
$\frac{d}{dt} \mathscr{O}(t) = \frac{1}{i \hbar} [\mathscr{O}(t), H] + e^{\frac{i}{\hbar} H t} \frac{\partial \mathscr{O}}{\partial t} e^{\frac{-i}{\hbar} H t}$
This is the Heisenberg equation of motion.

So, let’s try out a specific operator, to see how it will evolve with time.
First, we need to define the Hamiltonian operator:
$H = \frac{p^2}{2m} + V(x)$
The first term is kinetic energy in terms of momentum, and the second term is potential energy in terms of position. A Hamiltonian can be arbitrarily more complicated, but this form is fairly general, and relatively simple.
So let’s see how the position operator evolves over time, so plug in $x$:
$\frac{d}{dt} x = \frac{1}{i \hbar} [x, H] + e^{\frac{i}{\hbar} H t} \frac{\partial x}{\partial t} e^{\frac{-i}{\hbar} H t}$
The operator $x$ has no explicit time dependence, so $\frac{\partial x}{\partial t} = 0$. So we are left with:
$\frac{d}{dt} x = \frac{1}{i \hbar} [x, H]$
$[x,H] = [x, \frac{p^2}{2m} + V(x)] = [x, p^2]/2m$
Since $V(x)$ depends only on $x$, it commutes with $x$.
$[x,V(x)] = 0$
However, $x$ does not commute with $p$. This is what gives the uncertainty principle between position and momentum.
$[x,p] = i \hbar$
$\therefore [x,p^2] = 2 i \hbar p$
Using this result we can show:
$[x,H] = i \hbar \frac{p}{m}$
$\therefore \frac{d}{dt} x = \frac{p}{m}$
The equation of motion that describes the evolution of $x$ is then:
$x(t) = \int \frac{p}{m} dt$

The same can be done for the momentum operator.
$\frac{d}{dt} p = \frac{1}{i \hbar} [p, H]$
$[p,H] = [p, \frac{p^2}{2m} + V(x)] = [p,V(x)] = -i \hbar \frac{\partial}{\partial x} V(x)$
$\frac{d}{dt} p = -\frac{\partial}{\partial x} V(x)$
The equation of motion that describes the evolution of $p$ is then:
$p(t) = -\int \frac{\partial}{\partial x} V(x) dt$
This corresponds to Newton’s law, and shows how momentum is linked to the potential energy $V(x)$. The derivative of potential energy with respect to distance gives force.

To make this more familiar, let’s set $V(x) = 0$, to get the free-particle scenario (no forces acting on the particle).
So now $\frac{d}{dt} p = 0$ and the equations of motion become:
$p(t) = p(0)$
$x(t) = x(0) + \frac{p(0)}{m} t$
which describes a particle moving at constant momentum (constant velocity).

# Billiard Balls and the 90-Degree Rule

If you play pool, you may have noticed that when the cue ball strikes a target ball off-center, that they will separate at a 90-degree angle after the collision. This is because the collision is very elastic, and the balls are the same mass. All kinetic energy along the line connecting the two balls is transferred to the target ball, leaving the cue ball with no energy along that direction. Any remaining energy in the cue ball is along the line perpendicular to the line connecting the two balls. This is why they separate at a 90-degree angle. (Also note: if the cue ball strikes the target ball dead-center, there will be no energy along a perpendicular line, all energy will be transferred to the target ball, unless there is substantial inelasticity.)

So let’s explore the simple case, where we are assuming that there is negligible spinning and friction, and that the collision is perfectly elastic. Initially, the cue ball is put into motion toward the target ball, and the target ball is stationary.
First, we use conservation of momentum to get our first equation, where $m_1$ is the mass of the cue ball, $m_2$ is the mass of the target ball, $\overrightarrow{v}$ represents their velocity vectors, with the subscripts $i$ meaning “initial”, and $f$ meaning “final”.
$m_1 \overrightarrow{v}_{1,i} + m_2 \overrightarrow{v}_{2,i} = m_1 \overrightarrow{v}_{1,f} + m_2 \overrightarrow{v}_{2,f}$   (1)
Now, since the masses of the balls are equal and the initial velocity of the second ball is 0, we can simplify the first equation to:
$\overrightarrow{v}_{1,i} = \overrightarrow{v}_{1,f} + \overrightarrow{v}_{2,f}$   (2)
Second, we use conservation of energy to get our second equation:
$\alpha( \frac{1}{2} m_1 v_{1,i}^2 + \frac{1}{2} m_2 v_{2,i}^2) = \frac{1}{2} m_1 v_{1,f}^2 + \frac{1}{2} m_2 v_{2,f}^2$   (3)
Again, we can reduce this, since the initial velocity of the second ball is 0, the masses are equal, and the collision is perfectly elastic $\alpha = 1$.
$v_{1,i}^2 = v_{1,f}^2 + v_{2,f}^2$   (4)
Now we can take equation 2 and square it.
$v_{1,i}^2 = v_{1,f}^2 + v_{2,f}^2 + 2 \overrightarrow{v}_{1,f} \cdot \overrightarrow{v}_{2,f}$
$= v_{1,f}^2 + v_{2,f}^2 + 2 v_{1,f} v_{2,f} cos (\theta)$   (5)
Where $\theta$ is the angle between the two balls after the collision.
Now, if you compare equations 4 and 5, you’ll see that they are equal. However, there’s this cross term $2 v_{1,f} v_{2,f} cos (\theta)$ that shows up in equation 5, but not in 4. This means it’s equal to 0.
So then,
$cos(\theta) = 0$
and this can only be true if $\theta = \frac{\pi}{2}$, which is equivalent to 90 degrees!
(If the target ball is struck dead-center, then the cross term goes to 0 from $v_{1,f} = 0$.)

In a more general case, where we include inelasticity and the possibility of the balls having different masses, we arrive at a more general formula for $\theta$. We’ll still assume negligable friction, spinning, and that the target ball is initially stationary.
So we still have conservation of momentum:
$m_1 \overrightarrow{v}_{1,i} = m_1 \overrightarrow{v}_{1,f} + m_2 \overrightarrow{v}_{2,f}$   (6)
And conservation of energy, but with some of the energy going to sound and heat. This is represented by an elasticity factor $\alpha$, that ranges from 1 (perfectly elastic) to 0 (perfectly inelastic).
$\alpha \frac{1}{2} m_1 v_{1,i}^2 = \frac{1}{2} m_1 v_{1,f}^2 + \frac{1}{2} m_2 v_{2,f}^2$   (7)
So we can take equation 6, and just like last time, we square it.
$m_1^2 v_{1,i}^2 = m_1^2 v_{1,f}^2 + m_2^2 v_{2,f}^2 + 2 m_1 m_2 v_{1,f} v_{2,f} cos(\theta)$   (8)
Now divide equation 8 by $m_1$ on both sides, and you will have an expression for $m_1 v_{1,i}^2$
$m_1 v_{1,i}^2 = m_1 v_{1,f}^2 + \frac{m_2^2}{m_1} v_{2,f}^2 + 2 m_2 v_{1,f} v_{2,f} cos(\theta)$,
which can be plugged into the left side of equation 7.
So now, you can solve for $\theta$ in terms of the final velocities, masses, and elasticity factor.
The general result is then:
$cos(\theta) = \frac{1-\alpha}{2 \alpha} \frac{m_1}{m_2} \frac{v_{1,f}}{v_{2,f}} + \frac{1}{2 \alpha} (1 - \alpha \frac{m_2}{m_1}) \frac{v_{2,f}}{v_{1,f}}$
We can get back the previous result $cos(\theta) = 0$,  by setting $m_1 = m_2$ and $\alpha = 1$.

# Translating a Wave Function

In algebra, or pre-calc, you learn that you can change the position of a function by modifying the argument. In quantum physics this idea is used to displace wave functions. If a function starts off at one position, and moves to another position, all that is needed is a change in argument. However, quantum physics likes to use linear operators to alter functions. What would an operator look like if it can change the argument of a function? In this post, I will construct a 1-D example of such an operator.

A general wave function can be written as: $\Psi (x)$, where the shape of $\Psi$ is dependent on the spatial variable $x$.
To translate a function by distance $a$, modify the argument of $\Psi$,
To move the function right by $a$, $\Psi(x) \rightarrow \Psi (x-a)$
To move the function left by $a$, $\Psi(x) \rightarrow \Psi(x+a)$
Let’s just take the $\Psi(x+a)$ example, and without loss of generality say that $a$ can be positive or negative.
Next we can take advantage of Taylor expansions.
A function $f(x)$ can be expanded around a point $a$: $f(x) = f(a) + (x-a)\frac{d}{dx}f(a) + \frac{(x-a)^2}{2!} \frac{d^2}{dx^2}f(a) + ...$
Here, in our example, we want to expand $\Psi(x+a)$ around $x$, to express the translated function $\Psi(x+a)$ in terms of the original function $\Psi(x)$:
$\Psi(x+a) = \Psi(x) + a\frac{d}{dx}\Psi(x) + \frac{a^2}{2!} \frac{d^2}{dx^2} \Psi(x) + ...$
Note: $\frac{d}{dx} = \frac{i}{\hbar} p$.
A more complete version of this expression would be:
$\Psi(x+a) = \sum\limits_{n=0}^{\infty} \frac{a^n}{n!} \frac{d^n}{dx^n} \Psi(x) = \Psi(x) + \sum\limits_{n=1}^{\infty} \frac{a^n}{n!} \frac{d^n}{dx^n} \Psi(x)$
This sum’s structure is similar to the Taylor expansion of the exponential function.
$e^x = 1 + x + x^2/2! + ... = \sum\limits_{n=1}^{\infty} \frac{x^n}{n!}$
Every operator in the $\Psi(x+a)$ expansion can be reduced into an simplified operator: $\Psi(x+a) = e^{a\frac{d}{dx}} \Psi(x)$
$= e^{\frac{ai}{\hbar} p} \Psi(x).$
This new operator $e^{a\frac{d}{dx}}$ can be expanded to return to what we had before: $e^{a\frac{d}{dx}} = \sum\limits_{n=0}^{\infty} \frac{a^n}{n!} \frac{d^n}{dx^n}$.
So the translated function is a version of the original function with a specific type of interference. Such that the structure is: $Translated = Original + Interference$.
$\Psi(x+a) = \Psi(x) + \sum\limits_{n=1}^{\infty} \frac{a^n}{n!} \frac{d^n}{dx^n} \Psi(x) = \Psi(x) + \Delta(a)\Psi(x)$
We can reduce the operators in the interference terms into an exponential in the usual way:
$\Delta(a) = \sum\limits_{n=1}^{\infty} \frac{a^n}{n!} \frac{d^n}{dx^n}$
$= e^{a \frac{d}{dx} } - 1$

The expectation value $\left< e^{a\frac{d}{dx}} \right>$ characterizes the average overlap between $\Psi(x)$ and $\Psi(x+a)$.
$\left< e^{a\frac{d}{dx}} \right> = \left< \Psi(x) \right| e^{a\frac{d}{dx}} \left| \Psi(x) \right>$
$= \left< \Psi(x) | \Psi(x+a) \right>$
$= \left< \Psi(x) \right| 1 + \Delta(a) \left| \Psi(x) \right>$
$= 1 + \left< \Delta(a) \right>$
The expectation value of the translation operator is then unity plus the expectation value of the interference operator.
$\left< \Delta (a) \right> = \sum\limits_{n=1}^{\infty} \frac{a^n}{n!} \left< \frac{d^n}{dx^n} \right>$
$= \sum\limits_{n=1}^{\infty} \frac{(ia)^n}{(\hbar)^n n!} \left< p^n \right>$
The interference expectation value is shown to be an expectation value of a function of the momentum operator.
In the case of localized waves, as $a$ gets much greater than the width of $\Psi(x)$, the interference term approaches -1, since the overlap between the original and translated wave function decreases. This is equivalent to the original and displaced wave function becoming more and more orthogonal.
limit $_{a\rightarrow \infty} \left< \Delta (a) \right> = -1$
limit $_{a\rightarrow \infty} \left< \Psi(x) | \Psi(x+a) \right> = 0$