# 13. Random Processes

[status: partially-written]

## 13.1. Motivation, prerequisites, plan

Motivation

There is a surprising depth in understanding various aspects of random
processes, also called *stochastic processes*. The wikipedia article
mentions that:

“They have applications in many disciplines including sciences such as biology, chemistry, ecology, neuroscience, and physics as well as technology and engineering fields such as image processing, signal processing, information theory, computer science, cryptography and telecommunications. Furthermore, seemingly random changes in financial markets have motivated the extensive use of stochastic processes in finance.”

This is a staggering list of fields: random processes seem to be as pervasive in all fields of physical, biological and social science as basic math, statistics and calculus.

Prerequisites

The 10-hour “serious programming” course.

The “Data files and first plots” mini-course in Section 2.

Random number basics from Section 11.

The sections on random walks from Section 12.

Plan

So how do we write programs that study and visualize these ideas? We will:

Review random number generation.

Introduce simple poisson processes: random amplitude and random time between samples.

Examine the distribution of \(\Delta t\) in a poisson process.

Look at random placements on a 2D lattice.

Various types of time series.

Revisit random walks.

Diffusion: distribution of distance covered in a random walk after time \(t\).

Brownian motion and kinetic theory.

Progression of record peaks, and progression of sports world records.

## 13.2. Reviewing random number generation

You should quickly review the brief section on what web pages look like in Section 11 before continuing in this section.

Now let us write a few lines of code which generate random numbers and then puts them into a file which we can analyze and plot.

```
$ python3
>>> import random
>>> random.random()
>>> random.random()
>>> [...]
>>> ## now put the numbers in a file
>>> f = open('time_series.out', 'w')
>>> f.write('## time time_interval what_happened\n')
>>> t = 0
>>> for i in range(300):
>>> dt = random.random()
>>> t = t + dt
>>> f.write('%g %f EVENT!!\n' % (t, dt))
>>> f.close()
```

[not sure about the remaining paragraphs in this section; they are commented out for now]

## 13.3. Poisson processes

A poisson process is a random process in which each event’s probability is unrelated to any previous events.

Examples include throwing dice, flipping coins, radioactive decay.

We will write programs that simulate various types of poisson process.

### 13.3.1. A pure poisson process

Our first program will

### 13.3.2. An angry lightning goddess

Scenario: you live in a very unfortunately placed house. Every day Astrape, the goddess of lightning, rolls her special dice and has a probability of 0.03 (3%) to hitting that house. This should correspond to an average of one hit per month.

Let us first write a program which generates this series of events using a random number generator. Our first stab at this program will simply see that the probability was indeed close to 0.03.

```
#! /usr/bin/env python3
"""Simpole example of simulating lightning strikes with a fixed
probability every day. This first stab at a simulation simply
calculates the probability of the hit to make sure that the random
number generator performs as we expect.
"""
import random
def main():
n_days = 1000
n_hits = 0
n_misses = 0
for day in range(n_days):
r = random.random()
if r <= 0.03: ## 3% chance
n_hits += 1
else:
n_misses += 1
hit_fraction = n_hits / n_days
print('average daily hits: %g (%d days)' % (hit_fraction, n_days))
if __name__ == '__main__':
main()
```

Run the program a few times to see that you indeed got

The key mechanism used in this program was in these lines of code:

```
# ...
r = random.random()
if r <= 0.03: ## 3% chance
n_hits += 1
else:
n_misses += 1
# ...
hit_fraction = n_hits / n_days
```

Make sure you understand that the `random.random()`

call returning
less than 0.03 happens 3% of the time.

Now, following an example from Steven Pinker [FIXME: put citation to The Better Angles of Our Nature], we ask the question:

If Astrape struck your house today, which day is the most likely day for the next bolt of lightning to strike your house?

To estimate this we write the following program:

```
#! /usr/bin/env python3
import random
import math
def main():
## run this program with n_days = 50 when you want
## to eyeball the output; run it with n_days = 1000,
## then 10*1000, then 100*1000 when you want to make
## plots
n_days = 100000
delta_t_list = simulate_strikes(n_days)
## now that we have the list we print it to a file
with open('time_diffs.dat', 'w') as f:
for delta_t in delta_t_list:
f.write("%d\n" % delta_t)
print('wrote time_diffs.dat with %d delta_t values'
% len(delta_t_list))
def simulate_strikes(n_days):
"""simulates lightning strikes for a given number of
days, collecting information on the times between
strikes. returns the list of delta_t values.
"""
last_delta_t = -1
delta_t_list = []
prev_day_with_strike = -1
for day in range(n_days):
r = random.random() # a random float between 0 and 1
if r <= 0.03: # 3% chance
#print('%d: hit' % day)
if prev_day_with_strike >= 0:
## we record the delta_t of this event
last_delta_t = day - prev_day_with_strike
delta_t_list.append(last_delta_t)
prev_day_with_strike = day
return delta_t_list
if __name__ == '__main__':
main()
```

You can then plot the results of this program in a couple of different ways. We will first make a scatter plot showing the lightning events as points in a plot with lightning number and \(\Delta t\). You can see this in Figure 13.3.1.

```
##REQUIRED_FILE: time_diffs.dat
##PRE_EXEC: ./lightning-time-distribution.py
set xlabel 'lightning number'
set ylabel '{/Symbol D} t (days)'
plot 'time_diffs.dat' using 1 with points pt 4 \
ps 0.3 title 'time between lightning strikes'
```

A bit more rigor and insight can be gotten from a histogram plotting the number of lighting pairs versus the time between the two strikes. This histogram is shown in Figure 13.3.2.

```
##REQUIRED_FILE: time_diffs.dat
##REQUIRED_FILE: time_diffs.dat.hist
##PRE_EXEC: ./lightning-time-distribution.py
##PRE_EXEC: ../plotting-intermediate/int-histogram-maker.py time_diffs.dat
set grid
set xlabel '{/Symbol Delta} t (days)'
set ylabel 'frequency of that interval'
set style data histogram
set style fill solid 0.8 border -1
plot [0:] [0:] 'time_diffs.dat.hist' using 1:2 with boxes
```

First conclusion: the answer to Steven Pinker’s question is that the most likely day in which the next lightning will strike your house is tomorrow. The goddess Astrape likes to play with humans.

Another conclusion: the distribution of time between consecutive lightning strikes looks like a decaying exponential.

[…many other conclusions…]

#### 13.3.2.1. Questions

Now muse about the time distance between droughts and other natural calamities.

Examples: “Climate Change and Cultural Response” by Benson and Berry, page 105, shows distribution of mega-droughts in Figure 7.

Look in to the “palmer drought severity index” (PDSI) to find datasets. https://www.ncdc.noaa.gov/temp-and-precip/drought/historical-palmers/

PDSI is often shown from 1850 or 1870 until today. Some data sets go older, using tree ring data, and show the interesting periods from 800 to 1400, which would allow us to study the collapse of the Chaco Canyon culture.

For information about file formats and to see how pervasive obsolete data formats are, look at ftp://ftp.cdc.noaa.gov/Datasets/dai_pdsi/README

Download the .nc files at ftp://ftp.cdc.noaa.gov/Datasets/dai_pdsi/
and examine them with `ncview`

.

Or maybe better: get the ascii files from here: http://www.cgd.ucar.edu/cas/catalog/climind/pdsi.html for example http://www.cgd.ucar.edu/cas/catalog/climind/pdsisc.monthly.maps.1850-2010.fawc=1.r2.5x2.5.ipe=2.txt.gz

For Kansas look at http://www.kgs.ku.edu/Hydro/Publications/2012/OFR12_18/KGS_OF_2012-18.pdf page 10, Figure 5a: PDSI for 800 to 2000. And Figure 5b for megadroughts. Figure 5c shows the dustbowl drought.

Aha: the paleoclimatology (2000 years) might be here: https://www.ncdc.noaa.gov/paleo-search/study/19119 and a text file here: https://www1.ncdc.noaa.gov/pub/data/paleo/drought/LBDA2010/LBDA_PMDI_JJA_Recons.txt and the grid metadata is explained here: https://www1.ncdc.noaa.gov/pub/data/paleo/drought/LBDA2010/LBDA_PMDI_Instrumental_metadata.txt Note that Chaco Canyon Lat/Lon are: 36.0530 deg N, 107.9559 deg W. From the meatadat it looks like it might be grid points 1883 or 1884 in this data set.

Q: Are drought cycles random? Are they the combination of so many types of physics that they appear to be random enough? How about anthropogenic climate change? Does it add a baseline?

Topic: Atlantic multidecadal oscillation.

### 13.3.3. Vicious glow-worms

We have talked about processes which give events distributed *randomly
in time:* events happen at random times. Let us now look at processes
that generate points distributed *randomly in space:* \((x, y)\)
coordinates are spewed out by our process. An example might be where
the grains of sand land when you drop a handful onto the ground.

We can write a program to generate random \((x, y)\) points between
0 and 100. The program `random-spatial.py`

generates a series of such
points, each completely independent of the previous one.

```
#! /usr/bin/env python3
"""
Print a bunch of (x, y) points.
"""
import random
def main():
for i in range(3000):
x = random.random() * 100.0
y = random.random() * 100.0
print('%f %f' % (x, y))
if __name__ == '__main__':
main()
```

You can plot this with:

```
##REQUIRED_FILE: random-spatial.dat
##PRE_EXEC: ./random-spatial.py > random-spatial.dat
##CAPTION: Random (x, y) points. You should be able to see
##CAPTION: some structure: occasional filaments, clustering, and empty
##CAPTION: spaces.
set size ratio -1
set xlabel 'x'
set ylabel 'y'
plot 'random-spatial.dat' using 1:2 with points pt 7 \
ps 0.3 title 'random (x, y) points'
```

The results are shown in Figure 13.3.3. You can see
*features* in the data, even though it was randomly generated:
filaments, clustering, voids… 1

A possible comment: people who spend a lot of time looking at randomly generated data probably don’t easily believe in conspiracy theories.

We can then do something analogous to what we did for events with
random time intervals: plot the distribution of distances between
\((x, y)\) points. The programs `xy-to-distances.py`

and
`float-histogram-maker`

allow us to do so, and the results are in
. Note that you will not get as much insight out of these spatial
histograms as you did in , since a big factor in the distribution of
spacial distances is the small size of the *x-y* plane we used.

Before running these programs you will need to install the
`python3-scipy`

package:

```
$ sudo apt install python3-scipy
```

```
#! /usr/bin/env python3
"""Load a collection of (x, y) points and print out the distance
between each pair of points"""
import sys
import scipy
import scipy.spatial
import math
def main():
fname = sys.argv[1]
coords = []
with open(fname, 'r') as f:
lines = f.readlines()
for line in lines:
xs, ys = line.split()
(x, y) = (float(xs), float(ys))
coords.append((x, y))
print('read in %d coordinates' % len(coords))
dist_out_fname = sys.argv[1] + '.distances'
nearest_fname = sys.argv[1] + '.nearest'
with open(dist_out_fname, 'w') as fpairs:
with open(nearest_fname, 'w') as fnearest:
for i in range(len(coords)-1):
nearest_distance = sys.float_info.max
for j in range(i+1, len(coords)):
r = math.hypot(coords[j][0] - coords[i][0],
coords[j][1] - coords[i][1])
## write all pairwise distances to the fpairs file
fpairs.write('%g\n' % r)
## now see if we have a new nearest distance
if r < nearest_distance:
nearest_distance = r
## write nearest distances to a separate file
fnearest.write('%g\n' % nearest_distance)
print('wrote distances to %s' % dist_out_fname)
print('wrote nearest distances to %s' % nearest_fname)
if __name__ == '__main__':
main()
```

You can run this with:

```
$ ./random-spatial.py > random-spatial.dat
$ ./xy-to-distances.py random-spatial.dat
```

and plot the result with:

```
##REQUIRED_FILE: random-spatial.dat
##REQUIRED_FILE: random-spatial.dat.distances
##REQUIRED_FILE: random-spatial.dat.nearest
##REQUIRED_FILE: random-spatial.dat.distances.hist
##REQUIRED_FILE: random-spatial.dat.nearest.hist
##PRE_EXEC: ./random-spatial.py > random-spatial.dat
##PRE_EXEC: ./xy-to-distances.py random-spatial.dat
##PRE_EXEC: ./float-histogram-maker.py random-spatial.dat.distances
##PRE_EXEC: ./float-histogram-maker.py random-spatial.dat.nearest
set grid
set multi layout 2,1
set style data histogram
set style fill solid 0.8 border -1
set xlabel 'distance between random (x, y) points'
set ylabel 'frequency of that distance'
plot [0:] 'random-spatial.dat.distances.hist' using 1:2 with boxes
set xlabel 'nearest distance to a point'
set ylabel 'frequency of that nearest distance'
plot [0:] 'random-spatial.dat.nearest.hist' using 1:2 with boxes
```

The results are shown in Figure 13.3.4.

So we have seen that in two spatial dimensions we have a situation analogous to that for time differences: nearest points are distributed with a decaying exponential distribution.

#### 13.3.3.1. Questions

So what would the two dimensional distribution look like if the \((x, y)\) values were not purely random?

Explore this in two ways:

Change the program

`random-spatial.py`

to put all the \((x, y)\) points at integer positions.Change the program

`random-spatial.py`

to only put new points outside a “radius of avoidance” from existing points.

The “radius of avoidance” idea is what happens with the famous Waitomo Glowworm Caves in New Zeland: the worm larvae take positions on the ceiling of the cave and glow to attract prey. They avoid settling near other larvae to avoid cannibalism. The result is a distribution that does not have the same filaments and voids as a purely random distribution.

The lack of structure (filaments, voids…) in the cave was observed by biologist Stephen Jay Gould. He conjectured that the reason was this radius of avoidance. Physicist Ed Purcell carried out a calculation and visualization similar to ours to confirm Gould’s conjecture.

[cite Steven Pinker for the glowworm example]

## 13.4. Brownian motion

In 1827 Scottish botanist Robert Brown looked at pollen grains in water and noticed that the pollen was moved around in the water, jiggling randomly. This was one of the significant discoveries that led to the acceptance of the molecular theory.

In 1905 Einstein explained this motion by showing that the pollen was being buffeted around by the motion of individual water molecules.

Today we know that fluids (liquids and gasses) are made of molecules which are in constant motion. They bump in to each other, into the walls of their container, and they also bump in to larger particles floating around (such as the grain of pollen that Brown observed).

Simulating this behavior is an interesting challenge: it would involve keeping track of a large number of gas molecules and seeing when they bump in to a large sphere, which would represent the grain of pollen.

Right now I leave that more detailed programming task as an exercise, and will point that a huge number of bumps (\(10^{14}\) collisions/second) by molecules which are going in all directions means that our grain of pollen will be moving in a sort of random walk, the same kind we discussed and visualized in Section 12.

What we will do here is write some programs that simulate such random walks and then look at answering the question “where will I end up?” More specifically we ask “how far did I go?”

Let us start with one dimension. The program `multiple-walks.py`

will take several walks in one dimension and write out the final
arrival point and how far we went.

```
#! /usr/bin/env python3
"""Take many one-simensional random walks so that we can study the
distribution of how far you get in a random walk
"""
import random
import math
import sys
def main():
x = 0
n_steps = 10000
n_walks = 10
if len(sys.argv) == 3:
n_steps = int(sys.argv[1])
n_walks = int(sys.argv[2])
elif len(sys.argv) != 1:
sys.stderr.write('error - usage: %s [n_steps n_walks]\n' % sys.argv[0])
exit(1)
distances = [] # we'll store all the distances we found
## now run several random walks
for i in range(n_walks):
final_x = take_walk(x, n_steps)
distance = math.fabs(final_x) # how far did we get?
distances.append(distance)
print('%d: %g %g' % (i, final_x, distance))
def take_walk(x, n_steps):
"""take a simple 1D random walk"""
for i in range(n_steps):
## generate a step of -1 or +1 in the x direction
step_x = random.randint(0, 1) * 2 - 1
x = x + step_x
# print(i, x, math.sqrt(i), math.fabs(math.fabs(x) - math.sqrt(i)))
## return the final location
return x
main()
```

Nice, but can we have a plot?

```
#! /usr/bin/env python3
"""Take many one-simensional random walks so that we can study the
distribution of how far you get in a random walk. Keep track of the
path each walk took and plot it.
"""
import random
import math
import sys
import matplotlib.pyplot as plt
def main():
x = 0
n_steps = 10000
n_walks = 10
if len(sys.argv) == 3:
n_steps = int(sys.argv[1])
n_walks = int(sys.argv[2])
elif len(sys.argv) != 1:
sys.stderr.write('error - usage: %s [n_steps n_walks]\n' % sys.argv[0])
exit(1)
distances = [] # we'll store all the distances we found
## prepare some plot parameters
plt.xlabel('step')
plt.ylabel('distance from origin')
plt.grid(True)
## now run several random walks
for i in range(n_walks):
path = take_walk(x, n_steps)
final_x = path[-1]
distance = math.fabs(final_x) # how far did we get?
distances.append(distance)
print('%d: %g %g' % (i, final_x, distance))
plt.plot(path) # add to the plot
## now show the plot we have accumulated
plt.show()
def take_walk(x, n_steps):
"""take a simple 1D random walk"""
path = []
for i in range(n_steps):
## generate a step of -1 or +1 in the x direction
step_x = random.randint(0, 1) * 2 - 1
x = x + step_x
# print(i, x, math.sqrt(i), math.fabs(math.fabs(x) - math.sqrt(i)))
path.append(x)
## return the final location
return path
main()
```

Sure, but what if we are impatient and we want to see the plots before they are all generated. We can animate the plot using the techniques from Section 6.8.2.

```
#! /usr/bin/env python3
"""Take many one-simensional random walks so that we can study the
distribution of how far you get in a random walk. Keep track of the
path each walk took and plot it.
"""
import random
import math
import sys
import matplotlib.pyplot as plt
def main():
x = 0
n_steps = 10000
n_walks = 10
if len(sys.argv) == 3:
n_steps = int(sys.argv[1])
n_walks = int(sys.argv[2])
elif len(sys.argv) != 1:
sys.stderr.write('error - usage: %s [n_steps n_walks]\n' % sys.argv[0])
exit(1)
distances = [] # we'll store all the distances we found
## prepare to have animation
fig, ax = plt.subplots()
plt.xlabel('step')
plt.ylabel('distance from origin')
plt.grid(True)
plt.ion()
## now run several random walks
for i in range(n_walks):
path = take_walk(x, n_steps)
final_x = path[-1]
distance = math.fabs(final_x) # how far did we get?
distances.append(distance)
print('%d: %g %g' % (i, final_x, distance))
plt.plot(path) # add to the plot
fig.canvas.draw_idle()
plt.pause(0.4)
fig.canvas.draw_idle()
## now show the plot we have accumulated
plt.show()
plt.waitforbuttonpress()
def take_walk(x, n_steps):
"""take a simple 1D random walk"""
path = []
for i in range(n_steps):
## generate a step of -1 or +1 in the x direction
step_x = random.randint(0, 1) * 2 - 1
x = x + step_x
# print(i, x, math.sqrt(i), math.fabs(math.fabs(x) - math.sqrt(i)))
path.append(x)
## return the final location
return path
main()
```

Now let’s look at a crucial property of random walks: the overall
behavior of the final position and the distance from the origin. At
firs tit seems that these are all very different, with no pattern.
But if we plot a *histogram* of the final positions in the one
dimensional random walk we see something interesting:

```
#! /usr/bin/env python3
"""Take many one-simensional random walks so that we can study the
distribution of how far you get in a random walk. This version starts
from multiple-walks.py and adds the ability to plot a histogram of how
far we got.
"""
import random
import math
import sys
import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
def main():
x = 0
n_steps = 1001
n_walks = 1001
if len(sys.argv) == 3:
n_steps = int(sys.argv[1])
n_walks = int(sys.argv[2])
elif len(sys.argv) != 1:
sys.stderr.write('error - usage: %s [n_steps n_walks]\n' % sys.argv[0])
exit(1)
positions = [] # we'll store all the positions we found
## prepare to plot
plt.grid(True)
plt.xlabel('final value')
plt.ylabel('number of walks')
plt.title('1D random walk arrival values (%d, %d)' % (n_steps, n_walks))
plt.ion() # enable interactive
## now run several random walks
for i in range(n_walks):
final_x = take_walk(x, n_steps)
positions.append(final_x)
# print('%d: %g %g' % (i, final_x, position))
plot_position_histogram(positions, n_steps, n_walks)
plt.show()
plt.waitforbuttonpress()
def take_walk(x, n_steps):
"""take a simple 1D random walk"""
for i in range(n_steps):
## generate a step of -1 or +1 in the x direction
step_x = random.randint(0, 1) * 2 - 1
x = x + step_x
# print(i, x, math.sqrt(i), math.fabs(math.fabs(x) - math.sqrt(i)))
## return the final location
return x
def plot_position_histogram(positions, n_steps, n_walks):
if len(positions) < 10:
return
n_steps_so_far = len(positions)
n_bins = min(51, 5+int(n_steps_so_far/3))
# print(np.array(positions))
plt.cla()
plt.grid(True)
plt.xlabel('final value')
plt.ylabel('number of walks')
plt.title('1D random walk arrival values (%d, %d)'
% (n_steps_so_far, n_walks))
n, bins, patches = plt.hist(positions, n_bins)
fig.canvas.draw_idle()
plt.pause(0.01)
main()
```

NOTE: try to superimpose a gaussian with sigma proportional to sqrt(n_steps).

Let’s visualize things together with an animated multiplot:

```
#! /usr/bin/env python3
"""Take many one-simensional random walks so that we can study the
distribution of how far you get in a random walk. This version starts
from multiple-walks.py and adds the ability to plot a histogram of how
far we got.
"""
import random
import math
import sys
import numpy as np
import matplotlib.pyplot as plt
skip_interval = 1
# skip_interval = 5
def main():
x = 0
n_steps = 1001
n_walks = 1001
if len(sys.argv) == 3:
n_steps = int(sys.argv[1])
n_walks = int(sys.argv[2])
elif len(sys.argv) != 1:
sys.stderr.write('error - usage: %s [n_steps n_walks]\n' % sys.argv[0])
exit(1)
positions = [] # we'll store all the positions we found
# (ax1, ax2, ax3, ax4) = prepare_plotting(n_steps, n_walks)
prepare_plotting(n_steps, n_walks)
## now run several random walks
last_10_walks = []
for i in range(n_walks):
take_walk(x, n_steps, last_10_walks)
if i % skip_interval == 0 or i < 20 or i > (n_walks-20):
plot_walks(last_10_walks)
final_x = last_10_walks[-1][-1] # last position of last walk
position = math.fabs(final_x) # how far did we get?
# positions.append(position)
positions.append(final_x)
# print('%d: %g %g' % (i, final_x, position))
if i % skip_interval == 0 or i < 20 or i > (n_walks-20):
plot_position_histogram(positions, n_steps, n_walks)
plt.subplot(224).figure.suptitle('1D random walks arrival values (%d, %d)'
% (i, n_walks))
plt.pause(0.000001)
plt.show()
plt.waitforbuttonpress()
def prepare_plotting(n_steps, n_walks):
fig = plt.figure(1)
fig.suptitle('1D random walks arrival values (%d, %d)'
% (n_steps, n_walks))
plt.subplots_adjust(top=0.92, bottom=0.10, left=0.10, right=0.95,
hspace=0.25, wspace=0.35)
plt.ion()
# return (ax1, ax2, ax3, ax4)
def take_walk(x, n_steps, last_10_walks):
"""take a simple 1D random walk"""
last_10_walks.append([])
for i in range(n_steps):
## generate a step of -1 or +1 in the x direction
step_x = random.randint(0, 1) * 2 - 1
x = x + step_x
last_10_walks[-1].append(x)
# print(i, x, math.sqrt(i), math.fabs(math.fabs(x) - math.sqrt(i)))
## now make sure we only keep the last 10 entries
while len(last_10_walks) > 10:
del last_10_walks[0]
return last_10_walks
def plot_walks(last_10_walks):
## plot it in the cumulative plot
ax = plt.subplot(2, 2, 1)
plt.xlabel('step')
plt.ylabel('position')
ax.plot(last_10_walks[-1])
## plot it in the "last 10" plots
if len(last_10_walks) >= 10:
ax = plt.subplot(2, 2, 2)
ax.cla()
plt.xlabel('step')
plt.ylabel('position')
for i in range(10):
ax.plot(last_10_walks[i])
ax.figure.canvas.draw_idle()
def plot_position_histogram(positions, n_steps, n_walks):
if len(positions) < 10:
return
n_steps_so_far = len(positions)
n_bins = min(51, 5+int(n_steps_so_far/3))
## plot the positions
ax = plt.subplot(2, 2, 3)
ax.cla()
ax.grid(True)
plt.xlabel('final position')
plt.ylabel('number of walks')
# plt.title('1D random walk (%d, %d)'
# % (n_steps_so_far, n_walks))
n, bins, patches = ax.hist(positions, n_bins)
ax.figure.canvas.draw_idle()
## plot the positions
distances = [math.fabs(pos) for pos in positions]
ax = plt.subplot(2, 2, 4)
ax.cla()
ax.grid(True)
plt.xlabel('distance from start')
plt.ylabel('number of walks')
# plt.title('1D random walk (%d, %d)'
# % (n_steps_so_far, n_walks))
n, bins, patches = ax.hist(distances, n_bins)
ax.figure.suptitle('1D random walks arrival values (%d, %d)'
% (n_steps_so_far, n_walks))
ax.figure.canvas.draw_idle()
main()
```

Let’s look at the 2D situation:

```
#! /usr/bin/env python3
"""Take many one-simensional random walks so that we can study the
distribution of how far you get in a random walk. This version starts
from multiple-walks-histogram.py, changes it to take two dimensional
walks, and then plots a histogram of how far we got.
"""
import random
import math
import sys
import numpy as np
import matplotlib.pyplot as plt
def main():
pt = (0, 0)
n_steps = 1000
n_walks = 1000
if len(sys.argv) == 3:
n_steps = int(sys.argv[1])
n_walks = int(sys.argv[2])
elif len(sys.argv) == 4:
n_steps = int(sys.argv[1])
n_walks = int(sys.argv[2])
output_file = sys.argv[3]
elif len(sys.argv) == 2:
output_file = sys.argv[1]
elif len(sys.argv) != 1:
sys.stderr.write('error - usage: %s [n_steps n_walks]\n' % sys.argv[0])
exit(1)
distances = [] # we'll store all the distances we found
## now run several random walks
for i in range(n_walks):
final_pt = take_walk(pt, n_steps)
distance = math.hypot(*final_pt) # how far did we get?
distances.append(distance)
# print('%d: %g %g %g' % (i, *final_pt, distance))
plot_distance_histogram(distances, n_steps, n_walks, output_file)
def take_walk(pt, n_steps):
"""take a simple 1D random walk"""
for i in range(n_steps):
## generate a step of -1 or +1 in the x direction
step_x = random.randint(0, 1) * 2 - 1
step_y = random.randint(0, 1) * 2 - 1
pt = (pt[0] + step_x, pt[1] + step_y)
# print(i, x, math.sqrt(i), math.fabs(math.fabs(x) - math.sqrt(i)))
## return the final location
return pt
def plot_distance_histogram(distances, n_steps, n_walks, output_file):
n_bins = min(50, n_walks/3.0)
# print(np.array(distances))
n, bins, patches = plt.hist(distances, n_bins)
print(bins)
plt.xlabel('distance traveled')
plt.ylabel('number of walks')
plt.title('2D random walk distance distribution (%d, %d)'
% (n_steps, n_walks))
plt.grid(True)
plt.show()
main()
```

## 13.5. Further reading

The wikipedia article https://en.wikipedia.org/wiki/Brownian_motion has nice diagrams.

Start by watching this video: https://www.youtube.com/watch?v=hy-clLi8gHg

Possibly do this experiment: https://www.youtube.com/watch?v=wf2tBAvMNbg

Other video on brownian motion: https://www.youtube.com/watch?v=NHo6LTXdFns

This one has a cute demonstration of how you get a random walk when you focus on a single particle, and is https://www.youtube.com/watch?v=pdz7wFHSLD0

Scipy has some routines that help simulate Brownian motion. The cookbook has this example: http://scipy-cookbook.readthedocs.io/items/BrownianMotion.html

## 13.6. Progression of record peaks

https://en.wikipedia.org/wiki/Athletics_record_progressions

[footnotes]

- 1
Note that the clustering is an artifact of the random generation of points; it is not due to a physical effect that clusters the points together.

## 13.7. The gambler’s fallacy

## 13.8. The gambler’s ruin

## 13.9. Link to other chapters

Simulated annealing covered in Section 25

Genetic algorithms (planned).