Scaling from [Habib2013]

This document demonstrates taking some of the scaling data from Habib et al. (2013) and presenting it to show the scaling performance well. These plots demonstrate the metrics outlined in "Formal Metrics for Large-Scale Parallel Performance" by Moreland and Oldfield.

Loading the Data

First the preliminaries. Here are the Python modules we depend on.

In [1]:
import numpy
import pandas
import toyplot.pdf

Read in the raw data file Habib2013FullTitanRaw.csv. This is timing data ripped from the [Habib2013] paper by measuring the locations of the measurements in Figure 3. The measurements were made using rules in Adobe Acrobat, so should be as accurate as they are placed in the figure.

In [2]:
data = pandas.read_csv('Habib2013FullTitanRaw.csv')
data
Out[2]:
Study Number of Nodes Plot measure (pixels)
0 Strong 32 50.632
1 Strong 64 43.277
2 Strong 128 36.018
3 Strong 256 28.098
4 Strong 512 19.848
5 Strong 1024 13.577
6 Strong 2048 9.617
7 Strong 4096 7.826
8 Strong 8192 2.734
9 Weak 128 30.879
10 Weak 256 24.892
11 Weak 512 16.877
12 Weak 1024 6.601
13 Weak 2048 1.367
14 Weak 4096 -6.647
15 Weak 8192 -15.888
16 Weak 16384 -21.969

Our metrics need to know the data size for each measurement. According to the paper, the strong scaling measurements were all done with a 1024^3 grid of particles. The weak scaling had 32 million particles per node.

In [3]:
particles = numpy.empty(len(data.index))

strong_indices = numpy.array(data['Study'] == 'Strong')
particles[strong_indices] = 1024 ** 3

weak_indices = numpy.array(data['Study'] == 'Weak')
particles[weak_indices] = 32000000*data['Number of Nodes'][weak_indices]

data['particles'] = particles

The raw measurements are in screen pixels when the measurements were taken. When taking these measurements, I also measured the unit height (the height between tics) as 28.003 pixels per log(ns per particle). Use this to convert the plot measure to the actual value of nanoseconds per particle.

In [4]:
data['nanoseconds per particle'] = 10 ** (data['Plot measure (pixels)']/28.003)

With the nanoseconds per particle and the number of particle, we can derive what the actual time was.

In [5]:
data['seconds'] = (data['nanoseconds per particle']*1e-9)*data['particles']

Derived Metrics

We now have plenty of data to compute the rate. We can actually do it two ways. The rate is simply the inverse of the 'nanoseconds per particle' column (which we scale to particles per second). We can also compute the rate as the 'particles' column divided by the 'seconds' column. We compute both ways. One is saved and the other is used to check the error, which should be very low. We compute the error as an l2 norm.

In [6]:
data['rate'] = 1e9/data['nanoseconds per particle']

rate_check = data['particles']/data['seconds']

rate_error = numpy.linalg.norm(data['rate']-rate_check, ord=2)
rate_error
Out[6]:
2.7677678497145637e-07

The cost is defined as the time taken times the number of processing elements used. The cost per unit is the cost divided by the problem size (in this case, the number of particles). Some of our other caculations also require knowing the best cost per unit.

In [7]:
data['cost'] = data['seconds']*data['Number of Nodes']
data['cost per unit'] = data['cost']/data['particles']
best_cost_per_unit = numpy.min(data['cost per unit'])
best_cost_per_unit
Out[7]:
1.6214861797231084e-06

The efficiency can be expressed as the the best cost per unit divided by the observed cost per unit.

In [8]:
data['efficiency'] = best_cost_per_unit/data['cost per unit']

We can also use the ideal rate for comparison purposes. We also want the ideal value for 'nanoseconds per particle' as given in the original paper.

In [9]:
data['ideal rate'] = data['Number of Nodes']/best_cost_per_unit
data['ideal nanoseconds per particle'] = 1e9/data['ideal rate']

Standard Time Plot

Make a typical plot of time with linear scaling of the axis.

In [10]:
time_series = data.pivot_table(index='Number of Nodes',
                               columns='Study',
                               values='seconds')
time_series
Out[10]:
Study Strong Weak
Number of Nodes
32 69.022687 NaN
64 37.700007 NaN
128 20.754831 51.887558
256 10.821622 63.429999
512 5.491375 65.630448
1024 3.278995 56.386440
2048 2.367704 73.332439
4096 2.043478 75.882652
8192 1.344412 70.985908
16384 NaN 86.108660
In [11]:
canvas = toyplot.Canvas(400, 320)
axes = canvas.axes(xscale='linear',
                   yscale='linear',
                   xlabel='Number of Nodes',
                   ylabel='Time (seconds)')
axes.x.ticks.locator = toyplot.locator.Explicit(numpy.arange(0,2**14+2**12,2**12))
axes.x.ticks.show = True
axes.y.ticks.show = True

x = time_series.index.values
y = numpy.column_stack((time_series['Weak'], time_series['Strong']))
axes.plot(x, y, marker='o', size=40)
axes.text(16384, time_series['Weak'][16384], 'Weak Scaling',
          style={'text-anchor':'end', 'baseline-shift':'-80%'},
          angle=13.5)
axes.text(8192+300, time_series['Strong'][8192], 'Strong Scaling',
          style={'text-anchor':'start'})
Out[11]:
<toyplot.mark.Text at 0x10960ffd0>
Weak ScalingStrong Scaling0409681921228816384Number of Nodes0306090Time (seconds)

Save the plot as HabibTime.pdf.

In [12]:
toyplot.pdf.render(canvas, 'HabibTime.pdf')

Original Plot

Here we reproduce the original plot in the paper. The one difference is what we use as the ideal nanoseconds per particle. We use the ideal rate as measured from the algorithm. I believe the paper estimated the ideal based on the maximum FLOP rate of the computer vs. the FLOP rate while running the algorithm.

In [13]:
original_series = data.pivot_table(index='Number of Nodes',
                                   columns='Study',
                                   values='nanoseconds per particle',
                                   aggfunc=numpy.mean)
original_series
Out[13]:
Study Strong Weak
Number of Nodes
32 64.282386 NaN
64 35.110867 NaN
128 19.329443 12.667861
256 10.078421 7.742920
512 5.114241 4.005765
1024 3.053802 1.720778
2048 2.205096 1.118964
4096 1.903137 0.578939
8192 1.252081 0.270790
16384 NaN 0.164239
In [14]:
canvas = toyplot.Canvas(400, 320)
axes = canvas.axes(xscale='log2',
                   yscale='log10',
                   xlabel='Number of Nodes',
                   ylabel='Nanoseconds per Particle')
axes.x.ticks.locator = toyplot.locator.Explicit(2 ** numpy.arange(5,15))
axes.y.ticks.locator = toyplot.locator.Explicit(10 ** numpy.arange(-1,2, dtype=numpy.float64))
axes.x.ticks.show = True
axes.y.ticks.show = True

x = original_series.index.values
y = numpy.column_stack((original_series['Weak'], original_series['Strong']))
axes.plot(x, y, marker='o', size=40)
axes.text(16384, original_series['Weak'][16384], 'Weak Scaling',
          style={'text-anchor':'end', 'baseline-shift':'80%'},
          angle=-32)
axes.text(8192, original_series['Strong'][8192], 'Strong Scaling',
          style={'baseline-shift':'80%'},
          angle=-22.5)

# The easiest way to plot the ideal curve is to sort the original data by the number of
# nodes (the x axis) and plot that column directly.
ideal_order = numpy.argsort(data['Number of Nodes'])
axes.plot(data['Number of Nodes'][ideal_order],
          data['ideal nanoseconds per particle'][ideal_order],
          style={'stroke':'gray', 'stroke-width':0.5, 'stroke-dasharray':'5,5'})
axes.text(8192, 0.198, 'Ideal',
          style={'baseline-shift':'-75%', 'color':'gray'},
          angle=-34.5)
Out[14]:
<toyplot.mark.Text at 0x109651790>
Weak ScalingStrong ScalingIdeal3264128256512102420484096819216384Number of Nodes0.1110Nanoseconds per Particle

Save the plot as HabibTimePerParticle.pdf.

In [15]:
toyplot.pdf.render(canvas, 'HabibTimePerParticle.pdf')

Rate Plot

In [16]:
rate_series = data.pivot_table(index='Number of Nodes',
                               columns='Study',
                               values='rate')
rate_series
Out[16]:
Study Strong Weak
Number of Nodes
32 1.555636e+07 NaN
64 2.848121e+07 NaN
128 5.173455e+07 7.893993e+07
256 9.922189e+07 1.291502e+08
512 1.955324e+08 2.496402e+08
1024 3.274606e+08 5.811326e+08
2048 4.534950e+08 8.936836e+08
4096 5.254482e+08 1.727299e+09
8192 7.986704e+08 3.692902e+09
16384 NaN 6.088679e+09
In [17]:
data['ideal rate']
Out[17]:
0     1.973498e+07
1     3.946996e+07
2     7.893993e+07
3     1.578799e+08
4     3.157597e+08
5     6.315194e+08
6     1.263039e+09
7     2.526078e+09
8     5.052155e+09
9     7.893993e+07
10    1.578799e+08
11    3.157597e+08
12    6.315194e+08
13    1.263039e+09
14    2.526078e+09
15    5.052155e+09
16    1.010431e+10
Name: ideal rate, dtype: float64
In [18]:
canvas = toyplot.Canvas(400, 320)
axes = canvas.axes(xscale='linear',
                   yscale='linear',
                   xlabel='Number of Nodes',
                   ylabel='Rate (billion particles/second)')
axes.x.ticks.locator = toyplot.locator.Explicit(numpy.arange(0,2**14+2**12,2**12))
axes.x.ticks.show = True
axes.y.ticks.show = True

x = rate_series.index.values
y = numpy.column_stack((rate_series['Weak']/1e9, rate_series['Strong']/1e9))
axes.plot(x, y, marker='o', size=40)
axes.text(16384, rate_series['Weak'][16384]/1e9, 'Weak Scaling',
          style={'text-anchor':'end', 'baseline-shift':'80%'},
          angle=21)
axes.text(8192+300, rate_series['Strong'][8192]/1e9, 'Strong Scaling',
          style={'text-anchor':'start'})

# The easiest way to plot the ideal curve is to sort the original data by the number of
# nodes (the x axis) and plot that column directly.
ideal_order = numpy.argsort(data['Number of Nodes'])
axes.plot(data['Number of Nodes'][ideal_order],
          data['ideal rate'][ideal_order]/1e9,
          style={'stroke':'gray', 'stroke-width':0.5, 'stroke-dasharray':'5,5'})
axes.text(16384, 10.1, 'Ideal',
          style={'text-anchor':'end', 'baseline-shift':'-75%', 'color':'gray'},
          angle=35.5)
Out[18]:
<toyplot.mark.Text at 0x10967ee50>
Weak ScalingStrong ScalingIdeal0409681921228816384Number of Nodes0510Rate (billion particles/second)

Save the plot as HabibRate.pdf.

In [19]:
toyplot.pdf.render(canvas, 'HabibRate.pdf')

Efficiency Plot

In [20]:
efficiency_series = data.pivot_table(index='Number of Nodes',
                                     columns='Study',
                                     values='efficiency')
efficiency_series
Out[20]:
Study Strong Weak
Number of Nodes
32 0.788263 NaN
64 0.721592 NaN
128 0.655366 1.000000
256 0.628465 0.818029
512 0.619244 0.790602
1024 0.518528 0.920213
2048 0.359051 0.707566
4096 0.208010 0.683787
8192 0.158085 0.730956
16384 NaN 0.602582
In [21]:
canvas = toyplot.Canvas(400, 320)
axes = canvas.axes(xscale='linear',
                   yscale='linear',
                   xlabel='Number of Nodes',
                   ylabel='Efficiency')
axes.x.ticks.locator = toyplot.locator.Explicit(numpy.arange(0,2**14+2**12,2**12))
axes.x.ticks.show = True
axes.y.ticks.show = True
axes.x.domain.min = 0

x = efficiency_series.index.values
y = numpy.column_stack((efficiency_series['Weak'], efficiency_series['Strong']))
axes.plot(x, y, marker='o', size=40)
axes.text(16384, efficiency_series['Weak'][16384], 'Weak Scaling',
          style={'text-anchor':'end', 'baseline-shift':'80%'},
          angle=-12.5)
axes.text(8192+300, efficiency_series['Strong'][8192], 'Strong Scaling',
          style={'text-anchor':'start'})
Out[21]:
<toyplot.mark.Text at 0x10967eb10>
Weak ScalingStrong Scaling0409681921228816384Number of Nodes0.00.51.0Efficiency

Save the plot as HabibEfficiency.pdf.

In [22]:
toyplot.pdf.render(canvas, 'HabibEfficiency.pdf')