This document demonstrates taking some of the scaling data from Habib et al. (2013) and presenting it to show the scaling performance well. These plots demonstrate the metrics outlined in "Formal Metrics for Large-Scale Parallel Performance" by Moreland and Oldfield.
First the preliminaries. Here are the Python modules we depend on.
import numpy
import pandas
import toyplot.pdf
Read in the raw data file Habib2013FullTitanRaw.csv. This is timing data ripped from the [Habib2013] paper by measuring the locations of the measurements in Figure 3. The measurements were made using rules in Adobe Acrobat, so should be as accurate as they are placed in the figure.
data = pandas.read_csv('Habib2013FullTitanRaw.csv')
data
Our metrics need to know the data size for each measurement. According to the paper, the strong scaling measurements were all done with a 1024^3 grid of particles. The weak scaling had 32 million particles per node.
particles = numpy.empty(len(data.index))
strong_indices = numpy.array(data['Study'] == 'Strong')
particles[strong_indices] = 1024 ** 3
weak_indices = numpy.array(data['Study'] == 'Weak')
particles[weak_indices] = 32000000*data['Number of Nodes'][weak_indices]
data['particles'] = particles
The raw measurements are in screen pixels when the measurements were taken. When taking these measurements, I also measured the unit height (the height between tics) as 28.003 pixels per log(ns per particle). Use this to convert the plot measure to the actual value of nanoseconds per particle.
data['nanoseconds per particle'] = 10 ** (data['Plot measure (pixels)']/28.003)
With the nanoseconds per particle and the number of particle, we can derive what the actual time was.
data['seconds'] = (data['nanoseconds per particle']*1e-9)*data['particles']
We now have plenty of data to compute the rate. We can actually do it two ways. The rate is simply the inverse of the 'nanoseconds per particle' column (which we scale to particles per second). We can also compute the rate as the 'particles' column divided by the 'seconds' column. We compute both ways. One is saved and the other is used to check the error, which should be very low. We compute the error as an l2 norm.
data['rate'] = 1e9/data['nanoseconds per particle']
rate_check = data['particles']/data['seconds']
rate_error = numpy.linalg.norm(data['rate']-rate_check, ord=2)
rate_error
The cost is defined as the time taken times the number of processing elements used. The cost per unit is the cost divided by the problem size (in this case, the number of particles). Some of our other caculations also require knowing the best cost per unit.
data['cost'] = data['seconds']*data['Number of Nodes']
data['cost per unit'] = data['cost']/data['particles']
best_cost_per_unit = numpy.min(data['cost per unit'])
best_cost_per_unit
The efficiency can be expressed as the the best cost per unit divided by the observed cost per unit.
data['efficiency'] = best_cost_per_unit/data['cost per unit']
We can also use the ideal rate for comparison purposes. We also want the ideal value for 'nanoseconds per particle' as given in the original paper.
data['ideal rate'] = data['Number of Nodes']/best_cost_per_unit
data['ideal nanoseconds per particle'] = 1e9/data['ideal rate']
Make a typical plot of time with linear scaling of the axis.
time_series = data.pivot_table(index='Number of Nodes',
columns='Study',
values='seconds')
time_series
canvas = toyplot.Canvas(400, 320)
axes = canvas.axes(xscale='linear',
yscale='linear',
xlabel='Number of Nodes',
ylabel='Time (seconds)')
axes.x.ticks.locator = toyplot.locator.Explicit(numpy.arange(0,2**14+2**12,2**12))
axes.x.ticks.show = True
axes.y.ticks.show = True
x = time_series.index.values
y = numpy.column_stack((time_series['Weak'], time_series['Strong']))
axes.plot(x, y, marker='o', size=40)
axes.text(16384, time_series['Weak'][16384], 'Weak Scaling',
style={'text-anchor':'end', 'baseline-shift':'-80%'},
angle=13.5)
axes.text(8192+300, time_series['Strong'][8192], 'Strong Scaling',
style={'text-anchor':'start'})
Save the plot as HabibTime.pdf.
toyplot.pdf.render(canvas, 'HabibTime.pdf')
Here we reproduce the original plot in the paper. The one difference is what we use as the ideal nanoseconds per particle. We use the ideal rate as measured from the algorithm. I believe the paper estimated the ideal based on the maximum FLOP rate of the computer vs. the FLOP rate while running the algorithm.
original_series = data.pivot_table(index='Number of Nodes',
columns='Study',
values='nanoseconds per particle',
aggfunc=numpy.mean)
original_series
canvas = toyplot.Canvas(400, 320)
axes = canvas.axes(xscale='log2',
yscale='log10',
xlabel='Number of Nodes',
ylabel='Nanoseconds per Particle')
axes.x.ticks.locator = toyplot.locator.Explicit(2 ** numpy.arange(5,15))
axes.y.ticks.locator = toyplot.locator.Explicit(10 ** numpy.arange(-1,2, dtype=numpy.float64))
axes.x.ticks.show = True
axes.y.ticks.show = True
x = original_series.index.values
y = numpy.column_stack((original_series['Weak'], original_series['Strong']))
axes.plot(x, y, marker='o', size=40)
axes.text(16384, original_series['Weak'][16384], 'Weak Scaling',
style={'text-anchor':'end', 'baseline-shift':'80%'},
angle=-32)
axes.text(8192, original_series['Strong'][8192], 'Strong Scaling',
style={'baseline-shift':'80%'},
angle=-22.5)
# The easiest way to plot the ideal curve is to sort the original data by the number of
# nodes (the x axis) and plot that column directly.
ideal_order = numpy.argsort(data['Number of Nodes'])
axes.plot(data['Number of Nodes'][ideal_order],
data['ideal nanoseconds per particle'][ideal_order],
style={'stroke':'gray', 'stroke-width':0.5, 'stroke-dasharray':'5,5'})
axes.text(8192, 0.198, 'Ideal',
style={'baseline-shift':'-75%', 'color':'gray'},
angle=-34.5)
Save the plot as HabibTimePerParticle.pdf.
toyplot.pdf.render(canvas, 'HabibTimePerParticle.pdf')
rate_series = data.pivot_table(index='Number of Nodes',
columns='Study',
values='rate')
rate_series
data['ideal rate']
canvas = toyplot.Canvas(400, 320)
axes = canvas.axes(xscale='linear',
yscale='linear',
xlabel='Number of Nodes',
ylabel='Rate (billion particles/second)')
axes.x.ticks.locator = toyplot.locator.Explicit(numpy.arange(0,2**14+2**12,2**12))
axes.x.ticks.show = True
axes.y.ticks.show = True
x = rate_series.index.values
y = numpy.column_stack((rate_series['Weak']/1e9, rate_series['Strong']/1e9))
axes.plot(x, y, marker='o', size=40)
axes.text(16384, rate_series['Weak'][16384]/1e9, 'Weak Scaling',
style={'text-anchor':'end', 'baseline-shift':'80%'},
angle=21)
axes.text(8192+300, rate_series['Strong'][8192]/1e9, 'Strong Scaling',
style={'text-anchor':'start'})
# The easiest way to plot the ideal curve is to sort the original data by the number of
# nodes (the x axis) and plot that column directly.
ideal_order = numpy.argsort(data['Number of Nodes'])
axes.plot(data['Number of Nodes'][ideal_order],
data['ideal rate'][ideal_order]/1e9,
style={'stroke':'gray', 'stroke-width':0.5, 'stroke-dasharray':'5,5'})
axes.text(16384, 10.1, 'Ideal',
style={'text-anchor':'end', 'baseline-shift':'-75%', 'color':'gray'},
angle=35.5)
Save the plot as HabibRate.pdf.
toyplot.pdf.render(canvas, 'HabibRate.pdf')
efficiency_series = data.pivot_table(index='Number of Nodes',
columns='Study',
values='efficiency')
efficiency_series
canvas = toyplot.Canvas(400, 320)
axes = canvas.axes(xscale='linear',
yscale='linear',
xlabel='Number of Nodes',
ylabel='Efficiency')
axes.x.ticks.locator = toyplot.locator.Explicit(numpy.arange(0,2**14+2**12,2**12))
axes.x.ticks.show = True
axes.y.ticks.show = True
axes.x.domain.min = 0
x = efficiency_series.index.values
y = numpy.column_stack((efficiency_series['Weak'], efficiency_series['Strong']))
axes.plot(x, y, marker='o', size=40)
axes.text(16384, efficiency_series['Weak'][16384], 'Weak Scaling',
style={'text-anchor':'end', 'baseline-shift':'80%'},
angle=-12.5)
axes.text(8192+300, efficiency_series['Strong'][8192], 'Strong Scaling',
style={'text-anchor':'start'})
Save the plot as HabibEfficiency.pdf.
toyplot.pdf.render(canvas, 'HabibEfficiency.pdf')