The Pith of Performance

Tuesday, April 20, 2021

PDQ Online Workshop, May 17-21, 2021

PDQ (Pretty Damn Quick) is a free, open source, performance analyzer available from the Performance Dynamics web site.

All modern computer systems, no matter how complex, can be thought of as a directed graph of individual buffers that hold requests until to be serviced at a shared computational resource, e.g., a CPU or disk. Since a buffer is just a queue, any computer infrastructure, from your laptop up to Facebook.com, can be represented as a directed graph of queues.

The directed arcs or arrows in such a graph correspond to workflows between different queues. In the parlance of queueing theory, a directed graph of queues is called a queueing network model. PDQ is a tool for predicting performance metrics such as, waiting time, throughput, optimal user-load.

Two major benefits of using PDQ are:

confirmation that monitored performance metrics have their expected values
predict performance for circumstances that lie beyond current measurements

Find out more about the workshop and register today.

Thursday, November 26, 2020

PDQ 7.0 is Not a Turkey

Giving Thanks for the release of PDQ 7.0, after a 5-year drought, and just in time for the PDQW workshop next week.

New Featues

The introduction of the STREAMING solution method for OPEN queueing networks. (cf. CANON, which can still be used).
The CreateMultiNode() function is now defined for CLOSED queueing networks and distinguished via the MSC device type (cf. MSO for OPEN networks).
The format of Report() has been modified to make the various types of queueing network parameters clearer.
See the R Help pages in RStudio for details.
Run the demo(package="pdq") command in the R console to review a variety of PDQ 7 models.

Maintenance Changes

The migration of Python from 2 to 3 has introduced maintenance complications for PDQ. Python 3 may eventually be accommodated in future PDQ releases. Perl maintenance has ended with PDQ release 6.2, which to remain compatible with the Perl::PDQ book (2011).

Sunday, November 8, 2020

PDQ Online Workshop, Nov 30-Dec 5, 2020

PDQ (Pretty Damn Quick) is a queueing graph performance analyzer that comes as:

free open source software package
with an online user manual

As shown in the above diagram, any modern computer system can be thought of as a directed graph of individual buffers where requests wait to be serviced at some kind of shared computational resource, e.g., a CPU or disk. Since a buffer is just a queue, any computer infrastructure, from a laptop to Facebook, can be represented as a directed graph of queues. The directed arcs or arrows correspond to workflows between the different queues. In the parlance of queueing theory, a directed graph of queues is called a queueing network model. PDQ is a tool for calculating the performance metrics, e.g., waiting time, throughput, optimal load, of such network models.

Some example PQD applications include models of:

Cloud applications
Packet networks
HTTP + VM + DB apps
PAXOS-type distributed apps

Two major benefits of using PDQ are:

confirmation that monitored performance metrics have their expected values
predict performance for circumstances that lie beyond current load-testing

This particular PDQ workshop has been requested for the Central European Time zone.

All sessions are INTERACTIVE using Skype (not canned videos)
Online sessions are usually 4 hours in duration
A typical timespan is 2pm to 6pm CET each business day
A nominal 5-10 minute bio break at 4pm CET
Attendees are encouraged to bring there own PDQ projects or data to start one

Find out more about the workshop.
Here is the REGISTRATION page.

Hope to see you at PDQW!

Tuesday, February 11, 2020

Converting Between Human Time and Absolute Time in the Shell

This is really more of a note-to-self but it may also be useful for other readers.

Converting between various time zones, including UTC time, is both a pain and error-prone. A better solution is to use absolute time. Thankfully, Unix provides such a time: the so-called Epoch time, which is the integer count of the number of seconds since January 1, 1970.

Timestamps are both very important and often overlooked: the moreso in the context of performance analysis, where time is the zeroth metric. In fact, my preferred title would have been, Death to Time Zones but that choice would probably have made it harder to find the main point here, later on, viz., how to use the Unix epoch time.

Although there are web pages that facilitate such time conversions, there are also shell commands that are often more convenient to use, but not so well known. With that in mind, here are some examples (for future reference) of how to convert between human-time and Unix time.

Examples

The following are the bash shell commands for both MacOS and Linux.

MacOS and BSD

Optionally see the human-readable date-time:


[njg]~% date
Tue Feb 11 10:04:32 PST 2020

Get the Unix integer time for the current date:


[njg]~% date +%s
1581444272

Resolve a Unix epoch integer to date-time:


[njg]~% date -r 1581444272
Tue Feb 11 10:04:32 PST 2020

Linux

Optionally see the human-readable date-time:


[njg]~% date
Tue Feb 11 10:04:32 PST 2020

Get the Unix integer time for the current date:


[njg]~% date +%s
1581444272

Resolve a Unix epoch integer to date-time:


[njg]~% date -d @1581444272
Tue Feb 11 18:04:32 UTC 2020

Monday, June 3, 2019

Book Review: How to Build a Performance Testing Stack From Scratch

Writing a technical book is a difficult undertaking. As a technical author myself, I know that writing well is both arduous and tedious. There are no shortcuts. Over the last 40 years or so, computer-based tools have been developed to help authors write printed textbooks, monographs and technical articles. LaTeX (pronounced lar-tek) reigns supreme in that realm because it's not just word processing software but, a full-blown digital typesetting application that enables authors to produce a single camera-ready PDF file (bits) that is fit for direct printing. Not only does LaTeX correctly typeset characters and mathematical symbols but it can generate the Table of Contents and the corresponding Index of terms, together with correctly cross-referenced callouts for numbered chapters, sections, figures, tables and equations.

In the meantime, over the past 20 years, the nature of the book itself has become progressively more digital. The ability to render the book-block on digital devices as an "e-book" has made printed hardcopies optional. Although purely digital e-books make reading more ubiquitous, e-book file formats and display quality varies across devices, i.e., laptops, phones tablets, and various e-readers. Any reduction in display quality is offset by virtue of e-books being able to include user interaction and even animation: features entirely beyond the printed book. In that sense, there really has been a revolution in the publishing industry: books are no longer about books, they're now about media.

Recently, I became aware of an undergraduate calculus textbook that makes powerful use of animation and audio. The author is a academic mathematician and he published it on YouTube! Is that a book or a movie? Somehow, it's a hybrid of both and, indeed, I wish I'd been able to learn from technical "books" like that. Good visuals and animations can make difficult technical concepts much easier to comprehend. Progressive as all that is, when it comes to technical e-books, I'm not aware of any single authoring tool that can match the quality of LaTeX, let alone incorporate user-interaction and animation. If such a thing did exist, I would be all over it. And Markdown doesn't cut it. But, digital authoring tools are continually evolving.

Matt Fleming (@fleming_matt on Twitter), the author of How to Build a Performance Testing Stack From Scratch, opted to use a static e-book format—not because it produces the most readable result, but because it is the best way to reach a wider audience at lower cost than a more expensive print publisher. The e-publisher in this case is (the relatively unknown) Ministry of Testing Ltd in Brighton, UK and is available on Amazon for the Kindle reader. I was also able to read it using iBooks on Mac OS X.

The range of topics covered is very extensive. I've included the Table of Contents here because it is not viewable on Amazon:

Part 1

Step 1: Identify Stakeholders
Step 2: Identify What to Measure
Step 3: Test Design
Step 4: Measuring Test Success and Failure
Step 5: Sharing Results
Part 2

Understanding Statistics
Latency
Throughput
Statistical Significance
Part 3 The Benchmark Hierarchy Picking Tests Validating Tests Part 4

Use a Performance Framework
Ensure the Test Duration is Consistent
Order Tests by Duration
Keep Reproduction Cases Small
Setup The Environment Before Each Test
Make Updating Tests Easy
Errors Should Be Fatal
Part 5

Format
Use Individual Sample Data
Detecting Results in the Noise
Outliers
Result Precision
If All Else Fails Use Test Duration
Delivering Results

The overall e-book presentation of "Performance Testing Stack" seems underdeveloped. Most topics could have been greatly expanded. But, as Matt informed me, this is probably due to the content amounting to a concatenation of previously written blog posts. As I said earlier, there are no shortcuts to writing well. The paucity of detail, however, is offset by the shear enthusiasm the Matt brings to his writing: an aspect that deserves separate acknowledgement because performance testing is a very complex subject which can otherwise appear dry and mind-boggling to the uninitiated. And it's the uninitiated that Matt wants to reach. He has written this book in order to encourage the uninitiated reader to seriously consider entering the field.

Some points that could have been developed further, include:

Plots and tables can be expanded for legibility by double clicking on them
The term "benchmark" needs better explaination
Section 2.1 discusses the Harmonic mean but there's no discussion of the Geometric mean.
Section 3.3 on Distributions does not clearly distinguish between analytic (parametric) distributions and sample distributions (which usually have no analytic form).
Section 5 (p.85) on Result Precision needs to discuss the difference between accuracy, precision, and error.

Conversely, Matt's enthusiasm may have gone a bit overboard in his choice of title. The book promises:

This book will walk you through designing and building a performance testing stack from scratch; step by step from planning and running performance tests through to understanding and analysing the results. If you’re new to performance testing or looking to expand your understanding of this topic then this book is for you!

Unfortunately, this book doesn't provide enough details to actually build a test stack—which would've been very cool. Rather, it presents a comprehensive overview of all the major concepts that one needs to absorb in order to develop a running performance testing stack. But even with the more limited scope, this book is still important because, off hand, I don't know of any other source where one can be introduced to performance testing without drowning in a sea of terminology, procedures and architectures.

Ultimately, this e-book is a great starting point for newbies, as well was being a good reminder for seasoned testers about what should be done in good performance tests.

Wednesday, January 2, 2019

DSConf 2019 Featured Talk

"Applying The Universal Scalability Law to Distributed Systems"

DSConf'19 - Distributed Systems Conference (scroll down)
Pune, India
11am IST
February 16

I'm very much looking forward to this event and I thank @ShrivedAgashe for the invitation.

Monday, June 25, 2018

Guerrilla 2018 Classes Now Open

All Guerrilla training classes are now open for registration.

GCAP: Guerrilla Capacity and Performance — From Counters to Containers and Clouds
GDAT: Guerrilla Data Analytics — Everything from Linear Regression to Machine Learning
PDQW: Pretty Damn Quick Workshop — Personal tuition for performance and capacity mgmt

The following highlights indicate the kind of thing you'll learn. Most especially, how to make better use of all that monitoring and load-testing data you keep collecting.

How to save millions of dollars with a one-line performance model (video)
How to minimize chargeback after you lift and shift to the cloud (video)
How to correctly emulate web traffic on a load-testing rig (PDF)

See what Guerrilla grads are saying about these classes. And how many instructors do you know that are available for you from 9am to 9pm (or later) each day of your class?