The Craft of Scientific Illustration
The Good,
the Bad, and the Ugly
Scientific Illustration is a rich and wide field for
creative activity, ranging from artistic qualitative mixed-media drawings
visualizing concepts of the directly accessible or non-accessible natural world
to the quantitative, highly technical plotting of (numerical) data. This site
is restricted to the latter aspect, to the presentation and plotting of
scientific data and it is, almost necessarily, strongly personally biased.
The body of literature on scientific graphing is considerable; for me, it was
mostly of limited usefulness at best. Most books mention the graphical
presentation of scientific data in a byway chapter within the framework of
scientific publishing. Most of the time spent on these chapters is wasted time.
One source I found pleasantly out of the ordinary was E.R. Tufte's
The Visual Display of Quantitative Information (Graphics Press,
Connecticut 2001)
(at Amazon).
The times of hand-drawing figures containing scientific data are gone for good.
The quality of the illustrations did not turn to the better, unfortunately,
this despite the very powerful and versatile computer plotting programs
available, frequently even at no cost. Surprisingly enough, there are still
some books on the market containing high-quality hand-drawn figures. It is
questionable, though, if they really will serve as models of what to strive
for. I mention some of them for those who have have
no idea what I am up to here. One of them is Building Scientific Apparatus
(2nd Ed.) by Moore, Davis, and Coplan, published by
Addison-Wesely, it contains a large number of
engineering-type ink drawings. The classic Gravitation by Misner, Thorne, and Wheeler, published by Freeman, contains
a large number of tasteful drawings, from sketches setting the stage to
detailed quantitative science plots. More of culinary type, admittedly, are the
ink-and-pen drawings in the 1997 edition of The Joy of Cooking by
Rombauer et al. published by Scribner; the figures in this book can easily cope
with top-notch scientific illustrations in biology. Eric Sloan's Book of
Storms and other meteorologically oriented books of the same author contain
tasteful ink-and-pen illustrations making his books (reissued by Dover)
worthwhile investments. I think there is no chance to catch the flair and the
beauty of the illustrations and plots in the above-mentioned example books with
any computerized illustration package. Hence, as with any craft: Just to
have the tools of the trade at hand does not lead inevitably to high-quality
work. Unfortunately, currently not too many people and even worse, most
publishers do not seem to be bothered by the deplorable state of scientific
illustration in the literature.
Illustrating
scientific ideas and the production of figures with scientific data has been a
respected profession of its own in the past. The recent streamlining processes
of work flow has driven these handcrafters almost to extinction. Even if
scientific illustration (the kind we discuss on this page) is not really an art
in itself, it is (or at least was) a highly developed handcraft. As it is,
handcrafts will never be replaced by technical tools and gadgets; it takes
human skills (this is where the art lies) that need to be constantly developed
and exercised; the technical tools can make life easier and speed up the
preparation process through allowing for extensive experimentation.
We as casual illustrators (most often not even by choice) cannot be expected to
reach the level of craftsmanship of the professionals; following a few simple
guidelines though can easily and efficiently reduce the number of bad and ugly
figures that make it into scientific papers.
Examples are the obvious way to quickly
illustrate virtues and aesthetic/conceptual crimes in scientific illustration.
As it is much easier to point one's finger to bad examples of others than to do
better oneself, this the avenue we choose:
The Good: The illustrations in Icko Iben Jr.'s astrophysical articles
were of high class over many years. Seemingly, he had a very good scientific
illustrator at his hands. The semi-qualitative figure shown here has a the well
balanced placement of labels and lines. The height of the numbers and the
weight of the fonts are just right. Another example for semi-qualitative
illustration of Iben flavor is the Hertzsprung-Russell (HR) diagram with
an extreme range in effective temperature and luminosity. Another HR diagram that appeared in various Iben publications proves that even extensive labeling and
actually folding two plots into one can be done in a comprehensible and even
attractive way.
To criticize Iben's style just a little bit, one
might put the finger on the mostly used thick pencil style of the lines
in plots containing direct simulation data as well as in the (semi-)qualitative
illustrative figures. This means that it is not always clear, if the thick
lines are already some fitting to somewhat noisy numerical data or if the
computed results were really of the quality which was presented. As with
observational data, also in numerical data plots I prefer to be informed about
the uncertainties, even if they are not contributing to the aesthetic value.
After all, scientific illustration should inform in an objective way.
This is an example of what I call OldSpringer
style of scientific illustration. I confess that I do not know if it really
can be attributed to Springer. However, mainly in the first part of the 20th
century, most of the important Springer Verlag
publications had the same style (such as the physics journals, the Zeitschrift für Astrophysik, the Handbuch der Physik with its dozens of
volumes...). Even today many of the published engineering drawings by BOSCH are
quite close to this OldSpringer style. The
illustration style was not confined, however, to Springer publications, it can
also be found in the old technical Soviet literature. So, if anybody knows more
than I do on this particular scientific drawing style, I would appreciate a
note to complete or change my fragmentary knowledge. The color-magnitude
diagram of the Hyades on the left shows clearly that it is hand drawn. The
weight of the lines relative to the symbols are well balanced, the same is true
for the filled and open symbols. The coordinate grid is slightly less heavy
than the coordinate frame both do not dominate or disturb the general picture
of the data points that are the important information, they support the
quantitative evaluation of the data by a reader. The size of the labeling as
well as the size of the axis titles lead to a balanced overall impression. The
slanted font adds some personality to the figure which does not really go along
with an upright font. I was told by a scientific illustrator, that choosing a
slanted font when hand-labeling plots was the way to go as it hides much
better small deviations from the same slanting angle between neighboring letters
and/or digits. With upright fonts, such small deviations from the vertical jump
easily into the eye.
Another nice example of a multi-line plot in OldSpringer
style is the wavelength vs. a Balmer-jump parameter diagram. Like the last plot, this
one is out of one of the astronomical volumes of the Handbuch
der Physik, the 1958
edition. As pointed out, in the last plot, the relative sizes of labels and
texts, as well as the relative weight of the lines are just right to result in
an harmonious appearance of the admittedly dense information; o.k. the
underlying grid could be drawn on a finer line weight, or as of today in a
light grey.
A decently planned and executed Hertzsprung-Russell diagram - mainly with references to
pulsating variable stars and with selected stars' evolution tracks underlaid. Despite the many lines and many data points
(representing observed pulsating variables), the figure does not look overly
crowded. The choice of only a few sans-serifed fonts
helps to discretely label important features without drowning the plot.
The Bad: The Metallicity - Age
relationship for stars plotted in the figure to the left (enlarge to view!)
shows an example of an awkward choice of plot symbols for the data. One of the
main distractions in this plot is the size of the symbols; does the size mean
anything? Is it a measure of the accuracy of the data? The text did not mention
anything. Hence, why on earth are the symbols that large? Due to their size and
the large number of data points there is considerable overlap between the
symbols so that the impression of an overcrowded plot is intensified. Smaller,
filled symbols - not really pentagons - would have done a much better service.
The labeling of the plot is about equally oversized as the symbols themselves.
The labels are so large that the angular outlines of the Hershey fonts are
visible. Nowadays, Hershey fonts compare to Postscript fonts the way Hershey
chocolate compares to Swiss chocolate; ...well, it's all a matter of taste, in
the end; and to be fair, the Hershey fonts served scientific plotting well in the 80s.
The pale green background of the plot is not the fault of the authors; it is a
bad habit enforced by editors on the writers of mostly semipopular
articles. Seemingly, the prejudice that color-underlaid
figures make articles more interesting gained a foothold; most of the time,
however, they are a nuisance and boring articles remain boring.
A SPIE reference book on optomechanical analyses
(published in 2002) featured numerous very thick pencil illustrations of the
kind shown on the left. The plan of the plot is a good one: show thin grid lines
for the reader to get numbers from the curves. But why on earth are the curves that
heavy? Furthermore, the serifed font used for the
labeling does not make the plot any lighter. A slim and taller font would have
improved the overall appearance the figure. Unfortunately, the whole book is
full of that kind of figures. Too sad, after all we are living in a desktop
publishing age when drawings can be quickly modified with a few mouse clicks
rather than hours of laborious and tricky hand (re)drawing.
This is a typical example of a thick pencil plot (as published in a
conference proceedings volume on reactor safety). Despite the lines connecting
the data being thicker than those of the frame - as advocated - the plot does
not stands for what I mean. The lines of the simulation results are so
thick that they obscure the situation in regions of overlapping. Arrows with
associated numbers to refer to the computation parameters are added to the
figure. It remains mostly unclear to which curves these arrows point to. Hence,
the whole exercise is useless. The major tick-marks are just fat minor ones,
giving the whole presentation a rather clumsy aftertaste. The encircled numbers
in the plot are clearly sans serifed, the rest is set
in a Times font. Independent of the (to me) inappropriate "Times"
labeling, this change of typeface within the plot does not help to beautify it.
Is it conceivable that the author of this figure intended to demonstrate good
agreement between two sets of measurements, or the smoothness of the data, or
what? In any case, the choice of the ranges of the ordinate in particular and
of the abscissa are in no relation to the range of the measured data. If
the key had to be placed inside the figure, then this could have been done in a
more space-conserving fashion, in particular if the ordinate-range would have
been chosen more prudently. Last but not least, the choice of symbols does not
seem very clever in this case, the white squares hide the black ones over most
of the measured abscissa-range. By the way...this example was not made up; it
was really published in peer-reviewed conference proceedings on experimental
fluid dynamics.
The Ugly: I
think there is not much to be said about this plot; just DO NOT do it this way - your
mother will not be proud of you. O.K. the scanning degraded the plot slightly,
but only slightly. The letters in the "all bubbles detached" comment
were bleeding into each other already in the publication itself (an AIP
Conference Proceedings volume on hydrodynamics, by the way). Why on earth is
the frame of the coordinate box so heavy, is it that important, or has anybody
died? The arrows are much too heavy. The symbols are also too heavy and too
fat. The coordinate grid is good per se and in weight, if the author was
really interested to provide quantitative information.
The scan on the left shows two contour-plots with
labels on the contours. The figure is from the same proceedings volume as
mentioned above. The grey-scale is pretty useless, most possibly it was in
color on the computer screen. The white boxes around the contour labels are
disturbing the picture. The choice of values and the density of labels are both
making the whole thing incomprehensible. There is no sign of neither thought
nor care having gone into this figure. The final verdict on this one: `just
don't spoil your reputation with anything like that'!
This is no fake to pretend new
dimensions of ugliness - no, this
plot was really published! It appeared in `Laser Techniques for Fluid
Mechanics´, Springer Verlag (2002). Ugliness going that deep,
does not require detailed analysis of the weak points anymore. The authors
must either have been under drugs or having had at least 2 pars pro mille of
alcohol in their blood when doing the figure. Even Springer is apparently no guarantee for
high-quality publications anymore. Hence, this proofs that the universe is
inflationary.
Some DOs and DO
NOTs
for
appealing science plots. All points are personally biased and far from
complete. The following statements, taken from E.R. Tufte's
The Visual Display of Quantitative Information , capture the spirit for
efficient and elegant technical drawing:
Above all, show the data
Emphasize the data and not the design of the figure.
Maximize the data to ink ratio
Avoid graphical features that distract the readers
from the data you want to them to learn.
No chartjunk
Refrain from shadows under boxes and texts, from
crosshatched, hatched, and weird-patterned fillings of areas, colored
backgrounds, and other senseless wastes of ink.
Revise and edit!
Only looking at the result proves you right or wrong.
Computer-assisted drawing makes it easy to iterate through the preperation
process.
· The coordinate
frame should never constitute the
heaviest lines in a plot. The most important lines come from the science
data. A full coordinate frame (two x-axes and two y-axes forming a
closed box) looks - most of the time - better than only a single x- and a
y-axis each. A coordinate frame produces a desirable closed presentation and
helps in the quantitative evaluation of a plot if intended.
· The tick
marks are usually of the same line weight as the coordinate axes. The
length of the ticks should not interfere with the data and the length should
make it easy for the unaided eye to count the units. Minor tick marks should be
about 1/2 to 3/4 of the length of the major ones. The size of the major tick
marks should be of the order of 2 - 4 % of the plot size.
· Grid lines
are appropriate if the author is interested in the possibility for the reader
to extract quantitative information from the plot easily. The gridlines should
be the finest lines in the plot, just barely visible to guide the eye or the
ruler.
· Select symbols that are also
legible when the plot is shrunk in press or diminished in size when
photocopied. Clustered data destroy any distinction between symbols or even make
single symbols unrecognizable when the plot program does not use hiding of
partly overlapping symbols. The size of the symbols should not be too large to
give a wrong impression of the accuracy of the data but it should be big enough
for good legibility. Different symbols to distinguish between different sets
should be chosen carefully: Circles, squares, triangles, crosses can be easily
distinguished over a broad range of sizes. However, stars (four-armed ones) and
asterisks (with 5 arms) are already difficult to distinguish, in particular if
the data points are densely sprinkled, and finally not to mention the use of
polygons with more than 5 corners.
· Even nowadays,
colors should be used only if all other means to present the data fail.
First, not all journals (and especially not proceedings publishers) support
color printing. Second, most readers still use black/white printers and copiers
to get their personal copies. Hence, most of the time, the benefit of color is
lost at the level of the end-user. Most information can be presented quite well
in grey-scale figures. It is claimed that a trained eye can distinguish as many
as 128 shades of grey (which is more than in any other color). Admittedly,
colored figures usually look richer in information and appear definitely more
seducing than grey ones, even so if both aspects are unjustified. In any case,
grey scales allow for a more objective judgment of the information than
color wedges, which include colors of different brilliance that easily mislead
our brains. The textbook Physics of the Solar Corona by M. Aschwanden (published by Springer in 2004) is a magnificent
example of how mostly black/white graphs and cleverly planned grey-scale
figures communicate very elegantly all the pertinent scientific information.
Color is mainly useful for three-dimensional data projected onto
two-dimensional sheets. If grey scale is, for whatever reason, out of
discussion from the outset, make sure the chosen color scheme translates into
something useful when mapped onto grey scale (for a bad example see the graph
under `The Ugly' further up).
Colored line graphs are most of the time superfluous. Even if the colors for
the different lines are chosen carefully, different line types and/or different
line weights are usually at least as useful to distinguish between different
curves.
· Keys of
symbols used in a plot should never be described in boxes with shadows. Shadows
in general might be (if at all) appropriate in advertisements at your local
grocery store but they are definitely out of place in any scientific
illustration.
· The most
prominent lines of a graph should be the lines
with scientific information. Different data families, i.e. different lines, can
be discriminated either with different line types (not too much of them, to
maintain lucidity) or different line weights.
· Use as few
different fonts as possible to label and annotate a plot. Express the
importance of the various textual parts with the font size not with different
font families.
· Plots look
lighter when using sans-serif fonts
to label and annotate a figure. Serif fonts are fine for long texts where the
eye needs to be guided by the serifs. I find that serifs make most plots
clumsy.
· Never ever use
gothic majuscles
only. First, it is essentially illegible and second, it proves that the
author at best knew how to select the font on the computer but otherwise has
not even basic appreciation of lettering, not to mention calligraphy or taste.
Software packages
Software for
technical drawing at the computer is abundant, but many have deficiencies in one
or the other aspect important to scientific illustration. The software packages
mentioned below refer to those open-source tools that also provide capabilities
to label plots with decently looking mathematical expressions (i.e. with
something that tastes TeX-like). They all run under
Linux and many of them also on other platforms.
The stand-alone plotting
package called GLE
appears probably pre-historic by present-day GUI standards. Since version 4.1.0,
GLE offers a GUI called QGLE which can be used either as interactive graph
generator or as a previewer of results from scripts. The GLE scripting language
is powerful, it even allows for direct postscript coding within GLE scripts.
The output from GLE can be piped into various formats, ranging from eps to png, jpg, pdf. Contemplating my working behavior of lately, it seems
as if Yorick is slipping in with increasing frequency as a device
for high-end plots and not only for pre- and post-processing of data. Yorick is a powerful scripting environment for scientific
computing, coming with a wonderful interactive plotting facility. Labeling is
not highly developed, and I must defend Yorick
immediately: it was never intended for that glamour fiddling stuff. The plots
can be stored among others in postscript/eps format.
Therefore, the final publication-ready brush-up can be done via any illustrator
software that can deal with postscript input files (e.g. GLE, Inkscape, Skencil, Xfig, gimp, or whatever your heart beats for). For example Scribus
turns out to be a very interesting alternative to expensive page layout programs.
With Scribus, eps or pdf figures can be imported to be supplemented with text
for labeling, or annotations or composite figures can be created. At the end,
the result can be exported again as a pdf or eps file. But Scribus can do
more, it is designed to do DTP on a professional level.
matplotlib is a plotting library for/in the Python
scripting language. The syntax resembles that of Matlab.
Already the entry page of the matplotlib-homepage
convinces us of the high quality plots that can be produced with this package.
Since matplotlib is rooted inside a scripting
language it is easy for the user to go as complex as necessary. Second, the scripting nature ensures that
plots can be easily reproduced and/or placed in a batch processing environment.
For all those who prefer a printed, tutorial-like guide over plain
online help, Tosi's book
Matplotlib for Python Developers
(at Amazon)
might prove helpful.
1.XI.11