\documentclass[10pt,english]{article}
\usepackage{babel}
\usepackage{shortvrb}
\usepackage[latin1]{inputenc}
\usepackage{tabularx}
\usepackage{longtable}
\setlength{\extrarowheight}{2pt}
\usepackage{amsmath}
\usepackage{graphicx}
\usepackage{color}
\usepackage{multirow}
\usepackage[colorlinks=true,linkcolor=blue,urlcolor=blue]{hyperref}
\usepackage[a4paper]{geometry}
%% generator Docutils: http://docutils.sourceforge.net/
\newlength{\admonitionwidth}
\setlength{\admonitionwidth}{0.9\textwidth}
\newlength{\docinfowidth}
\setlength{\docinfowidth}{0.9\textwidth}
\newlength{\locallinewidth}
\newcommand{\optionlistlabel}[1]{\bf #1 \hfill}
\newenvironment{optionlist}[1]
{\begin{list}{}
  {\setlength{\labelwidth}{#1}
   \setlength{\rightmargin}{1cm}
   \setlength{\leftmargin}{\rightmargin}
   \addtolength{\leftmargin}{\labelwidth}
   \addtolength{\leftmargin}{\labelsep}
   \renewcommand{\makelabel}{\optionlistlabel}}
}{\end{list}}
% begin: floats for footnotes tweaking.
\setlength{\floatsep}{0.5em}
\setlength{\textfloatsep}{\fill}
\addtolength{\textfloatsep}{3em}
\renewcommand{\textfraction}{0.5}
\renewcommand{\topfraction}{0.5}
\renewcommand{\bottomfraction}{0.5}
\setcounter{totalnumber}{50}
\setcounter{topnumber}{50}
\setcounter{bottomnumber}{50}
% end floats for footnotes
% some commands, that could be overwritten in the style file.
\newcommand{\rubric}[1]{\subsection*{~\hfill {\it #1} \hfill ~}}
\newcommand{\titlereference}[1]{\textsl{#1}}
% end of "some commands"
\input{style.tex}
\title{Drawing graphs the easy way: an introduction to dot}
\author{}
\date{}
\hypersetup{
pdftitle={Drawing graphs the easy way: an introduction to dot}
}
\raggedbottom
\begin{document}
\maketitle


\setlength{\locallinewidth}{\linewidth}


%___________________________________________________________________________

\hypertarget{got-a-graphing-problem}{}
\pdfbookmark[0]{Got a graphing problem?}{got-a-graphing-problem}
\section*{Got a graphing problem?}

You must give a presentation tomorrow and you haven't prepared any
figure yet; you must document your last project and you need to plot
your most hairy class hierarchies;  you are asked to provide ten slightly 
different variations of the same picture; you are pathologically unable to put 
your finger on a mouse and drawing anything more complex that a square ...
in all these cases,  dont' worry! \texttt{dot} comes at the rescue and 
can save your day!


%___________________________________________________________________________

\hypertarget{what-is-dot}{}
\pdfbookmark[0]{What is dot?}{what-is-dot}
\section*{What is \texttt{dot}?}

\texttt{dot} is a tool to generate nice-looking diagrams with a minimum of
effort. \texttt{dot} is distributed as a part of \texttt{GraphViz}, an
Open Source project developed at AT{\&}T and released under a MIT licence.
It is a high quality and mature product, with very good 
documentation and support, available on all major platforms, 
including Unix/Linux, Windows and Mac. There is an official home-page and 
a supporting mailing list.


%___________________________________________________________________________

\hypertarget{what-can-i-do-with-dot}{}
\pdfbookmark[0]{What can I do with dot ?}{what-can-i-do-with-dot}
\section*{What can I do with \texttt{dot} ?}

First of all, let me make clear that \texttt{dot} is not just another paint program, 
nor a vector graphics program. \texttt{dot} is a scriptable batch-oriented graphing 
tool;  it is to vector drawing programs as \texttt{LaTex} is to word processors.
If you want to have control on every single pixel in your diagram,
or if you are an artistic person who likes to draw free hand, then \texttt{dot} 
is not for you. \texttt{dot} is a tool for the lazy developer, the one who wants
the job done with the minimum effort and without caring too much about the details.

Since \texttt{dot} is not a WYSIWYG tool - even if it comes together with a WYSIWYG tool, 
\texttt{dotty} - it is not intended to be used interactively: 
its strength is the ability to \emph{programmatically} generate diagrams. To fullfill
this aim, \texttt{dot} uses a simple but powerful graph description language. You 
just give (very high level) instructions to \texttt{dot} and it will draw the diagrams 
for you, taking into account all the low level details. Whereas the user 
has a faily large choice of customization 
options and can control the final output in many ways, it is not at all easy 
to force \texttt{dot} to do \emph{exactly} what one wants.

Expecting that would mean to fight with the tool. 
You should think of \texttt{dot} as a kind of smart boy, 
who likes to do things his own way and who is very good at it, but becomes 
nervous if the master tries to put too much pressure on him. 
The right attitude with \texttt{dot} (just as with Latex) is to trust it and 
let it to do the job.
At the end, when \texttt{dot} has finished its part, the user can always 
refine the graph by hand, by using \texttt{dotty}, the interactive editor 
of \texttt{dot} diagrams which comes with GraphViz and has the ability to read 
and generate \texttt{dot} code.
But in most cases, the user is not expected to do anything manually,
since \texttt{dot} works pretty well. The right way to go is to customize
\texttt{dot} options, then the user can programmatically generate one or
one hundred diagrams with the least effort.

\texttt{dot} is especially useful in repetitive and automatic tasks, since
it is not difficult to generate \texttt{dot} code.
For instance, \texttt{dot} comes very handy in the area of automatic documentation 
of code. This kind of jobs can be down with UML tools, but \texttt{dot} has an 
advantage over them in terms of easy of use, flat learning curve and 
flexibility. On top of that, \texttt{dot} is very fast, since it is written in C
and can generate very complicated diagrams in fractions of second.


%___________________________________________________________________________

\hypertarget{hello-world-from-dot}{}
\pdfbookmark[0]{Hello World from dot}{hello-world-from-dot}
\section*{Hello World from \texttt{dot}}

\texttt{dot} code has a C-ish syntax and it is quite readable even from somebody
who has not read the manual. For instance, this \texttt{dot} script:
\begin{quote}
\begin{ttfamily}\begin{flushleft}
\mbox{graph~hello{\{}}\\
\mbox{}\\
\mbox{//~Comment:~Hello~World~from~``dot``}\\
\mbox{//~a~graph~with~a~single~node~Node1}\\
\mbox{}\\
\mbox{Node1~[label="Hello,~World!"]}\\
\mbox{}\\
\mbox{{\}}}
\end{flushleft}\end{ttfamily}
\end{quote}

generates the following picture:
\begin{figure}

\includegraphics{fig1.ps}
\end{figure}

Having saved this code in a file called \texttt{hello.dot}, the graph can be 
generated and shown on the screen with a simple one-liner:
\begin{quote}
\begin{ttfamily}\begin{flushleft}
\mbox{{\$}~dot~hello.dot~-Tps~|~gv~-}
\end{flushleft}\end{ttfamily}
\end{quote}

The \texttt{-Tps} option generates postscript
code, which is then piped to the ghostview utility. Notice that 
I am running my examples on a Linux machine with ghostview installed, 
but \texttt{dot} works equally well under Windows, so you may trivially 
adapt the examples.

If the user is satisfied with the output, it can save it into a file:
\begin{quote}
\begin{ttfamily}\begin{flushleft}
\mbox{{\$}~dot~hello.dot~-Tps~-o~hello.ps}
\end{flushleft}\end{ttfamily}
\end{quote}

Most probably the user may want to tweak with the options,
for instance adding colors and changing the font size. 
This is not difficult:
\begin{quote}
\begin{ttfamily}\begin{flushleft}
\mbox{graph~hello2{\{}}\\
\mbox{}\\
\mbox{//~Hello~World~with~nice~colors~and~big~fonts}\\
\mbox{}\\
\mbox{Node1~[label="Hello,~World!",~color=Blue,~fontcolor=Red,}\\
\mbox{~~~~fontsize=24,~shape=box]}\\
\mbox{~}\\
\mbox{{\}}}
\end{flushleft}\end{ttfamily}
\end{quote}

This draws a blue square with a red label:
\begin{figure}

\includegraphics{fig2.ps}
\end{figure}

All X-Window colors and fonts are available.

\texttt{dot} is quite tolerant: the language is case insensitive and 
quoting the options (color=``Blue'', shape=``box'') will work too. 
Moreover, in order to make happy C fans, semicolons can be used 
to terminate statements and they will simply be ignored.


%___________________________________________________________________________

\hypertarget{basic-concepts-of-dot}{}
\pdfbookmark[0]{Basic concepts of dot}{basic-concepts-of-dot}
\section*{Basic concepts of \texttt{dot}}

A generic \texttt{dot} graph is composed by nodes and edges.
Our \texttt{hello.dot} example contains a single node and no edges.
Edges enter in the game when there are relationships between nodes,
for instance hierarchical relationships as in this example:
\begin{quote}
\begin{ttfamily}\begin{flushleft}
\mbox{digraph~simple{\_}hierarchy{\{}}\\
\mbox{}\\
\mbox{B~[label="The~boss"]~~~~~~//~node~B}\\
\mbox{E~[label="The~employee"]~~//~node~E}\\
\mbox{}\\
\mbox{B->E~[label="commands",~fontcolor=darkgreen]~//~edge~B->E}\\
\mbox{}\\
\mbox{{\}}}
\end{flushleft}\end{ttfamily}
\end{quote}
\begin{figure}

\includegraphics{fig3.ps}
\end{figure}

\texttt{dot} is especially good at drawing directed graph such this, where
there is a natural direction (notice that GraphViz  also includes the \texttt{neato}
tool, which is quite similar to \texttt{dot} and is especially targeted to 
undirected graphs). 
In this example the direction is from the boss, who commands, 
to the employee, who obeys. Of course in \texttt{dot} one has the freedom 
to revert social hierarchies ;):
\begin{quote}
\begin{ttfamily}\begin{flushleft}
\mbox{digraph~revolution{\{}}\\
\mbox{}\\
\mbox{B~[label="The~boss"]~~~~~~//~node~B}\\
\mbox{E~[label="The~employee"]~~//~node~E}\\
\mbox{}\\
\mbox{B->E~[label="commands",~dir=back,~fontcolor=red]~~}\\
\mbox{//~revert~arrow~direction~}\\
\mbox{}\\
\mbox{{\}}}
\end{flushleft}\end{ttfamily}
\end{quote}
\begin{figure}

\includegraphics{fig4.ps}
\end{figure}

Sometimes, one wants to put on the same level things of the
same importance; this can be done with the rank option, as
in the following example, which describes a hierarchy with a boss,
two employees of the same rank, John and Jack, and a lower
rank employee Al who depends from John:
\begin{quote}
\begin{ttfamily}\begin{flushleft}
\mbox{digraph~hierarchy{\{}}\\
\mbox{}\\
\mbox{nodesep=1.0~//~increases~the~separation~between~nodes}\\
\mbox{}\\
\mbox{node~[color=Red,fontname=Courier]}\\
\mbox{edge~[color=Blue,~style=dashed]~//setup~options}\\
\mbox{}\\
\mbox{Boss->{\{}~John~Jack{\}}~//~the~boss~has~two~employees}\\
\mbox{}\\
\mbox{{\{}rank=same;~John~Jack{\}}~//they~have~the~same~rank}\\
\mbox{}\\
\mbox{John~->~Al~//~John~has~a~subordinate~}\\
\mbox{}\\
\mbox{John->Jack~[dir=both]~//~but~still~is~on~the~same~level~as~Jack}\\
\mbox{{\}}}
\end{flushleft}\end{ttfamily}
\end{quote}
\begin{figure}

\includegraphics{fig5.ps}
\end{figure}

This example shows a nifty feature of \texttt{dot}: if the user forgets
to give it explicit labels, it will use the name of the nodes as
default labels. The default colors and style can be set for nodes and 
edges respectively. It is also possible to control the separation 
between (all) nodes by tuning the \texttt{nodesep} option.
We leave for our readers to see what happens without the rank option
(hint: you get a very ugly graph).

\texttt{dot} is quite sophisticated and 
there are dozen of options which are deeply discussed in the excellent 
documentation. In particular, the man page (\texttt{man dot}) is especially 
useful and well done. The documentation also explain how to draw
graphs containing subgraphs. However those are advanced features which
are outside the scope of a brief presentation.

Here we will discuss another feature instead: the ability to generate output 
in different formats.
Depending on the requirements, different formats can be more or
less suitable. For the purpose of generating printed documentation, 
the postscript format is quite handy. On the other hand, if the documentation
has to be converted in html format and put on a Web page, a png
format can be handy. It is quite trivial to get it:
\begin{quote}
\begin{ttfamily}\begin{flushleft}
\mbox{{\$}~dot~hello.dot~-Tpng~-o~hello.ps}
\end{flushleft}\end{ttfamily}
\end{quote}

There are \emph{many} others available formats, including all the common ones
such as gif, jpg, wbmp, fig and more exotic ones.


%___________________________________________________________________________

\hypertarget{generating-dot-code}{}
\pdfbookmark[0]{Generating dot code}{generating-dot-code}
\section*{Generating \texttt{dot} code}

\texttt{dot} is not a real programming language, nevertheless it is pretty easy
to interface \texttt{dot} with a real programming language.  Bindings for 
many programming languages - including Java, Perl and Python - are already
available. A more lightweight alternative is just to generate the \texttt{dot} code
from your preferred language. 
Doing so allows the user to completely automatize the graph generation. 
Here I will give a simple Python example using this technique.

This example script shows how to draw Python class hierarchies 
with the least effort; it may help you in documenting your code.

Here is the script:
\begin{quote}
\begin{ttfamily}\begin{flushleft}
\mbox{{\#}~dot.py~}\\
\mbox{}\\
\mbox{"Require~Python~2.3~(or~2.2.~with~from~{\_}{\_}future{\_}{\_}~import~generators)"}\\
\mbox{}\\
\mbox{def~dotcode(cls):}\\
\mbox{~~~~setup='node~[color=Green,fontcolor=Blue,fontname=Courier]{\textbackslash}n'}\\
\mbox{~~~~name='hierarchy{\_}of{\_}{\%}s'~{\%}~cls.{\_}{\_}name{\_}{\_}}\\
\mbox{~~~~code='{\textbackslash}n'.join(codegenerator(cls))}\\
\mbox{~~~~return~"digraph~{\%}s{\{}{\textbackslash}n{\textbackslash}n{\%}s{\textbackslash}n{\%}s{\textbackslash}n{\}}"~{\%}~(name,~setup,~code)}\\
\mbox{}\\
\mbox{def~codegenerator(cls):}\\
\mbox{~~~~"Returns~a~line~of~dot~code~at~each~iteration."}\\
\mbox{~~~~{\#}~works~for~new~style~classes;~see~my~Cookbook}\\
\mbox{~~~~{\#}~recipe~for~a~more~general~solution}\\
\mbox{~~~~for~c~in~cls.{\_}{\_}mro{\_}{\_}:}\\
\mbox{~~~~~~~~bases=c.{\_}{\_}bases{\_}{\_}}\\
\mbox{~~~~~~~~if~bases:~{\#}~generate~edges~parent~->~child}\\
\mbox{~~~~~~~~~~~~yield~''.join(['~{\%}s~->~{\%}s{\textbackslash}n'~{\%}~(~b.{\_}{\_}name{\_}{\_},c.{\_}{\_}name{\_}{\_})}\\
\mbox{~~~~~~~~~~~~~~~~~~~~~~~~~~~for~b~in~bases])}\\
\mbox{~~~~~~~~if~len(bases)~>~1:~{\#}~put~all~parents~on~the~same~level}\\
\mbox{~~~~~~~~~~~~yield~"~{\{}rank=same;~{\%}s{\}}{\textbackslash}n"~{\%}~''.join(}\\
\mbox{~~~~~~~~~~~~~~~~['{\%}s~'~{\%}~b.{\_}{\_}name{\_}{\_}~for~b~in~bases])}\\
\mbox{}\\
\mbox{if~{\_}{\_}name{\_}{\_}=="{\_}{\_}main{\_}{\_}":~}\\
\mbox{~~~~{\#}~returns~the~dot~code~generating~a~simple~diamond~hierarchy}\\
\mbox{~~~~class~A(object):~pass}\\
\mbox{~~~~class~B(A):~pass}\\
\mbox{~~~~class~C(A):~pass}\\
\mbox{~~~~class~D(B,C):~pass}\\
\mbox{~~~~print~dotcode(D)}
\end{flushleft}\end{ttfamily}
\end{quote}

The function \texttt{dotcode} takes a class and returns the \texttt{dot} source
code needed to plot the genealogical tree of that class.  
The source code is generated by \texttt{codegenerator}, which traverses the list
of the ancestors of the class (a.k.a. the Method Resolution Order of
the class) and determines the edges and the nodes of the hierarchy. 
\texttt{codegenerator} is a generator which returns an iterator yielding 
a line of \texttt{dot} code at each iteration. Generators are a cool 
recent addition to Python; they come particularly handy for the purpose
of generating text or source code.

The output of the script is the following self-explanatory \texttt{dot} code:
\begin{quote}
\begin{ttfamily}\begin{flushleft}
\mbox{digraph~hierarchy{\_}of{\_}D{\{}}\\
\mbox{}\\
\mbox{node~[color=Green,fontcolor=Blue,font=Courier]}\\
\mbox{}\\
\mbox{~B~->~D}\\
\mbox{~C~->~D}\\
\mbox{}\\
\mbox{~{\{}rank=same;~B~C~{\}}}\\
\mbox{}\\
\mbox{~A~->~B}\\
\mbox{}\\
\mbox{~A~->~C}\\
\mbox{}\\
\mbox{~object~->~A}\\
\mbox{}\\
\mbox{{\}}}
\end{flushleft}\end{ttfamily}
\end{quote}

Now the simple one-liner:
\begin{quote}
\begin{ttfamily}\begin{flushleft}
\mbox{{\$}~python~dot.py|dot~-Tpng~-o~x.ps}
\end{flushleft}\end{ttfamily}
\end{quote}

generates the following picture:
\begin{figure}

\includegraphics{fig6.ps}
\end{figure}


%___________________________________________________________________________

\hypertarget{references}{}
\pdfbookmark[0]{References}{references}
\section*{References}

You may download \texttt{dot} and the others tool coming with GraphViz at the
official home-page of the project:

\href{http://www.graphviz.org}{http://www.graphviz.org}

You will also find plenty of documentation and links to the mailing list.

Perl and Python bindings are available here

\href{http://theoryx5.uwinnipeg.ca/CPAN/data/GraphViz/GraphViz.html}{http://theoryx5.uwinnipeg.ca/CPAN/data/GraphViz/GraphViz.html}

(Perl bindings, thanks to Leon Brocard)

and here

\href{http://www.cs.brown.edu/~er/software/}{http://www.cs.brown.edu/{\textasciitilde}er/software/}

(Python bindings, thanks to Manos Renieris).

The script \texttt{dot.py} I presented in this article is rather minimalistic. 
This is on purpose. A much more sophisticated version with additional 
examples is discussed in my Python Cookbook recipe

\href{http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/213898}{http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/213898}

\end{document}