%============================================================================= % ---------------------------------------------------------------------------- % Noxref, the cross referencing program for Noweb \documentstyle[noweb]{article} %\RCSdef $Id: noxref.nw,v 1.2 1993/05/06 18:15:40 jgk Exp $ % ---------------------------------------------------------------------------- %\title{{\tt Noxref\thanks{\RCSId},} \title{{\tt Noxref\thanks{Id: noxref.nw,v 1.2 1993/05/06 18:15:40 jgk Exp},} the cross referencing program for {\tt Noweb}} \author{\setcounter{footnote}{6}% JG Krom\thanks{JET Joint Undertaking, e-mail address: jgk@jet.uk}} \date{Printed: \today} % ---------------------------------------------------------------------------- \makeindex \begin{document} \maketitle % ---------------------------------------------------------------------------- % A bit of jargon: \newcommand{\noweb}{{\tt noweb}} \newcommand{\noweave}{{\tt noweave}} \newcommand{\notangle}{{\tt notangle}} \newcommand{\noxref}{{\tt noxref}} \newcommand{\Noxref}{{\tt Noxref}} \newcommand{\markup}{{\tt markup}} \newcommand{\awk}{{\rm awk}} \newcommand{\unix}{{\sc unix}} \newcommand{\dos}{{\sc msdos}} \newcommand{\chklist}{{$\setminus$\tt nowebchunks}} %---------------------------------------------------------------------------- \section{Introduction} N Ramsey presents in [2] a clean, language independent system for literate programming, called \noweb. One of the components of \noweave, the ``weave'' program for this system, is the \noxref\ program. This system has been been ported to \dos\ by L Wittenberg. The author of this paper customised this \noxref\ program. The purpose of this paper is to describe these customisations. In order to implement these cutomisations in a ``literate Programming'' style, the codes written by the above mentioned authors are included in this document. % ---------------------------------------------------------------------------- \section{Problem Definition} The \noxref\ program consists of an \awk\ [1] program, driven by an \unix\ shell script or, as appropriate, by an \dos\ batch file. This \noxref\ program adds page number references to the usage and definitions of the code chunks in a ``woven'' printing of a literate program. A feature that is available in other implementations of the \noxref\ program, the alphabetical chunk list, was missing from the \awk\ implementation of this program. As this feature seemed useful, it is implemented as an addition to the existing \noxref\ \awk\ program. This noxref program is the proper place to create this chunk list, since all information required for this list is already collected by this program. The chunk list should take a form, similar to a table of contents: chunk names, in the ``\LA{chunk name}\RA{}'' format, on the left hand side of the page, a list of page numbers on the right hand side of the page and leaders between the two. The list of page numbers should first contain the pages on which the chunk is defined and then the pages on which the chunk is used. Root chunks are to be indicated with the word ``Root''. Chunks that are used, but not defined, are marked with the word ``Undefined''. This whole chunk list, formatted using \LaTeX\ commands, replaces any line, in the original source, containing only the word ``\chklist''. \Noxref\ was, and still is, intended as a stage in the \noweave\ pipeline. This means that it will receive input in the ``marked-up'' format generated by the \markup\ program. The output of \noxref\ should also be in this format. % ---------------------------------------------------------------------------- \pagebreak \section{Web Structure} This document describes three different files for two different environments: \begin{enumerate} \item [[noxref]] The executable shell script for use under \unix. This file includes the awk script. \item [[noxref.bat]] The executable shell script for use under \dos. This file calls upon the following file. \item [[noxref.awk]] The awk source code used or included by the files above. \end{enumerate} Each of these can be generated by specifying the required filename as the root chunk name when executing the \notangle\ program on this web. To obtain the \dos\ batch file the following command should be executed: \begin{center} [[notangle -Rnoxref.bat noxref.nw > noxref.bat]] \end{center} Users of these shell scripts might have to adapt the [[awk]] program name in these scripts to match their local system configuration. <<noxref>>= #!/bin/sh # $Id: noxref.nw,v 1.2 1993/05/06 18:15:40 jgk Exp $ nawk '<<noxref.awk>>' <<noxref.bat>>= @echo off REM # $Id: noxref.nw,v 1.2 1993/05/06 18:15:40 jgk Exp $ REM The NOWEB environment variable must be set to the directory REM where NOXREF.AWK is. It must end in '/' or '\' as required REM by the AWK interpreter in use. awk -f %NOWEB%noxref.awk <<noxref.awk>>= # $Id: noxref.nw,v 1.2 1993/05/06 18:15:40 jgk Exp $ <<Noxref awk source>> <<Noxref awk chunk list additions>> @ % ---------------------------------------------------------------------------- \section{The AWK Source Code} This is mostly Ramsey's original code. The fragment that has been changed is included as the chunk: \LA{Find and process the \chklist\ request}\RA. Module label generation has been upgraded to the algorithm used in N~Ramsey's last release of the \noweb\ system. <<Noxref awk source>>= BEGIN { defns[0] = 0 ; uses[0] = 0 ; dcounts[0] = 0 ; firstdef[0] = 0; ucounts[0] = 0 ; idtable[0] = 0 ; keycounts[0] = 0 ; firstdefnout[0] = 0; filetable[0] = 0 } { lines[nextline++] = $0 } /^@defn / { logname("DEFN", defns, dcounts, substr($0, 7)) } /^@use / { logname("USE", uses, ucounts, substr($0, 6)) } /^@file / { curfile = modid(substr($0, 7) substr($0, 10, 3)) } <<Noxref awk source>>= function logname(which, tbl, counts, name, id) { counts[name] = counts[name] + 1 id = which curfile "-" modid(name) "-" counts[name] tbl[name] = tbl[name] id " " lines[nextline++] = "@literal \\label{" id "}" if (which == "DEFN" && firstdef[name] == "") firstdef[name] = id } <<Noxref awk source>>= function modid(name, key) { if (idtable[name] == "") { key = name gsub(/[\[\]\\{} -]/, "*", key) if (length(key) > 6) key = substr(key,1,3) substr(key, length(key)-2, 3) keycounts[key] = keycounts[key] + 1 idtable[name] = key "-" keycounts[key] } return idtable[name] } <<Noxref awk source>>= END { for (i=0; i < nextline; i++) { name = substr(lines[i], 2) name = substr(name, 1, index(name, " ")-1) arg = substr(lines[i], length(name)+3) if (name == "defn") { thischunk = arg printf "@defn %s~{\\footnotesize\\rm\\pageref{%s}}\n", arg, firstdef[arg] } else if (name == "use") { if (firstdef[arg] != "") printf "@use %s~{\\footnotesize\\rm\\pageref{%s}}\n", arg, firstdef[arg] else printf "@use %s~{\\footnotesize\\rm (never defined)}\n", arg } else if (name == "end") { if (substr(arg, 1, 4) == "code" && firstdefnout[thischunk] == 0) { firstdefnout[thischunk] = 1 n = split(defns[thischunk], a) if (n > 1) { printf "@literal \\nwalsodefined{" for (j = 2; j <= n; j++) printf "\\\\{%s}", a[j] printf "}\n@nl\n" } if (uses[thischunk] != "") { printf "@literal \\nwused{" n = split(uses[thischunk], a) for (j = 1; j <= n; j++) printf "\\\\{%s}", a[j] printf "}\n@nl\n" } else printf "@literal \\nwnotused\n@nl\n" } print lines[i] } <<Find and process the \chklist\ request>> else print lines[i] delete lines[i] } } @ Finding the \chklist\ command is straight forward, it must be on a \verb+@text+ line. The unclean way of using a chunk to insert an [[else]]~[[if]] clause in a larger [[if -- else if -- else]] structure should be noted. <<Find and process the \chklist\ request>>= else if (name == "text") { if (arg == "\\nowebchunks") printChunkList() else print lines[i] } @ If the keyword has been found the function [[printChunkList()]] is called to do the actual printing. % ---------------------------------------------------------------------------- \section{The Chunk List Additions} These additions consist, except for the chunk \LA{Find and process the \chklist\ request}\RA\ mentioned above, of two functions: one to sort the list of chunks, and one to print this list. <<Noxref awk chunk list additions>>= <<Sort chunk list>> <<Print chunk list>> @ % ---------------------------------------------------------------------------- \subsection{The Sorting Routine} This function implements essentially a simple insertion sort. If performance becomes a problem, some effort could be invested to use a better algorithm, but that seems unnecessary at the moment. \subsubsection{The Sorting Function Framework} Two global variables, [[nextFreeIdx]] and [[sortedNames]], carry the results of this function. The [[sortedNames]] array is empty to start with, except for the first element, which contains the null string as a sentinel; no string compares to less than the null string. The invariant on this array is that it will always contain chunk names in sorted order, with the lowest (according to the awk comparison rules) coming first. The invariant on [[nextFreeIdx]] is that it always contains the index number of the next free element in the array. <<Sort chunk list>>= function sortChunkNames( <<Sort --- Local Variables>>) { sortedNames[0] = "" nextFreeIdx=1; # The next index to use (range 1--N) <<Run through the [[chunkname]]s as stored in [[defns]] array>> <<Run through the [[chunkname]]s as stored in [[uses]] array>> } @ \subsubsection{Scan the Arrays} All chunk names have been stored in the [[defns]] array when they were defined. Using the ``{\tt for \em xyz \tt in \em arrayname}'' feature of awk, it is possible to step through all elements of the array. The zero element in the arrays would confuse the sorting algorithm, so these elements have to be discarded. <<Run through the [[chunkname]]s as stored in [[defns]] array>>= for (chunkname in defns) { if (chunkname != 0) { <<Insert in proper place in sorted array>> } } <<Sort --- Local Variables>>= chunkname, @ All names that have been used are stored in the array [[uses]]. This array has to be scanned for chunk names that might have been used, but that were not defined. Such chunks should also be included in the chunk list, so they are inserted in the [[sortedNames]] array. <<Run through the [[chunkname]]s as stored in [[uses]] array>>= for (chunkname in uses) { if ((chunkname != 0) && !(chunkname in defns)) { <<Insert in proper place in sorted array>> } } @ \subsubsection{Insert into the Sorted Array} The proper place for the insertion is found by scanning the sorted array from the end to the beginning. The local variable [[idx]] is used for this scan, it will always point at a possible insertion location. [[nextFreeIdx]] is incremented, since it is now known that there is another element which will be inserted. If the element before the current scan location is greater than the chunkname to be inserted, then the chunkname will be inserted before that element. The scanned element should be moved one position up (to the current insertion location) at this point. Otherwise, the chunk name should come after the element before the scan location, ie. it should be inserted at the current position. [[idx]] is pushed to the end condition, to stop the scan over the sorted array and to get a new chunk name. <<Insert in proper place in sorted array>>= for ( idx = nextFreeIdx++ ; idx>0 ; idx-- ) { if ( sortedNames[idx-1] > chunkname ) sortedNames[idx] = sortedNames[idx-1] ; else { sortedNames[idx] = chunkname ; idx = -1 ; } } <<Sort --- Local Variables>>= idx @ % ---------------------------------------------------------------------------- \subsection{Print the Chunk List} The function to print the chunk list, first calls upon the sorting function to get the names in order. It then inserts, if required, some heading material and lastly prints the names. <<Print chunk list>>= function printChunkList( <<Print --- Local Variables>> ) { sortChunkNames() ; <<Optional Header material>> <<Header material>> <<Print Loop>> <<Closing Material>> } @ % ---------------------------------------------------------------------------- \subsubsection{The Printing Loop} This loops steps through the indices of the sorted names array, up to the next free index number. It prints each name, using the \markup\ \verb+@use+ directive, followed by a row of dots. The printing of the page numbers, root markers etc. is delegated to the chunk \LA{print page numbers etc.}\RA. <<Print Loop>>= for (idx=1; idx<nextFreeIdx; idx++) { print "@use " sortedNames[idx] ; print "@literal \\dotfill" ; <<Print page numbers etc.>> print "@nl" ; } <<Print --- Local Variables>>= idx, @ When printing the page references, the following cases should be considered. \begin{itemize} \item If a name does not appear in the [[uses]] array, it must have been in the [[defns]] arrray. It is therefore a root chunk. \item If a name does not appear in the [[defns]] array, it is undefined in the source file currently processed. \item If a name is defined, print the page references of the definitions. \item If a name is used, print the usage page references. \end{itemize} <<Print page numbers etc.>>= if (uses[sortedNames[idx]] == "") { print "@literal { \\rm Root}," ; } if (defns[sortedNames[idx]] == "") { print "@literal { \\rm Undefined}" ; } else { <<Print definition page numbers>> } if (uses[sortedNames[idx]] != "") { <<Print usage page numbers>> } @ \paragraph{Definition References.} Definition page references are derived from the [[defns]] arrays, by splitting them into fields with the [[split]] function. The first one should not be preceded by a ``,'' (comma character), but all subsequent numbers (if any) should have a comma in front of them. Page references for the definitions are printed underlined. <<Print definition page numbers>>= n = split(defns[sortedNames[idx]], a) print "@literal { \\underline{\\pageref{" a[1] "}}}"; if (2 <= n) for (j = 2; j <= n; j++) print "@literal {, \\underline{\\pageref{" a[j] "}}}"; <<Print --- Local Variables>>= n, a, j @ \paragraph{Usage References.} The page references for the places where the chunks are used are obtained from the [[uses]] array. These always have preceding text (definition page references or the word ``Undefined''), so these should always have a ``,'' in front of them. <<Print usage page numbers>>= n = split(uses[sortedNames[idx]], a) for (j = 1; j <= n; j++) print "@literal {\\rm, \\pageref{" a[j] "}}"; @ \paragraph{A Small Test.} Both chunk names should appear in the chunk list: one marked as ``Root'', the other as ``Undefined''. <<An unused (therefore root) chunk>>= <<An undefined chunk>> @ \pagebreak \subsubsection{List Opening and Closing Definitions} If required, some commands could be included to generate a chapter or section heading above the chunk list. However, the author of this code prefers to have such sectioning commands under the control of the final document source file. Users who prefer to have these section commands automatically generated (like the Icon implementation of the \noxref\ program does) can redefine \LA{Optional Header material}\RA\ to be equal to the current definition of \LA{Not used header material}\RA. <<Optional Header material>>= <<Not used header material>>= print "@literal \\ifx\\chapter\\undefined\\section*" print "@literal {Alphabetical List of Chunk Names}" ; print "@literal \\else\\chapter" print "@literal {Alphabetical List of Chunk Names}" ; print "@literal \\fi" print "@nl" ; @ The following header material is required, it sets up the environment for the list. <<Header material>>= print "@literal {\\obeylines" ; print "@literal \\setlength{\\parindent}{0mm}" ; print "@literal \\setlength{\\parskip}{1.4ex}" ; print "@nl" ; <<Closing Material>>= print "@literal }" ; @ % ---------------------------------------------------------------------------- \newpage \section{References} % This is faked (ie, not a real LaTeX bibliography), since this file % is likely to get included in other files, with other bibliographies. { \begin{description} \sfcode`\.=1000\relax \item[{\rm [1]~~~}] Aho AV., Kernighan BW., Weinberger PJ. 1988, {\sl The AWK Programming Language,} Addison-Wesley. \item[{\rm [2]~~~}] Ramsey N. 1992--1993, ``Literate Programming Tools Need Not Be Complex,'' To be published in {\sl IEEE Software.} 1993. \end{description} } % ---------------------------------------------------------------------------- % This should go in RCS.sty! \newenvironment{RCSlog}% {\begin{trivlist} \item[]% \setlength{\parindent}{0mm}% \setlength{\parskip}{3ex}% \catcode`\$=12% \hbadness=10000\ignorespaces\obeycr}% {\end{trivlist}} \section{RCS Maintained Log} \begin{RCSlog} $Log: noxref.nw,v $ Revision 1.2 1993/05/06 18:15:40 jgk Moved from using bold to underlining for the page references of definitions. (On advice of Lee Wittenberg.) A few linguistic improvements. RCS ID strings included in progs. Revision 1.1 1993/05/01 21:08:21 JG~Krom Initial revision A version with the same code, but some errors and typos in the documentation text was known as: Revision 1.1 1993/04/28 17:03:23 jgk Initial revision This file was derived from: ``NOXREF.BAT'' by L~Wittenberg and ``noxref'' By N~Ramsey. (No change log was available for these files.) And from: Log: noxref.awk Revision 1.5 1993/04/23 12:52:16 JG~Krom On advice of Lee Wittenberg, used the new way of label generation. Revision 1.4 1993/04/20 22:41:44 JG~Krom Improved layout of chunklist. Revision 1.3 1993/04/11 17:47:38 JG~Krom Indicate root chunks in the chunklist. Revision 1.2 1993/04/11 15:52:53 JG~Krom First stab at the chunklist command. Revision 1.1 1992/10/21 17:00:00 LEEW checked in with -k by JG~Krom at 1993/04/10 16:53:28 Which in turn was also derived from: ``noxref'' By N~Ramsey. \end{RCSlog} % ---------------------------------------------------------------------------- \newpage \section{Alphabetical List of Chunk Names} \nowebchunks % ---------------------------------------------------------------------------- \input{noxref.ind} % ---------------------------------------------------------------------------- \end{document} % End of noweb code %=============================================================================