nyweb

v1.00

nyweb is just another incarnation of Knuth's WEB, this time language-independent, notation based on XML, and heavily inspired by nuweb.

Essentially, it is a preprocessor, separating comment from content (the program itself), outputting both into a different file (or, more precisely, multiple files as specified). Other major features are parametrizable macros definition/replacement and conditional "translation". But, contrary to the previous implementations of WEB, nyweb does not pretend that the comments are usable as true documentation, therefore no prettyprinting is supported.

Rather, it is seen that the comments are valuable only as the programmer's tool, in the process of creation. Documentation can be created together with the program, though, simply as just another type of content, written in some of documentation formats (e.g. TeX (and derived formats), or html, or formats derived from xml), output to a separate file(s).

Instead of "autodocumentation"/prettyprinting, a different but major idea of WEB is brought into front in nyweb - notably, that it supports the natural way how programs are created. First, there is a rough description in human language - equivalent of which is in the freelance and unpublished comments - and the sketch of a skeleton of the program can be written at top level, replacing the key elements simply by their desctiption, to be written (here usually as macros) and included later. Then, these lower-level elements are written, possibly in the same manner: outlining their function and replacing the details again by inclusion marks, the body of which is to be written later. After the first version of the program is written in this top-to-down manner, it is not uncommon that new functionality and features are added, often influencing more parts of the original program, and not rarely requiring a different structure than the original program was written with. In (ny)web, elements of such new functionality is written as a separate "chapter" of a set of macros, included; thus grouping together functionality, but enabling to position these elements into separate parts of the program as needed. This partially replaces and augments the natural structures provided by many programming languages, but does this without need to sacrifice efficiency (as unnecessary "artificial" functions calls would).

nyweb aims to build also on an other aspect of modern programming. We have seen in the last years a growth in "intelligent" programmer's editors. They now can highlight syntax, fold functions or other structures, create lists of functions and other structures, autocomplete names of already written/used functions and variables etc. These editors doubtlessly increased efficiency of a programmer, especially working on a big project. However, these editors rely on recognition of the programming language syntax (basically they perform the first steps of compilation, "online"), therefore are language dependent, can be confused by new features of a language, have troubles in recognising "mixed" languages/projects - in short, they are rigid and fragile.

Therefore, nyweb is seen only as an intermediate step towards - or basis of - a new programmer's editor, providing a high degree of flexibility in defining structure by the programmer, while maintaining efficiency of the resulting compiled code. It would also provide markup needed for fast and efficient movement within sources and instant lookup of key program elements, fully under the control of programmer, not exhausting him with unnecessary details of similar markups (e.g. functions lists) created automatically. Also, the "interwoven documentation" can be easily accomplished in such editor, in a very similar manner to what hypertext editors/viewers provide. This is why xml was used as the underlying markup language, enabling to add an arbitrary number of tags and marks, even in a forward-compatible manner (simply ignoring tags unknown for a given version). However, it is clear, that such editor would be a major undertaking, far beyond the capabilities of the one-man-show nyweb currently is.

Usage of nyweb

nyweb is a command-line tool, being run as

nyweb inputfile.w [outputfile.txt]
Comments (i.e. all text outside emit tags - see below) are output into the output file. If not given as the second parameter, the output goes to standard output (console). Often, it is wise to use nul as output file, to flush the - for most uses unnecessary - comments. The content (real program) output files are given explicitly in the source .w file - see emit tag.

Errors and warnings as they occur during processing are printed to the stderr device. Upon occurence of the first error further input processing stops and the program exists with exit code 1. On success, the exit code is 0.

Basic technicalities of nyweb

nyweb currently reproduces much of nuweb's functionality, except prettyprinting/TeX support. However, before describing the tags themselves, a few words about the xml-related stuff:

Basic tags of nyweb

emit tag

This tag encloses the "real" program (or documentation or any other content), to be saved into a file. It has one attribute, file, to determine the filename to be output into:

<emit file="myprogram.c">
  for (i = 0; i &lt; 99; i++) {
    printf("Blahblah %d\n", i);
  }
</emit>
(Note the usage of &lt; instead of < in the example above). There may be any number of /emit tags in a single input file, even emitting to the same file: the outputs are concatenated in the order they appear in the input file. However, /emit tags cannot be nested.

One more feature is taken over from nuweb: the emitted files are during processing saved to a temporary file; and after the whole input file is processed, these temporary files are compared against older version of the same file. Only if a change occurs (or the file did not exist previously), the new file is overwritten over the old file; otherwise the temporary file is silently discarded. This feature enables to use tools such as make, which rely on the timestamp of a source file to determine whether it has changed or not.

macro tag

This tag encloses any content, which won't appear in any output file directly, but can be included either in comment or emit - and it can be included any number of times - see use tag below.

Macro has one required attribute, name, which uniquely identifies the macro:

<macro name="print blahblah">
  printf("Blahblah\n");
</macro>
Macros can be defined only outside any other functional tag (i.e. macro, emit or use). However, use tags can be nested into macros. Macros can be defined in any order, they don't need to be defined before the use tag where they are invoked.

Multiple macros with the same name can be defined, they are then concatenated in the order they appear in the source file. This ordering can be overriden explicitly by adding an order attribute to the macro, with an integer as the value. These macros are then concatenated in ascending order (macros of the same "order" again in their order of appearance), with those macros which don't have theg order attribute defined added to the end of chain of macros. For example:

<macro name="fruits">  Apple </macro>
<macro name="fruits">  Banana </macro>
<macro name="fruits">  Orange </macro>
when invoked as <use name="fruits"/> would appear as " Apple Banana Orange ", but if we define the same as
<macro name="fruits" order="20">  Apple </macro>
<macro name="fruits">  Banana </macro>
<macro name="fruits" order="10">  Orange </macro>
when invoked as <use name="fruits"/> would appear as " Orange Apple Banana ".

use tag

This tag is replaced by the content of macro with the same name tag. It can occur in comment area as well as inside emit tags.

Other tags

Parametrized macros: param tag

Parametrized macros are accomplished using the param tag. This tag has dual usage:

  1. inside the use element, it contains definitions of the parameter body
  2. within a macro definition, it is replaced by the respective definition when the macro is invoked from a use
In both cases, param has a required name attribute, as an identifier.

Both usage and processing of the param elements are very similar to the usage and processing of macros, except

It is not necessary to have all parameters used in a macro defined when the macro is invoked from a use. For the undefined params, a warning is issued, but processing continues.

Example:

<macro name="pies"> <param name="filling"/> pie</macro>
<emit file="menu.txt">
  <use name="pies"> <param name="filling">Cherry</param> </use>,
  <use name="pies"> <param name="filling">Apple</param> </use>,
  <use name="pies"> <param name="filling">Chocolate</param> </use>.
</emit> 
will create the following menu: Cherry pie, Apple pie, Chocolate pie.

Conditional processing: define, if/else and comment tags

This is similar to conditional compilation in many programming languages:

<define name="tropical"/>

<if defined="tropical">
  <macro name="fruit">Banana</macro>
  <macro name="fruit">Tangerine</macro>
  <macro name="fruit">Orange</macro>
<else/>
  <macro name="fruit">Cherry</macro>
  <macro name="fruit">Apple</macro>
</if>
As macros are invoked in a different order than they are placed in the input file, placing define and if tags inside macros might lead to a difficult interpretation of things, and is strongly discouraged (although not disabled, for the benefit of those who desperately want to shoot themselves into their foot).

comment is just an another tag, which content is simply ignored:

<comment>
  This text is completely ignored.
  <macro name="fruit">This macro is completely ignored, too.</macro>
</comment>

Input files nesting: include tag

As could be expected, include tag simply literally includes the file specified by the file attribute, during processing.

<include file="common subroutines.w"/>

Miscellaneous: nyweb tag

The nyweb tag is used for various ad-hoc functions, determined by the attribute.

Unused macros list

Macros unused yet can be printed using nyweb tag with list="unused macros" attribute. It is wise to place this tag at the end of the source .w file:
<nyweb list="unused macros"/>

Timing

Duration of processing of part of source file can be printed out, using a couple of nyweb tags with time="begin" and time="end" attribute. In the latter tag, a file attribute determines the output file name where the timing is written to. Two times are written, one for each processing pass. It is wise to place this couple of tags at the beginning and end of the source .w file:
<nyweb time="begin"/>
<nyweb time="end" file="timing.txt"/>
Note that the total of two times even in this case will be slightly less than the total duration of nyweb run - the time needed to check the temporary output file(s) against its older version (see emit tag) is not included.

Planned features