VHIST Users' Guide

./images/vhist-logo-18ss.png

Stefan Vollmar, Andreas Hüsgen, Michael Sué,
Joachim Nock, Roman Krais
Email: vhist@nf.mpg.de

VHIST 1.60.0, Jan 26 2010, Rev 419:1758

minervaMax-Planck-Institut für neurologische Forschung
mit Klaus-Joachim-Zülch-Laboratorien der
Max-Planck-Gesellschaft und der
Medizinischen Fakultät der Universität zu Köln
Cologne, Germany   http://www.nf.mpg.de

Table of Contents

1 Introduction

1.1 About VHIST

The VHIST [1], [2] project defines a file format specification that allows to embed arbitrary binary data for the documentation of workflows together with structured meta-information and multiple facilities for validation. The format conforms to PDF and other open standards, is self-describing and particulary suited as an image or meta-image format in the context of multi-modality and functional imaging.

It includes a platform independent reference implementation which contains the essential features. VHIST can be used on top of existing workflows without the need to change major applications. Please refer to the VHIST white paper [1] for more information on the general concept and the specification, this User's Guide focuses on how to use the reference implementation.

Please see Disclaimer and Licensing for legal issues.

1.2 VHIST Reference Implementation

The reference implementation currently consists of a set of commandline tools and a user-friendly suite of GUI tools. It can be downloaded at the VHIST homepage [1].

The commandline tools (VHIST core):

  • vhistadd - commandline tool for creating VHIST files.
  • vhistxs - commandline tool for extracting embedded data from VHIST files. Minimum implementation.
  • vhistxl - commandline tool for extracting embedded data from VHIST files, including facilities of individual sections, extraction of individual embedded files with validation.

The GUI tools:

  • vhistzard - an application that helps configuring commandline options for vhistadd and creating VHIST files without using the commandline.

1.3 Hard- and Software Requirements

Platform independence was an important goal when conceiving VHIST and its Reference Implementation We believe that VHIST Core should work well on a wide range of systems for which a Python distribution [3] (equal or better to version 2.4) exists, especially on anything manufactured in this millenium.

We use Trolltech's Qt libraries [4] for the user friendly VHISTzard. Some distributions of VHIST ship with the appropiate libraries and should run out of the box.

MS Windows
A standard installer is available that can deploy a ready-to-run version of VHISTzard including all commandline tools. This setup does not require a Python distribution, as all commandline tools have been compiled to ready-to-use stand-alone ".exe" programs. Alternatively, you could install Python (using a proper "single-click" installer [3]) and use the "source" distribution of VHIST.
Mac OS X
a ready-to-run version of VHISTzard is available for this platform, it can be installed as a standard Mac application. Installing the source distribution is also possible, a suitable Python distribution is part of the operating system since Mac OS X 10.3 (for older systems it can be easily installed without compromising other applications that might rely on the older versions). VHIST should work just fine with the newer "Intel" Macs (tested with Tiger and Leopard) and for the older PPC platform.
Linux
Python should be ubiquitous on this platform. Use python -v to check whether your installed version is recent enough (it very likely is), anything equal or better version 2.4 should work. If the installed version is not recent enough, you should consider upgrading it. Please refer to the documentation of your distribution

If for some reason Python is not installed on your system, please refer to the documentation for your linux distribution on how to install Python.

Solaris
Python is shipped with Solaris 10 and available for previous versions of this operating system. VHIST should work on SPARC systems, as well as Solaris machines with x86 architecture.

1.4 VHIST terminology

The general idea behind VHIST is to provide a robust and simple means for documenting steps of a workflow by logging and optionally embedding all relevant information: which files were used, which files were written, what software package was used in which version and with what parameters.

An example from medical imaging is the process (workflow) to create an image volume suitable for scientific/diagnostics purposes. In this case, a typical workflow step could be the the application of an image filter (tool, this example works best with a commandline tool) to an image, present as an input file (infile). The commandline used in this workflow step, in addition to some filter option (say, Gauss filter scope), could be added to the VHIST file using Attributes. Assuming the filter application writes a new file containing the filtered image volume, we can add this as an output file (outfile). It is recommended practice to have vhistadd embed (sse Files: Embedded Vs. Reported) any logfile the filter application might write.

1.5 Attributes

An attribute is a key-value pair. vhistadd provides commandline options for specifying either pre-defined or user-defined keys. We make this distinction to ensure that a basic set of attributes is universally available (e.g. description, or title). This is a prerequisite for comparing VHIST files from different sources. Attributes can be specified for the workflow step and even for individual in- and outfiles. The type and meaning of the value depends on the key is described in the documentation of the individual options.

1.6 Appending to a VHIST file

VHIST files are stacks of sections which can be validated independently and usually refer to one workflow step. Sections are appended at the end of an existing file, so no previous data is changed. This is related to incremental writing of PDF [5] files. If you want to add on to a VHIST file containing information about previous workflow steps (this is recommended) you need to specify a VHIST root file (this can be any VHIST file).

1.7 Files: Embedded Vs. Reported

The vhistadd implementation uses the terms reported and embedded, meaning similar but slightly different things.

A reported file is a file whose file properties are reported in the automatically generated XML summary [1] (which contains structured information on one workflow step and is suitable for automated processing) and in the corresponding "human-readable" PDF part of the VHIST document.

An embedded file's properties are also listed in the XML summary and the PDF listing. However, in this case the contents of the file are also completely contained in the VHIST document, either in compressed or uncompressed form.

Embedded files can be extracted by various means, e.g. with a PDF browser, VHISTzard, vhistxs, vhistxl. You can embed binary data in VHIST files.

1.8 Disclaimer

THIS SOFTWARE IS PROVIDED AS-IS, WITHOUT ANY EXPRESSED OR IMPLIED WARRANTY. SPECIFICALLY, NEITHER THE MAX-PLANCK-INSTITUT FÜR NEUROLOGISCHE FORSCHUNG MIT KLAUS-JOACHIM-ZÜLCH LABORATORIEN DER MAX-PLANCK-GESELLSCHAFT UND DER MEDIZINISCHEN FAKULTÄT ZU KÖLN NOR THE AUTHORS WARRANT THAT THE FUNCTIONS CONTAINED IN THE SOFTWARE WILL MEET YOUR REQUIRMENTS, OR THAT THE OPERATION OF THE SOFTWARE WILL BE UNINTERRUPTED OR ERROR-FREE, OR THAT DEFECTS IN THE SOFTWARE WILL BE CORRECTED. TO THE EXTENT PERMITTED BY LAW, NEITHER THE MPI FÜR NEUROLOGISCHE FORSCHUNG NOR THE AUTHORS SHALL BE LIABLE FOR ANY DAMAGES ARISING OUT OF OR RELATING TO THE USE OF THE SOFTWARE, INCLUDING BUT NOT LIMITED TO INCIDENTAL, SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY LOST PROFITS, BUSINESS INTERRUPTION, LOSS OF PROGRAMS OR OTHER DATA ON YOUR INFORMATION HANDLING SYSTEM.

1.9 Licensing

FOR NON-COMMERCIAL SCIENTIFIC RESEARCH USAGE ONLY, the VHIST spcification [1] and the VHIST Reference Implementation are available under the GNU General Public License (GPL Version 3) [6]. Any other usage requires written permission by the Max Planck Institute for Neurological Research Cologne.

2 vhistadd

vhistadd is the command-line tool used for creating or appending to VHIST files.

We think that vhistadd is a comparatively easy to use - although the number of commandline options featured by vhistadd can be a bit intimidating. Therefore we developed VHISTzard, a user-friendly tool specifically designed to assist you with assembling and running vhistadd commandlines and which can be used without a detailed knowledge of vhistadd's options.

In order to show a glimpse of what vhistadd can do for you, we have added some examples.

2.1 Arg-Files

You can write the arguments of a vhistadd call into a file and then specify this file instead of typing the whole commandline into the shell. These files are called Arg-Files.

Orginally, this feature was implemented due to limitations of the MS Windows platform, where only commandline arguments of fairly moderate size can be passed to programs (2048 characters on Windows 2000, 8191 on Windows XP [7], typical UNIX commandlines can be >100KB long). The corresponding error message does not make this obvious: "Window cannot access the specified device, path, or file. You may not have the appropriate permissions to access the item."

Since 2048 characters is not enough for all but the simplest use cases, a more powerful mechanism was needed. Arg-Files have several additional advantages to normal commandline calls: Arg-Files can easily be archived, copied around and passed to other uers. They can even be embedded into the VHIST file they generate, the lines in an Arg-File can be of arbitrary length and the syntax for Arg-Files is independent of platform, operating system of shell. It is very similar to the syntax of the bash [8] shell: single arguments are separated by one or several whitespace characters (space, tab or newline), strings which contain whitespaces must be enclosed in quotes (") and enviroment variables start with a $ sign, e.g. $PATH. Comments start with a hash symbol # and continue until the end of the line.

2.2 Synopsis

This table contains a brief summary of the commandline options for vhistadd.py.

FunctionCommandline Option
Built-in help-h or --help
Get Version-V or --version
Use an Arg File-c or --cmdfile <cmdfile>
Sync current directory--sync-cwd
Output VHIST file-O <filename1> [-O <filename2> ...] [other options]
Append to VHIST file-A <filename>
VHIST root file-I <filename> or vhistadd.py -J <filename>
Pre-defined attributes (workflow step)-s <key> <value> [-s <key> <value>] ...
User-defined attributes (workflow step)-U <key> <value> [-U <key> <value>] ...
PDF-related properties-d <key> <value> [-d <key> <value>] ...
Input file(s) with qualifiers-i <filename> ... <qualifiers> [-i <filename> ... <qualifiers>]
Output file(s) with qualifiers-o <filename> ... <qualifiers> [-o <filename> ... <qualifiers>]
Customization: PDF first page-1 or --firstpage
Customization: embedded readme-r or --readme
Verbosity-v or -q
Pretend Mode-p

This table contains the qualifiers that can be attachted to input or output files.

Type of FileQualifier
Pre-defined flags (files)[-f <flag>] [-f <flag>] ...
Pre-defined attributes (files)-a <key> <value> [-a <key> <value>] ...
User-defined attributes (files)-u <key> <value> [-u <key> <value>] ...

2.3 Built-in help

-h or --help
gives a brief description of all options.

2.4 Get Version Information

-v or --version
will print out vhistadd's version information and exit.

2.5 Use an Arg-File

-c <cmdfile> or
--cmdfile <cmdfile>
will read a file <cmdfile>.

An Arg-File is a file containing the commandline of one vhistadd call as descriped in Arg-Files.

2.6 Sync Current Working Directory

It is possible to synchronize the current working directory used by vhistadd with the path of the selected Arg-File. If synchronizing is enabled, all paths (with the exception of readme- and firstpage files) specified in the commandline or any Arg-File are relative to the path of the first Arg-File.

2.7 Output VHIST file

-O <filename> [-O <filename>] [-O <filename>] ...

The name of the VHIST file which will be generated during this workflow step. If more than one file is specified, several VHIST files with identical content will be generated. At least one -O (capital letter "O") option must be specified.

By design, vhistadd prevents you form overwriting existing VHIST files except (a) when an output VHIST file is also specified as a VHIST root file or (b) you choose the Append to VHIST file option.

2.8 Append to VHIST file

-A <filename>

If the file exists, the current workflow step will be appended to the file, preserving all previous content. If not, a new VHIST file of that name will be created.

2.9 VHIST root file

-I <filename> (file must exist)
-J <filename> (ignore if file cannot be read)

The name of the VHIST file, which is used as the basis for the new VHIST file. The generated workflow step is appended to a copy of this VHIST document. The VHIST root file is not modified except if the root file is also specified as an output VHIST file. If this option is not specified, vhistadd starts with an empty document. The -I option will cause vhistadd to stop with an error message if the VHIST root file cannot be read, use -J if vhistadd should ignore a missing root file (we have introduced -A, see previous subsection, to achieve the same effect without having to specify an output VHIST file with -O).

2.10 Pre-defined attributes (workflow step)

-s <key> <value>

We distinguish between User-defined attributes (workflow step) and pre-defined properties or attributes of a workflow step. Use to set one or more of the following pre-defined attributes (it is an error to use other keywords with this option):

AttributeBrief Description
titleThe title of the workflowstep.
descriptionA short description of the workflowstep. Longer descriptions can be embedded as input files.
commentAdditional comments referring to this workflowstep.
toolA version string of the tool used in the workflowstep.
toolversionAn alias for tool.
toolpathThe path to the executable of the tool.
hostThe name of the host, on which the workflowstep was performed.
userThe name of the user, who performed the workflowstep.
commandThe commandline used to exectue the workflowstep.

2.11 User-defined attributes (workflow step)

-U <key> <value>
to set a user-defined attribute.

We distinguish between user-defined and Pre-defined attributes (workflow step). -U "my key" "my value" would create the attribute my key and set it to my value.

2.12 PDF-related properties

-d <key> <value> or
-doc <key> value>

Valid keys are: producer, creator, author, keywords, title, subject

These properties are important when viewing VHIST-files using a PDF browser. We refer to [5] for a detailed description.

These properties can currentyl only be set for newly created VHIST files. When appending a workflowstep to an existing VHIST file, an attempt to set PDF-related properties is ignored.

2.13 Input file(s) with qualifiers

-i <filename> or
--infile <filename>
specifies one or more input files.

The VHIST format distinguishes between the files present before the workflow step (infiles) and files which result from executing the specified tool (outfiles).

See Pre-defined flags (files) Pre-defined attributes (files), User-defined attributes (files) for further options.

2.14 Output file(s) with qualifiers.

-o <filename> or
--outfile <filename>
to specify one or more output files.

The VHIST format distinguishes between files present before the workflow step (infiles) and files which result from executing the specified tool (outfiles).

See Pre-defined flags (files) Pre-defined attributes (files), User-defined attributes (files) for further options.

2.15 Pre-defined flags (files)

-r <key>
sets one or more pre-defined flags.
It is an error to use other keywords then pre-defined flags with
this option.

A flag enabling or disabling a property of the previously specified in- or outfile. The flags can be negated by prepending the flag's name with no-, e.g. no-embed.

FlagBrief description
automd5The MD5 sum [9] of the file's content is automatically calculated by vhistadd. If the file is embedded, this option is forced. By default, this option is enabled.
embedThe file's content is embedded into the generated VHIST file. By default, this option is enabled.
compressThe file is compressed using the flate compression method [10]. If the file's content is not embedded, this flag is ignored. By default, this option is enabled.
optionalThe file is optional and only included in the VHIST file if it is readable, otherwise it is silently ignored. By default, this flag is not set, i.e. if vhistadd needs to access the file to compute the MD5 sum or read the last modification date but the the file is not available, vhistadd will abort with an error message. If this flag is not set, but either flags embed or automd5 are set and the file is not readable, vhistadd will also abort with an error message.
previewwsHint that a VHIST browser with a suitable template can use this file as a preview image for the whole workflow step. Currenty must be a JPEG or PNG image.
previewHint that a VHIST browser with a suitable template can use this file as a preview image for the next embedded file. Currenty must be a JPEG or PNG image. Tip: define a "description" attribute to ensure correct identification of the associated data file.
thumbnailHint that a VHIST browser with a suitable template can use this file as a small preview image (usually called "thumbnail", e.g. a typical file icon for image files). Currenty must be a JPEG or PNG image.
thumbnailonlySimiliar to "thumbnail" but can be used to hint that this thumbnail image should be used in the context of the immediately following file and that no detailed information about this (illustrative) image is required (or, indeed, desired) as all relevant data will be supplied by the next image's attributes. CAVEAT: If the associated data file is of the type outfile, the thumbnail should also be of this type, similiar with the type infile.

2.16 Pre-defined attributes (files)

-a <key> <value>
sets one or more pre-defined attributes. It is an error to
use other keywords then pre-defined attributes with this option.

An attribute describing a property of the previously specified in- or outfile.

AttributeBrief Description
filetypeA user-defined text describing the file type. We suggest a convention reminiscent of the MIME [11] standard, e.g. binary/Analyze-Header.
descriptionA short description of the in- or outfile. There is no formal size limit, however, we suggest not to use this option with text significantly longer then a few lines. If required, put the description in a file which can then be embedded.
commentAdditional comments referring to the in- or outfile. We suggest to use this entry for observations during this particular workflow steps, e.g. unusually poor quality of data due to some hardware malfunction.
md5fileThe name of a file which contains an MD5 checksum [9] in the first line. The sum is used as user specified md5 sum for the in- or outfile. This attribute is useful in situations in which a file is not embedded and the checksum was already generated by another application, e.g. for very large files where caluclating MD5 sum is only feasible with low-priority background processes.

2.17 User-defined attributes (files)

-u <key> <value>
sets a key value pair of your choice.

We distinguish between user-defined attributes and Pre-defined attributes (files)

-U "my key" "my value"

sets the attribute my key to the value my value.

2.18 Verbosity

-v or
--verbose

Use this option to have vhistadd generate more verbose output to stdout.

-q or
--quiet

If this option is set, the output to stdout is reduced to errors only.

2.19 Pretend Mode

-p or
--pretend

If this option is set, vhistadd only parses the commandline and generates the workflow-step but does not generate a VHIST file. This option is useful for verifying a commandline.

2.20 Customization: PDF first page

-1 or
--firstpage

A file containing the content of the first page of a newly created VHIST document. The content of the file must be encoded in UTF-8 [12]. The text can contain Wiki-like markup which supports bold, italic and one coloured (blue) font attributes. By default, vhistadd will use res/title.txt.

This option is ignored if the workflowstep is appended to an existing VHIST file.

2.21 Customization: embedded readme

-r or
--readme

A file containing the readme, which is embedded at the beginning of a newly created VHIST document. The content of the file must be encoded in an ASCII compatible encoding, UTF-8 [12] is preferred. By default, vhistadd will use res/embedded_readme.txt. This option is ignored if the workflow step is appended to an existing VHIST file.

2.22 vhistadd examples

Please note that the following examples assume that you are using the "plain vanilla" Python version of vhistadd. The important part of the examples are the commandline parameters (everything after the initial vhistadd.py). We suggest you use bin/setup.py for setting up paths and links on unixoid platforms (Linux, Mac OS X).

The actual calling syntax of the vhistadd tool depends on your platform and configuration, e.g.

UNIX/Linux/Mac OS X, assuming vhistadd was installed in /usr/local/vhist: /usr/local/vhist/bin/vhistadd.py ...

UNIX/Linux/Mac OS X, with a suitable PATH variable, or ALIAS vhistadd.py ...

UNIX/Linux/Mac OS X, if the Python sources are (for some reason) not directly executable: python vhistadd.py ...

MS Windows without Python, assuming the VHIST executables were installed in Programs (no Python distribution is then necessary): C:\Program Files\VHIST\bin\vhistadd.exe

MS Windows with Python, assuming the VHIST executables were installed in Programs (you need to install a Python distribution first): cmd vhistadd.py

2.22.1 Example 1

The following examples will generate a new VHIST file hello.vhist which "documents" a simple application of the echo command. CAVEAT: vhistadd will fail if hello.vhist already exists (this is by design to prevent you from unintentionally overwriting important data). You also need to take into account the general problems of passing commandline arguments with special characters (spaces, inverted commas) by escaping them (e.g. use "\!" if your text contains an exclamation mark) and /or using a suitable type of inverted commas.

vhistadd.py -s title "Hello VHIST" -s command "echo 'Hello VHIST'" - O hello.vhist

2.22.2 Example 2

This example demonstrates how to have vhistadd add information about a workflow step involving the co-registration of two image volumes (which is an important task in multi-modality imaging of human brains). We think the general idea about documenting this non-trivial task involving multiple files can be easily transferred to other experimental setups.

Assuming we have a program my-coreg-tool that is capable of performing fully-automatic co-registration of image volumes and two image volumes of the same patient from two different modalities (here: MRI and PET scans of the same patient's brain). The toll will write a coreg.log protocl which we want to embed in the VHIST file (usually logs are so small that this is the recommended pratice) in addition to information about the co-registration result (another image volue: regimage-pet.v).

We have used the ECAT7 file type in these examples, which encodes one image volume with meta information in a single file; binary indicates that this file type does not store information in any "human-readable" form.

vhistadd.py
-s title "Co-Registration Task"
   the workflow step's title
-s tool "my-coreg-tool v1.23"
   the program's name and version
-s toolpath "/usr/bin/my-coreg-tool"
/the program's full path/ -i coreg.log -f embed
   input file: co-registration log, will be embedded
-a filetype "text/log"
   setting optional file type info on previous file
-i "pat-pet.jpg" -f no-embed
   input file: PET image volume of patient, do not embed
-a filetype "binary/ECAT7"
   setting optional file type info on previous file
-i "pat-mri.gif" -f no-embed
   input file: MRI image volume of patient, do not embed
-a filetype "binary/ECAT7"
   setting optional file type info on previous file
-o "regimage-pet.png" -f no-embed
   output file: co-reg. PET image volume of patient
-a filetype "binary/ECAT7"
   setting optional file type info on previous file
-O "regimage-pet.png.vhist"
   VHIST output file (mandatory)

Please note that we have tried to keep this example brief. In terms of "Good Scientific Pratice" you can easily improve on this template by adding further attributes to the workflow step or individual files: Pre-defined attributes (workflow step), User-defined attributes (workflow step), User-defined attributes (files).

2.22.3 Example 3

Based on the output of the previous example we now want to apply a 3D image filter to the registered image volume. Documentation of this next workflow step should take advantage of the existing processing history, i.e. use the VHIST file generated in the previous step.

This is achieved by specifying the previous VHIST as the VHIST root file: the VHIST file generated in this workflow step will then contain a copy of the full previous history, the new meta information on the current process (here: 3D filtering) is appended as a new section at the end.

# the workflow step's title
-s title "Gauss 3D Filter"= =# an optional description
-s description "apply filter to PET image"= =# the program's name and version
-s tool "my-filter-tool v2.34"= =# the program's full path
-s toolpath "C:\Programs and Files\my-filter-tool.exe
# optional attribute (key-value-pair)
-U "gauss-fwhm-mm" "2.3"= =# input file: image volume, do not embed
-i regimage-pet.png" -f no-embed
# output file: filter.log, will be embedded
-o "filter.log" - f embed
# the VHIST root file
-I "regimage-pet.v.vhist"= =# VHIST file to generate
-O regimage-filtered.v.vhist

3 vhistxs

The vhistxs commandline tool is part of the VHIST Core reference implementation and demonstrates the simplest approach to extract data from a VHIST file. This short program is embedded in each VHIST file: it is a part for the default "readme" which is located at the very beginning of each VHIST file.

vhistxs.py <vhistfile> Figure 6: Synopsis of commandline tool vhistxs.

4 vhistxl

A significantly more powerful tool, vhistxl also allows for validation of extracted data and sections of a VHIST file. In particular, the List Embedded Files options can be useful: consider it similiar to the same functionality of the UNIX tar command.

Built-in Help (vhistxl)vhistxl.py -h or –help
Get Version Informationvhistxl.py -V or –version
List Embedded Filesvhistxl.py -t or –list <vhistfile>
Validate Sections
Extract all files
Extract one file
Validate files
Specify extraction directory

4.1 Built-in Help (vhistxl)

-h or --help
gives a brief description of all options.

4.2 Get version information

-v or --version
will print out vhistxl's version information and exit.

4.3 List embedded files

-t or --list <vhistfile>
will only list all embedded files and not extract any data.

4.4 Validate sections

-v or --validate
will validate sections.

4.5 Extract all files

-x or --extract
will extract all files to disk (implies List Embedded Files).

4.6 Extract one file

-r <fileid> or --extract-file-<fileid>
will extract the file with the id <fileid>.

Use List Embedded Files to find the correct <fileid> of a particular file.

4.7 Validate files

-p or --pretend
will only decompress and test MD5 checkums [9] of embedded files, but not
write anything to disk.

4.8 Specify extraction directory

-d <dir> or --dir <dir>
will extract files to directory <dir>.

5 vhistzard

images/VHISTzard-4r.png

vhistzard is a collection of graphical user interface tools which are designed to make casual working with VHIST easier. vhistzard has been written in C++ utilizing the Qt libraries, Version 4 from Trolltech [4] and runs natively on Windows, MacOS X and Linux.

It provides facilities to browse through existing VHIST files, modify existing ones and create new ones.

images/vhistzard-all_a.png

Figure 8: vhistzard in action, Here shown in the MS Windows incarnation.

5.1 Creating VHIST Arg-Files

5.1.1 Introduction

One component of vhistzard has been designed for easy and comfortable creation of commandline calls for use with vhistadd. It is an orthogonal effort to acceptance for sites that have little scripting experience but we feel it might be useful to experienced users as well.

The window of this module is divided into 4 tabs: Main, In/Out Files, VHIST files and Export. Each individual tab will be presented in the next subchapters.

5.1.2 Main

First, we'll enter general information including a title and description of the workflow step that we either want to append to an existing VHIST file or use to start a new VHIST file.

Although not technically limited in length, we recommend to keep the description short and embed any longer texts as simple text files.

images/vhistez-main_a.png

Figure 9: Start with entering general information on a particular workflow step. The screenshot is from the MacOS X version.

5.1.3 In/out Files

Use this tab to specify which filse have been used in a workflow step. We distinguish between files that were used as input in this step and files that were created, i.e. files of type output. All files you have specified will be listed in the box at the bottom.

If you have a number of similiar files we sugesst you define one in detail and use the Duplicate File button to copy the settings and then focus on the differences.

images/vhistez-files_a_new.png

Figure 10: Specify which files have been used in a particular workflow step - and what their functions were (input files vs output files)

5.1.4 VHIST files

There are two type of files you can specify using this tab.

  1. A VHIST root file is optional (upper part of the dialog) and refers to an existing VHIST file (usually from a previous workflow step). If specified, its contents will be copied to the new VHIST file, any new information from the current workflow step is appended so that neither the copied data nor the root file are modified.
  2. You need to specify at least one VHIST file that will be created during this workflow step (bottom part of the dialog). For your convenience, we have added some settings that will derive the filename(s) from files you might have specified in the In/Out Files tab.

images/vhistez-vfiles_a_new.png

Figure 11: Choose how many VHIST files should be generated. You can select one of the presets for frequently used configurations.

5.1.5 Export

All your settings of the previous tabs will converted to vhistadd commandline options. You can copy them to the clipboard and paste them into a commandline or other tool of your choice. Please also consider using the Arg-File option and have vhistzard write a file with auomatic comments, see Use an Arg-File.

You configure if the paths are meant to be interpreted as relative paths in this. The other option is to leave them as absolute paths, which is the default.

images/vhistez-export_a_new.png

Figure 12: You can export all configured settings by clipboard to a commandline or other tool of your choice.

5.2 Running VHIST Arg-Files

vhistzard also contains a tool which is deisgned to assist in the creation of VHIST files from existing VHIST Arg-Files. It allows you to execute vhistadd without using a commandline.

images/vhistit-new.png

Figure 13: A newly openened "Run VHIST Arg-File" Window (here shown for a Linux flavour of vhistzard. The message in the output box points users to the website, the documentation and the examples.

5.2.1 Selecting a VHIST Arg-File

To select an arg-file, just enter the name of the file in the field labelled with "Arg-File". You can also select a file by clicking on the browse button next to the text filed or by dragging a file from you file managed (e.g. Explorer on Windows, the Finder on Mac OS X) onto the dialog. You can set a custom current working directory by unchecking the checkbox below the text fields and entering a path into the text field labelled "current working directory". This however, is needed soldomly. See also What is a Current Working Directory and Why You Should (Not) Bother About It.

5.2.2 Viewing the File's Content

To view the contents of the currently selected file, click on the tab labelled "File Content". This view can be used to verify that the correct file was selected or to inspect someone else's file. To prevent that a spuriously selected file fills up the computer's memory, only files smaller than a certain size are displayed. This does not represent a problem since arg-files are usually not larger than several kilobytes in size.

images/vhistit-content.png

Figure 14: The contents of the selected Arg-File is displayed in the textbrowser at the bottom of the window.

5.2.3 Executing a VHIST Arg-File

To "execute an Arg-File" here refers to running vhistadd using the options defined in the Arg-File (similiar in concept to running a powerful commandline tool with a number of commandline options/arguments). A click on the "Run" button will start vhistadd using the selected Arg-File, the output is displayed in the "Commandline Output" tab below the run-button.

images/vhistit-run-ok.png

Figure 15: The correct run of vhistadd is indicated by an error code of 0.

If an error or warning occurs during the execution, it is highlight in red or orange colour.

images/vhistit-run-error.png

Figure 16: Errors, which occured during the execution of vhistadd are highlighted in red.

5.2.4 What is in a Current Working Directory and Why Should (Not) Bother About it

A call of vhistadd usually contains references to several files using file paths. These paths can be either "absolute" or "relative". An absolute path specifies the exact position of the file for a particular file system, e.g. C:\SomeDirectory for file C:\SomeDirectory\MyFile (MS Windows syntax). On UNIX/Linux systems a similiar example is the absolute path /usr/local/data/mydir for file /usr/local/data/mydir/MyFile. This way of referencing is unambiguous, however, can be quite inflexible if you reorganize data.

Using relative paths is an alternative: a relative path is specified in relation to a reference directory, also known as currenct working directory (CWD). Therefore, it is important to set the CWD correctly or otherwise files referenced with relative pathes are not found (or, worse, wrong files might be used). Indeed, it is good practice to specify all relative pathes inside an Arg file relative to the Arg file itself. This means that you usually do not have to care about the CWD and just set it to the directory, in which your Arg file is located. This is also the default option in the run dialog of vhistzard.

5.2.5 Testing the Examples with vhistzard

The examples described in vhistadd - Examples are alos part of all VHIST distributions and located in the examples directory. Testing them on your system is quite easy.

images/vhistit-install-examples.png

Figure 17: To install the examples, select the "Install Examples" options from the "Extras" menu.

  • Select "Install Examples" from the "Extras" menu.
  • In the following dialog, select a directory into which the examples should be installed. A directory "Examples" will be created inside the selected directory.
  • If the directory already exists, vhistzard will ask if it should be overwritten.
  • In the next dialog, you can select one of the examples files.
  • To run the example, just press the "Run" button.
  • The generated VHIST file is stored in the directory in which the Arg file is located.

5.3 Viewing VHIST files

One part of vhistzard is a tool to view VHIST files nad to extract embedded files from them in a user-friendly manner. This tool provides a template-based preview ("themes") of the information contained insed the VHIST file (in HTML format) and allows for easy extraction of single files or complete workflow steps.

images/unvhist-prelim_new.png

Figure 18: The outline on the left-hand side is used for navigation inside the VHIST file as well as extraction of embedded files. The textbrowser next to it shows detailed information about the individual sections and files inside the VHIST file.

5.3.1 Opening a VHIST File

It is quite simple to open a VHIST file. Select "Open" from the "File" menu and navigate to the file to open it or just drag the file from you file manager (e.g. Explorer on Windows, the Finder on Mac OS X) onto the dialog. vhistzard will read the file and present you an outline of its content as well as augmented information about the files and sections.

5.3.2 Navigating Inside a VHIST File

To navigate inside a VHIST file, you can either use the outline on the left-hand side of the dialog or scroll through the textual information on the right-hand side. To jump to one special section or file, double click on the entry in the outline.

5.3.3 Extracting Embedded Files

To extract one or several files from a VHIST file, select the files in the outline and click on the "extract" button. For each selected file, a "Save File" dialog will appear. Embedded files are marked by a paper clip icon next to the filename in the outline.

5.3.4 Selecting a Template/Theme

You can use the vhistzard to view the summary of your VHIST file by means of various templates ("themes"). Different templates show information in different granularties and present them in different ways. To select a template, click on the drop down list about the text viewer.

5.3.5 Technical information on Templates/Themes

A template is a XHTML2 [13] file which can be viewed by any HTML browser. It usually contains keywords (e.g. $COMMENT) which will then be replaced by the corresponding content from the current VHIST file, see below for a list of keywords which will be available when the template is being parsed. User-defined attributes (workflow step) are prepended with $USR:: and converted to upper case, so for a key pair ("MyKey", "MyValue") the placeholder $USR::MYKEY will yield the value "MyValue". We strongly suggest to use the dump-all.htm tempalte to generate a list of all available keywords for a give VHIST file before modifying a template. Keywords may contain alphanumerical characters, "_" and ":" (but not as the last character).

The template can provide additional information on how information from the VHIST file should be rendered in HTML (or PDF) which can depend on the existence or values of certain entries. In addition, it is possible to define blocks of arbitrary HTML commands and use them as building blocks if certain conditions are met. In order to remain HTML compatible, we use HTML/XML comments and evaluate them in our own parser.

For each section/workflow step of the VHIST file, the template will be processed with a set of keywords for that particular workflow step, so e.g. $SECTION will contain the current section's name. There are named blocks with reserved names: __top__ and __bottom_ are only evaluated once and __files__ is evaluated for each file (of each section/workflow step).

In order to provide maximum flexibility, you can access all properties of all files of a section from the template. Keywords such as $FILEPATH[3] are available which in this case will refer to the path of the third file of that section. If you write $FILEPATH[*], the "*" will match the current file's index.

Similiarities to the C/C++ preprocessor syntax are intended in the following list of commands for HTML/PDF generation:

  • #ifdef $KEYWORD - If $KEYWORD has a value (has been set in the VHIST file) the next HTML block will be used, otherwise it will be skipped.
  • #ifndef $KEYWORD - If $KEYWORD has a value (has been set in the VHIST file) the next HTML block will be skipped, otherwise it will be used.
  • #ifequal $KYWORD value - If $KEYWORD is equal to value, the next HTML block will be used, otherwise it wll be skipped.
  • #else - May only be used in conjunction with one of the #if statements.
  • #begin(<name>) - Defines the start position of a named block.
  • #end(<name>) - Defines the end position of a named block.
  • #showblock(<name>) - Inserts a named block at this position.
  • #evenodd begin - Starts processing for $EVENODD keywords.
  • #evenodd end - Ends processing for $EVENODD keywords.

In the following example, the first and last line are HTML/XML comments which will be ignored by any HTML browser. When used in a template, the second line will only be used for rendering HTML/PDF output if the keyword $COMMENT has been set (which is the case if the current section/workflow step files contains a comment).

<!-- #ifdef $COMMENT -->
Comment: $COMMENT<br/>
<!-- #endif -->

The next example demonstrates how to set content differently for HTML and for PDF generation

<!-- #ifndef $ISPDFGEN -->
   this will only appear in the HTML version
<!-- #else -->
   this will only appear in the PDF file
<!-- #endif -->

The following special keywords are available (in alphabetical order):

  • $EVENODD - Processing of $EVENODD is activated between occurances of #evenodd start and #evenodd end. The keyword will be replaced by the strings "odd" and "even", resp. These terms will be used in an alternating fashion, starting with "odd". This feature is generally useful when defining tables with rows that should have an alternating background color and it is required when the table's rows depend on some conditions which are not fully known when designing the table's layout. Template black-and-white.htm makes use of this feature.
  • $FILEHASUSRATTRIBUTES[*] - Will be set to "TRUE" if any User-defined attributes (files) have been defined for the workflow step, otherwise this keyword is undefined. User-defined attributes can be accessed with keywords of the form $USR::MYKEY[*].
  • $ISPDFGEN - Will be set to "TRUE" if content is generated for a PDF file, otherwise this keyword is undefined.
  • $ISFIRSTSECTION - Will be set to "TRUE" if the current section is the first section of the output (either HTML or PDF), otherwise this keyword is undefined.
  • $PREVIEWID[*] - Contains the ID of the current file's preview image if one was defined, see Pre-defined flags (files), otherwise this keyword is undefined.
  • $THUMBNAILID[*] - Contains the ID of the current file's thumbnail image if one was defined, see Pre-defined flags (files), otherwise this keyword is undefined.
  • $VHISTFILE - The file name (without the full path) of the selected VHIST file.
  • $VHISTFILESIZEINBYTES - The size of the selected VHIST file in Bytes.
  • $VHISTFILESIZEINMB - The size of the selected VHIST file in MB.
  • $VHISTFILELASTMODIFIED - The last-modified-date of the VHIST file.
  • $WSHASUSRATTRIBUTES - Will be set to "TRUE" if any User-defined attributes (workflow step) have been defined for the workflow step, otherwise this keyword is undefined. User-defined attributes can be accessed with keywords of the form $USR::MYKEY.
  • $WSPREVIEWID - Contains the ID of the section's preview image if one was defined, see Pre-defined flags (files), otherwise this keyword is undefined.

Footnotes:

[1] S. Vollmar, A. Hüsgen, M. Sué, The VHIST Homepage (2009) URL http://www.nf.mpg.de/vhist

[2] S. Vollmar, A. Hüsgen, M. Sué, M. May, R. Krais, Workflow Histories and Image Data with Validation, Abstracts of the XI Turku PET Symposium (2008) p. 108. URL http://www.pet.fi/files/PET2008_book_of_abstracts.pdf

[3] Python Programming Language Official Website. URL http://www.python.org

[4] Qt-Trolltech URL http://trolltech.com/products/qt

[5] Adobe Corp., Adobe Systems Incorporated, PDF Reference, fourth edition, Adobe Portable Document Format Version 1.5. URL http://www.adobe.com/devnet/pdf/pdf_reference.html

[6] GNU General Public License (GPL). URL http://www.gnu.org/licenses

[7] Microsoft Inc., Command prompt (Cmd.exe) command-line string limitation. URL http://support.microsoft.com/kb/830473/EN-US

[8] GNU Project, BASH - GNU Project - Free Software Foundation (FSF). URL http://www.gnu.org/software/bash

[9] R. Rivest, The MD5 Message-Digest Algorithm, RFC 1321.

[10] L.P. Deutsch, DEFLATe Compressed Data Format version 1.3. URL http://www.ietf.org/rfc/rfc1951.txt

[11] W3C, RFC 2046, Multipurpose Internet Mail Extensions (MIME) Part Two: Media. URL http://www.ietf.org/rfc/rfc2781.txt

[12] W3C, Unicode (UTF-8, UTF-16) URL http://www.ietf.org/rfc/rfc2781.txt

[13] World Wide Web Consortium, XHTML2 Working Group Home Page. URL http://www.w3.org/MarkUp

Author: Stefan Vollmar <vollmar@nf.mpg.de>

Date: 2010-01-26

HTML generated by org-mode 6.34 in emacs 23