CERN home pageCERN home pageDocuments by ReferenceDocuments by ReferenceCNLsCNLsYear 2002Year 2002Help, Info about this page


Editorial Information
If you need help
Announcements Physics Computing Desktop Computing Internet Services and Network Scientific Applications and Software Engineering Desktop Publishing The Learning Zone User Documentation Just For Fun ...
Previous:The Learning Zone
Next:Questions and Answers from the Computing Helpdesk
 (If you want to print this article)

Unix Lesson - Displaying more with "less"

Wolfgang Friebel , IT / DS

To browse files under UNIX you can use the excellent viewer less, the better alternative to "more". By making use of the environment variable LESSOPEN, less can be enhanced by external filters to become even more powerful. I will describe a filter that can be used at CERN by simply setting LESSOPEN:

LESSOPEN='|/usr/local/bin/ %s'; export LESSOPEN (ksh, bash, zsh)
setenv LESSOPEN '|/usr/local/bin/ %s' (csh, tcsh)

Most Linux distributions come already preconfigured with a filter or that covers the most common situations. The input filter for less that I am going to describe understands many of the more common file formats. It is easily extendable for new formats. The preprocessor for less is written in a ksh compatible language (ksh, bash, zsh) as one of these is nearly always installed on UNIX systems and uses relatively few resources.

The design of the input filter is based on two main ideas. The recognition of the file format is not based on the file suffix. This method from the DOS world is error prone and keeping the suffix list up to date is a tedious job. UNIX comes with the file command that recognizes lots of formats. Up to date file descriptions are included in the tarball, so maintaining a list of file formats is therefore only a matter of obtaining a current version of the file package.

The second idea is to being able to call with a hierarchy of file names and to pull out finally the file at the bottom of the hierarchy. This would allow you to look at individual files contained in an archive which itself could be part of a still bigger archive.

As accepts only a single argument, a hierarchical list of file names has to be separated by a nonblank character. As the colon is rarely found in file names, it has been chosen as the separator character. At each stage in extracting files from such a hierarchy the file type is determined. This guarantees correct processing and display at each stage of the filtering.

To give an example, I show how one could display the man page found in the RPM source archive file-xxx.spm. The less command enhanced with the filter

less file-3.27-43.i386.spm

yields the following output

SuSE series: a 
-rw-r--r--   1 root     root        12953 Feb  3 11:45 file-3.27.dif 
-rw-r--r--   1 root     root       123541 Jul  6  1999 file-3.27.tar.gz 
-rw-r--r--   1 root     root         3398 Mar 25 07:31 file.spec 

Then the command

less file-3.27-43.i386.spm:file-3.27.tar.gz

produces the output

-rw-rw-r-- christos/christos  8740 1999-02-14 18:16 file-3.27/file.c 
-rw-rw-r-- christos/christos  4886 1999-02-14 18:16 file-3.27/file.h 
-rw-rw-r-- christos/christos 13428 1999-02-14 18:16 file-3.27/ 

The desired man page can finally be viewed with

less file-3.27-43.i386.spm:file-3.27.tar.gz:file-3.27/

The subcomponents of the argument to less were easily obtained by cut and paste using information contained in the previous lines of output. If you wanted to display the nroff sources instead, appending another colon at the end of the argument would have done the job:

less file-3.27-43.i386.spm:file-3.27.tar.gz:file-3.27/

If the man page was even compressed (e.g. as it would have been uncompressed anyway. To also disallow uncompressing the source a second colon would have to be appended to the argument.

Even extracting single files from an archive is possible, like with

less file-3.27-43.i386.spm:file-3.27.tar.gz:file-3.27/file.c > file.c

The script is able to extract files up to a depth of 6 where applying a decompression algorithm counts as a separate level. In a few rare cases the file command does not recognize the correct format (especially with nroff). In such cases the filtering can be suppressed by a trailing colon on the file name.

The most recent additions to allow you to browse MSWord files (using the very fast antiword command) and look at contents of DOS formatted disks by accessing the proper device file.

To activate you have only to define the environment variable LESSOPEN as described above.

The current lesspipe package is available from the original distribution site or from sourceforge. It supports the following formats:

Compressed files

file format decoding method
gzip, pack and compress uncompressed with gzip -c -d
bzip2 uncompressed with bzip2 -c -d
zip uncompressed with unzip -lv (extracting with unzip -avp)

Other file formats

file format displaying method
tar using GNU tar tvf (extracting files with tar 0xf)
nroff using groff -s -p -t -e -Tascii -mandoc
ar library using ar vt (extracting with ar p)
nm shared lib using nm
executable using strings
directory using ls -lAL
rpm using rpm -qiv -p and rpm2cpio | cpio -i -tv
(extracting with rpm2cpio and GNU cpio)
Debian using dpkg -I, dpkg -c (extracting with dpkg --fsys-tarfile)
html using lynx -dump
Word using antiword
pdf using pdftotext
unmounted media using tar or mdir (extracting with mtype)
rtf using unrtf
dvi using dvi2tty
ps using ps2ascii and gs

For matters related to this article please contact the author.

Vol. XXXVII, issue no 1

Last Updated on Thu Mar 28 16:36:24 CET 2002.
Copyright © CERN 2002 -- European Organization for Nuclear Research