CERN Accelerating science

This website is no longer maintained. Its content may be obsolete. Please visit http://home.cern/ for current CERN information.

CERN home pageCERN home pageDocuments by ReferenceDocuments by ReferenceCNLsCNLsYear 2001Year 2001Help, Info about this page

Contents

Editorial Information
Editorial
If you need help
Announcements Physics Computing Desktop Computing Internet Services and Network Scientific Applications and Software Engineering Desktop Publishing The Learning Zone User Documentation Just For Fun ...
Previous:A Brief History of Cernlib (1966-???)
Next:The PAW History Seen by the CNLs
 (See printing version)



The Design and Implementation of Large Packages: Hbook, GEANT3, PAW, ROOT

Rene Brun , IT/DI


Abstract

The software development evolution as seen by a CERN Physicist involved in the development of Physics software since the very beginning of his professional life.


When the CNL editor kindly suggested me to write an article for this special edition of the Newsletter, I had a long hesitation for two reasons:

  • I still feel myself pretty young, but have to admit that I have been floating around for 3 decades by now. At a time where many Internet startup/down managers are teenagers, I thought that may be the CNL readers will not be interested by stories of the kind "How good was the time when my grand mother was making tasteful strawberries jams".
  • the second and main reason was that when you have been involved in the development of large packages used at a large scale in our field of research and others, you know very well that the true story involves not only technical matters but also and mainly sociological events with people moving between projects, sections, groups and divisions.
At the end, I accepted to write this note because I always had a fantastic opportunity to work in a very challenging area in contact with thousands of people in the world. During this time, I have been the witness of a tremendous evolution of the computer industry and also of the way large software systems are developed.

The seventies

In 1971, I was lucky to be accepted as a summer student. At the time, I was preparing my thesis in a small experiment at the SC evaluating the differential cross-sections in interactions between protons and thin targets from Beryllium to Gold. The events consisted of only two quantities, the energy of a particle absorbed by a NAI block and the dedx information collected by a germanium diode. The events of two bytes were recorded on rolls of paper tape, about 1 cm per event and 10,000 events on a roll. To analyze the events, histograms and scatter plots of dedx versus energy had to be produced. That is how I started a long career with histograming packages. The paper tape rolls were "manually" converted to temporary disk files with a life time of a few days and analyzed on the CDC6600 in the room where the printing shop is today. The analysis program consisted of about one "box" of 2000 punched cards with about 50 Fortran subroutines. The name of the routine was written with a pen on the top side of the cards. The size of the subroutine was carefully chosen such that there were enough cards to make the name visible and the order of routines in the box corresponded in general to the logical sequence of execution. The size of the executable module was typically a few tens of Kilobytes. Running a job was a long process consisting of the following steps:
  • put your box in one of the input trays X,S,M,L,T (express, short, medium, long jobs, jobs with tapes). The jobs were queued in each tray.
  • an operator reads the box in the card reader and controls the execution.
  • the result of the job is written to the line printer output
  • Once an hour, the line printer output are sorted in special racks.
Starting in 1972, a new and faster facility (FOCUS) was made available. You could enter yourself your program in the cards reader and could get the output on a line printer in the users area. While your job was running, you could find the status by typing a command of the style "Job inquiry brun 1287". Waiting for the job completion, you had plenty of opportunities to discuss with your colleagues and the cafeteria was just near by.

Immediately after having completed my thesis, I was hired in the DD division to work on Special Hardware processors to be used in a p-p elastic experiment at the ISR. Due to problems with the hardware, I migrated gradually to software activities. At the fall of 1973, the first version of HBOOK was available and this was the beginning of a long saga.

With my colleagues Michel Hansroul and Jean-Claude Lasalle, we were working on the so-called Electronic Experiments (as opposed to Bubble Chamber Experiments). We started together the development of several packages. I will only name a few (FFread, Lintra, Mudifi, Vertex).In 1974, we launched a framework for simulation, initially called Geneve (Generation of Events). But we discovered that this name was already used by at least another program, we changed the name to GEANT (Generation of Events and Tracking). After a first rudimentary version 1, we came with a more reasonable package GEANT2 in 1977.

During this period, most users were still using punched cards. I had already switched to a more interactive system using a teletype. The editor was very primitive, but I could do everything from the teletype. In 1975, I was happy to have exclusive access to a huge but noisy Tektronix 4002 device. This display was connected to the FOCUS CDC3100 system (a front-end to the CDC6600). Running a session was a complex operation:

  • To login, I had to type a long message with my name and a tape number. Following this message, a lamp was flashing in the computer center. The lamp was to attract the attention of the operators to mount a tape and copy the contents of my previous session to a local disk of a total capacity of about 10 Mbytes.
  • I could use an advanced editor and submit my jobs interactively to the CDC6600. I could also inspect the graphics metafile produced by the job, a huge progress.
  • To logout, the same lamp was again flashing to instruct the operator to copy my files to tape, the disk files were deleted and the space used by another client. Only a few people (less than 5) could use the system concurrently.
A major step came with the Intercom system. Intercom was a replacement of FOCUS using a CDC6400 and a CDC6500 as a front-end to the CDC7600. At the same time, I switched from my good old Tektronix4002 to a more lightweight display a Tektronix 4006. I developed the HBOOK and GEANT2 systems in this environment. We were always struggling with the memory limitations on the CDC7600 system and were experts in building large programs using overlays and the LCM (Large Core Memory) of the machine.

Our products were well documented. The first Hbook manual was produced in 1974 with a text processor called BARB. The input and output characters to BARB were exclusively uppercase characters. A better version BARBascii available in 1975 allowed upper/lowercase text in the output.

The eighties

The memory size limitations and the growing complexity of our programs pushed us to implement Memory Managers. I implemented a very simple system called ZBOOK (8000 lines) that we used both as a memory manager and a data structure manager. At the same time a more powerful but also complex system HYDRA had been designed for the bubble chambers software.

The GEANT2 system had been in use quite successfully in several experiments, but was lacking a very important component, a geometry package. Pushed by my experience in the simulation and reconstruction programs of the deep inelastic muon scattering experiment NA4, I developed several prototypes of geometry packages before finding an acceptable solution. I was lucky to meet Andy McPherson from OPAL to help me implementing the package. GEANT3 was born in 1981. At the time, I was far to imagine that the system will survive for more than two decades. The move from the painful Intercom/Scope/CDC7600 to the user friendly VAX/VMS environment made possible the rapid development of GEANT3. The OPAL Vax/780 from RAL with 1 MByte of memory and 66 Mbytes disk plus the nice compiler and debugger was an essential step to develop a complex software. By the end of 1982, we had made a lot of progress in developing the simulation program for OPAL. L3 joined the effort in participating to the development of the tracking system with Francis Bruyant and later of the hadronic physics processes with Federico Carminati.

GEANT3 was based on the ZBOOK system. An official project called GEM had been launched in 1981 with the goal to deliver an enhanced data structure management system in view of LEP. GEANT3 was waiting for GEM. However, because of multiple problems that I have seen repeating multiple times, this ambitious project was stopped. This generated hot discussions between the experiments. In 1983, following the GEM problems, and strongly pushed by L3, an agreement was reached with Julius Zoll who was the main developer of HYDRA to create a new system that we named ZEBRA (ZBOOK + HYDRA). We quickly replaced the HBOOK and GEANT3 data structure managers by ZEBRA.

In June 1982, I was pleased to accept an invitation to visit the Apollo company in Boston. I immediately realized the potential and the advantages of multiple windows on the same screen and of a user friendly operating system. In September 1982, I had the first Apollo workstation in Europe in my office (half Megabyte of memory and 33 Mbytes disk). This again strongly contributed to speed up the development of GEANT3.

At the same time, I had been involved in an interactive version of HBOOK, called HTV. HTV used a small command interpreter called ZCEDEX and could be used to produce plots by calling interactively the HBOOK and HPLOT systems. A special version working only on Apollos (HTVGUI) was developed. It offered more interactivity by exploiting the bitmap functionality of the workstation. HTV had a growing number of users and in 1985 a group of people started a series of meetings to prepare the ground to design a successor that Rudy Bock named PAW, the Physics Analysis Workstation. The birth of PAW had been a difficult exercise. There was a conflict between those pushing to adopt standards (GKS in particular) and associated commercial systems and those pushing for a more integrated and home-grown system. If we had followed the recommendations of the design committee, PAW would have had a primitive command line interface, a menu system using the SMG (Screen Management Graphics) designed for a VT100, a GKS only graphics interface and no ntuples. An attempt was made to setup a collaboration with Eric Bashler from DESY who had designed a popular package called GEP. Unfortunately, GEP was written in PL1, a language popular at the time but with a restricted portability. It was clear that we had to develop a more powerful package portable on as many platforms as possible.

I have enjoyed coordinating the design and the implementation of PAW until 1993. The development of such systems takes time and requires an intensive cooperation between developers and users. One of the characteristics of interactive systems is that you have nearly an infinity of possible combinations that can be used. The testing phase cannot be done without having a large users base. Vice-versa, having many users imposes a lot of constraints such as backward compatibility. It is also imperative to provide an efficient user support.

Some exotism in the middle of the 80s

In 1985, we saw a first serious campaign to move away from Fortran. The ADA language had been adopted by the DoD in the USA and many people thought that HEP could not stay "behind". It was an interesting experience to participate to the ADA club discussions. A technical student Jean-Luc Dekeyser implemented an important subset of GEANT3 in ADA, the materials, cross-sections and tracking. This was the subject of his diploma. The language had indeed very interesting features, but was totally unusable in the reality, in particular the compilation/Link cycle was far too long.

Some effort was spent in porting our software on IBM emulator machines, 168E, then 370E that proved to be useful in several online applications. Because the mainframe world was dominated by IBM at that time, we had to adapt our mini interactive systems to the semi-interactive system Wylbur. We also had some attempts to use the native IBM/TSO interactive system for applications such as HTV or GEANT3. In the middle 80s, supercomputing was synonym of vector computing. We had a CDC7600 and IBM mainframes with some vector capabilities. We spent a tremendous amount of time in understanding how one could vectorize large programs such as GEANT3. Most of these attempts failed. In 1988, RISC machines appeared and were generally considered as the alternative to vector computers. The very positive experience with my Apollo DN1000 pushed OPAL to start the HOPE project, the predecessor of SHIFT.

Discussions on software methodologies were a hot topic. The SASD methodology, in particular, was fashionable. You were considered as a hacker and dumb programmer if you were not able to show some bubble diagrams for your code. Many managers liked these systems considered as a solid guarantee of a good design. I remember a nice talk at a CHEP conference at Oxford in 1990 with the title "Bubbles and no code". Was this talk fatal to this methodology in HEP ?, I do not know, but its main advocates stop preaching its virtue soon after.

The nineties

At the end of the 80s and early 90s, we spent a considerable effort in the continuous development of the PAW and GEANT3 systems. With PAW we had moved from the old GKS times to a continuous development of the HIGZ system with more and more interfaces (obviously X11) and more operating systems (explosion of the Unix flavours). The success of the row-wise ntuples called for new developments, the column-wise ntuples. A Motif-based version called PAW++ was available in 1992.

GEANT3 also had been in continuous development. The electro-magnetic processes were continuously improved by Michel Maire and Lazlo Urban. We had several interfaces with hadronic packages. The small package Tatina developed by Tony Baroncelli was gradually replaced by Geisha, Fluka and Calor. At the same time, the facilities developed for PAW were also gradually introduced in GEANT3.

In the early 90s, we had an interesting parenthesis with the so-called MPP (Massively Parallel Processors). It was a painful exercise to show that parallelism other than event-level parallelism was a non trivial task and this generated many lovely discussions. A parallel version of PAW called PIAF was developed by Fons Rademakers in collaboration with Hewlett-Packard. The successful experience with PIAF for the analysis of large ntuples has been essential to design and implement the PROOF (Parallel ROOT Facility) a few years later.

On the software side, we had a pleasant workshop in Erice at the end of 1989. The first solids attacks against Fortran were made and the first advocates of Object Oriented Programming appeared in our field. In 1990, Toby Burnett from SLAC had an interesting prototype called GISMO of a simulation program written in Objective C and developed on a very advanced workstation (Next). This was also the time when Tim Berners Lee had started the development of the Web. Tim was in my group at that time and I remember all the discussions between him and the majority of us who did not understand what he was doing (described in another article in this Newsletter).

Since the Erice meeting, we were under strong pressure to investigate Object Oriented Programming. Following CHEP in Annecy in September 92, I became convinced that we had to do something concrete. However, Fortran90 had strong advocates and in many places, it was usual to hear "I do not know what will be the next programming language in HEP, but I know that it will be called Fortran". I had been working a bit with Mike Metcalf who had written several nice introductory examples to Fortran90. Thanks to his examples, I realized how difficult it will be to implement an I/O subsystem without having a run time type information support in the language. Discussions on languages degenerated quickly between the strong believers in Fortran90 and the strong believers in OO languages. Which OO language was the best was not clear. A project called MOOSE had concluded that Eiffel was the best OO language. However, more and more people were looking to C++ as a possible serious candidate.

The years 1993 and 1994 were a turning point in my professional career. After long and hot debates, with my colleagues Fons Rademakers and Masaharu Goto from Japan, I started a new project ROOT that generated even more new pleasant and hot debates. These two lines just to mention that the transition from the procedural Fortran-based programming environment to an Object-Oriented environment has taken far more time that was initially anticipated by everybody, including the true believers.

The new Century

There are now many developers with several years of experience in the OO technology. Different ideas are being tried in different collaborations. Competition in several software areas has been a good thing, at least we can compare different solutions. This is true for the data Base technology, but also, for graphics, User Interfaces, and the methodologies. Frameworks like the ROOT system consist of more than 500 classes and more than 500,000 lines of C++ code. Thanks to the fantastic evolution of the computers, memories, disk capacity, and tools, a small team of two or three people can still manage these large systems that will probably reach 1 or 2 millions lines of code when the LHC will start in a few years. I do all my development on my laptop with 256 Mbytes of memory and 20 GBytes of disk, yes more than a factor 1000 times more powerful than the machine where I developed Hbook in 1973. Compared to a few years ago when a major release of similar size systems was taken at least one week, we can generate in just a few hours the tar balls for 3 times more systems distributed on many machines on 3 continents.

In all the projects where I have been involved, users have obviously played a key role. Involving users early in a project and getting new users at the right rate corresponding to the state of maturity of a project is the key for success. The naive views of the early eighties with the waterfall development model have always failed. More appropriate models with macro and micro cycles have been more successful. Large systems take years before reaching a sufficient stability (at least 5 years for PAW and more than 6 years for GEANT3 or ROOT for example). Stability is a major requirement for a system to be used in a large scale. This explains why the transition to the OO world has taken so much time. PAW has been a success once it reached a stability point and also because it was the "lingua franca" for data analysis. However, stability does not mean that a system is frozen. It means that the main concepts are there and already proven with a large users base.

Getting a large users base is a chicken and the egg problem. For sure, with today's Internet technology, Open Source concepts, things are easier. However, going from a few users to a few ten, then a few hundred, then a few thousand is a non trivial task. Past experience is a great help. User support with instantaneous responses to a problem is a must. This is the most difficult thing to do, but also the most rewarding in the long term. Just to give some numbers to illustrate the scale of the problem: the ROOT tar files have been downloaded more than 100,000 times via the web. The Web site has about 500,000 clicks per month and the Users Guide has been downloaded 15000 times in two months. To these numbers, one must add about 4000 messages per year to the public list and nearly 10000 other messages related to the project. Yes, in some sense, this can be seen as a nightmare, but it is also a fantastic opportunity to meet people in all countries of the world. When I start answering mails in the morning, I take advantage of the time difference between countries, starting with mails coming from Japan and Asia, then Europe. At noon, it is time to process the mails from the Us East coast, etc. Our world is really a small village.

I want to thank all my colleagues in the small teams of the PAW, GEANT3 and ROOT projects for their essential contribution. They all know the efforts required to support thousands of users, but also the immense satisfaction to work in this very challenging area.



For matters related to this article please contact the author.
Cnl.Editor@cern.ch


CERN-CNL-2001-001
Vol. XXXVI, issue no 1


Last Updated on Thu Apr 05 15:28:11 CEST 2001.
Copyright © CERN 2001 -- European Organization for Nuclear Research