Introduction to <TITLE>Remote Procedure Call <DATE>June 1988 <ADDRESS> <ALINE>CERN, Data Handling Division, Online Group </ADDRESS> <AUTHOR>T.J. Berners\Lee </TITLEP> <ABSTRACT> <HP1> Writing the software for a system of many processors is not always as simple as writing it for a single machine. Remote Procedure Call (RPC) is a technique which makes it relatively straightforward, allowing programs to span many machines without being aware of the boundaries between them. Programming using RPC does not require a knowledge of communication software, and even allows the same distributed application to run over various different communications media without modification. In this introduction to RPC, the principles of RPC are described, its advantages and limitations, and a few of the many applications for which it has been used at CERN over the last few years.</HP1> </ABSTRACT> </FRONTM> <BODY> <HP2 FONT=TITLE>Introduction</HP2> At CERN (the European Particle Physics Laboratory) large numbers of minicomputers, workstations, and microprocessors embedded in complex VME and FASTBUS based systems are used for data acquisition and the control of particle physics experiments. The design of distributed software to run on many different types of processor and operating systems has provoked the development of an infrastructure based on Remote Procedure Call, and of many related tools and applications. In this paper, we describe briefly the principles of RPC, and then discuss some typical uses to which it has been put at CERN. <HP2 FONT=TITLE>What is Remote Procedure Call?</HP2> Remote Procedure Call is a technique for building distributed systems. Basically, it allows a program on one machine to call a subroutine on another machine without knowing that it is remote. RPC is not a transport protocol: rather, it is a method of using existing communications features in a transparent way. This transparency is one of the great strengths of RPC as a tool. Because the application software does not contain any communication code, it is independent of <UL> <LI>The particular communications hardware and protocols used <LI>The operating system used <LI>The calling sequence needed to use the underlying communications software </UL> <P> This means that application software can be designed and written before these choices have even been made. Because it takes care of any data reformatting needed, RPC also provides transparency to byte ordering and differences in data representation (real number formats, etc). RPC is not a new technique. It was first investigated thoroughly by Nelson [1] in 1976 and has been in use in academic and commercial areas for many years [2,3,4,5,6]. <HP2 FONT=TITLE>How does it work?</HP2> Here we describe the components involved in a remote call, and introduce some conventional terms. From the point of view of a remote call, the calling program is known as the client, and the subroutine it calls is known as the server. When the client calls the server, the RPC system must take care of: <OL> <LI>Taking all the parameters which are passed to the subroutine and transferring them to the remote node; <LI>Having the subroutine executed on the remote node; and <LI>Transferring back all the parameters which are returned to the calling routine. </OL> <FIG ID=NFIG1 PLACE=INLINE CAP=LONG FRAME=NONE> ..if "&SYSTERMT" ne "APA6670" and "&SYSTERMT" ne "I3820" ..th ..go bypas1 <PICTURE NAME=NFIG1$$S> ...bypas1 <FIGCAP><HP1>In this example, a FORTRAN program on a minicomputer calls a subroutine (written in C) on a microprocessor to set the state of a valve. </HP1> </FIG> The most common method of doing this is by the use of stub modules. The client program is linked to a client stub module. This is a subroutine which looks (from the outside) in every respect like the remote subroutine. On the inside, it is almost empty: all it does is take the values of the parameters which are passed to it, and put them in a message. (This is known as marshalling). The client stub then uses a routine in the RPC Run\Time System (RTS) to send the message off and wait for a reply message. When the reply arrives, the stub unmarshals the parameters that were returned in the reply message, putting their values into the variables of the calling program. The client stub then returns to the calling program just like a normal subroutine. The server stub is located on the remote machine. It is called by the RPC run\time system when the message arrives from the client. The server stub performs the operations complementary to those of the the client stub: unmarshalling the parameters passed to the subroutine, calling the subroutine, and marshalling the return parameters. <FIG ID=nFIG2 PLACE=INLINE FRAME=NONE> ..if "&SYSTERMT" ne "APA6670" and "&SYSTERMT" ne "I3820" ..th ..go bypas2 <PICTURE NAME=NFIG2$$S> ...bypas2 <FIGCAP><HP1> The software components of a remote call. </HP1> </FIG> All the communication details are handled by the RPC run\time system, so the stubs contain only the code which is specific to the application involved. Each stub handles a specific set of procedures known as a package. In order to produce stub modules, one needs to know <UL> <LI>The names of the procedures in the package <LI>The number of parameters which they take <LI>The data type of each parameter <LI>The direction in which each parameter is transferred. </UL> This information is normally declared in modern programming languages such as Ada. It is simple to write, even if the rest of the code is written in FORTRAN. Given such a definition of a package, it is a mechanical task to write the stub modules, and so this is done by a program called an RPCJcompiler. <FIG ID=NFIG3 PLACE=INLINE FRAME=NONE> ..if "&SYSTERMT" ne "APA6670" and "&SYSTERMT" ne "I3820" ..th ..go bypas3 <PICTURE NAME=NFIG3$$S> ...bypas3 <FIGCAP><HP1>The RPC compiler takes the package definition as input, and produces both stub modules.</HP1> </FIG> <HP2 FONT=TITLE>How to Use RPC</HP2> The steps necessary to create an application system using RPC are outlined below. The more sophisticated techniques available are not discussed here, but are described in [7,8] <HP2 FONT=HEAD>Designing the system</HP2> The constraints imposed by the use of RPC are mostly those imposed by the rules of good software design. <OL> <LI>The parameters must be the only means of communication between modules. FORTRAN common blocks will of course not work between two different machines. This is not likely to arise with a new application, but must be checked if an existing application is to be split into a distributed one. <LI>Pointers may not be passed between modules. In general, a pointer will not have any significance except on the machine on which it originated. If a package works by returning pointers to large data structures, then it must be modified before it can be used remotely. .br In some cases, a pointer is passed when in fact the significant thing is the data to which it points, not the actual value of the pointer. In this case, the RPC system can copy the data, and make a new pointer on the remote side. In C language, this is the case with all parameters which are returned from the routine. <LI>One should not needlessly pass large arrays of data. If a procedure is only going to reference a single element, that element should be passed explicitly, for efficiency. </OL> In addition, there will be extra constraints specific to the implementation of RPC in use. Our particular implementation imposes restrictions on the data types which the RPC compiler can handle, and, under certain circumstances, on the total number of bytes transferred in any call. <HP2 FONT=HEAD>Generating the software</HP2> The basic steps involved, for most RPC implementations, are as follows. <OL> <LI>Write the package definition file to define the software interface. <LI>Run the RPC compiler to produce the stub code. <LI>Link together the client modules (program, stub, RTS) to make the client module. <LI>Link the server modules (a standard main program, the server stub, the server routines themselves, and the RTS) </OL> In some cases, the application code has to be modified to add a few lines to initialise the run\time system and the stubs. This depends on whether the local operating system is suficiently sophisticated to do this automatically. Apart from that initialisation code, the package definition is the only software which needs to be written to make an application run remotely. <HP2 FONT=HEAD>Running the system: Naming and Addressing</HP2> When the client program is run, the client stub must be connected, via the RTS, to the server stub. Normally, the client stub will not be aware of the physical address of the server (this would be too constraining in a real system). Therefore, the RTS has to take the logical name of the service required, and use it to find a suitable server. In our current implementation, local tables on each machine are normally used to give the addresses of the remote servers. Some other systems [9, 10] use central name servers to store this information; others rely on the address being compiled into the stubs or provided by the user program. Experience shows that many methods of address resolution must be available. A full discussion of this is, however, outside the scope of this introductory paper. .pa <HP2 FONT=TITLE>Example Applications</HP2> In this section we discuss a few examples of applications in which Remote Procedure Call has been used effectively at CERN. <HP2 FONT=HEAD>Remote File access</HP2> Remote file and database access was one of the earliest uses of RPC. The Sun Network File System, for instance, is implemented using the Sun XDR RPC system [4]. At CERN, the Valet\Plus VME-based test system [10] uses RPC running over ethernet to access files on minicomputers and mainframes. <HP2 FONT=HEAD>Remote Graphics</HP2> In another RPC application, an active monitoring program may call GKS standard graphics primitives, which are executed on a remote workstation. In this case, a more complex server had to be written to handle a pool of server processes and windows in order to provide a rapid response to the creation and deletion of many independent client programs. <HP2 FONT=HEAD>Remote software task management Load/Start/Control</HP2> In this case, RPC allows a co\ordinating computer to perfome the management functions needed to set up and start the software in a large data acquisition system. Each machine has a small server which allows the co\ordinating machine to start and stop programs and to interact with them.The level of complexity of the server depends on the machine: on a VAX, the server can start processes and load images for them to run [12], while on a simple monotask embedded M68020 processor, the server simply allows the coordinator access to memory locations (for program loading),the task's registers (for program starting) and the local tables (for system configuration). Attached processor farms are managed in a similar way. The same control program as runs on a local processor on VMEbus will run equally well on a VAX attached by ethernet, as the functions it uses to control the processors and interrogate their state are available by RPC. <HP2 FONT=HEAD>Other Examples</HP2> Other examples of the use of RPC in experiments at CERN include: remote monitoring program control, remote FASTBUS access, remote error logging, remote terminal interaction with processors in VMEbus, the submission of operating system commands from embedded microprocessors, and many less general functions. It is important to realise that the client\server relationship only applies to one call. In fact a program which is a server from one point of view may also be a client of another facility. A client may call back routines within the client it is serving. <FIG ID=NFIG4 PLACE=INLINE CAP=LONG FRAME=NONE> ..if "&SYSTERMT" ne "APA6670" and "&SYSTERMT" ne "I3820" ..th ..go bypas4 <PICTURE NAME=NFIG4$$S> ...bypas4 <FIGCAP><HP1>RPC allows software modules responsible for different areas to call each other irrespective of processor boundaries.</HP1> </FIG> <HP2 FONT=TITLE>Conclusions</HP2> Remote Procedure Call is a technique which is becoming increasingly important in physics experiments, with the profusion of distributed processing power. Experience at CERN over the last three years has demonstrated that the same design technique can be applied in environments as diverse as embedded eight\bit processors without operating systems, M68020 processors running with or without real\time executives, VAX systems and workstations running VMS, and personal computers. The communication media have also been varied, ranging from raw ethernet and RS232 links to the use of standard Class 4 transport services, DECnet, and fast dedicated hardware links. The fact that the applications code is largely independent of this variety, and of future variations, has been an aid both to simplified design and to program portability. <HP2 FONT=HEAD>Acknowledgements</HP2> Many people in this group and in collaborating institutes have been involved in the development of the RPC system, including T. Adye (RAL), R. Bagnara (CERN), D. Gosman (NIKHEF), R. Jones (CERN), I. Martinez (Univ. of Santander), A. Pastore (CERN), J. Raab (CERN), N. Schraudolph (Univ. of Essex), L. Tremblet (CERN). I am also indebted to my group leader, D.M. Sendall, and to B. Segal for their encouragement and advice. Ada is a trademark of the US Government, Ada Joint Project Office. Unix is a trademark of Bell Laboratories. VAX, VMS and DECnet are trademarks of Digital Equipment Corporation. .pa <HP2 FONT=HEAD>References</HP2> <DL> <DT>[1] <DD>B.J. Nelson, "Remote Procedure Call", XEROX PARC CSL-81-9, May 1981 <DT>[2] <DD>B.E.Carpenter, R.Cailliau, "Experience with Remote Procedure Calls in a Real\time Control System", <HP1>Software Practise and Experience,</HP1> Vol 14(9),901-907 (Sep 84) <DT>[3] <DD>K. Kostro, CERN/DD, "Portable Communication with Remote Procedure Call Protocols", private communication <DT>[4] <DD>Sun Microsystems, <HP1>External Data Representation Reference Manual.</HP1> Sun Microsystems, Jan.J1985. <DT>[5] <DD>Xerox Corporation, <HP1> Courier: The Remote Procedure Call Protocol, </HP1>Xerox Corp, XSIS 038112, Dec 81 <DT>[6] <DD>CCITT "Red Book" Volume VIII, Recomendation X.409: "Message Handling Systems: Presentation Transfer Syntax and Notation" <DT>[7] <DD>T.J. Berners\Lee CERN/DD, <HP1>RPC User Manual</HP1>, available from the author <DT>[8] <DD>T.J. Berners\Lee, CERN/DD, "Experience with Remote Procedure Call in Data Acquisition and Control", <HP1>Proceedings of the 5th Conference on Real\Time Computer Applications in Nuclear, Particle and Plasma Physics</HP1>, San Fransisco, May 1987 <DT>[9] <DD>Apollo Computer, <HP1>Network Computing System: A Technical Overview, </HP1>private communication <DT>[10] <DD>D. Notkin, et.al. "Interconnecting Heterogeneous Computer Systems", <HP1>Commun. ACM 31</HP1>, 3, pp 258-273. <DT>[11] <DD>Y. Perrin, et. al. "The Valet\Plus System Embedded in Large Physics Experiments", <HP1>these proceedings</HP1>. <DT>[12] <DD>S. Vascotto, "MODEL" (CERN VAX data acquisition tools), <HP1>CERN Mini and MicroComputer Newsletter No 17</HP1>, Dec 1987. </DL>