Distributed Computing Technologies Explained:
RMI vs. CORBA vs. DCOM
by
Dare Obasanjo
RMI, CORBA, DCOM, RPC, COM+, etc. to many are
just meaningless buzzwords that smack more of hype than results.
This article is aimed at demystifying these technologies that have
revolutionized distributed computing and are transforming
businesses all over the world. The article is not strictly
targetted at the layman but deep programming knowledge is not
needed.
My primary reason for writing this article is that I received an
email a while ago from someone who found two papers I wrote on the
subject, entitled
Introduction To Distributed Computing and
Distributed Object Technologies Compared respectively,
and expressed interest in an article.
Why distributed computing?
Once networking computers began to take hold it soon became
necessary to share resources and data amongst them in a cohesive
manner. Today numerous computer applications use distributed
computing in one form or the other: From large scale ERP
applications that allow monitoring of all the diverse aspects of
the average corporation, to
file servers,
web application servers,
groupware servers, database servers and even print servers that
enable several machines to share a single printer.
A more recent application of distributed computing that is
becoming prevalent now that the processing power of computers is
sufficiently advanced and their use became widespread it harnessing
the power of many personal computers in tandem and using them to
process data in much the same way that mainframes were used in days
of yore.
In the beginning there was RPC
The first distributed computing technology to gain widespread use
was the Remote Procedure Call (RFC 1831)
commonly known as RPC. RPC is
designed to be as similar to making local procedure calls as
possible. The idea behind RPC is to make a function call to a
procedure in another process and address space either on the same
processor or across the network on another processor without having
to deal with the concrete details of how this should be done
besides making a procedure call.
Before an RPC call can be made, both the client and the server
both have to have
stubs for the remote function that are usually generated by an
interface definition language (IDL). When an RPC call is made
by a client the arguments to the remote function are
marshalled and sent across the network and the client waits
until a response is sent by the server. There are some difficulties
with marshalling certain arguments such as pointers since a memory
address on a client is completely useless to the server so various
strategies for passing pointers are usually implemented the two
most popular being a.) dissallowing pointer arguments and b.)
copying what the pointer points at and sending that to the remote
function.
An RPC function locates the server in one of two ways:
- Hard coding the address of the remote server which is
extremely infexible and may require a recompile if that server
goes down.
- Using dynamic
binding where various servers
export whatever interfaces/services they support and clients
pick which server that they want to use out of those that
supports whatever service is needed.
RPC programs, as well as other distributed systems, face a number
of problems which are unique to their situation such as
- Network packets containing client requests being lost.
- Network packets containing server responses being lost.
- Client being unable to locate its server.
There are a variety of solutions to solving these problems which
can begleaned from the various links provided in this article.
The need for distributed object and component systems
As distributed computing became more widespread, more flexibility
and functionality was required than RPC could provide. RPC proved
suitable for
Two-Tier Client/Server Architectures where the application
logic is either in the user application or within the actual
database or file server. Unfortunately this was not enough, more
and more people wanted a
Three-Tier Client/Server Architectures where the application is
split into client application (usually a GUI or browser),
application logic and data store (usually a database server). Soon
people wanted to move to N-tier aapplications where there are
several seperate layers of application logic in between the client
application and the database server.
The advantage of N-tier applications is that the application logic
can be divided into reusable, modular components instead of one
monolithic codebase. Distributed object systems solved many of the
problems in RPC that made large scale system building difficult, in
much the same way Object Oriented paradigms swept Procedural
programing and design paradigms. Distributed object systems make it
possible to design and implement a distributed system as a group of
reusable, modular and easily deployable components where complexity
can be easily managed and hidden behind layers of abstraction.
CORBA
A
CORBA application usually consists of an Object
Request Broker (ORB), a client and a server. An ORB is
responsible for matching a requesting client to the server that
will perform the request, using an object reference to locate the
target object. When the ORB examines the object reference and
discovers that the target object is remote, it marshals the
arguments and routes the invocation out over the network to the
remote object's ORB. The remote ORB then invokes the method
locally and sends the results back to the client via the network.
There are many optional features that ORBs can implement besides
merely sending and receiving remote method invocations including
looking up objects by name, maintaining persistent objects, and
supporting transaction processing. A primary feature of CORBA is
its interoperability between various platforms and programming
languages.
The first step in creating a CORBA application is to define the
interface for the remote object using the OMG's interface
definition language (IDL). Compiling the IDL file will yield two
forms of stub files; one that implements the client side of the
application and another that implements the server. Stubs and
skeletons serve as proxies for clients and servers, respectively.
Because IDL defines interfaces so strictly, the stub on the client
side has no interacting with the skeleton on the server side, even
if the two are compiled into different programming languages, use
different ORBs and run on different operating systems.
Then in order to invoke the remote object instance, the client
first obtains its object reference via the Orb. To make the remote
invocation, the client uses the same code that it would use in a
local invocation but use an object reference to the remote object
instead of an instance of a local object. When the ORB examines the
object reference and discovers that the target object is remote, it
marshals the arguments and routes the invocation out over the
network to the remote object's ORB instead of to another
process within the on the same computer.
CORBA also supports dynamically discovering information about
remote objects at runtime. The IDL compiler generates type
information for each method in an interface and stores it in the
Interface Repository (IR). A client can thus query the IR to
get run-time information about a particular interface and then use
that information to create and invoke a method on the remote CORBA
server object dynamically through the
Dynamic Invocation Interface (DII). Similarly, on the server
side, the
Dynamic Skeleton Interface (DSI) allows a client to invoke an
operation of a remote CORBA Server object that has no compile time
knowledge of the type of object it is implementing.
CORBA is often considered a superficial specification because it
concerns itself more with syntax than with semantics. CORBA
specifies a large number of services that can be provided but only
to the extent of describing what interfaces should be used by
application developers. Unfortunately, the bare minimum that CORBA
requires from service providers lacks mention of security, high
availability, failure recovery, or guaranteed behavior of objects
outside the basic functionality provided and instead CORBA deems
these features as optional. The end result of the lowest common
denominator approach is that ORBs vary so wildly from vendor to
vendor that it is extremely difficult to write portable CORBA code
due to the fact that important features such as transactional
support and error recovery are inconsistent across ORBs.
Fortunately a lot of this has changed with the development of the
CORBA Component Model, which is a superset of Enterprise Java Beans.
DCOM/COM+
Distributed
Component Object Model (DCOM)is the distributed version of
Microsoft's COM technology
which allows the creation and use of binary objects/components from
languages other than the one they were originally written in, it
currently supports Java(J++),C++, Visual Basic, JScript, and
VBScript. DCOM works over the network by using proxy's and
stubs. When the client instantiates a component whose registry
entry suggests that it resides outside the process space, DCOM
creates a wrapper for the component and hands the client a pointer
to the wrapper. This wrapper, called a proxy, simply marshals
methods calls and routes them across the network. On the other end,
DCOM creates another wrapper, called a stub, which unmarshals
methods calls and routes them to an instance of the component.
DCOM servers object can support multiple interfaces each
representing a different behavior of the object. A DCOM client
calls into the exposed methods of a DCOM server by acquiring a
pointer to one of the server object's interfaces. The client
object can the invoke the server object's exposed methods
through the acquired interface pointer as if the server object
resided in the client's address space. All DCOM components and
interfaces must inherit from IUnknown, the base DCOM interface.
IUnknown consists of the methods AddRef(), Release() and
QueryInterface(). AddRef() and Release() are used to for reference
counting and memory management. Essentially, when an object's
reference count becomes zero, it must self-destruct.
Java RMI
Remote Method
Invokation (RMI) is a technology that allows the sharing of
Java objects between
Java Virtual Machines (JVM) across a network. An RMI
application consists of a server that creates remote objects that
conform to a specified interface, which are available for method
invocation to client applications that obtain a remote reference to
the object. RMI treats a remote object differently from a local
object when the object is passed from one virtual machine to
another. Rather than making a copy of the implementation object in
the receiving virtual machine, RMI passes a remote stub for a
remote object. The stub acts as the local representative, or proxy,
for the remote object and basically is, to the caller, the remote
reference. The caller invokes a method on the local stub, which is
responsible for carrying out the method call on the remote object.
A stub for a remote object implements the same set of remote
interfaces that the remote object implements. This allows a stub to
be cast to any of the interfaces that the remote object implements.
However, this also means that only those methods defined in a
remote interface are available to be called in the receiving
virtual machine.
RMI provides the unique ability to dynamically load classes via
their byte codes from one JVM to the other even if the class is not
defined on the receiver's JVM. This means that new object types
can be added to an application simply by upgrading the classes on
the server with no other work being done on the part of the
receiver. This transparent loading of new classes via their byte
codes is a unique feature of RMI that greatly simplifies modifying
and updating a program.
The first step in creating an RMI application is creating a remote
interface. A remote interface is a subclass of
java.rmi.Remote, which indicates that it is a remote object
whose methods can be invoked across virtual machines. Any object
that implements this interface becomes a remote object.
To show dynamic class loading at work, an interface describing an
object that can be serialized and passed from JVM to JVM shall also
be created. The interface is a subclass of the
java.io.Serializable interface. RMI uses the object
serialization mechanism to transport objects by value between Java
virtual machines. Implementing Serializable marks the class as
being capable of conversion into a self-describing byte stream that
can be used to reconstruct an exact copy of the serialized object
when the object is read back from the stream. Any entity of any
type can be passed to or from a remote method as long as the entity
is an instance of a type that is a primitive data type, a remote
object, or an object that implements the interface
java.io.Serializable. Remote objects are essentially passed by
reference. A remote object reference is a stub, which is a
client-side proxy that implements the complete set of remote
interfaces that the remote object implements. Local objects are
passed by copy, using object serialization. By default all fields
are copied, except those that are marked static or transient.
Default serialization behavior can be overridden on a
class-by-class basis.
Thus clients of the distributed application can dynamically load
objects that implement the remote interface even if they are not
defined in the local virtual machine. The next step is to implement
the remote interface, the implementation must define a constructor
for the remote object as well as define all the methods declared in
the interface Once the class is created, the server must be able to
create and install remote objects. The process for initializing the
server includes; creating and installing a security manager,
creating one or more instances of a remote object, and registering
at least one of the remote objects with the RMI remote object
registry (or another naming service such as one that uses JNDI), for
bootstrapping purposes. An RMI client behaves similarly to a
server; after installing a security manager, the client constructs
a name used to look up a remote object. The client uses the
Naming.lookup method to look up the remote object by name in the
remote host's registry. When doing the name lookup, the code
creates a URL that specifies the host where the server is running.
Pick your poison
I'm a big fan of Java so I'm partial to both RMI and
CORBA. Anyone who has had experience with any of the three
aforementioned technologies is welcome to post below.
References and further reading
Component Engineering
Corncupia
Java RMI Tutorial
Introduction To CORBA (uses Java)
Dr.
GUI's Gentle Guide To COM
Using CORBA and Java IDL
© 2001 Dare Obasanjo