The Globus Toolkit 3 Programmer's Tutorial

Borja Sotomayor

This tutorial is available for use and redistribution under the terms of the Globus Toolkit Public License


Table of Contents

Introduction

Welcome to the Globus Toolkit 3 Programmer's Tutorial! This document is intended as a starting point for anyone who is going to program grid-based applications using the Globus Toolkit 3 (GT3). We also hope experienced GT3 programmers will find it useful to learn about the more advanced aspects of GT3 and Grid Services.

The tutorial is divided into 3 main areas:

  • Getting Started: An introduction to key concepts related with Grid Services and GT3.

  • GT3 Core: A guide to programming basic Grid Services which only use the core services in GT3.

  • GT3 Security Services: A guide to programming secure Grid Services which use the toolkit's Security Services.

Future versions of the tutorial will include sections related to GT3 Higher-Level Services (programming Grid Services which use GT3 services such as Index Service, Job Management, File Transfer, etc.)

GT3 Prerequisite Documents

This tutorial has no GT3 prerequisite documents, since it is intended as a starting point for GT3 programmers. However, you should already be familiar with Grid Computing. The following book can help you get up to speed: The Grid: Blueprint for a New Computing Infrastructure (Edited by Ian Foster and Carl Kesselman). Most of the book is easy to read and not too technical. It is also known as "The Grid Bible". With a name like that, you can assume it's worth taking a look at it :-) You might be even more interested in the second edition released in 2003, including tons of new material: The Grid 2: Blueprint for a New Computing Infrastructure (Edited by Ian Foster and Carl Kesselman)

You might also be interested in taking a look at the 'Publications' section in the Globus website, specially the documents listed below. However, these documents are rather technical and might be too hard for a beginner. You might want to just skim through them at first, and then reread them once you're familiar with GT3.

Audience

This document is intended for programmers who wish to program grid-based applications with GT3. Readers who have absolutely no experience with Web Services or the Globus Toolkit should read the whole document. Readers who have some experience with GT3 can safely skip most of the introductory material.

Assumptions

The following knowledge is assumed:

  • Programming in Java. If you don't know Java, you can find some useful links here. Also, prior experience of distributed systems programming with Java (with CORBA, RMI, etc.) will certainly come in handy, but is not strictly required.

  • Basic knowledge of XML. If you have no idea of XML, you can find some useful links here.

  • You should know your way around a UNIX system. This tutorial is mainly UNIX-oriented, although in the future we hope to include sections for Windows users.

  • Basic knowledge of what The Grid and grid-based applications are. This tutorial is not intended as an introduction to Grid Computing, but rather as an introduction to a toolkit which can enable you to program grid-based applications.

The following knowledge is not required:

  • Web Services. The tutorial includes an introduction to fundamental Web Services concepts needed to program Grid Services.

  • Globus Toolkit 2

Related Documents

The Globus Toolkit includes some very useful documents. The ones most related to this document are:

  • Java User's Guide: $GLOBUS_LOCATION/docs/users_guide.html

  • Java Programmer's Guide: $GLOBUS_LOCATION/docs/java_programmers_guide.html

  • Programmer's API: $GLOBUS_LOCATION/docs/api/index.html

Substitute $GLOBUS_LOCATION for the root of your GT3 installation. A team at IBM lead by Luis Ferreira has written a thorough Redpaper titled GT3 Quick Start which explains the GT3 installation process in detail.

GT3 users have also contributed installation and programming guides:

Once you've become a Grid Services expert, you might have to occasionally take a look at the OGSI specification, available at the OGSI Working Group page.

Document Conventions

The following conventions will be observed in this document.

Code

public class HelloWorld
{
  public static final void main( String args[] )
  {
     // Code in bold is important
     System.out.println("Hello World");
  }
} 

Shell commands

javac HelloWorld.java

If a command is too long to fit in a single line, it will be wrapped into several lines using the backslash ("\") character. On most UNIX shells (including BASH) you should be able to copy and paste all the lines at once into your console.

javac \
-classpath /usr/lib/java/Hello.jar \
HelloWorld.java \
HelloUniverse.java \
HelloEveryone.java 

Notes

You can find two types of notes in the text: General notes, and warnings.

[Note]

This is a general note.

This kind of notes are usually used after a block of code to point out where you can find the file that contains that particular code. It is also used to remind you of important concepts, and to suggest what sections of the tutorial you should read again if you have a hard time understanding a particular section.

[Caution]

This is a warning.

Warnings are used to emphatically point out something. They generally refer to common pitfalls or to things that you should take into account when writing your own code.

About the author & acknowledgments

The Globus Toolkit 4 Programmer's Tutorial is written and maintained by Borja Sotomayor, a Ph.D. student at the Department of Computer Science at the University of Chicago. You can find out more about me in my UofC personal page.

Acknowledgments

This tutorial can hardly be considered a one-person effort. The following people have, in one way or another, helped to make the GT3 Tutorial a reality:

  • Lisa Childers

  • Rebeca Cortazar

  • Ian Foster

  • Leon Kuntz (in memoriam)

  • Jesus Marco

  • All the Globus gurus who have reviewed the tutorial on countless occasions

Last, but certainly not least, a lot of readers have helped to improve the tutorial by reporting bugs and typos, as well as making very constructive comments and suggestions:

Balamurali Ananthan, Sebastien Barre, Thomas Becker, Luther Blake, Robert M. Bram, Javier Cano, Paulo Cortes, Jun Ebihara, Qin Feng, Luis Ferreira, Fernando Fraticelli, Anders Keldsen, Britt Johnston, Steve Mock, Elizabeth Post, Philippe Prados, Michael Schneider, Shiva Shankar Chetan, Nelson Sproul, Ian Stokes-Rees, Jason Young, Matthew Vranicar, James Werner

If you've reported a bug, typo, or helped out in any way, and you are not listed here, please do let me know!

Getting Started

Chapter 1. Key Concepts

There are certain key concepts that must be well understood before being able to program with GT3. This chapter gives a brief overview of all those fundamental concepts.

  • OGSA, OGSI, and GT3 : We'll take a look at what these oft-mentioned acronyms mean, and how they are related.

  • Web Services : OGSA, OGSI, and GT3 are based on standard Web Services technologies such as SOAP and WSDL. You don't need to be a Web Services expert to program with GT3, but you should be familiar with the Web Services architecture and languages. We provide a basic introduction and give you pointers to interesting sites about Web Services.

  • Grid Services : Grid Services are the core of GT3. We take a look at what a Grid Service is, and how it is related to Web Services.

  • The GT3 Architecture : After seeing both Grid Services and Web Services, we take a look at the whole GT3 architecture, and how Grid Services fit in it.

  • Java & XML : Finally, if you want to use GT3, you need to be able to program in Java, and to understand basic XML. If you're new to Java and XML, we provide a couple links that can help you get started.

OGSA, OGSI, and GT3

The third and latest version of the Globus Toolkit is based on something called Grid Services. Before defining Grid Services, we're going to see how Grid Services are related to a lot of acronyms you've probably heard (OGSA, OGSI, ...), but aren't quite sure what they mean exactly. The following diagram summarizes the major players in the Grid Services world:

OGSA

The Open Grid Services Architecture (OGSA), developed by The Global Grid Forum, aims to define a common, standard, and open architecture for grid-based applications. The goal of OGSA is to standardize practically all the services one finds in a grid application (job management services, resource management services, security services, etc.) by specifying a set of standard interfaces for these services.

However, when the powers-that-be undertook the task of creating this new architecture, they realized they needed to choose some sort of distributed middleware on which to base the architecture. In other words, if OGSA (for example) defines that the JobSubmisionInterface has a submitJob method, there has to be a common and standard way to invoke that method if we want the architecture to be adopted as an industry-wide standard. This base for the architecture could, in theory, be any distributed middleware (CORBA, RMI, or even traditional RPC). For reasons that will be explained further on, Web Services were chosen as the underlying technology.

However, although the Web Services Architecture was certainly the best option, it still had several shortcomings which made it inadequate for OGSA's needs. OGSA overcame this obstacle by defining an extended type of Web Service called Grid Service (as shown in the diagram: Grid Services are defined by OGSA). A Grid Service is simply a Web Service with a lot of extensions that make it adequate for a grid-based application (and, in particular, for OGSA). In the diagram: Grid Services are an extension of Web Services. Finally, since Grid Services are going to be the distributed technology underlying OGSA, it is also correct to say that OGSA is based on Grid Services.

OGSI

OGSA alone doesn't go into much detail when describing Grid Services. It basically outlines what a Grid Service should have (that Web Services don't) but little else. That is why OGSA spawned another standard called the Open Grid Services Infrastructure (OGSI, also developed by The Global Grid Forum) which gives a formal and technical specification of what a Grid Service is. In other words, for a high-level architectural view of what Grid Services are, and how they fit into the next generation of grid applications, OGSA is the place to go. For an excruciatingly detailed specification of how Grid Services work, OGSI is the place to go. In the diagram: Grid Services are specified by OGSI (as opposed to simply 'being defined' by OGSA). The Related Documents section has a link to the Grid Service Specification.

The Globus Toolkit 3

The Globus Toolkit is a software toolkit, developed by The Globus Alliance, which we can use to program grid-based applications. The third version of the toolkit (GT3) includes a complete implementation of OGSI (in the diagram GT3 implements OGSI). However, it's very important to understand that GT3 isn't only an OGSI implementation. It includes a whole lot of other services, programs, utilities, etc. Some of them are built on top of OGSI and are called the WS (Web Services) components, while other are not built on top of OGSI and are called the pre-WS components. We'll take a closer look at the GT3 architecture shortly.

I still don't get it: What is the difference between OGSA, OGSI, and GT3?

Consider the following simple example. Suppose you want to build a new house. The first thing you need to do is to hire an architect to draw up the blueprints, so you can get an idea of what your house will look like. Once you're happy with the architect's job, it's time to hire an engineer who will plan all the construction details (where to put the master beams, the power cables, the plumbing, etc.). The engineer then passes all his plans to qualified professional workers (construction workers, electricians, plumbers, etc) who will actually build the house.

We could say that OGSA (the definition) is the blueprints the architect creates to show what the building looks like, OGSI (the specification) is the structural design that the engineer creates to support the architect's vision for the building, and GT3 is the bricks, cement, and beams used to build the building to the engineer's specifications.

A short introduction to Web Services

Since Web Services are the basis for Grid Services, understanding the Web Services architecture is fundamental to using GT3 and programming Grid Services.

Lately, there has been a lot of buzz about "Web Services", and many companies have begun to rely on them for their enterprise applications. So, what exactly are Web Services? To put it quite simply, they are yet another distributed computing technology (like CORBA, RMI, EJB, etc.) They allow us to create client/server applications.

For example, let's suppose I have to develop an application for a chain of stores. These stores are all around the country, but my master catalog of products is only available in a database at my central offices, yet the software at the stores must be able to access that catalog. I could publish the catalog through a Web Service called ShopService.

IMPORTANT: Don't mistake this with publishing something on a website. Information on a website (like the one you're reading right now) is intended for humans. Information which is available through a Web Service will always be accessed by software, never directly by a human (despite the fact that there might be a human using that software). Even though Web Services rely heavily on existing Web technologies (such as HTTP, as we will see in a moment), they have no relation to web browsers and HTML. Repeat after me: websites for humans, Web Services for software :-)

The clients (the PCs at the store) would then contact the Web Service (in the server), and send a service request asking for the catalog. The server would return the catalog through a service response. Of course, this is a very sketchy example of how a Web Service works. In a moment we'll see all the details.

Some of you might be thinking: "Hey! Wait a moment! I can do that with RMI, CORBA, EJBs, and countless other technologies!" So, what makes Web Services special? Well, Web Services have certain advantages over other technologies:

  • Web Services are platform-independent and language-independent, since they use standard XML languages. This means that my client program can be programmed in C++ and running under Windows, while the Web Service is programmed in Java and running under Linux.

  • Most Web Services use HTTP for transmitting messages (such as the service request and response). This is a major advantage if you want to build an Internet-scale application, since most of the Internet's proxies and firewalls won't mess with HTTP traffic (unlike CORBA, which usually has trouble with firewalls)

Of course, Web Services also have some disadvantages:

  • Overhead. Transmitting all your data in XML is obviously not as efficient as using a proprietary binary code. What you win in portability, you lose in efficiency. Even so, this overhead is usually acceptable for most applications, but you will probably never find a critical real-time application that uses Web Services.

  • Lack of versatility. Currently, Web Services are not very versatile, since they only allow for some very basic forms of service invocation. CORBA, for example, offers programmers a lot of supporting services (such as persistency, notifications, lifecycle management, transactions, etc.) In fact, in the next page we'll see that Grid Services actually make up for this lack of versatility.

However, there is one important characteristic that distinguishes Web Services. While technologies such as CORBA and EJB are geared towards highly coupled distributed systems, where the client and the server are very dependent on each other, Web Services are more adequate for loosely coupled systems, where the client might have no prior knowledge of the Web Service until it actually invokes it. Highly coupled systems are ideal for intranet applications, but perform poorly on an Internet scale. Web Services, however, are better suited to meet the demands of an Internet-wide application, such as grid-oriented applications.

A Typical Web Service Invocation

So how does this all actually work? Let's take a look at all the steps involved in a complete Web Service invocation. For now, don't worry about all the acronyms (SOAP, WSDL, ...) We'll explain them in detail in just a moment.

  1. As we said before, a client may have no knowledge of what Web Service it is going to invoke. So, our first step will be to find a Web Service that meets our requirements. For example, we might be interested in locating a public Web Service which can give me the temperature in US cities. We'll do this by contacting a UDDI registry.

  2. The UDDI registry will reply, telling us what servers can provide us the service we require (e.g. the temperature in US cities)

  3. We now know the location of a Web Service, but we have no idea of how to actually invoke it. Sure, we know it can give me the temperature of a US city, buy what is the actual service invocation? The method I have to invoke might be called Temperature getCityTemperature(int CityPostalCode), but it could also be called int getUSCityTemp(string cityName, bool isFarenheit). We have to ask the Web Service to describe itself (i.e. tell us how exactly we should invoke it)

  4. The Web Service replies in a language called WSDL.

  5. We finally know where the Web Service is located and how to invoke it. The invocation itself is done in a language called SOAP. Therefore, we will first send a SOAP request asking for the temperature of a certain city.

  6. The Web Service will kindly reply with a SOAP response which includes the temperature we asked for, or maybe an error message if our SOAP request was incorrect.

Web Services Addressing

We have just seen a simple Web Service invocation. At one point, the UDDI registry 'told' the client where the Web Service is located. But...how exactly are Web Services addressed? The answer is very simple: just like web pages. We use plain and simple URIs (Uniform Resource Identifiers). If you're more familiar with the term URL (Uniform Resource Locator), don't worry: URI and URL are practically the same thing.

For example, the UDDI registry might have replied with the following URI:

http://webservices.mysite.com/weather/us/WeatherService

This could easily be the address of a web page. However, remember that Web Services are always used by software (never directly by humans). If you typed a Web Service URI into your web browser, you would probably get an error message or some unintelligible code (some web servers will show you a nice graphical interface to the Web Service, but that isn't very common). When you have a Web Service URI, you will usually need to give that URI to a program. In fact, most of the client programs we will write will receive the Grid Service URI as a command-line argument.

Web Services Architecture

Now that we've seen the different players in a Web Service invocation, let's take a closer look at the Web Services Architecture:

  • Service Discovery: This part of the architecture allows us to find Web Services which meet certain requirements. This part is usually handled by UDDI (Universal Description, Discovery, and Integration). GT3 currently doesn't include support for UDDI.

  • Service Description : One of the most interesting features of Web Services is that they are self-describing. This means that, once you've located a Web Service, you can ask it to 'describe itself' and tell you what operations it supports and how to invoke it. This is handled by the Web Services Description Language (WSDL).

  • Service Invocation : Invoking a Web Service (and, in general, any kind of distributed service such as a CORBA object or an Enterprise Java Bean) involves passing messages between the client and the server. SOAP (Simple Object Access Protocol) specifies how we should format requests to the server, and how the server should format its responses. In theory, we could use other service invocation languages (such as XML-RPC, or even some ad hoc XML language). However, SOAP is by far the most popular choice for Web Services.

  • Transport : Finally, all these messages must be transmitted somehow between the server and the client. The protocol of choice for this part of the architecture is HTTP (HyperText Transfer Protocol), the same protocol used to access conventional web pages on the Internet. Again, in theory we could be able to use other protocols, but HTTP is currently the most used one.

What a Web Service Application Looks Like

OK, now that you have an idea of what Web Services are, you are probably anxious to start programming Web Services right away. Before you do that, you might want to know how Web Services-based applications are structured. If you've ever programmed with CORBA or RMI, this structure will look pretty familiar.

First of all, you should know that despite having a lot of protocols and languages floating around, Web Services programmers usually never write a single line of SOAP or WSDL. Once we've reached a point where our client application needs to invoke a Web Service, we delegate that task on a piece of software called a client stub. The good news is that there are plenty of tools available that will generate client stubs automatically for us, usually based on the WSDL description of the Web Service.

Therefore, you shouldn't interpret the "Typical Invocation" diagram literally. A Web Services client doesn't usually do all those steps in a single invocation. A more correct sequence of events would be the following:

  1. We locate a Web Service that meets our requirements through UDDI.

  2. We obtain that Web Service's WSDL description.

  3. We generate the stubs once, and include them in our application.

  4. The application uses the stubs each time it needs to invoke the Web Service.

Programming the server side is just as easy. We don't have to write a complex server program which dynamically interprets SOAP requests and generates SOAP responses. We can simply implement all the functionality of our Web Service, and then generate a server stub (the term skeleton is also common) which will be in charge of interpreting requests and forwarding them to the service implementation. When the service implementation obtains a result, it will give it to the server stub, which will generate the appropriate SOAP response. The server stub can also be generated from a WSDL description, or from other interface definition languages (such as IDL). Furthermore, both the service implementation and the server stubs are managed by a piece of software called the Web Service container, which will make sure that incoming HTTP requests intended for a Web Service are directed to the server stub.

So, the steps involved in invoking a Web Service are described in the following diagrams.

Let's suppose that we've already located the Web Service, and generated the client stubs from the WSDL description. Furthermore, the server-side programmer will have generated the server stubs.

  1. Whenever the client application needs to invoke the Web Service, it will actually call the client stub. The client stub will turn this 'local invocation' into a proper SOAP request. This is often called the marshaling or serializing process.

  2. The SOAP request is sent over a network using the HTTP protocol. The Web Services container receives the SOAP requests and hands it to the server stub. The server stub will convert the SOAP request into something the service implementation can understand (this is usually called unmarshaling or deserializing)

  3. The service implementation receives the request from the service stub, and carries out the work it has been asked to do. For example, if we are invoking the int add(int a, int b) method, the service implementation will perform an addition.

  4. The result of the requested operation is handed to the server stub, which will turn it into a SOAP response.

  5. The SOAP response is sent over a network using the HTTP protocol. The client stub receives the SOAP response and turns it into something the client application can understand.

  6. Finally the application receives the result of the Web Service invocation and uses it.

By the way, in case you're wondering, most of the Web Services Architecture is specified and standardized by the World Wide Web Consortium, the same organization responsible for XML, HTML, CSS, etc.

What is a Grid Service?

As mentioned before, Web Services are the technology of choice for Internet-based applications with loosely coupled clients and servers. That makes them the natural choice for building the next generation of grid-based applications. However, remember Web Services do have certain limitations. In fact, plain Web Services (as currently specified by the W3C) wouldn't be very helpful for building a grid application. Enter Grid Services, which are basically Web Services with improved characteristics and services.

We'll take a brief look at the main improvements introduced in OGSI:

  • Stateful and potentially transient services

  • Service Data

  • Notifications

  • Service Groups

  • portType extension

  • Lifecycle management

  • GSH &amp GSR

Stateful and potentially transient services

This first feature is probably one of the most important improvement with regard to Web Services. Let's see what this feature is all about by using a simple example. Imagine your organization has a really big cluster capable of performing the most mind-boggling calculations. However, this cluster is located in your central headquarters in Chicago, and you need employees from your offices in New York, Los Angeles, and Seattle to conveniently use the cluster's computational power. This looks like a perfect scenario for a Web Service!

We could implement a Math Web Service called MathService which offered operations such as SolveReallyBigSystem(), SolveFermatsLastTheorem(), etc. At first, we would be able to perform typical Web Service invocations:

  1. Invoke MathService, asking it to perform a certain operation.

  2. MathService will instruct the cluster to perform that operation.

  3. MathService will return the result of the operation.

So far, so good. However, let's be a bit more realistic. If you're going to access a remote cluster to perform complex mathematical operations, you probably won't perform a single operation, but rather a chain of operations, which will all be related to each other. However, Web Services are stateless. "Stateless" means that Web Services can't remember what you've done from one invocation to another. If we wanted to perform a chain of operations, we would have to get the result of one operation and send it as a parameter to the next operation.

Furthermore, even if we solved the stateless problem (some Web Services containers actually work around this problem), Web Services are still non-transient, which means that they outlive all their clients. Web Services are also referred to as persistent (as opposed to transient; this doesn't mean 'persistent' in the sense of 'persisting data to secondary storage, a hard drive, etc.') because their lifetime is bound to the Web Services container (a Web Service is available from the moment the server is started, and doesn't go down until the server is stopped) In any case, this implies that, after one client is done using a Web Service, all the information the Web Service is remembering could be accessed by the next clients. In fact, while one client is using the Web Service, another client could access the Web Service and potentially mess up the first client's operations. This certainly isn't a very elegant solution!

Grid Services solve both problems by allowing programmers to use a factory/instance approach to Web Services. Instead of having one big stateless MathService shared by all users, we could have a central MathService factory in charge of maintaining a bunch of MathService instances. When a client wants to invoke a MathService operation, it will talk to the instance, not to the factory. When a client needs a new instance to be created (or destroyed) it will talk to the factory.

This diagram shows how there doesn't necessarily have to be one instance per client. One instance could be shared by two clients, and one client could have access to two instances. These instances are transient, because they have a limited lifetime which isn't bound to the lifetime of the Grid Services' container. In other words, we can create and destroy instances at will whenever we need them (instead of having one persistent service permanently available). The actual lifecycle of an instance can vary from application to application. Usually, we'll want instances to live only as long as a client has any use for them. This way, every client has its own personal instance to work with. However, there are other scenarios where we might want an instance to be shared by several users, and to self-destruct after no clients have accessed it for a certain time.

Finally, notice how Grid Services are potentially transient. This means that not all Grid Services have to use (by definition) a factory/instance approach. A Grid Service can be persistent, just like a normal Web Service. Choosing between persistent Grid Services or factory/instance Grid Services depends entirely on the requirements of your application.

Lifecycle management

Since we are now dealing with services that have non-trivial lifecycles (if we use a factory/instance model, instances can be created and destroyed at any time), lifecycle management mechanisms are provided in Grid Services. OGSI itself only supplies some very basic mechanisms, which are complemented by additional mechanisms in GT3, as we'll see later on.

Service Data

Service Data, along with statefulness and transience, ranks very high in the list of 'the best things Grid Services add to Web Services'. In fact, Service Data is my personal favorite extension!

Service Data allows us to easily include a set of structured data to any service, which can then be accessed directly through its interface. Since plain Web Services only allow operations to be included in the WSDL interface, you could think of Service Data as an extension that allows us to include not only operations in the WSDL interface, but also attributes. However, Service Data is much more than simple attributes, since we can easily include any type of data (fundamental types, classes, arrays, etc.)

In general, the Service Data we include in a service will fall into one of two categories:

  • State information: Provides information on the current state of the service, such as operation results, intermediate results, runtime information, etc.

  • Service metadata: Information on the service itself, such as system data, supported interfaces, cost of using the service, etc.

If you're not too sure what service data is, don't worry: a much more detailed explanation will be given when we start coding Grid Services with service data.

Notifications

A Grid Service can be configured to be a notification source, and certain clients to be notification sinks (or subscribers). This means that if a change occurs in the Grid Service, that change is notified to all the subscribers (not all changes are notified, only the ones the Grid Services programmer wants to). In the MathService example, suppose that all the clients perform certain calculations using a variable called InterestingCoefficient which is stored in the Grid Service. Any of the clients can modify that value to improve the overall calculation. However, all clients must be notified of that change when it occurs. We can achieve this easily with the Grid Services notifications.

Later on, we'll see that notifications in OGSI are very closely related to service data.

Service Groups

Any service can be configured to act as a service group which aggregates other services. We can easily perform operations such as 'add new service to group', 'remove this service from group', and (more importantly) 'find a service in the group that meets condition FOOBAR'. Although the service group functionality included in OGSI is pretty simple, it is nonetheless the base of more powerful directory services (such as GT3's IndexService) which allow us to group different services together and access them through a single point of entry (the service group).

portType extension

In the previous page we saw that a Web Service exposes its interface (the operations it can perform) through a WSDL document. The interface is usually called portType (due to a WSDL tag of the same name). A normal Web Service can have only one portType. Grid Services, on the other hand, support portType extension, which means we can define a portType as an extension of a previously existing portType.

For example, the OGSI specification mandates that all Grid Services must extend from a standard portType called GridService:

Thanks to portType extension, we can simply define our own portType as an extension of GridService. With plain web services, we would need to include the declaration of all the operations (including the GridService operations) in a single portType.

Besides the standard GridService portType, OGSI defines a lot of other standard portTypes we can extend from to add functionality to our grid service. For example, there is a NotificationSource portType which we can extend from if we want our service to act as a notification source (we'll take a close look at this portType and other standard portTypes when we start working with code). Notice how NotificationSource itself extends from GridService:

In general, we'll find that grid services can have three types of portTypes:

GSH & GSR

In the previous page we saw that Web Services are addressed with URIs. Since Grid Service are Web Services, they are also addressed with URIs. However, OGSI introduces a more powerful addressing scheme.

A "Grid Service URI" is called the Grid Service Handle, or simply GSH. Each GSH must be unique. There cannot be two Grid Services with the same GSH. The only problem with the GSH is that it tells me where the Grid Service is, but doesn't give me any information on how to communicate with the Grid Service (what methods it has, what kind of messages it accepts/receives, etc.). To do this, we need the Grid Service Reference, or GSR. In theory, the GSR can take many different forms, but since we will usually use SOAP to communicate with a Grid Service, the GSR will be a WSDL file (remember that WSDL describes a Web Service: what methods it has, etc.). In fact, in this tutorial we will only handle WSDL as a GSR format.

The Globus Toolkit 3

Grid Services sound great, don't they? However, if you've already programmed grid-based applications, you're probably thinking that this is all very nice, but hardly enough for The Grid. Grid Services are only a small (but important!) part of the whole GT3 Architecture, which offers developers plenty of services to get serious with Grid programming.

OGSI (i.e. "Grid Services") is the 'GT3 Core' layer. Let's take a look at the rest of the layers from the bottom up:

  • GT3 Security Services: Security is an important factor in grid-based applications. GT3 Security Services can help us restrict access to our Grid Services, so only authorized clients can use them. For example, we said that only our New York, Los Angeles, and Seattle offices could access MathService. We want to make sure only those offices have access to MathService. Besides the usual security measures (putting the web server behind a firewall, etc.) GT3 gives us one more layer of security with technologies such as SSL and X.509 digital certificates.

  • GT3 Base Services: This layer actually includes a whole lot of interesting services:

    • Managed Job Service: Suppose some particular operation in MathService might take hours or even days to be done. Of course, we don't want to simply stand in front of a computer waiting for the result to arrive (specially if, after 8 hours of waiting, all we get might simply be an error message!) We need to be able to check on the progress of the operation periodically, and have some control over it (pause it, stop it, etc.) This is usually called job management (in this case, the term 'job' is used instead of 'operation').

    • Index Service: Remember from A short introduction to Web Services that we usually know what type of Web Service we need, but we have no idea of where they are. This also happens with Grid Services: we might know we need a Grid Service which meets certain requirements, but we have no idea of what its location is. While this was solved in Web Services with UDDI, GT3 has its own Index Service. For example, we could have several dozen MathServices all around the country, each with different characteristics (some might be better suited for statistical analysis, while others might me better for performing simulations). Index Service will allow us to query what MathService meets our particular requirements.

    • Reliable File Transfer (RFT) Service: This service allows us to perform large file transfers between the client and the Grid Service. For example, suppose we have an operation in MathService which has to crunch several gigabytes of raw data (for a statistical analysis, for example). Of course, we're not going to send all that information as parameters. We'll be able to send it as a file. Furthermore, RFT guarantees the transfer will be reliable (hence its name). For example, if a file transfer is interrupted (due to a network failure, for example), RFT allows us to restart the file transfer from the moment it broke down, instead of starting all over again.

  • GT3 Data Services: This layer includes Replica Management, which is very useful in applications that have to deal with very big sets of data. When working with large amount of data, we're usually not interested in downloading the whole thing, we just want to work with a small part of all that data. Replica Management keeps track of those subsets of data we will be working with.

  • Other Grid Services: Other non-GT3 services can run on top of the GT3 Architecture.

WSRF & GT4

Before moving on to the practical part of the tutorial, I think it's definitely worthwhile to take five minutes to reading about the next step in the evolution of grid standards: WSRF and GT4.

After having read all the theory behind grid services, you might be thinking: "Gee, Grid Services are really cool!" You might even be thinking: "Why would anyone want to continue using Web Services when you can use hip and cool Grid Services instead?" Well, it so turns out that OGSI was created with the hopes that it would eventually converge with Web Services standards and that, in fact, Web Services and Grid Services would become the same thing. However, however 'hip and cool' OGSI Grid Services might seem, they do in fact have several drawbacks which have blocked this convergence.

  • The OGSI specification is long and dense. Yes, practically all specifications are long and dense but, believe me, OGSI is specially long and dense.

  • OGSI does not work well with current Web Services tooling.

  • Too object oriented. Despite the fact that many Web Services systems have object oriented implementations, web services themselves are not supposed to be object oriented. OGSI, on the other hand, takes a lot of concepts from OO (such as statefulness, the factory/instance model, etc.)

To solve OGSI's problems once and for all, and to improve Grid Services' chances of finally converging with Web Services, a new standard was presented in January 2004 to substitute OGSI: The Web Services Resource Framework, or WSRF.

As the following diagram shows, WSRF aims to integrate itself into the family of Web Services standards, instead of simply being a 'patch over Web Services' like OGSI was. In this new scenario, OGSA will be based directly on Web Services instead of being based on OGSI Grid Services.

Notice how this new standard doesn't really affect OGSA all that much. All the high level services defined in OGSA (job management, resource management, security services, etc.) will keep having the same interfaces and specifications. The only difference is that the underlying middleware will be pure Web Services instead of OGSI Grid Services. From the standpoint of defining the standard interfaces and behavior of (for example) a job submission service, this isn't really such a big change. It will, of course, affect those who want to implement (or have already begun implemented) OGSA services.

And how exactly does WSRF intend to gain true convergence with Web Services? Well, first of all, it overcomes OGSI's drawbacks:

  • The OGSI specification is long and dense. The WSRF specification is divided into 5 documents, plus one complementary specification which is not strictly a part of WSRF (WS-Notification)

  • OGSI does not work well with current Web Services tooling. WSRF takes into account most of the objections posed by the Web Services community, making it easier for current Web Services tooling to evolve towards including WSRF support.

  • Too object oriented. WSRF cleanly separates the service from the state, since pure Web Services cannot have state. Support for factory/instance services also disappears, although it can still be implemented by using a design pattern.

The Globus Toolkit 4

Yes, despite the fact that what you're reading is the Globus Toolkit Three Programmer's Tutorial, there are already plans for a fourth version. This new version, in fact, will be the first available WSRF implementation. The first stable release is expected in the third quarter of 2004.

Don't Panic

At this point, alarm bells might be ringing loudly in your head. If the next version of the Globus Toolkit is just around the corner (with a new standard), then what is the point of spending a single nanosecond on GT3 and OGSI? Well, here's the good news: WSRF and OGSI are conceptually the same thing. The main differences between OGSI and WSRF are not 'big' differences (such as a paradigm shift). They are mainly syntactical in nature: WSRF is a refactoring of OGSI, taking into account all the comments from both the Grid community and the Web Services community, resulting in a more polished and stable standard. In fact, there's even a document available in the WSRF website that shows how there's a direct mapping from OGSI features to WSRF features.

So, what's the point in learning OGSI and GT3? Because the switch from OGSI to WSRF will be very simple if you've already gotten the hang of all the fundamental concepts: service data, notifications, lifecycle management, etc. If you wait for GT4 to come out, you'll have to learn all the theory then. It's like learning to program: Once you've mastered one programming language, moving to a new language is simply a question of learning the syntax of the new language (along with any particular idiosyncrasies that language might have)

Ok, so that previous paragraph might sound like a shameless pitch to get you to read the tutorial :-) Believe me, it's not. If you learn GT3 now instead of waiting for GT4, you'll be able to laugh at those who didn't learn GT3. Like this: "Ha, ha, ha! I scoff thine minuscule knowledge of service data and notifications! Tremble upon my encyclopedic knowledge of service-oriented grid applications!"

Where to learn Java & XML

After seeing all the theory behind GT3, we're almost ready to start programming. However, remember you need to know Java to follow this tutorial. If you're new to Java, you will probably find the following sites interesting:

  • The Java Tutorial : The official tutorial from Sun, the makers of Java. Very good if you know absolutely nothing about Java.

  • The Coffee Break : Website with resources for Java programmers, including tutorials and FAQs.

Also, you need to be familiar with XML. You don't have to be an XML wizard, but should at least be able to read and interpret the different elements of an XML document. If you've never worked with XML, you should probably take a look at the following sites:

  • W3Schools XML Tutorial : Tutorial that covers both the basics and the more advanced aspects of XML.

  • ZVON.org : Tons of XML resources. Includes some very good reference guides.

Chapter 2. Installation

This tutorial currently doesn't include an installation guide. I hope to include one in the future but, in the meanwhile, there are plenty of good GT3 installation guides available on the Web:

GT3 Core

Chapter 3. Writing Your First Grid Service in 5 Simple Steps

MathService

In this chapter we are going to write and deploy a simple Grid Service. Our first Grid Service is an extremely simple Math Grid Service, which we'll refer to as MathService. It will allow users to perform the following operations:

  • Addition

  • Subtraction

High-tech stuff, huh? Don't worry if this seems a bit lackluster. Since this is going to be our first Grid Service, it's better to start with a small didactic service which we'll gradually improve by adding service data, notifications, etc. You should always bear in mind that MathService is, after all, just a means to get acquainted with GT3. Typical grid services are generally much more complex and do more than expose trivial operations (such as addition and subtraction). Although the tutorial is currently based only on MathService, future versions of the tutorial will include examples of grid services which you could find in 'real' applications.

The Five Steps

Writing and deploying a Grid Service is easier than you might think. You just have to follow five simple steps.

  1. Define the service's interface. This is done with GWSDL

  2. Implement the service. This is done with Java

  3. Define the deployment parameters. This is done with WSDD

  4. Compile everything and generate GAR file. This is done with Ant

  5. Deploy service. This is also done with Ant

Don't worry if you don't understand these five steps or are baffled by terms such as GWSDL, WSDD, and Ant. In this first example we're going to go through each step in great detail, explaining what each step accomplishes, and giving detailed instructions on how to perform each step. The rest of the examples in the tutorial will also follow these five steps. However, the rest of the examples will simply instruct you to perform a step, and won't repeat the whole explanation of what that step is. So, if you ever find that you don't understand a particular step, you can always come back to this chapter ("Writing Your First Grid Service in 5 Simple Steps") to review the details of that step.

Before we start...

Ready to start? Ok! Just hold your horses for a second. Don't forget to download the tutorial files before you start. You can find a link to the tutorial files in the tutorial website. The tutorial bundle includes all the tutorial source files, plus a couple of extra files we'll need to successfully build and deploy our service. Just create an empty directory on your filesystem and untar-ungzip the file there. From now on, we'll refer to that directory as $TUTORIAL_DIR.

Once you have the files, take into account that there are two ways of following the first chapters of the tutorial:

  • With the tutorial source files: You'll have all the source code (Java, GWSDL, and WSDD) ready to use in $TUTORIAL_DIR, so there's no need to manually modify these files.

  • Without the tutorial source files: Some people don't like getting all the source code ready to use out-of-the-box, but rather prefer to write the files themselves so they can have a better understanding of what they're doing at each point. In fact, I think this is probably the best way to follow this first part of the tutorial. Since this first part includes complete code listing in the tutorial (which you can copy and paste to a file), you can easily write all the files yourself. However, you do need a set of auxiliary files included in the tutorial bundle which are needed to build and deploy the services. So, if you want to follow the tutorial without the source files, you still need to download the tutorial files. Once you're in $TUTORIAL_DIR, simply delete directory "org" to delete the source files, but don't delete anything else.

Ok, now we're ready to start :-)

Step 1: Defining the interface in GWSDL

The first step in writing a Grid Service (or a Web Service) is to define the service interface. We need to specify what our service is going to provide to the outer world. At this point we're not concerned with the inner workings of that service (what algorithms it uses, what databases it will access, etc.) We just need to know what operations will be available to our users. Remember that, in Web/Grid Services lingo, the service interface is usually called the port type (usually written portType).

As we saw in A short introduction to Web Services, there is a special XML language which can be used to specify what operations a web service offers: the Web Service Description Language (WSDL). So what we need to do in this step is write a description of our MathService using WSDL. However, this is not entirely true. In GT3 we have two options:

  • Writing the WSDL directly. This is the most versatile option. If we write WSDL directly, we have total control over the description of our service's portType. However, it is not the most user-friendly one, since WSDL is a rather verbose language.

  • Generating WSDL from a Java interface. We can generate WSDL automatically from a Java interface. This is the easiest option, but not the most versatile (since very complicated interfaces are not always converted correctly to WSDL).

At first sight, it might seem that starting with an interface language (such as Java interface or an IDL interface) might be the best option, since it is the most user-friendly. In fact, if we wanted to define our interface in Java, we could simply write the following:

public interface Math
{
  public void add(int a);

  public void subtract(int a);

  public int getValue();
}

...and we'd be finished with step 1! However, we are going to start with a WSDL description of the interface, even if it is a bit harder to understand than using a Java interface. The main reason for this is that, although Java interfaces might be easier to write and understand, in the long run they produce much more problems than WSDL. So, the sooner we start writing WSDL, the better.

Actually, what we're going to write is not WSDL, but Grid WSDL (or GWSDL), which is an extension of WSDL used in the OGSI specification (and, therefore, in the Globus Toolkit). This 'extended' WSDL supports all the features described in the What is a Grid Service? page which are not supported in plain Web Services (and, therefore, in plain WSDL). In a moment we'll see the exact differences between WSDL and GWSDL. Before that, we'll take a good look at the GWSDL code which is equivalent to the Java interface shown above.

However, the goal of this page is not to give a detailed explanation of how to write a GWSDL file, but rather to present the GWSDL file for this particular example and explain the main differences between WSDL and GWSDL. If you have no idea whatsoever of how to write WSDL, now is a good time to take a look at the following section of the How to... appendix: How to write a GWSDL description of your Grid Service.

A general description of the interface

The GWSDL code we'll be using is the equivalent of the Java interface shown above. However, that interface might raise a couple of eyebrows. If we said that our MathService is going to allow addition and subtraction, why do add and subtract receive only one parameter? Did the author of the tutorial flunk lower school math? Addition and subtraction involves two numbers! And what's with that getValue method? An explanation is in order...

The reason why addition and subtraction have one parameter is because we're going to use our service as an accumulator to demonstrate how grid services are stateful (remember: stateful means the service remembers internal values from one call to the other). MathService will have an internal value which will initially be zero. Each call to add and subtract will modify that internal value, which we'll be able to access thanks to the getValue method. This will allow us to observe that the service remembers that internal value from once call to the next (unlike a plain stateless web service, which would be unable to 'remember' that value).

The GWSDL code

Ok, so supposing you either know WSDL or have visited the How to write a GWSDL description of your Grid Service page, take a good thorough look at this GWSDL code:

<?xml version="1.0" encoding="UTF-8"?>
<definitions 
name="MathService"
  targetNamespace="http://www.globus.org/namespaces/2004/02/progtutorial/MathService"
  xmlns:tns="http://www.globus.org/namespaces/2004/02/progtutorial/MathService"
  xmlns:ogsi="http://www.gridforum.org/namespaces/2003/03/OGSI"
  xmlns:gwsdl="http://www.gridforum.org/namespaces/2003/03/gridWSDLExtensions"
  xmlns:xsd="http://www.w3.org/2001/XMLSchema"
  xmlns="http://schemas.xmlsoap.org/wsdl/">

<import location="../../ogsi/ogsi.gwsdl"
  namespace="http://www.gridforum.org/namespaces/2003/03/OGSI"/>

<types>
<xsd:schema 
targetNamespace="http://www.globus.org/namespaces/2004/02/progtutorial/MathService"
  attributeFormDefault="qualified"
  elementFormDefault="qualified"
  xmlns="http://www.w3.org/2001/XMLSchema">
  <xsd:element name="add">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="value" type="xsd:int"/>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>
  <xsd:element name="addResponse">
    <xsd:complexType/>
  </xsd:element>
  <xsd:element name="subtract">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="value" type="xsd:int"/>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>
  <xsd:element name="subtractResponse">
    <xsd:complexType/>
  </xsd:element>
  <xsd:element name="getValue">
    <xsd:complexType/>
  </xsd:element>
  <xsd:element name="getValueResponse">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="value" type="xsd:int"/>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>
</xsd:schema>
</types>

<message name="AddInputMessage">
  <part name="parameters" element="tns:add"/>
</message>
<message name="AddOutputMessage">
  <part name="parameters" element="tns:addResponse"/>
</message>
<message name="SubtractInputMessage">
  <part name="parameters" element="tns:subtract"/>
</message>
<message name="SubtractOutputMessage">
  <part name="parameters" element="tns:subtractResponse"/>
</message>
<message name="GetValueInputMessage">
  <part name="parameters" element="tns:getValue"/>
</message>
<message name="GetValueOutputMessage">
  <part name="parameters" element="tns:getValueResponse"/>
</message>

<gwsdl:portType name="MathPortType" 
extends="ogsi:GridService">
  <operation name="add">
    <input message="tns:AddInputMessage"/>
    <output message="tns:AddOutputMessage"/>
    <fault name="Fault" message="ogsi:FaultMessage"/>
  </operation>
  <operation name="subtract">
    <input message="tns:SubtractInputMessage"/>
    <output message="tns:SubtractOutputMessage"/>
    <fault name="Fault" message="ogsi:FaultMessage"/>
  </operation>
  <operation name="getValue">
    <input message="tns:GetValueInputMessage"/>
    <output message="tns:GetValueOutputMessage"/>
    <fault name="Fault" message="ogsi:FaultMessage"/>
  </operation>
</gwsdl:portType>

</definitions>
[Note]

This file is $TUTORIAL_DIR/schema/progtutorial/MathService/Math.gwsdl. If you're wondering why you have to save it in that particular directory, take a look at the Tutorial Directory Structure appendix.

If you know WSDL, you'll recognize this as a pretty straightforward WSDL file which defines three operations: add, subtract, and getValue (along with all the necessary messages and types). Let's take a closer look at some important parts of the code. First of all, notice how the target namespace for the Grid Service is:

http://www.globus.org/namespaces/2004/02/progtutorial/MathService

Furthermore, we have to declare the following OGSI namespaces:

xmlns:ogsi="http://www.gridforum.org/namespaces/2003/03/OGSI"
xmlns:gwsdl="http://www.gridforum.org/namespaces/2003/03/gridWSDLExtensions"

We also have to import a GWSDL file that defines all the OGSI-specific types, messages, and portTypes.

<import location="../../ogsi/ogsi.gwsdl"
namespace="http://www.gridforum.org/namespaces/2003/03/OGSI"/>

Finally, notice how we have no bindings whatsoever in our GWSDL file (bindings are an essential part of a normal WSDL file). We don't have to add them manually, since they are generated automatically with a GT3 tool.

Namespace mappings

One of the nice things about (G)WSDL is that it's language-neutral. In other words, there is no mention to the language in which the service is going to be implemented, or to the language in which the client is going to be implemented.

However, there will of course come a moment when I'll want to refer to this interface from a specific language (in our case, Java). We do this through a set of stub classes (stubs where described in A short introduction to Web Services) which are generated from the GWSDL file using a GT3 tool. For that tool to successfully generate the stub classes, we need to tell it where (i.e. in what Java package) to place the stub classes. We do this with a mappings file, which maps GWSDL namespaces to Java packages:

http\://www.globus.org/namespaces/2004/02/progtutorial/MathService=
        org.globus.progtutorial.stubs.MathService

http\://www.globus.org/namespaces/2004/02/progtutorial/MathService/bindings=
        org.globus.progtutorial.stubs.MathService.bindings
  
http\://www.globus.org/namespaces/2004/02/progtutorial/MathService/service=
        org.globus.progtutorial.stubs.MathService.service
[Note]

Each mapping must go in one line (i.e. the above file should have three lines). This file is $TUTORIAL_DIR/namespace2package.mappings.

The first namespace is the target namespace of the GWSDL file. The other two namespaces are automatically generated when a GT3 tool 'completes' the GWSDL file (including the necessary bindings). For all the tutorial, we'll be placing all the stub classes in the following Java package:

org.globus.progtutorial.stubs

Since we're defining a service called MathService, we're specifically mapping the GWSDL file to the following package:

org.globus.progtutorial.stubs.MathService

However, take into account that the stubs classes are generated from the GWSDL file, so they won't exist until we compile the service (which is when the stub classes are generated). In other words, don't look for the org.globus.progtutorial.stubs package in $TUTORIAL_DIR, because you won't find them there. If you are of a curious predisposition, don't worry, as soon as we generate the stub classes, we'll take a (very brief) look at the directory where they are generated.

Differences between WSDL and GWSDL

Remember that this is GWSDL, an extension of WSDL. Actually, this is not entirely true. GWSDL has certain features that WSDL 1.1 doesn't have, but which will be available in WSDL 1.2/2.0. Since WSDL 1.2/2.0 is still a W3C Working Draft (in other words, not a stable standard, and bound to change in the near future), the Global Grid Forum was unable to use WSDL 1.2/2.0 (with all its great Grid-friendly improvements) in OGSI. So, they created GWSDL as a temporary solution. In fact, the Global Grid Forum has said that, as soon as WSDL 1.2/2.0 is a W3C Recommendation (a stable standard), it will substitute GWSDL for WSDL 1.2/2.0. However, since the announcement of WSRF, this has changed (as OGSI will be discontinued when WSRF appears). WSRF will not use GWSDL, but it won't wait for WSDL 1.2/2.0 either, so it will use pure WSDL 1.1 for interface description.

So, what exactly are the improvements in GWSDL? Well, they're basically related to the improvements that Grid Services introduce with respect to Web Services (as seen in What is a Grid Service?). If you are WSDL-literate, you've probably already spotted one improvement in the above code:

<gwsdl:portType name="MathPortType" extends="ogsi:GridService">


</gwsdl:portType>

First of all, notice how we're not using the WSDL <portType> tag, but a tag from the GWSDL namespace: <gwsdl:portType>. This new tag has an extends attribute. This is the first major improvement in GWSDL: portType extension. You can define a PortType as an extension of one of more portTypes. In this case, we're extending from an OGSI PortType called GridService (all Grid Services must implement this interface).

The second major improvement is related to Service Data, which we will see in one of the following chapters.

Step 2: Implementing the service in Java

After defining the service interface ("what the service does"), the next step is implementing that interface. The implementation is "how the service does what it says it does". The implementation of a Grid Service is simply a Java class which, besides implementing the operations described in the GWSDL file, has to meet certain requirements. This class can furthermore include additional private methods which won't be available through the Grid Service interface, but which our service can use internally.

We'll implement the service in a Java class called MathImpl which we'll place in $TUTORIAL_DIR/org/globus/progtutorial/services/core/first/impl/MathImpl.java. If you're wondering why we're saving it in that particular spot, take a look at the Tutorial Directory Structure appendix. However, if you don't completely grasp the directory structure right now, don't worry. You can safely take a leap of faith right now, save the files where the tutorial tells you to and, after you've done a couple of examples, take a look at the Tutorial Directory Structure and see how everything fits together.

So, let's start coding! The first thing we'll do in the Java file is include the package declaration, and import a couple of classes we'll need (and which will be described shortly).

package org.globus.progtutorial.services.core.first.impl;

import org.globus.ogsa.impl.ogsi.GridServiceImpl;
import org.globus.progtutorial.stubs.MathService.MathPortType;
import java.rmi.RemoteException;

Now we'll declare the MathImpl class, which will be the implementation of our Grid Service.

public class MathImpl extends GridServiceImpl implements MathPortType

Notice the following:

  • MathImpl is a child class of GridServiceImpl. All Grid Services must extend from the base class GridServiceImpl. This is what is usually called the skeleton class, because it contains the 'bare bones' -the basic functionality- common to all Grid Services. In fact, GridServiceImpl implements all the operations declared in the GridService portType which all Grid Services must extend from (as described in What is a Grid Service?)

  • Our Grid Service implements an interface named MathPortType. Remember that PortType is another name for "service interface". We're telling Java that this Grid Service implements a particular portType: the Math PortType. But, where did MathPortType come from? This is one of the stub files which will be generated from the GWSDL file once we compile the service..

Next, we have to write a constructor for our Grid Service. Our constructor will simply call the GridServiceImpl constructor, which expects a String with a description of the Grid Service.

public MathImpl()
{
        super("Simple Math Service");
}

Next, remember we said in the previous page that we want MathService to 'remember' an internal value which we'll add to and subtract from using the add and subtract methods. We simply have to declare a private integer attribute:

private int value = 0;

Finally, we implement the methods specified in the service interface: add, subtract, and getValue. Notice how, despite being very simple methods, they can all throw a RemoteException. Since they are remote methods (methods which can be accessed remotely, through the Grid Service), a RemoteException can be thrown if there is a problem between the server and the client (for example, if there is a network error).

public void add(int a) throws RemoteException
{
  value = value + a;
}

public void subtract(int a) throws RemoteException
{
  value = value - a;
}

public int getValue() throws RemoteException
{
  return value;
}

With that, we have finished implementing our Grid Service. Now we only have a couple of simple steps before we can finally see it work!

The complete code for the implementation is shown here:

package org.globus.progtutorial.services.core.first.impl;

import org.globus.ogsa.impl.ogsi.GridServiceImpl;
import org.globus.progtutorial.stubs.MathService.MathPortType;
import java.rmi.RemoteException;

public class MathImpl extends GridServiceImpl implements MathPortType
{
  private int value = 0;

  public MathImpl()
  {
    super("Simple MathService");
  }

  public void add(int a) throws RemoteException
  {
    value = value + a;
  }

  public void subtract(int a) throws RemoteException
  {
    value = value - a;
  }

  public int getValue() throws RemoteException
  {
    return value;
  }
}
[Note]

This file is $TUTORIAL_DIR/org/globus/progtutorial/services/core/first/impl/MathImpl.java

Step 3: Configuring the deployment in WSDD

Up to this point, we have written the two most important parts of a Grid Service: the service interface (GWSDL) and the service implementation (Java). Now, we somehow have to put all these pieces together, and make them available through a Grid Services-enabled web server! This step is called the deployment of the Grid Service.

One of the key components of the deployment phase is a file called the deployment descriptor. It's the file that tells the web server how it should publish our Grid Service (for example, telling it what the our service's GSH will be). The deployment descriptor is written in WSDD format (Web Service Deployment Descriptor). The deployment descriptor for our Grid Service could be something like this:

<?xml version="1.0"?>
<deployment name="defaultServerConfig" xmlns="http://xml.apache.org/axis/wsdd/"
  xmlns:java="http://xml.apache.org/axis/wsdd/providers/java">

  <service name="progtutorial/core/first/MathService" provider="Handler" style="wrapped">
    <parameter name="name" value="MathService"/>
    <parameter name="className" value="org.globus.progtutorial.stubs.MathService.MathPortType"/>
    
    <parameter name="baseClassName" value="org.globus.progtutorial.services.core.first.impl.MathImpl"/>
    <parameter name="schemaPath" value="schema/progtutorial/MathService/Math_service.wsdl"/>

    <!-- Start common parameters -->
    <parameter name="allowedMethods" value="*"/>
    <parameter name="persistent" value="true"/>
    <parameter name="handlerClass" value="org.globus.ogsa.handlers.RPCURIProvider"/>
  </service>

</deployment>
[Note]

This file is $TUTORIAL_DIR/org/globus/progtutorial/services/core/first/server-deploy.wsdd. If you're using the provided example files, you'll notice there is an extra <service> tag. You can safely ignore it for now. It will be explained later on.

Let's take a close look at what all this means...

The 'service name'

<service name="progtutorial/core/first/MathService" provider="Handler" style="wrapped">

This specifies the location where our Grid Service will be found. If we combine this with the base address of our Grid Service container, we will get the full GSH of our Grid Service. For example, if we are using the GT3 standalone container, the base URL will probably be http://localhost:8080/ogsa/services. Therefore, our service's GSH would be:

http://localhost:8080/ogsa/services/progtutorial/core/first/MathService

The 'service name' (again)

<parameter name="name" value="MathService"/>

Although this might seem confusing, the <service> element has both a name attribute (the one seen above) and name parameter. The parameter simply contains a brief description of the service (e.g."MathService")

className and baseClassName

<parameter name="className" value="org.globus.progtutorial.stubs.MathService.MathPortType"/>
<parameter name="baseClassName" value="org.globus.progtutorial.services.core.first.impl.MathImpl"/>

In strict WSDD, this parameter would refers to the class which implements the service interface (in our case, MathImpl from the previous page). However, notice how the value of this parameter is the MathPortType stub interface mentioned in the previous page.

Since Grid Services are a tad more complex that plain web services, it's necessary to distinguish between the className and the baseClassName. The className refers to the interface that exposes all the functionality of the grid service (MathPortType), and baseClassName is the class that provides the implementation for our grid service. In our case, baseClassName is the MathImpl class we implemented in the previous page. When we see operation providers (in the next section) we'll see that the baseClassName doesn't need to include an implementation for all the methods exposed in the className (we'll see that it's possible to delegate some or all of the our methods to special classes called operation providers)

The WSDL file

<parameter name="schemaPath" value="schema/progtutorial/MathService/Math_service.wsdl"/>

The schemaPath tells the grid services container where the WSDL file for this service can be found. No, that's not a typo, I said WSDL, not GWSDL. Remember that GWSDL is a (non-standard) extension of WSDL, so it must first be converted to WSDL so it can be truly interoperable with existing web services technologies. This WSDL file (Math_service.wsdl) will be generated automatically by a GT3 tool when we compile the service.

The common parameters

<!-- Start common parameters -->
<parameter name="allowedMethods" value="*"/>
<parameter name="persistent" value="true"/>
<parameter name="handlerClass" value="org.globus.ogsa.handlers.RPCURIProvider"/>

These are three parameters which we'll see in every grid service we program.

Step 4: Create a GAR file with Ant

At this point we have (1) a service interface in GWSDL, (2) a service implementation in Java, and (3) a deployment descriptor telling the grid services container how to present (1) and (2) to the outer world. However, all this is a bunch of loose files. How are we supposed to place this in a grid services container? Do we have to copy these files to strategically located directories? And what about the Java file? We haven't compiled it yet!

Fear not, for this is the step when everything comes together in perfect harmony. Using those three files we wrote in the previous three pages we'll generate a Grid Archive, or GAR file. This GAR file is a single file which contains all the files and information the grid services container need to deploy our service and make it available to the whole world. In fact, in the next page we'll instruct a simple grid services container to take the GAR and deploy it.

However, creating a GAR file is a pretty complex task which involves the following:

  • Converting the GWSDL into WSDL

  • Creating the stub classes from the WSDL

  • Compiling the stubs classes

  • Compiling the service implementation

  • Organize all the files into a very specific directory structure

Don't be scared by all this. Thanks to the hard work of the Globus guys and gals, we can do all this in a single step using a very useful tool called Ant.

Ant

Ant, an Apache Software Foundation project, is a Java build tool. In concept, it is very similar to the classic UNIX make command. It allows programmers to forget about the individual steps involved in obtaining an executable from the source files, which will be taken care of by Ant. Each project is different, so the individual steps are described in a build file ('Makefile' in the make jargon). This build file directs Ant on what it should compile, how it should compile it, and in what order. This simplifies the whole process considerably. In fact, it reduces the number of steps to one! With Ant, all we have to worry about is writing the service interface, the service implementation, and the deployment descriptor. Ant takes care of the rest:

As you can see, Ant generates the GAR directly from the three source files. Internally, it is carrying out all the steps listed earlier, sparing us the cumbersome task of doing them ourselves. In a GT3 project, Ant uses two sets of build files: a couple of build files which are a part of GT3, and a build file we'll have to write on our own. The GT3 build files cover all the important steps (generating the WSDL code, generating the stubs, ...). Our build file essentially has all the unique parameters of our Grid Service, and a bunch of calls to the GT3 build files. At first, it is safe to know practically nothing about Ant and build files; you can usually write a 'generic' build file which will work with more than one Grid Service, and then you won't have to see build files ever again. In fact, this tutorial includes a handy build file that works with all the examples we'll see. However, as you move on to more complex projects, you'll probably need to write custom build files to fine tune the whole build process.

If you want to learn more about Ant, take a look at the Ant Website. It includes plenty of documentation, tutorials, etc.

Our handy multipurpose buildfile and script

For the rest of the tutorial, we are going to use a very handy Ant build file which will work with all the examples we'll see. That way, we won't have to rewrite the build file each time we get to a new example. The build file is included in the downloadable tutorial files (available in the tutorial website). Since this tutorial isn't meant as an Ant tutorial, we won't see what's inside the build file, but feel free to take a look inside.

Since using the Ant file directly implies passing a lot of parameters to Ant, we'll also use a handy shell script which makes things even simpler (also included with the tutorial files).

However, before doing anything, make sure you create a file called build.properties in $TUTORIAL_DIR with the following line:

ogsa.root=path to GT3 installation

Replace path to GT3 installation with the path where you installed GT3. For example:

ogsa.root=/usr/local/gt3

Creating the MathService GAR

Using the provided Ant buildfile and the handy script, building a Grid Service is as simple as doing the following:

./tutorial_build.sh <service base directory> <service's GWSDL file>

The "service base directory" is the directory where we placed the server-deploy.wsdd file, and where the MathImpl.java file can be found (inside an impl directory). More details on this in the Tutorial Directory Structure appendix.

For example, to build our first example and generate its GAR file, we simply need to do the following:

./tutorial_build.sh \
org/globus/progtutorial/services/core/first \
schema/progtutorial/MathService/Math.gwsdl
[Note]

Make sure you run this from $TUTORIAL_DIR.

[Note]

Windows users can use a Python build script originally contributed by Michael Schneider. This script is called tutorial_build.py and is included with the downloadable tutorial files. If you prefer to use the Python script, simply replace tutorial_build.sh with tutorial_build.py in all the following examples.

If everything works fine, the GAR file will be placed in $TUTORIAL_DIR/build/lib. To be exact, the GAR file generated for this example will be the following:

$TUTORIAL_DIR/build/lib/org_globus_progtutorial_services_core_first.gar

Step 5: Deploy the service into a grid services container

The GAR file, as mentioned in the previous page, contains all the files and information the web server needs to deploy the Grid Service. Deployment is also done with the Ant tool, which unpacks the GAR file and copies the files within (WSDL, compiled stubs, compiled implementation, WSDD) into key locations in the GT3 directory tree. It also reads our deployment descriptor and configures the web server to take our new Grid Service into account.

This deployment command must be run from the root of your GT3 installation. Furthermore, you need to run it with a user that has write permission in that directory.

ant deploy -Dgar.name=<full path of GAR file>

For the GAR file we've just created, this would be:

ant deploy \
-Dgar.name=$TUTORIAL_DIR/build/lib/org_globus_progtutorial_services_core_first.gar

Deployment is really as simple as that! That also concludes the five steps necessary to write and deploy a Grid service. However, although you're probably beaming with pride because you've deployed your first Grid Service, you'll certainly want to make sure that it works. We'll try out our recently deployed service using a very simple client application.

A simple client

We're going to test our Grid Service with a command-line client which will invoke the add method and the getValue method. This client will receive two arguments from the command line:

  1. The Grid Service Handle (GSH)

  2. Number to add

The client class will be called Client and we'll place it in the $TUTORIAL_DIR/org/globus/progtutorial/clients/MathService/Client.java file. Again, you can find more information about the directory structured followed in the tutorial in the Tutorial Directory Structure appendix.

The full code for the client is the following:

package org.globus.progtutorial.clients.MathService;

import org.globus.progtutorial.stubs.MathService.service.MathServiceGridLocator;
import org.globus.progtutorial.stubs.MathService.MathPortType;

import java.net.URL;

public class Client
{
  public static void main(String[] args)
  {
    try
    {
      // Get command-line arguments
      URL GSH = new java.net.URL(args[0]);
      int a = Integer.parseInt(args[1]);

      
      // Get a reference to the MathService instance
      MathServiceGridLocator mathServiceLocator = new MathServiceGridLocator();
      MathPortType math = mathServiceLocator.getMathServicePort(GSH);

      // Call remote method 'add'
      math.add(a);
      System.out.println("Added " + a);

      // Get current value through remote method 'getValue'
      int value = math.getValue();
      System.out.println("Current value: " + value);
    }catch(Exception e)
    {
      System.out.println("ERROR!");
      e.printStackTrace();
    }
  }
}
[Note]

This file is $TUTORIAL_DIR/org/globus/progtutorial/clients/MathService/Client.java

As you can see, writing a Grid Service client is very easy. With only two lines we obtain a reference to the Math portType. In following sections, as we introduce things like service data, notifications, and factories the code to obtain that reference will be slightly longer. However, the important point is that, once we have that reference, we can work with the Grid Service as if it were a local object. Notice, however, that we have to put the whole code inside a try/catch block, because the Grid Service methods (in this example, the add method) can throw RemoteExceptions.

We are now going to compile the client. Before running the compiler, make sure you run the following:

source $GLOBUS_LOCATION/etc/globus-devel-env.sh

The globus-devel-env script takes care of putting all the Globus libraries into your CLASSPATH. When compiling the service, Ant took care of this but, since we're not using Ant to compile the client, we need to run the script.

To compile the client, do the following:

javac \
-classpath ./build/classes/:$CLASSPATH \
org/globus/progtutorial/clients/MathService/Client.java

./build/classes is a directory generated by Ant were all the compiled stub classes are placed. We need to include this directory in the Classpath so our client can access generated stub classes such as MathServiceGridLocator. Before running it, we need to to start up the standalone container. Otherwise, our Grid Service won't be available, and the client will crash. The following command must be run from the root of your GT3 installation:

globus-start-container