What is Globus? -- It is funny, but the answer to that question isn't exactly straight forward in concrete terms. In fairly abstract terms, it is a set of open protocols for doing things like
At a somewhat less abstract layer, Globus is a set of software tools that implement the protocols that are defined by Globus. To be Globus compatible, then, means implementing the protocols so that other things that implement the protocols can interoperate. The Globus group has implemented many such tools, some have been decommissioned from Globus while others are being added. The following image gives the best picture I could find of global Globus architecture.
The core Globus services in the figure are really what most people mean when the say the word "Globus."
the true core -- At its most concrete, Globus is really three core services
GridFTP is an increasingly popular file transfer utility. In addition to its use of the GSI (making it far more secure in unspecified units of security) that regular FTP, it can do things like third-party controlled transfer which are really useful.
GASS as mentioned stands for "Globus Access to Secondary Storage." It is a file replication facility that can cache a file local to where it is to be accessed. Again, it does this using the very general naming mechanisms implemented by the MDS and securely using the GSI.
Replica Catalog is an LDAP registry for managing replicated files. Applications can query the catalog for file attributes and determine if, when and what replica to use.
Tools that use the core Globus quarks include MPICH-G -- a Globus enabled version of MPI, Condor-G a Condor-equipped version of Globus, etc. Essentially these are version of native distributed tools that have been converted to using the Globus quarks (and sometimes the Globus electrons, protons, and neutrons) to provide the same functionality. A legitimate question to ask is What benefit do these tools get from using Globus? There are primarily two answers to this question. The first is security. One of the things the Globus project did right was to recognize security as a critical issue for enabling Grid computing. By using the Globus quarks, a tool transparently picks up a secure infrastructure. The second benefit is interoperablity in the sense that Globus-enabled tools are more likely to work well with each other than non-Globus enabled tools.
The first I will paraphrase as "a universal user ID." If we are to do anything is a completely distributed environment, we will need a way to assign an unforgeable user ID to people who use the system. Think of what happens when you log on to a Unix system. When you type your password, the kernel looks in a protected file (the password file) for a number to associate with the password you typed in. It takes that number and makes sure that every process you run (with a few perverse exceptions) is associated with that user ID. In fact, the ID is embedded in the process state that the kernel keeps for each process. When the kernel checks file permissions, for example, it looks at the user ID of the accessing process when determining of the access is legal.
authentication -- The process of unambiguously associating an identifier with a person is called "authentication." Usually, a password is the chief mechanism. If a person requests access, and knows the password, a user ID can be assigned to that person on the strength of that knowledge. Notice, though, that you could use other methods. Retinal scan, for example, could also be used. Instead of typing a password to log into your Linux system, you could use a retinal scan to check to determine which user ID should be assigned to the login shell. It would just make your machines a lot more expensive.
authentication is hard -- In a single O.S. environment, authentication requires careful programming, but because the kernel runs in supervisor mode it can protect a user ID once it is established. It is not possible, for example, for you to write a program that overwrites the user ID of your process with another user ID -- the kernel will not permit it because of the page table permissions and there isn't a system call. As such, when a user ID is stolen or forged, it is because of negligent programming.
distributed authentication -- If we are going to pull this trick off in a distributed environment, though, there is no "supervisor mode" that we can ensure across machine domains. Instead, we must rely on encryption. Without going into a huge diatribe on the various forms of encryption, it is safe to generalize the encryption-based authentication problem as consisting of two subproblems
First, you need to believe that it is possible generate a pair of different numbers such that a message encrypted with one can be decrypted only with the other. RSA explains how to do that. Next, you keep one of the numbers protected. The entity that has control over the protected number (called the protected key) is the "authentic" entity. We'll get to the obvious problem with this distinction in a minute. The other number is called a "public key" and it can be passed around as needed.
certificates and a certificate authority -- With public key-private key pairs available, there are several ways to handle authentication. Globus uses the X.509 certificate-based method and a certificate authority (CA). When you first decide to use Globus -- they very first time -- you need to apply for a certificate. You will do this for your next project. You verify that you are who you say you are and the CA digitally signs a message using its private key. The CA advertises its public key to anyone and everyone. When you request a certificate, you must present your own public key which the CA includes in the signed version it sends back to you.
you and the authenticator -- When you go to use a resource (e.g. login to a computer) you send your certificate to whatever entity is doing the authentication. If it trusts the CA from which that pass has been issued (a relationship that has been previously set up) it has the public key for that CA on hand. It can decrypt the digital signature and verify that it had been encrypted by the CA who signed it. The certificate cannot be tampered with since the digital signature contains a kind of checksum (called a hash). As such, the authenticator can then trust that the public key contained in the certificate is your public key.
public key challenge -- Now, the authenticator can check to see that you hold the corresponding private key. To do so, it sends you a message that you must encrypt and send back. If it can successfully decrypt the message, you are the correct owner.
In summary, here are the steps. First you need to get a certificate. You
trust assumptions -- When ever you look at a security mechanism, you should ask "What assumptions about trust must I make to believe that this mechanism will work?" For Globus, you are essentially trusting two entities. First, you must trust the administrator who installed Globus. For example, if the authenticator is compromised, it may correctly authenticate you, but allow other "inauthentic" access to the resources. If those "bad guys" are allowed to run on the resources you are using, but you are assuming that they are not, your trust has been violated. Secondly, you must trust the certificate authority. If the CA is compromised, it can control who can access the resources.
scalable key management -- If you think about the whole CA mechanism a bit, you will probably see that it is really a way for the authenticator (called a gatekeeper in the Globus parlance) to get a public key for you that it trusts belongs to you. Consider ssh. Ssh uses public key/private key encryption as well. Typically, when you are first granted a login, you get an initial password (over the phone hopefully) that you can use to get in and install your own public key. Because someone spoke to you and verified that you are who you say you are (at least with a phone call), the public key you installed could only have gotten there because you had the password.
The problem is that this mechanism for installing the keys is not scalable. It works fine if the number of logins you need is small, but you probably don't want to field 10,000 phone calls to get access to 10,000 machines. With the CA, you make one phone call (to the CA administrator) and you have a secure way of passing your public key to the resource site.
After setting up your environment so it can find the necessary globus tools, you type
bash$ mkdir .globus bash$ grid-cert-requestGlobus wants to put your certificates and the like in a .globus directory under your home directory. You will be prompted for a pass phrase. This pass phrase cannot be recovered so you had better not forget it. If you do, you will have to start the entire process over and, as we will see, it can take time.
After you have entered your pass phrase (twice), you will see a set of instructions:
A private key and a certificate request has been generated with the subject: /O=Grid/O=ucsb/CN=Richard Wolski If the CN=Richard Wolski is not appropriate, rerun this script with the -force -cn "Common Name" options. Your private key is stored in /cs/faculty/rich/.globus/userkey.pem Your request is stored in /cs/faculty/rich/.globus/usercert_request.pem Please e-mail the request to the Globus CA graziano@cs.ucsb.edu You may use a command similar to the following: cat /cs/faculty/rich/.globus/usercert_request.pem | mail graziano@cs.ucsb.edu Only use the above if this machine can send AND receive e-mail. if not, please mail using some other method. Your certificate will be mailed to you within two working days. If you receive no response, contact Globus CA graziano@cs.ucsb.eduThe certificate has been created for you, but it hasn't been signed by a CA. In our case, Graziano Obertelli will be the CA for our work. Globus is configured, by default, to use a CA that ISI/USC maintains. In that case, the instructions would tell you where to mail your certificate request for signature and you'd probably get a piece of email back asking for a phone call. Some institutions actually require a fax of some ID. You could even request a face-to-face meeting. Notice, though, that this authentication step is the linchpin of the trust relationship. If you forget your pass phrase, you must be re-authenticated by the CA.
Once the CA is satisfied that you are who you say you are, it will send you a user certificate that you must install in your .globus directory as the file usercert.pem. Here is mine
Certificate:
Data:
Version: 1 (0x0)
Serial Number: 8 (0x8)
Signature Algorithm: md5WithRSAEncryption
Issuer: O=Grid, O=ucsb, CN=Mayhem Certification Authority
Validity
Not Before: Jan 15 02:36:22 2002 GMT
Not After : Jan 15 02:36:22 2003 GMT
Subject: O=Grid, O=ucsb, CN=Richard Wolski
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
RSA Public Key: (1024 bit)
Modulus (1024 bit):
00:b4:cd:fa:74:66:cd:8b:33:f3:26:fb:34:22:09:
4f:25:f7:05:bf:79:0e:2e:bd:d8:0b:59:d5:c2:20:
43:bd:e3:26:70:cf:e4:36:05:8f:1d:bf:ea:08:b4:
de:38:87:43:f3:08:a4:e9:ff:91:98:63:53:a6:e3:
f5:dc:bc:82:32:43:56:75:29:cf:59:2f:f2:e4:f8:
16:28:d7:b8:b9:64:12:bf:03:b0:45:94:c6:65:23:
64:e6:7e:d2:78:29:1b:c3:a4:f2:24:e9:8f:bf:99:
df:9c:d4:ac:31:be:ef:5d:0a:aa:d8:16:fb:e3:ad:
91:18:44:24:a0:d0:c6:ed:83
Exponent: 65537 (0x10001)
Signature Algorithm: md5WithRSAEncryption
93:8a:58:b7:68:2f:1f:48:00:bc:ae:d5:36:44:d9:bc:22:bb:
e0:b5:d3:76:43:b0:8d:b6:82:a7:3b:19:47:0a:83:1b:d3:22:
01:69:bc:27:9f:a6:53:73:10:ee:c7:88:4a:84:c2:bf:e3:15:
1b:93:af:57:d7:b7:6a:10:84:bb:59:0b:c3:db:04:18:54:b6:
53:08:4c:3f:10:a8:12:e2:78:3b:a5:5c:13:47:d5:59:81:94:
4f:96:7b:f5:31:9b:78:6a:f5:47:dd:f3:f4:92:57:79:9f:62:
e6:8e:4b:cb:ba:04:af:c9:df:45:2c:d6:bf:35:00:63:22:1f:
79:97
-----BEGIN CERTIFICATE-----
MIIB7TCCAVYCAQgwDQYJKoZIhvcNAQEEBQAwRzENMAsGA1UEChMER3JpZDENMAsG
A1UEChMEdWNzYjEnMCUGA1UEAxMeTWF5aGVtIENlcnRpZmljYXRpb24gQXV0aG9y
aXR5MB4XDTAyMDExNTAyMzYyMloXDTAzMDExNTAyMzYyMlowNzENMAsGA1UEChME
R3JpZDENMAsGA1UEChMEdWNzYjEXMBUGA1UEAxMOUmljaGFyZCBXb2xza2kwgZ8w
DQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBALTN+nRmzYsz8yb7NCIJTyX3Bb95Di69
2AtZ1cIgQ73jJnDP5DYFjx2/6gi03jiHQ/MIpOn/kZhjU6bj9dy8gjJDVnUpz1kv
8uT4FijXuLlkEr8DsEWUxmUjZOZ+0ngpG8Ok8iTpj7+Z35zUrDG+710KqtgW++Ot
kRhEJKDQxu2DAgMBAAEwDQYJKoZIhvcNAQEEBQADgYEAk4pYt2gvH0gAvK7VNkTZ
vCK74LXTdkOwjbaCpzsZRwqDG9MiAWm8J5+mU3MQ7seISoTCv+MVG5OvV9e3ahCE
u1kLw9sEGFS2UwhMPxCoEuJ4O6VcE0fVWYGUT5Z79TGbeGr1R93z9JJXeZ9i5o5L
y7oEr8nfRSzWvzUAYyIfeZc=
-----END CERTIFICATE-----
Also in the .globus directory is a file named userkey.pem. This file contains your private key and must be
protected. If this file is stolen, your certificate (which you do not have to
protect) can be presented by anyone who holds the private key and will
authenticate them as you.
delegation, message integrity, message confidentiality -- We won't delve into these realms too much other than to recognize that scalable authentication is not the only function of the GSI. Delegation refers to the idea that once you have been authenticated to a resource, the Globus software can delegate processes to work on your behalf an ensure that they, too, have been authenticated. Doing so securely is trickier than it sounds, but basically it means that when your process forks or sends a message to another process via SSL, Globus can trace the authentication chain back securely to your first authentication event. Message integrity means that when you receive a message, you are sure it is the message that the authenticated sender sent you. That is, the messages are tamper-proof. Message confidentiality means that an unauthenticated listener cannot determine the contents of a message.
The main function of the Globus MDS is to provide information about the Grid in a uniform format, using a standard set of protocols. It sounds simple, right? Well it's not.
To act as a general information service, the MDS must
naming -- The naming scheme that the MDS uses is essentially the one used by LDAP (as described in the LDAP lecture). Name are hierarchical, and contain keyword-attribute pairs. Even this decision is viewed as controversial in some quarters of the Grid community although it does not seem that it can be otherwise, to me.
queries -- Queries are attribute based and use the LDAP query syntax. As you can probably tell, much of the inspiration for the MDS (and much of its functionality) come from LDAP.
registrations -- Here, things deviate from "standard" LDAP a bit. The open versions of LDAP are essentially query-optimized look-up services. They were designed to support high query rates, but relatively low update rates. For example, it is possible to manage user accounts using LDAP. The number of account creations is dwarfed by the number of password queries on a daily basis. Registering "updates" in LDAP, then, must be carefully managed.
The current MDS is actually organized as a hierarchy of servers. At the bottom level are the GRIS servers, one per resource typically. GRIS stands for Grid Resource Information Server and it is responsible for gathering and maintaining the "raw" data.
Above the GRIS "leaf" nodes, are a tree of GIIS (Grid Information Index Servers) nodes. They are responsible for "indexing" the raw data in the database sense of the word. Recall from your database knowledge that often, for performance reasons, you can build an index of pointers "over" your baseline data as a way of improving access speed. A GIIS intended to serve the same purpose. Each one "optimizes" the query characteristics for a different view of the data provided by the base GRIS level.
GRRP -- Obviously, if all GRIS nodes broadcast all of their data to all GIIS node, there would be a scalability problem. Instead, a GRIS registers information about the information it maintains with one or more GIIS nodes. To do so, it uses the Grid Resource Registration Protocol in which a small message is sent to the GIISes telling them what data a GRIS can provide. Like with the NWS (which used this technique long before the MDS did but which is never acknowledged by the MDS designers, I might add) the registrations are soft-state meaning that they must be refreshed periodically or they will time out.
GRIP -- When a particular GIIS wants to get the actual data maintained by a GRIS, it uses a well-defined protocol called the Grid Resource Interrogation Protocol or GRIP. A GIIS is free to GRIP its GRIS any time it wants to, with what ever frequency will meet its needs.
managing dynamically generated data -- While the GIIS/GRIS architecture for the MDS is scalable, there is still a problem with data that is updated very frequently. In particular, the NWS sensors cannot send data directly to a GRIS because the update frequency would overload it. Instead, the GRIS implements an information provider protocol, that the NWS can understand. The GRIS tells its GIISes that it has the NWS data, but when a GIIS asks for it (using GRIP), the GRIS makes a query to the NWS on demand. The following figure depicts this un-holy alliance.
Notice how flexible the architecture is, and the power of open protocols. Because GRIP and GRRP are published, the NWS can speak them directly without using the intermediate GRIS node.
The box in the middle of the figure should be familiar to you. It is, in fact, the same NWSlapd that you have been using all along. The disadvantage of this organization is that the GRIS provides functionality (like certificate-based security) that the NWS will have to track if we do it this way. On the other hand, the producer protocol should also be secure meaning that the NWS will need to be X.509 compliant either way.
grid-info-search -x -b 'mds-vo-name=mayhem, o=Grid' -h pompone.cs.ucsb.edu -p 2135 'Mds-Host-hn=dagwood'In our environment, because Globus tries to install a common environment and because our DNS does not return a fully qualified host name, you need to type this string EXACTLY as shown. The '-b' parameter tells the search command to go to the virtual organization "mayhem" under the organization "Grid" when beginning its search. Otherwise, it will start at the Grid root and we haven't yet registered our VO with the world wide family of Globus. If you do use this string, however, you are rewarded with the following information about "dagwood."
# # filter: Mds-Host-hn=dagwood # requesting: ALL # # dagwood, mayhem, Grid dn: Mds-Host-hn=dagwood,Mds-Vo-name=mayhem,o=Grid objectClass: MdsComputer objectClass: MdsComputerTotal objectClass: MdsFsTotal objectClass: MdsHost objectClass: MdsNet objectClass: MdsNetTotal objectClass: MdsOs Mds-Computer-isa: i386 Mds-Computer-platform: i86pc Mds-Computer-Total-nodeCount: 1 Mds-Fs-freeMB: 0 Mds-Fs-freeMB: 1194 Mds-Fs-freeMB: 1353 Mds-Fs-freeMB: 3254 Mds-Fs-freeMB: 430 Mds-Fs-freeMB: 872 Mds-Fs-sizeMB: 0 Mds-Fs-sizeMB: 1194 Mds-Fs-sizeMB: 2013 Mds-Fs-sizeMB: 3290 Mds-Fs-sizeMB: 486 Mds-Fs-sizeMB: 995 Mds-Fs-Total-count: 7 Mds-Fs-Total-freeMB: 7103 Mds-Fs-Total-sizeMB: 7978 Mds-Host-hn: dagwood Mds-keepto: 20020202234344Z Mds-Net-Total-count: 0 Mds-Os-name: SunOS Mds-Os-release: 5.7 Mds-Os-version: Generic_106542-18 Mds-validfrom: 20020202234344Z Mds-validto: 20020202234344Z # search result search: 2 result: 0 Success # numResponses: 2 # numEntries: 1Some of these fields are a little obscure, but you get the processor type and the O.S information. Curiously, if you generalize your host name query to
grid-info-search -x -b 'mds-vo-name=mayhem, o=Grid' -h pompone.cs.ucsb.edu -p 2135 'Mds-Host-hn=*'Tells you about the information registered by all of the hosts configured into our Globus installation. Notice that by specifying a different location in "the tree" you can get to the CPU information generated by the NWS for dagwood.
grid-info-search -x -b "o=meas,service=nws,o=grid" -h nws.cs.ucsb.edu -p 3389 "event=dagwood*availableCpu*"yields
version: 2 # # filter: event=dagwood*availableCpu* # requesting: ALL # # dagwood:8060.availableCpu.0, NWS, Grid dn: event=dagwood:8060.availableCpu.0,service=NWS,o=Grid objectClass: top objectClass: service objectClass: GridEvent event: dagwood:8060.availableCpu.0 timestamp: 1012695743 value: 0.990100 # search result search: 2 result: 0 Success # numResponses: 2 # numEntries: 1It is possible to link all of this together using LDAP referrals, but for the moment, we have them separated. The syntax is uniform, however, and the same search tools work for both.
It is a little bit obscure since there really isn't a box labeled "GRAM" and there are boxes labeled with MDS components (GRIS, GIIS, etc.) The Globus project has been struggling with notation and terminology of late. I believe this diagram reflects that struggle. The two important features are the boxes labeled RSL library and the local resource manager. From here on out, we will refer to the pair of these as The GRAM but you should realize that in some parts of the Globus documentation, this definition will not be precise.
To initiate a process on a resource using Globus, you must specify the "resource attributes" you want using the Globus Resource Specification Language (RSL). It is a keyword-attribute pair type language that uses the same conjunctive and disjunctive syntax that LDAP does. The legal set of attributes recognized by the GRAM is documented here. Strangely, the resource you wish to specify is not one of the recognized attributes in the Resource Specification Language. So it goes. Again -- nomenclature seems to be a problem. In fact, the RSL is really a resource attribute language that tells a specific resource manager (not specified in the language) what attributes of the resource you wish to exercise.
Rather than describing the language in detail, we'll illustrate its usage with a few examples. To do so, we will use the globusrun command. This utility contacts the gatekeeper on a the specified host and authenticates you. Then it talks to the GRAM running on that host and passes the RSL you specify.
The canonical example is to run /bin/date remotely. To do so on the host joplin.cs.ucsb.edu, you would execute
globusrun -o -r joplin:2119:/O=Grid/O=ucsb/CN=joplin.cs.ucsb.edu '&(executable=/bin/date)'The "-o" argument says that you want Globus to buffer the output on your local machine using GASS. The "-r" argument specifies the contact string for the GRAM you wish to have parse your RSL string. Under "normal" Globus circumstances you could just specify the host name here for short. In our environment, though, because of the DNS issues, you must completely specify the contact string. Notice that this name looks suspiciously like n LDAP name. That is because it is. Globus uses these names for everything, which is a really good idea. The part at the end is the RSL specifier indicating the path name to the binary you wish to execute.
In the above example, the globusrun command blocks until the answer comes back. If you had a long running job that you wished to launch and whose output you were archiving to a file, you could write
globusrun -b -r jasmine.cs.ucsb.edu:2119:/O=Grid/O=ucsb/CN=jasmine.cs.ucsb.edu '&(executable=/cs/faculty/rich/junk/ppid)(stdout=filename)'This time on jasmine.cs.ucsb.edu we launch a binary in batch mode because of the "-b" option. The globusrun command prints out a "handle" that you can use to interrogate the job while it is running. In this case, the handle I got was
https://jasmine:65486/24258/1012269176/More on this handle in a minute. The RSL string in this example says what binary to execute and that the stdout should be directed to the file "filename" in the home directory of the user associated with the certificate used to launch the job.
To check the status of a batch job, you can issue another globusrun command using the handle as an identifier.
globusrun -status https://jasmine:65486/24258/1012269176/And if you wish to terminate the job, again, you can use the globusrun command:
globusrun -kill https://jasmine:65486/24258/1012269176/The advantage of the RSL and the GRAM is that the syntax and recognized set of keywords is universal across resources.
RSL weakness -- One weakness is the RSL, as alluded to before, is that it doesn't really allow you to reference the resource. For example, it is not possible to specify a single RSL string that launches one set of binaries on a SPARC architecture and another set on the i86. The MDS has the information, but there isn't a way to issue a globusrun and to specify the conditional in RSL. You have three options here: