Adding new access protocolsA collector is the JICOS component that accepts data from a non-Java client, converts that data into JICOS Tasks, submits it to a JICOS system, waits for the result, converts the answer back to something the client can understand and returns it to the client. There are three parts of the Collector, representing the three functions that a Collector provides: communication with the (non-Java) client, creating a JICOS Task, and communicating with the Host Service Provider.
All new collectors must extend the edu.ucsb.cs.jicos.services.external.services.Collector class. This class extends java.lang.Thread. It is assumed that all collectors will be listening on some network port to communicate with the client. It is assumed that the machine will either have a name, or allow "localhost" as a host name. The Collector class also manages the HSP to which all Tasks will be submitted. Future releases will allow for multiple HSP's providing load balancing and fault-tolerance. Besides providing the run() method, a collector must implement the following three constructors:
In the run() method, the new collector will implement handling the specified protocol: HTTP, SMTP, IRC, etc. The collector will usually create a java.net.ServerSocket, bind it to the appropriate port, and then go into an infinite loop accepting connections and processing the incoming data. For example, the CollectorHttp (without the error handling) looks something like:
Currently, all of the collectors run inside the same JVM as the Host Services Provider. When the HSP starts, it calls the CollectorManager to start up all the required collectors. By default, only the CollectorHttp will be started, but this can be changed by modifying the DEFAULT_StartList (line 50), or specifying different collectors in the jicos.services.CollectorManager.startList property. The CollectorHttp, CollectorSoap, and CollectorDebug can be specified by the shorthand "http", "soap", and "debug" respectively. Any other collector must be specified using it's fully qualified name (i.e. "my.package.name.CollectorIRC"). Once each collector is wrapped in a new Thread and started, (see CollectorManager, lines 293-305), the Manager will exit.
The communication protocol between the client and the collector is defined by the port. For example, port 80 is expected to be HTTP, port 25 is expected to be SMTP, 21 for FTP, 143 for IMAP, etc. Many data transfer protocols are implemented in a "request/response" manner: they send all the data necessary as a request, and wait for the answer in a response. The CollectorHttp only understands three types of HTTP messages:
The URL specified in the <FORM ACTION="http://host:port/ExternalRequest/fully/.quaqlified.taskname"> element, is the CollectorHttp, a collector that accepts the HTTP input, and takes some specific action on the data. A common technique is to write a script or program and have a the web server pass the data via the Common Gateway Interface (CGI). However, it is possible to write a program ito interpret the HTTP data directly. This is what is done in the CollectorHttp. The HTTP header contains a variety of information. Of most importance here is the URL, action, and data. The CollectorHttp only understands 2 URLs, "/ExternalRequest" to submit a task, and "/ExternalResponse" to get the answer from a previously submitted request.
In order for the CollectorHttp to create a JICOS Task, the data in the "/ExternalRequest" must include the fully qualified name of the Task. Since many of the tasks are part of the JICOS system, any task beginning with the package The collector creates an instance of the task using the no-argument constructor, and the populates the class by passing the XML data to the
Once the CollectorHttp has a valid Task, Shared, and input Object, the Task needs to be given to JICOS. To prevent the collector from blocking while waiting for an answer, a new thread is started up, called the
Eventually, JICOS will return the answer to the computation. This answer needs to be converted back to a format that the external client can understand, referred to as "external data". Since the answer is Task specific, there can be no generic conversion routine and the Task must supply a The ideal way to convert the resulting XML to any other format would be using Extensible Stylesheet Language Translations (XSLT). The CollectorHttp would look for a stylesheet for HTML, CollectorSoap would look for a stylesheet for SOAP, and so on. However, forcing a programmer to learn XSLT seems an unnecessary burden. Therefore, to make using the CollectorHttp simpler, another function is defined in the XmlConverter interface:
Here is the complete CollectorHttp.java. Note that there are two inner classes: HttpRequestThread (stating at line 976), and HttpResponseMethod (starting at line 1110). HttpRequestThread extends the ExternalRequestProcessor thread which communicates with the HSP. The HTTP content is processed in processRequest(), and getHttpInput() (starting at lines 236 and 439, respectively). Once the HTTP data is converted to a task, it is submitted to the HSP via the HttpRequestThread. Depending on the response type selected, the HttpRequestThread will process the result accordingly. There are some utility methods, notably jicosHtmlHeader() and jicosHtmlFooter(), that create the standard HTML header and footer (with the response selection pull-down). |