Automatic Multi Language Program Library Generation for REST APIs

By: Thomas Steiner (tomac[AT]google.com)
Last Updated: June 2007. This is work in progress.
Disclaimer: The opinions expressed in this document are my own, and not necessarily those of my employer.

What Web Services and Cell Phones Have in Common

Modern web services and cell phones share a lot of common properties: they have become a part of our daily lives, you can do awesome things with them, and of course they are supposed to make our communication easier. And they have both kind of failed. How many times have you seen these people talking in the train by phone, telling someone else that they are actually sitting in there, and that their train will be 5 minutes late. Oh my God, five minutes late! Many of today's web services are pretty much like that. Just like the people that pick you up at the station do not assume that your car got stolen, then there was an earthquake, then a terrible flood, locusts,...[1], web services as well should just make sense of the situation themselves, without the need to telling them about every minute the train is off schedule. However, a lot of web services represent this kind of "five minute late" guys. They are verbose, talkative, and contain a lot of boilerplate. Most of the developers out there just do not care whether the ID of something is an Int32, or an Int64 value[2]. This kind of strong typing in a web service intended for a broad audience to use makes it fragile and inflexible, where instead it should be forgiving and easy-going. The common SOAP/WSDL approach for web services forces exactly the way it should not be. "But SOAP/WSDL has the advantage of automatic code generation", you might oppose. Again, this is like calendar synchronization between your phone and your favorite PIM software. In principle it works, at least it should, there exist some flaws you know how to work around, and others you just got used to live with. There are many SOAP toolkits[3] in the wild, some with full SOAP 1.x support, others with partial support, and each of them - like every non-trivial software product - with its own bugs. An in theory correct WSDL file might just work perfect with one toolkit, and need a lot of manual fine-tuning with another. Like calendar syncing... I think you got the point.

A Short Definition of SOAP/WSDL and REST

SOAP
SOAP is a protocol for exchanging XML-based messages over computer networks. Usually this happens via HTTP(S), even if SMTP is a valid application layer protocol as well. Originally SOAP was an abbreviation for Simple Object Access Protocol, however, with the introduction of version 1.2 of the protocol the abbreviation was dropped as it was considered misleading. No one should be talking about SOAP without having read Pete Lacey's "The S stands for Simple" [4] blog entry. This is probably one of the most entertaining and fun ways to learn a lot about SOAP. The protocol was originally designed by Don Box, Mohsen Al-Ghosein, Dave Winer, and Bob Atkinson back in 1998. At the time Atkison and Al-Ghosein worked with Microsoft and Redmond, and later IBM, supported the team. Today the SOAP specification is maintained by the independent XML Protocol Working Group of the W3C (Word Wide Web Consortium). SOAP works well with firewalls which is a major advantage of SOAP over other technologies like DCOM. The protocol development team chose XML for the acceptance of XML with companies, and for the wide spectrum of available tools around this language. The main weakness of SOAP is its lengthy XML format which makes it quite slow in comparison to binary formats, both for bandwidth usage, but also for parsing the messages. However, XML allows for direct wire-level inspection of the data, as XML is human-understandable. Especially for smaller messages there is a considerable overhead of the actual data, and the protocol boilerplate. SOAP messages consist of an Envelope container which holds a necessary Body element and an optional Header element. SOAP has its own namespace, usually referred to as soapenv.
Sample SOAP request for an imaginary credit card number validating service (adapted from the German Wikipedia entry on SOAP):
<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>
<soapenv:Body>
<ns1:validate
soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:ns1="urn:CardValidator"
>
<number xsi:type="xsd:string">1234 5678 9876 5432</number>
<valid xsi:type="xsd:string">12/08</valid>
</ns1:validate>
</soapenv:Body>
</soapenv:Envelope>
WSDL

You cannot talk about SOAP without talking about WSDL, the Web Services Description Language. It serves to specify the data types, parameter lists, operation names, transport bindings, and the endpoint URI involved in a web service. Usually WSDL files are auto-generated. This order is the main source of criticism, as it normally does not allow for contract-first development, but pretty much forces the developer to do the straight opposite. WSDL is an XML-based service description language. Its version 2.0 is expected to become a W3C recommendation, however until now there have only been released several drafts. If a client wants to use a web service, it can access the service's WSDL in order to discover the available methods and the used data types. WSDL uses the XML schema types and allows for user-extended types to be embedded directly into the WSDL file. Each method is accessible by means of a port. Ports are used to define individual endpoints by specifying single addresses for a binding. Bindings define the concrete protocol (usually HTTP) to be used for the messages involved in a SOAP communication. Messages and ports are defined in a separate way. This approach allows the reuse of existing abstract definitions with concrete instances. The WSDL specification is considered very hard to read and understand. In addition to that it has to be noted that common practice of using the language and the way the  specification defines it do not completely comply. This is also due to the different encoding standards RPC/Encoded, Document/Literal, and the Wrapped-Document/Literal style introduced by Microsoft. These encoding styles are all about the positioning of the operation name in the SOAP message. All in all this makes SOAP/WSDL less interoperable and less intuitive to use as it should be, both for end-users, web service providers, but also for SOAP toolkit developers.


REST

The term REST stands for Representational State Transfer. It was coined in 2000 by Roy Thomas Fielding in his doctoral dissertation about the principle design of the modern web architecture. The most famous quote[5] is "Representational State Transfer is intended to evoke an image of how a well-designed web application behaves: a network of web pages (a virtual state-machine), where the user progresses through an application by selecting links (state transitions), resulting in the next page (representing the next state of the application) being transferred to the user and rendered for their use". In addition to this quite abstract definition, the term is also used in a loose sense to describe any simple interface that transmits domain-specific data over HTTP, where it is to be noted that this transfer happens without any additional messaging layer such as SOAP or session tracking via HTTP cookies. The main idea of REST is to apply verbs like GET, POST, PUT, and DELETE to nouns (i.e. subjects) which represent resources. These verbs happen to be HTTP methods, but are not limited to those operations. The original concept assumes communication to be stateless. Each HTTP message contains all the necessary information to understand the message. However, in practice there is often added a state by means of cookies or session variables. Due to the fact of every resource having its very own URI, resource usage statistics often reduces to analyzing server log files. In addition to that, efficient caching of requests is possible. Representations of REST systems are usually built on top of XML or (X)HTML. These languages can contain resource descriptions and hyperlinks, it is thus possible to navigate from one resource to another without an additional infrastructure layer. Each resource represents a state. Upon resource change, the client transfers to a new resource representation, and is thus in a new state. Hence the name Representational State Transfer.

Introducing the REST Challenge

Besides all these negative points, there are very strong positive points as well. OK, SOAP/WSDL might not be the best choice for every application, but for many at least it is a not so bad one. And even if code generation does not always work perfectly, it usually saves a lot of work. There is just this feeling that there should be something simpler, more straight-forward, and more intuitive. And then REST enters the stage. It is not that REST by definition is easier than SOAP/WSDL. In fact, for machines it is not easier at all. To understand this, let's have a look at Yahoo's REST API for its News Search[6]. This API allows the user to search the internet for news stories. You place requests by means of URL parameters to the API's endpoint:

http://search.yahooapis.com/NewsSearchService/V1/newsSearch?
  appid=YahooDemo&
  query=madonna&
  results=2&
  language=en

In this example just by looking at the URL we can easily see that there are four parameters:
Further, the service endpoint is NewsSearchService/V1/newsSearch, consisting of:
We as human beings have the ability to actually understand the API just by having a closer look at it. I very much believe in the principle of "hackable" URLs[7]. A hackable URL is a URL that by its very structure allows you to retrieve the desired content. Back to the news search example. If we, say, want 20 results instead of just 2, and we would rather prefer any language than just restricting the results to English, we can simply try "hacking" the URL, assuming that setting the results variable to 20 will guarantee the first property, and completely omitting the language variable the second, and end up with a new request URL:

http://search.yahooapis.com/NewsSearchService/V1/newsSearch?
  appid=YahooDemo&
  query=madonna&
  results=20

Why cannot a machine do this? Well, because machines are pretty poor hackers. There is need for a description language for this kind of RESTful APIs. If we think of a SOAP/WSDL approach for the news search web service, we would end up with a long WSDL file including the description of the types, the requests, the responses, the bindings, the port types, and the service endpoint. But do we really need all this? What are we actually interested in?
At present, such services are described using a mixture of XML schema and textual description[8]. We want a simpler description language than WSDL, but we want to keep all of its benefits, especially the ability to automatically generate code.

Proposals on Description Languages for REST APIs

The question of describing (REST) web services in a machine-readable way other than WSDL has been raised before[9]. However, often the motivation behind was more to get rid of WSDL rather than actually solving the REST description issues. Many suggestions are more or less ad hoc inventions designed to solve particular problems. It is to be noted that with WSDL 2.0 it is possible to describe REST services[10], but here we want to focus on some examples of non-WSDL approaches. As Sun Microsystem's Norman Walsh writes[11]: "We know the hard things are possible, we just have to make the easy things easy." In the following listing of such languages, we have included samples which are either provided by the authors themselves, or which have been generated by us to the best of our knowledge according to the provided specs. In order to simplify the sample structure, those parts of the sample describing a request have been highlighted in green, and those parts describing a response have been highlighted in red. In some cases the samples have been shortened or reformatted for better readability. However, these changes do not tamper the sample's expressiveness. Due to the ravages of time, some of the described services do not exist anymore, or have been changed.


Description:
Paul Prescod, co-author of the debatable "XML Handbook", offers an approach that treats the available REST API functions as a collection of resources. Each resource is represented by its very own URI and represents a state in the process of handling the requests. Each method has one or more valid representations. When XML is sent or received, the WRLD engine should check - just like an XML schema validator - whether the data is correspondent to the valid representations. While XML schema does not describe the interaction between different resources, the author claims that WRDL is capable of doing so by describing the service's runtime behavior. Therefore, the WRDL engine enriches each hyperlink with a resource type.  Hyperlinks may come back in response to requests, based on the specification found in the schema. Thus each action has a set of defined follow-up actions. POST and freely definable other HTTP methods can have an optional @creates attribute, allowing to check the result against a schema for validation. The author provides a fundamental (piecewise in pseudo-code) implementation written in Python, however states that the code is in a very rough premature state (which is true).

Author-provided sample: Babelfish translation service
<?xml version="1.0"?>
<!DOCTYPE types SYSTEM "wrdl.dtd">
<types>
<resourceType representations="html" name="babelfish">
<POST>
<input representations="babelFormData">
<query name="doit" apiName="translate" default="done"/>
<query name="tt" default="urltext" apiName="actiontype"/>
<query name="urltext" use="required" apiName="urltext"/>
<query name="lp" use="required" apiName="languages"/>
</input>
<output representations="html"/>
</POST>
</resourceType>
<representationType mediaType="text/html" name="html">
</representationType>
<urlEncodedRepresentation mediaType="application/x-www-form-urlencoded"
name="babelFormData"
>
</urlEncodedRepresentation>
</types>

Spec last updated: November, 2002



Description:
During development of NSDL, Sun Microsystems' employee Norman Walsh was driven by two main goals: first, make web services just as transparently usable as normal local code libraries, and second, allow service providers to describe their services in a totally interoperable way. Walsh limits his approach to describing services accessible by means of HTTP POST or GET. His proposal is a pretty straight-forward: each method is abstracted in a service element containing attributes that describe the HTTP method and the service URI. Each servicerequest, and one response element. Requests contain parameter elements with @type, @optional, and @default attributes. For POST requests, there is an additional body element which, in contrast to GET parameters that are simply URL-encoded, allows complete XML requests to be sent. Response elements contain result elements with a @select attribute which, assuming an XML response, via XPath allows the desired sub-set of the whole response XML to be selected. In addition to that, NSDL supports fault element contains one elements that also via XPath allow to react on erroneous requests. The author provides a complete implementation of the language in form of three Perl modules, indeed allowing web services with an appropriate NSDL description to be used in a high-level way from within any Perl script.

Author-provided sample: Amazon item search service

<descriptions xmlns="http://nwalsh.com/xmlns/nsdl#">

  <service xmlns:a="http:// [...] amazon.com/AWSECommerceService/ [...]"
           name="booksbykeyword"
           action="get"
           uri="http:// [...] amazon.com/ [...] SearchIndex=Books&amp;"
  >
 
    <request>
      <parameter name="SubscriptionId" type="xsd:string"/>
      <parameter name="Keywords" type="xsd:string"/>
    </request>
    <response>
      <result name="count"
              select="/a: [...] /a:Items/a:TotalResults"
      />

      <result name="time"
              select="/a: [...] /a:OperationRequest/a:RequestProcessingTime"
      />

      <result name="titles"
              select="/a: [...] /a:Items/a:Item/a:ItemAttributes/a:Title"
      />

    </response>
  </service>
</descriptions>

Spec last updated:
September, 2005


Description:
Tim Bray, director of web technologies at Sun and co-inventor of XML has proposed SMEX-D, a description language developed with the goal in mind to provide an implementation in the simplest way that could possibly work. SMEX-D focuses both on SOAP- and REST-style message exchanges. The basic idea is that there is necessarily a request, followed by an optional response. Requests can take on three forms: name-value pairs, SOAP, and XML different from SOAP. Responses can be either SOAP, XML different from SOAP (Bray calls this NSX for non-SOAP XML), or nothing at all. Name-value pairs describe URI parameter requests which can be typed by means of types defined in XML schema section 3.2., where the default type is simply string. SOAP messages have a header and a body element each of which must contain at least one language element. The language element describes the elements that may appear in the SOAP header and body. This is realized either via putting both the SOAP header and body independently in a certain namespace, and/or via assigning an XML schema to the header and body. This also applies to non-SOAP XML (NSX) messages, with the difference that there are no header and body elements.

Author-provided sample: Amazon item search service

<smex-d xmlns="http://smex-d.net/ns/"
        href="http://webservices.amazon.com/onca/xml"
>
  <request form="pairs">
    <pair name="Service" />
    <pair name="Subscription" />
    <pair name="Operation" >
      <enum>
        <v>ItemSearch</v>
      </enum>
    </pair>
    <pair name="AssociateTag" />
    <pair name="ResponseGroup">
      <enum>
        <v>Groups</v>
        <v>Accessories</v>
        [...]
        <v>VariationSummary</v>
      </enum>
    </pair>
    <pair name="Style" type="anyURI" />
    <pair name="ContentType" />
    [...]
    <pair name="MaximumPrice" type="decimal" />
    <pair name="MerchantId" />
    [...]
  </request>
  <response form="nsx">
    <language namespace="http:// [...] amazon.com/ [...]">
      <schema flavor="xsd"
              href="http:// [...] amazon.com/ [...]"
      />
    </language>
  </response>
</smex-d>

Spec last updated: May, 2005



Description:
John Cowan has a blog called Recycled Knowledge, and this was exactly the motivation that drove him when he created Resedel. It is a mixture of NSDL and SMEX-D. Cowan states having stolen most of Norman Walsh's RPC-style encoding and the rest from Tim Bray. As of today there is only a Relax NG schema available, without any description at all, besides the comments. The approach maps the four basic functions of persistent storage CRUD (Create, Read, Update, Delete) directly to the HTTP methods POST, GET, PUT, READ. This adds another abstraction layer, however, given the different assumptions of the CRUD model and REST concerning updates, not without being controversial.

Sample (not provided by the author): Yahoo news search service

<?xml version="1.0" encoding="UTF-8"?>
<resedel version="0.2" xmlns="http://www.ccil.org/~cowan/resedel/ns">
  <type id="Yahoo Search"
        flavor="xsd"
        href="http://search.yahooapis.com/ [...] /V1/NewsSearchResponse.xsd"
  />
  <service id="News Search"
           uri="http://search.yahooapis.com/NewsSearchService/V1/newsSearch"
           operation="read"
  >
    <request soap="false">
      <parameter name="appid" />
      <parameter name="query " />
      <parameter name="type" default="all" />
      <parameter name="results" typref="xsd:integer" default="10" />
      <parameter name="start" typeref="xsd:integer" default="1" />
      <parameter name="sort" default="rank" />
      <parameter name="language" />
      <parameter name="site" default="" />
      <parameter name="output" default="xml" />
      <parameter name="callback" />
    </request>
    <response>
      <language uri="urn:yahoo:yn">
        <schema flavor="xsd" root="ResultSet"
                href="NewsSearchResponse.xsd"
        />

      </language>
      <fault name="Bad Req" status="400" />
      <fault name="Forbidden" status="403" />
      <fault name="Service Unavailable" status="503" />
    </response>
  </service>
</resedel>


Spec last updated: May, 2005


Description: Richard Salz, named contributor to the HTTP/1.0 and HTTP/1.1 specs, proposes a three-part approach consisting of

  1. a schema definition which defines what messages look like

  2. an interface definition which describes what methods are provided

  3. a location definition which tells the processor where to find the service

These three elements are put in a container called description which has a @name attribute which actually is a URI. According to the author, RSWD should support all available schema description languages. The interface describes the available methods in operation containers which hold input and output elements. Basically these two elements contain simply a reference to the appropriate section of the schema in order to define the valid parameters. For this initial version Salz abstains from defining error elements. Due to Salz, this is because of the necessity to keeping RSWD independent from the different SOAP versions.


Author-provided sample: Generic imaginary description

<rsws:description name="http://example.com/rsws/">
  <rsws:schema id="mytypes">
<![CDATA[
default namespace = "http://example.com"
element foo { attribute bar { string } }
]]>

</rsws:schema>
<rsws:interface id="sample">
<rsws:operation>
<rsws:input ref="tns:fooIn"/>
<rsws:output ref="tns:fooOut"/>
</rsws:operation>
<rsws:fault soap11Faultcode="env:server"
soap12Code="env:server" soap12Subcode="tns:ErrorValues"
>
...
</rsws:fault>

</rsws:interface>
<rsws:location>

<rsws:provides href="#mgmt"/>
  </rsws:location>
</rsws:description>

Spec last updated: October, 2003



Description: Dave Orchard is member of several W3C committees. In his approach the author suggests to basically just list all the resources of a web service where each resource is assigned to a @location. Each resource can have operation elements. In addition to that, in order to simplify the syntax with the most common GET operation, WDL supports a getoperation element. These operation (and of course getoperation) elements in turn serve as containers for input, output and fault elements (however faults are not defined in the provided schema). He deliberately does not list all the parameters a service can potentially have, claiming this would be a step backwards to the WSDL 1.1 definition weakness. Orchard wants the approach to also include HTTP Headers, Configuration, and Status Codes in order to keep to existing means of HTTP communication.


Author-provided sample: Yahoo search service

<!-- www.ps.com stands for www.pacificspirit.com -->
<?xml version="1.0" encoding="UTF-8"?>

<resources xmlns="http://www.ps.com/ns/2005/05/WDL/"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.ps.com/ [...] ./WDL.xsd"
           xmlns:ysearchtypes="http://www.ps.com/ [...] srch/types/"
           xmlns:yahoosrch="urn:yahoo:srch"
           xmlns:xs="http://www.w3.org/2001/XMLSchema"
>
   
  <xs:import namespace="http://www.ps.com/ns/2005/yahoo/srch/types/"
             location="./YahooV1Search.xsd"
  />

  <xs:import namespace="urn:yahoo:srch" location="YahooV1SearchFault.xsd"/>
  <xs:import namespace="urn:yahoo:srch"
             location="http://
[...] yahoo.com/ [...] WebSearchResponse.xsd"
  />

  <resource location="http://api.search.yahoo.com/WebSearchService/V1/">
    <getoperation location="{searchString}">
      <output ref="yahoosrch:Result"/>
      <output code="400 403 503" ref="yahoosrch:Error"/>
    </getoperation>
  </resource>    
</resources>

Spec last updated: May, 2005


Description: Dr. Marc J. Hadley is a senior staff engineer in the Office of the CTO with Sun Microsystems, and represents Sun on the W3C XML Protocol and W3C WS-Addressing working groups. There he is co-editor of the SOAP 1.2 and WS-Addressing 1.0 specifications. Hadley's approach intends to list the resources in hierarchical form, where the resources are grouped together by a resources element which has a @base attribute indicating the web service's base address. In consequence each resource element has a @path attribute which defines the resource's path relative to the base address. Each resource has method elements that in turn are defined by request and response elements. WADL allows schemas to be included in the description, the author calls this grammar. Both XML schema and Relax NG are supported. Requests and responses can be defined by means of a representation element allowing for schema-conform representation. Via an @element attribute a particular section of the schema can be selected for refinement. Responses can also have fault elements. These can have a @status attribute and in addition to that be represented by a schema. WADL has some interesting extensions not included in other approaches, the most interesting one are probably the so-called template parameters. This feature allows resources to have a dynamic path variable which only at runtime is substituted by the actual value.


Author-provided sample: Yahoo news search service

<?xml version="1.0"?>
<application xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http:// [...] sun.com/wadl/ [...] wadl.xsd"
             xmlns:tns="urn:yahoo:yn"
             xmlns:xsd="http://www.w3.org/2001/XMLSchema"
             xmlns:yn="urn:yahoo:yn"
             xmlns:ya="urn:yahoo:api"
             xmlns="http://research.sun.com/wadl/2006/10"
>

  <grammars>
    <include href="NewsSearchResponse.xsd"/>
    <include href="Error.xsd"/>
  </grammars>
  <resources base="http://api.search.yahoo.com/NewsSearchService/V1/">
    <resource path="newsSearch">
      <method name="GET" id="search">
        <request>
          <param name="appid" type="xsd:string" style="query"
                 required="true"
          />

          <param name="query" type="xsd:string" style="query"
                 required="true"
          />

          <param name="type" style="query" default="all">
            <option value="all"/>

            <option value="any"/>
            <option value="phrase"/>
          </param>
          <param name="results" style="query" type="xsd:int" default="10"/>
          <param name="start" style="query" type="xsd:int" default="1"/>
          <param name="sort" style="query" default="rank">
            <option value="rank"/>
            <option value="date"/>
          </param>
          <param name="language" style="query" type="xsd:string"/>
        </request>
        <response>
          <representation mediaType="application/xml"
                          element="yn:ResultSet"
          />

          <fault status="400" mediaType="application/xml"
        &nbsp;&nbsp;       element="ya:Error"
          />

        </response>
      </method>
    </resource>
  </resources>
</application>

Spec last updated: November, 2006


WADL Pseudo XML/RegularExpression/Schema Document


<application>
  <doc xml:lang="" title="xsd:string" />*
  <grammars>?
    <doc xml:lang="" title="xsd:string" />*
    <include href="xsd:anyURI" />*
  </grammars>
  <resources base="xsd:anyURI">?
    <doc xml:lang="" title="xsd:string" />*
    <resource id="xsd:ID"? path="xsd:string"? type="[xsd:anyURI]"?

              queryType="application/x-www-form-urlencoded"?

    >*
      <doc xml:lang="" title="xsd:string" />*
      <param style="{template|matrix|query|header}"1>*
        <doc xml:lang="" title="xsd:string" />*
        <option value=""1>?
          <doc xml:lang="" title="xsd:string" />*
        </option>
        <link resource_type=""? rel=""? rev=""?>
          <doc xml:lang="" title="xsd:string" />
        </link>
      </param>
      <method type="xsd:string"? name="{GET|POST|PUT|DELETE|HEAD}"1>*
        <doc xml:lang="" title="xsd:string" />*
        <request>
          <doc xml:lang="" title="xsd:string" />*
          <representation href="xsd:anyURI"1>*

             // the referenced representation must have an attribute @id equal to the @href value
          <representation mediaType=""? element=""? profile=""? status=""?>*
            <doc xml:lang="" title="xsd:string" />*
            <param path=""? />*
          </representation>
          <param id=""? name=""1 style="{query|header}"1 type="xsd:string"? default=""?

                 required=""? repeating=""? fixed="xsd:string"?

          >*
            <doc xml:lang="" title="xsd:string" />*
            <option value=""1>?
              <doc xml:lang="" title="xsd:string" />*
            </option>
            <link resource_type=""? rel=""? rev=""?>
              <doc xml:lang="" title="xsd:string" />*
            </link>
          </param>
        </request>
        <response>
          <doc xml:lang="" title="xsd:string" />*
          <representation href="xsd:anyURI"1>*

            // the referenced representation must have an attribute @id equal to the @href value
          <representation mediaType=""? element=""? profile=""? status=""?>*
            <doc xml:lang="" title="xsd:string" />*
            <param path=""? />*
          </representation>
          <fault mediaType=""? element=""? profile=""? status=""?>*
            <doc xml:lang="" title="xsd:string" />*
            <param path=""? />*
          </fault>
          <param style="{header}"1 />*
        </response>
      </method>
      <method href="xsd:anyURI"? />*

        // the referenced method must have an attribute @id equal to the @href value
      <resource id="xsd:ID"? path="xsd:string"? type="[xsd:anyURI]"?

                queryType="application/x-www-form-urlencoded"?

      />*
    <resource>
  </resources>
  <resource_type id="xsd:ID">*
    <doc xml:lang="" title="xsd:string" />*
    <param style="{query|header}"1 />*
    <method type="xsd:string"? name="{GET|POST|PUT|DELETE|HEAD}"1>*
  </resource_type>
</application>

Observations During the Examination of the Proposals

When we have a closer look at the above proposals there can be remarked a peculiar phenomenon: since 2005 there is a certain inactivity on the field of description languages. public-web-http-desc@w3.org is the World Wide Web Consortium's official mailing list dedicated to discussion of web description languages based on URI/IRI and HTTP, and aligned with the Web and REST Architecture. The activity for the year 2006 with 72 single postings has more than halved regarding the 147 postings in 2005. Besides WADL, all proposals date back to the year 2005 or earlier. The term "REST" was brought up by Roy T. Fielding in 2000, however, only since 2005 the concept of RESTful web services has really started to explode. What made developers and web technology architects suddenly turn away from the idea of describing REST web services in a general way? Why did most of the initial drafts never get updated, picked up by others, and, not to mention, get implemented?
At present there is no satisfactory answer. Maybe it is just because REST is so simple in many cases. Most of the APIs in the wild that call itself RESTful actually are not really RESTful in Fielding's sense of the term (especially regarding application state and functionality that need to be divided into uniquely addressable resources), but rather "RESTful" in the sense of not being SOAP. Some web services use HTTP to tunnel function calls. This technique is called XML over HTTP. Web services that follow this architecture typically just have one endpoint and can be used by POSTing raw XML data. A prominent example is Google's Checkout API[19]. If an API is limited to HTTP GET operations, not even one single line of code is necessary in order to use the API. These pure read-only APIs can be explored directly from within any web browser, here common examples are eBay's REST API[20], Yahoo's Search REST APIs introduced earlier[6], and equally Amazon's REST web services[8].

Design Doc

Author: Thomas Steiner <tomac[AT]google[DOT]com>, project proposed by Patrick Chanezon <chanezon[AT]google[DOT]com>

Status: Working beta 0.3 (as of June 2007), see http://tomayac.de/rest-describe/latest/RestDescribe.html for details.

Objective: The project's main goal is to create a compiler which allows for automatic client code generation for REST APIs in various programming languages. The project's second goal is to implement a rich web application which allows for more or less interactive web service description creation.

Background: In order to understand what this project is about, a good understanding of RESTful design, and especially WADL[18] is necessary. As to our understanding, no other project has been started yet with the ambitious goal of providing auto-generated code for REST APIs in various programming languages.

Overview: REST APIs are in broad use throughout the web services development world. However with RESTful web services, in contrast to SOAP/WSDL, capabilities for automatic code generation are still very limited, or not available at all. The project consists of two sub-projects:


Implementation: At the time of writing the application had reached version 0.3 Beta. The following description covers only this release.

Development environment: Google Web Toolkit 1.3.3[21] and Eclipse 3.2.0[22] (Mac OS X releases)

Eclipse: Eclipse is one of the most powerful IDEs for Java (among other languages). Google Web Toolkit has a command line tool for creating an Eclipse project that can then be imported into Eclipse.

Google Web Toolkit: The Google Web Toolkit is an open source software which allows developers to create Ajax applications in the Java language. It integrates easily in IDEs like Eclipse. The toolkit ships with a so-called hosted browser that can be used to run created applications smoothly from within the IDE. The main feature, however, is a Java to JavaScript compiler that transforms programs written in Java into web applications written in JavaScript. Thus the idea is to profit from the tools available in the Java world (JUnit, IDEs, ...) also for web application development.
Included in the toolkit are common widgets like trees, radio buttons, forms, and the like. In addition to that, the toolkit implements event listeners like WindowResizeListener that are an abstract layer on top of the browser's event handling. A common issue with web development are browser differences and incompatibilities. This makes life a lot harder than it should be, in consequence the toolkit abstracts all these worries away, and the generated applications in general behave the same on different browser platforms.
The 1.3.3 release of Google Web Toolkit is based on Java 1.4. Basically everything that could be done in JavaScript can be done in Java, and vice versa. To make this idea clear, let's have a look at an example. Via XML HTTP Requests, JavaScript applications can access resources as long as they do not hurt the Same Origin Policy[23]. So even if any arbitrary Java program may access external resources from third parties, this will not work in JavaScript (you are on the domain my.company.com/dir1/a.html, another valid resource would be my.company.com/dir2/b.html, however, an invalid external resource would be other.company.com/dir1/a.html).
Object orientation (OO) in JavaScript is prototype-based, whereas Java has a classical class-based OO model. To many programmers the class approach is more intuitive, so with the Google Web Toolkit you can program class-based JavaScript from within Java.

Justification of the selected tools: I have some experience with pure JavaScript application development, however, for this project the host company requirement was to use the in-house Google Web Toolkit. The widgets mentioned above were of great value for the project, especially the Tree widget for the tree view of the generated WADL file (explanation later). The fact that the Google Web Toolkit generated code is compatible to most of today's browsers is a big plus for each JavaScript developer. While I probably would have been able to deal with some of the incompatibilities, having a toolkit do this for you is of great convenience, and allows you to focus the implementation details of your application rather than browser quirks. So even if the choice of Google Web Toolkit was not mine, I never felt bad about it. Google provides pretty good introductions and a hello world application that helps you understand the general concepts of the toolkit.

External architecture: From a user's point of view, the application is structured in two modules: REST Describe, and REST Compile. In the following an overview of each module's responsibilities:

WADL creation of a web service based on sample requests. The idea is to profit from the information intrinsic to REST-style web services, i.e.
  • the service structure corresponds to the URI structure
  • the endpoint names correspond to the available web service operations
  • sample requests contain a lot of data that can be analyzed with an heuristic approach
  • the HTTP method which is of particular interest for REST is part of the sample requests
  • the response of a web service can be analyzed in order to gain more knowledge about the service, especially for fault handling and XML schemas in use

WADL upload and modification of existing WADLs. The more people use WADL, the more exists the need to modify and parse existing WADLs. REST Describe has a WADL parsing feature which checks the given WADL for correctness, and upon success creates a tree representation for interactive editing.

WADL download of the created WADL files.


Code generation in various programming languages based on a WADL file. At the time of writing these were
  • PHP 5 (since version 0.2)
  • Ruby (since version 0.2.1)
  • Python (since version 0.2.2)
  • Java (since version 0.3)
The first three languages are dynamically typed interpreted languages, whereas Java is statically typed. In some respects the strict typing Java enforces is only an outside-world feature, as internally in the end everything is a String on the HTTP protocol.

Internal architecture: From a developer's point of view the application consists of the following package structure:

com.google.code.apis.rest.client
RestDescribe.java
The entry point to the application. This is a singleton whose only job is to initialize the application GUI.
GUIGuiFactory.java
ParameterPanel.java
WadlPanel.java
MainMenuPanel.java
RequestUriPanel.java
RestCompilePanel.java


AboutDialog.java
BatchUriDialog.java
CustomTypesDialog.java
FullscreenDialog.java
SettingsDialog.java
TestRequestDialog.java
WadlPreviewDialog.java
WadlSaveDialog.java
WadlUploadDialog.java


ParameterTree.java
RequestUriTree.java

Indicator.java

Notifications.java
Notifications.properties
Notifications_de.properties

Strings.java
Strings.properties
Strings_ca.properties
Strings_de.properties
This package contains all the GUI elements. The main screen consists of several horizontal and vertical panels that are docked to a dock panel inside of the GUI factory.

This package also contains all the dialog windows.

Very important are the two tree files which manage the parameter tree on the left side of the application (ParameterTree.java) and the request URI tree in the upper part of the application window (RequestUriTree.java).

Indicator is a small notification division as seen on Gmail or other Ajaxified sites. It signalizes script activity.

In addition to that there are some internationalization (I18N) files for (currently) English, German, and Catalan language support. This is realized thanks to Google Web Toolkit's I18N module which is based on Java properties files.
TreeXmlTree.java

ApplicationItem.java
FaultItem.java
GenericClosingItem.java
GrammarsItem.java
IncludeItem.java
MethodItem.java
ParamItem.java
RepresentationItem.java
RequestItem.java
ResourceItem.java
ResourceTypeItem.java
ResourcesItem.java
ResponseItem.java
In the Tree package are all the classes that serve as WADL tree item representations in the center of the application window. The tree root is in the XmlTree.java file, all other tree nodes can be added according to the WADL specification. The idea is to  have a tree that behaves like an XML file, including code folding and interactive editing. ParamItem.java and MethodItem.java files are linked to the parameter tree and the request URI tree, i.e. if you edit either of them, the corresponding counter-parts change accordingly.
UtilInvalidUriException.java
PrettyPrinter.java
Tools.java
TypeEstimator.java
Uri.java
This is a container package for utilities like URI management, syntax highlighting, and type estimation.
WADLAnalyzer.java

WadlParser.java

WadlXml.java

ApplicationNode.java
DocNode.java
FaultNode.java
FaultRepSuperNode.java
GenericNode.java
GrammarsNode.java
MethodNode.java
NamespaceAttribute.java
ParamNode.java
RepresentationNode.java
RequestNode.java
ResourceNode.java
ResourceTypeNode.java
ResourcesNode.java
ResponseNode.java

NamespaceDiscoverer.java
GrammarDiscoverer
The WADL package contains everything that is related to WADL.

The analyzer checks sample request URIs, the parser parses uploaded WADLs.

Finally the WadlXml converts a WADL tree into an XML file.

A WADL tree consists of many different nodes, each of them represented by an own class. Some nodes share sub nodes for several analysis runs, e.g. Application nodes remember their sub nodes, as each request URI in a batch analysis a priori contains to the same application.

Finally there are the namespace and grammar discoverers. They are currently mocked up, however, shall one day analyze live requests to web services and extract the XML namespaces and grammars the service uses. This requires a server-side component though as a pure client-side script is bound to the Same Origin Policy[23].
CodeGenerationCodeGenerator.java

PythonGenerator.java
RubyGenerator.java
JavaGenerator.java
PHP5Generator.java

HTTPError.java

Parameter.java

RequestMessager.java

Templates_Java.java
Templates_Java.properties
Templates_PHP.java
Templates_PHP.properties
Templates_Python.java
Templates_Python.properties
Templates_Ruby.java
Templates_Ruby.properties

TypeMapper.java
The CodeGeneration package contains everything related to code generation. The architecture follows a classical factory pattern: a code generator factory initializes the particular generator class.

All generators share a common HTTP error class that contains standard HTTP error descriptions and allows for fault handling in case of where an API uses standard HTTP errors.

The parameter class is an abstraction of a web service parameter and provides visibility information and coding conventions for the class variables of the classes to be generated.

The generators communicate with the factory by means of a request messager which basically is a representation of a request node enriched with particular web service data.

Code generation happens for the static parts via Java properties files. The dynamic parts are inserted by calling the corresponding methods defined in the particular language template interfaces.

Finally the XSD schema types that are used in WADL are mapped to Java, and later C# types by the type mapper. This does not yet include custom types.

Class interaction: In the following we want to give an overview on how the classes interact. There are several steps during execution that can be combined to complete program cycles.

GUI setup: this is the first and basic step during program execution. The sole responsibility of this unit is to initialize the working environment. Strings and Notifications are internationalization classes that contain all the program messages in various languages. At run-time the correct class is initialized depending on the selected language. See the sequence diagram below for details:
Request URI analysis: one of the most common tasks is request URI analysis. The idea is to create a WADL XML tree based on a single request URI. Each request URI consists of an HTTP method, and the actual URI string. This information is managed by the RequestUriTree class. This class first tries to analyze the request response in order to discover the namespaces and grammars used by the web service (currently mocked up), and then initializes the Analyzer. The following sub-steps are all dependant on the request structure. In consequence the following diagram describes a standard case for a request URI with one parameter, and one level of resource hierarchy. First, the ResourcesNode gets created. It contains the URI authority as the ResourcesNode's base attribute. Second, a ResourceNode with the path attribute in correspondence to the URI path is initiated. Afterwards, a MethodNode which reflects the HTTP method used for the request in the name attribute. The following RequestNode is just a container for the ParamNodes. These nodes from an analysis point of view are the most interesting ones as this is where type analysis happens. Via a multi-step structure of regular expressions the correct type is tried to be estimated. Type estimation will be explained in a later section. The final DocNode preserves the original sample value for the parameter, and contains information about the quality of the estimation (sure, supposed, unsure).
Batch URI Analysis: this is basically just a modification of the previous case with the difference that there are more than one RequestUriTree instances at the beginning. These instances are created from within the GUI package's BatchUriDialog class.

XML Tree Creation: once a request has been parsed or an already existing WADL file has been uploaded, this step is responsible for creating a graphical representation of the tree. The XmlTree class instantiates an ApplicationItem instance, and from that point on the tree gets created item by item, where each item can have sub-items according to the WADL schema. The diagram below shows a very incomplete view of what can happen. Tree creation is not a strictly sequential process which makes it hard to represent the correct workflows in a sequence diagram style. References may be circular, and elements may have the same kind of child nodes. There is no generic way to represent this in UML form, so the diagram is something in between a concrete sequence and an attempt to show the generality, however, the expressiveness is of rather limited quality. The WADL XML schema or RelaxNG schema is far better suited for gaining a better understanding of the tree, as the implementation strictly follows the structure of the schema.
WADL Upload: via the WadlUploadDialog class a WADL file gets forwarded to the WadlParser class. This class then instantiates the ApplicationNode, and then dependent of whether the particular node exists in the WADL, the nodes get instantiated and are added to the abstract syntax tree representation with the application node as a root element. The allowed structure (child nodes and attributes) are checked during runtime. As soon as the first error gets discovered, analysis is interrupted with an appropriate error message. In the end a reference to an Applicationnode is passed to the XmlTree class which then creates a graphical tree-like representation of the web service.
Code Generation: probably the most common task in REST Describe & Compile concerns WADL to programming language compilation. The compilation process, however, is rather simple. The central component is the CodeGenerator factory. Depending on the selected language, the appropriate generator class is loaded. The generator factory and the concrete generators communicate by means of a so-called RequestMessager class after the syntax tree (the internal representation of the graphical WADL XML tree) has been traversed. The RequestMessager contains the parameters, the request name, the HTTP method, and the absolute address (i.e. the URI without the parameters). Faults are passed to the generators as separate HTTPError objects. In case of statically typed languages such as Java and C# a TypeMapper class maps the XML schema types to the language's native types. Each target language has a templates class that contains templates for the static code generation fractions. The diagram below shows the Java generation sequence.
Summary: the steps outlined above can be combined in whatever sense-making combination, for example GUI Setup followed by WADL Upload followed by XML Tree Creation and finally Code Generation. Another sequence would be GUI Setup followed by Request URI Analysis.

Type estimation in REST Describe: type estimation is an important element of request analysis. In the following the approach of REST Describe & Compile is described. It is based on a fall-through hierarchy of regular expressions. The estimation process takes both the parameter value and the parameter name into account. This allows for pretty good estimation with the process judging its own quality. A problem are custom types (for example a type sort with the allowed values {date, relevance}). These parameters look like strings, besides a potential XML schema analysis there is no way to determine whether only certain values are allowed. At present the application runs completely on the client side. Because of the Same Origin Policy, schema retrieval would require a server side component, therefore currently the default type estimation for this kind of parameters is string, however, just with estimation quality "supposed" (and not "sure"). The diagram below contains the complete fall-through hierarchy, where the green, yellow, and red boxes symbolize the type estimation at the particular point, green meaning "sure", yellow "supposed", and red "unsure". The left side of the diamonds means condition fulfilled, the right side means condition not fulfilled.


In practice the following observations can be made:

One last remark concerns types with web services in general. When you take a search web service with an imaginary parameter results=10, type estimation will detect that the 10 is an integer. But what does this mean? In the end, when the request goes on the wire, you are always sending a string. Whatever types your web service client code enforces on the user interface (this means on the constructors of the request classes), there must be a string representation of the data. Imagine a language parameter that allows a search to be restricted to a subset of languages.  In a programming language you would create a list or an array of the allowed languages, but how should an array or a list map to a string? In practice there are two forms to represent this in the REST world:

Code generation with Java properties files: one of the more interesting features of REST Describe & Compile is the way code generation is implemented. The application is created with Google Web Toolkit. The toolkit's I18N Message feature which is based on Java properties files provides us with a certain kind of template functionality. How does this work? Let's start with an explanation of the feature the way it was thought (i.e. for I18N):

You start with a template in each language:
   runningOutOfDiskSpace = Caution, you only have {0} left.
   runningOutOfDiskSpace = Achtung, Sie haben nur noch {0} übrig.

At runtime the program inserts the actual amount of available disk space, and only the strings that need to be localized are kept in the properties file. Now back to the code generation idea. For instance functions or methods or whatever you want to call them exist in almost every programming language. There are slight differences, however, the main idea is the same. Have a look at two equivalent functions, the first in Ruby and the second in PHP:

def welcome(name)

puts "howdy #{name}"

end
function welcome($name) {

echo "howdy" . $name;

}
So "}" becomes "end", "def" becomes "function" and so on. For each language there are characteristic structures, only the dynamic content needs to be updated. What REST Compile does is that a CodeGenerator fabric class loads for example a PHP5Generator class which then loads its particular Templates_PHP.properties file. In consequence if you want to add another language, simply register your generator in the CodeGenerator factory, write your template file, add your dynamic code generating stuff to the generator, and there you go. By the way: this I18N message abuse allows code generation to happen completely on the client side. You could download the whole application and run it offline.



Prospects or dog food in REST Compile & Describe

At present the application runs completely on the client side. This means that all the program logic is executed independently from any server-side component. This is a big advantage, however, also limits the application's capabilities. For future releases we are planning to extend the application by adding some server-side logic. Eating its own dog food is a common term for companies that use their own products. This happens as a means of improving the overall quality of their products. We are planning to follow this pattern in REST Describe & Compile, that is to say in the Grammars and Namespace Discoverer modules. We are thinking of a module that uses code created by REST Compile, and then executes this code in order to retrieve the namespaces and particular XML schemas a certain web service uses. So the process flow would be:

  1. Analyze a request URI and generate a first reqest-based WADL

  2. Use this WADL file to generate request code

  3. Execute the generated code on the server, i.e. place a real API request

  4. Harvest the returned response and discover the contained namespaces and XML schemas

  5. Refine the previously generated WADL, and add the response-based WADL information

  6. Return the final version of the WADL

  7. Return the final version of the code with proper response handling



[1] Freely adapted from the Blues Brothers: http://www.imdb.com/title/tt0080455/quotes
[2] Example from Ex-Googler Nelson Minar's weblog: http://www.somebits.com/weblog/tech/bad/whySoapSucks.html
[3] List of available SOAP toolkits: http://soaplite.com/resources.html#TOOLKITS
[4] The S stands for Simple: http://wanderingbarque.com/nonintersecting/2006/11/15/the-s-stands-for-simple
[5] Roy T. Fielding's in his dissertation about REST: http://www.ics.uci.edu/%7Efielding/pubs/dissertation/evaluation.htm#sec_6_1
[6] Yahoo's News Search REST API: http://developer.yahoo.com/search/news/V1/newsSearch.html
[7] Jakob Nielsen's Alertbox on URLs as UI: http://www.useit.com/alertbox/990321.html
[8] Examples of textual and/or XML schema REST API descriptions:
    Amazon (Item Lookup): http://docs.amazonwebservices.com/AWSEcommerceService/2007-02-22
    del.icio.us (Tag Rename): http://del.icio.us/help/api/tags
    Flickr (Add Tags): http://www.flickr.com/services/api/flickr.photos.addTags.html
    Yahoo Photo Service (List Photos): http://developer.yahoo.com/photos/V3.0/listPhotos.html
[9] List of REST web service descriptions maintained by David B. Orchard: http://www.pacificspirit.com/Authoring/REST
[10] W3C overview on how to describe REST web services with WSDL 2.0: http://www.w3.org/2005/Talks/1115-hh-k-ecows/#(1)
[11] Sun's Norman Walsh on the complexity of WSDL: http://norman.walsh.name/2005/02/24/wsdl#p8
[12] WRDL (Web Resource Description Language): http://www.prescod.net/rest/wrdl/wrdl.html
[13] NSDL (Norm's Service Description Language): http://norman.walsh.name/2005/03/12/nsdl
[14] SMEX (Simple Message Exchange Descriptor): http://www.tbray.org/ongoing/When/200x/2005/05/03/SMEX-D
[15] Resedel (REstful SErvices DEscription Language): http://recycledknowledge.blogspot.com/2005/05/resedel.html
[16] RSWS (Really Simple Web Service Descriptions): http://webservices.xml.com/pub/a/ws/2003/10/14/salz.html
[17] WDL (Web Description Language): http://www.pacificspirit.com/Authoring/WDL
[18] WADL (Web Application Description Language): https://wadl.dev.java.net
[19] Google Checkout API: http://code.google.com/apis/checkout/index.html
[20] eBay REST API: http://developer.ebay.com/developercenter/rest
[21] Google Web Toolkit: http://code.google.com/webtoolkit/versions.html
[22] Eclipse Project: http://download.eclipse.org/eclipse/downloads
[23] Same Origin Policy: http://www.mozilla.org/projects/security/components/same-origin.html