XML saves the world <?>

Related issues

Protocols and data formats

I draw a clear distinction between protocols and data formats.

Protocols describe dynamic transaction process [*], having:

  • some implicit processing model
  • a concept of changing state in each of the communicating parties
  • protocol data unit semantics dealing with the dynamic state of the communicating parties

Data formats describe static information formats:

  • syntax
  • possibly some associated semantics, independent of an underlying communication protocol state

Like network reference models, such as the ISO 7-layer model, there are times when these distinctions can become blurred:  e.g. protocol specifications define formats for their protocol data units (PDUs), and applications use protocols to transfer formatted data, and may themselves have a notion of processing and state associated with those transfers.  But I still find it useful, in a divide-and-rule kind of way, to consider separately the designs for dynamic processes and static information formats.

Recent developments in the XML application area (e.g. SOAP) tend to blur the distinction because XML is used as a carrier syntax for protocol data units (e.g. SOAP envelope), and is also the form of payload data transferred.  Also, using XML as part of a protocol framework in this way tends to move protocol designs away from a "bits-on-the-wire" focus that is characteristic of many IETF protocol designs.

But ultimately, I maintain, XML is just a data format - a syntax with very little in the way of associated semantics, and no form of dynamic transactrion-state semantics.  Using XML will not, of itself, do anything to solve the tricky problems of managing and synchronizing transaction state between communicating parties. There are a whole range of design issues that must be addressed regardless of the choice to use XML or any other syntactic framework for encoding protocol data units and application data.


[*] This begs a definition.  A transaction is a series of message exchanges between two or more communicating parties that unfolds over a period of time, such that each send or receipt of a message results in a state change in the sending or receiving party, according to the data sent or received.