Akeni Logo
Inicio Productos Servicios Descargas Compre Soporte Contactos
Jabber Client Programmer's Cheat Sheet
Jabber Client Developer’s Cheat Sheet

The Jabber Client Developer's Cheat Sheet

Jens Alfke

rev. 4 :: 25 May 2001
(Copyright & license at end of document)

Contents

  1. Introduction
  2. Jabber Sessions
  3. Logging In & Out
  4. Declaring & Receiving Presence
  5. Managing Your Roster
  6. Sending & Receiving Messages
  7. Chatting
  8. File Transfer
  9. Registering NewAccounts


1. Introduction

This is an informal tutorial guide for the perplexed developer trying to implement a Jabber client from scratch. Two weeks ago that developer was me; now, after many experiments and confused e-mails to jdev, I'm dropping pearls of my great wisdom to help you out. You're welcome!

Your need for this document will be less if you use one of the many available Jabber client libraries like libjabber or JabberCOM, which do a lot of this work for you. However, the libraries typically don't do all of the work of managing a real Jabber session (libjabber in particular is pretty low-level) so you will still need to do part of the work described here.

This document is a commentary on, not a substitute for, the Jabber Protocol Overview [JPO] or Jabber Programmer's Guide [JPG]. Since those documents are (in their present incarnations) arranged more as references than introductions, the tutorial overview here should help you in navigating them. I've included pointers to relevant sections in the official documentation, to which you can refer for more details. (See also the References section at the end of this document.)

In particular, I'm going to assume you've already read enough of the official documentation to know how <iq> tags work, since you'll need to send and receive one to be able to log in. See [JPO 1.5].

There are still (as of revision 2) several blank spots. There's a placeholder section on chat that I hope to fill in soon (once I implement chat in my own client!) There's no mention at all of transports and bridging to other instant messaging protocols; frankly, I'm unlikely to address this myself since I'm not especially interested in that area.


2. Jabber Sessions

A Jabber session runs as a single continuous TCP connection on port 5222, or on 5223 if SSL encryption is used. (Even if your client supports SSL, the user should be able to disable it, since not all Jabber servers support it.) It remains open for as long as the client is logged in.

XML elements as commands.

The data stream sent in each direction over the connection forms a continuous XML document whose outermost element is <stream>, although the document of course isn't finished until each side logs out by sending the final closing </stream> tag. Within the outer <stream> elements, sub-elements are sent in both directions as commands, containing their own attributes and nested elements to indicate parameters.

The two sides of the connection are asynchronous. Unlike older protocols like POP, after you send a command you don't have to wait for a reply from the server before sending another. Moreover, the server may send you notifications at any time (for instance, if one of your buddies logs in or out.) You should always be prepared to handle any incoming command.

Then how do you identify the reply to a particular command? The "id" attribute is used to associate commands with replies. A command you send should include such an attribute with a reasonably unique value; the response from the server will contain a matching attribute with the same value. Internally you can use a hashtable to remember the IDs of outstanding requests and associate them with any necessary data. (A particularly flexible technique is to store a pointer to a callback function or method that you'll invoke when the reply arrives. This really helps simplify your flow of control.)

Parsing XML.

The most complex part of a Jabber implementation is parsing XML. Fortunately, there is no need for you to code it! XML parsing is a very standard task and there are many excellent general-purpose libraries you can use as-is. Personally, for C/C++ based clients I recommend expat, which I've used in my own client. Expat was written by one of the designers of XML, it's widely used, fast and compact, open-source, and comes under a very liberal (MIT) license that allows it to be used even in commercial closed-source software. (C++ developers preferring an object-based API might consider Apache.org's Xerces.) Other excellent libraries exist for C/C++ and for other popular languages like Java, Perl and Python. Note that you need a parser that can read from a data stream and parse it incrementally as the data arrives; a parser that has to be fed an entire XML document in one chunk won't work for Jabber.

Writing XML.

Some libraries, like expat, only handle parsing XML, not generating it. Fortunately of course the latter is a lot easier, although you'll still find that it makes your code a lot cleaner and more robust if you abstract this out into a simple API like writeOpenTag(), writeText(), writeCloseTag() instead of littering your source with zillions of string literals filled with angle brackets! This is particularly valuable since XML syntax, unlike HTML, is extremely strict, and the server's parser may reject your input and cause you to be disconnected if you forget a single quote character.

A couple of XML reminders: Tag and attribute names are case sensitive. All attribute values need to be enclosed in quotes. Attribute values, and text content outside of tags, need to escape the magic XML metacharacters, which are:

< > & ' "

These need to be replaced with XML entity references which begin with "&" and end with ";". You can use the Unicode value in hex, in the format "&xnnnn;", or the special entities:

&lt; &gt; &amp; &apos; &quot;

Also remember that non-ascii characters need to use the right encoding. Your stream's <?xml> header should specify your platform's native encoding, so you don't have to do any work to translate non-ascii characters.

For lots more info on XML syntax, see the XML books in the References section.


3. Logging In & Out

This section assumes the user already has an account on the server. Jabber can also be used to register new accounts; see section 9.

Opening the session.

Once you've opened the socket on port 5222 (or 5223 for SSL) you need to send a standard XML header and open the <stream> element that encloses the entire session:

<?xml version="1.0" encoding="UTF-8" ?>
<stream:stream
to="jabber.org"
xmlns="jabber:client"
xmlns:stream="http://etherx.jabber.org/streams">

That's enough to wake up the server. Get the input stream of the socket hooked up to your XML parser. The server will send something like this:

<?xml version="1.0" encoding="UTF-8" ?>
<stream:stream
from="jabber.org"
id="39ABA7D2"
xmlns="jabber:client"
xmlns:stream="http://etherx.jabber.org/streams">

Your XML parser will call you with great excitement to tell you that it found the opening tag of a <stream> element. Now it's your turn to log in...

Logging in.

The server is now waiting for you to authenticate yourself; it won't let you do anything else until you do. You use an <iq> query [JPO 1.5] to send your credentials, and the server will send a reply to this query to tell you whether you were successful. [JPO 1.5.3.3]

The query uses the jabber:iq:auth namespace and includes <username> and <resource> elements to identify the user's Jabber ID and the resource name of this computer. [JPO 1.6.3] Many clients seem to use the client name (e.g. "WinJab") as the resource; but this isn't very informative for buddies and isn't condusive to unique resource names. I recommend prompting users on their first login to choose a descriptive name for this computer or location ("Home", "PowerBook", "Naomi's Office") and storing it as a local preference.

The query also has to include the authentication payload, of course. You can either send the raw password in a <password> element, which is obviously discouraged, or more securely as an encoded digest in a <digest> element. The digest is computed by concatenating the session ID (sent by the server in the "id" attribute of its <stream> tag) with the password, running this string through the SHA-1 algorithm to generate a 20-byte hash, then converting the hash into a 40-character hex representation. [Most Unix libraries include secure hash functions, and you can find a GPL'd C implementation in the libjabber sources.]

Having sent the login query, you need to wait for the reply, which will have the same ID as the query you sent. If you get an <iq type="reply"> then all is well and you are now logged in. If the reply type is "error", the login failed and the reply will contain an explanatory error code and message. In case of failure you should close the session immediately (see next section.)

Now what?

Once logged in, your next tasks are to establish the user's presence and sync up the roster. You might also want to:

  • Send a jabber:iq:private query to retrieve custom preferences from the server, if you store any there. [JPG pp.55-56]
  • Send a jabber:x:autoupdate query to see whether a newer version of your client is available [JPG pp.35-36]

Then you can relax and wait for user commands or server notifications.

Closing the session, or being disconnected.

To log out, simply close your outer <stream> element by writing "</stream:stream>", then close the socket. If some fatal error on your side (like an XML parse error) forces you to disconnect, but the socket is still OK, it's polite to first write an <error> element containing an error code and/or message.

In some cases the socket may be closed unexpectedly. Obviously this can happen if the Jabber server process crashes. The server may also pre-emptively log you out if a fatal error occurs; the most common fatal error, at least during development, being that it received incorrectly structured XML from you. In this case the server will send a top-level <error> element containing a message (but usually not an error code, for some reason) followed by the </stream>. You should close the socket and tell the user they've been disconnected.

It's important to note that normally a TCP socket cannot distinguish between an intact but idle connection vs. a broken connection where the other computer has crashed or network connectivity has been broken. It'll just never receive any more data. This is a problem for a real-time protocol like Jabber: you may think you're still online and your buddies are still available, but until you try to send some data you won't know that you're disconnected. Worse, some firewalls and routers get aggressive about scavenging idle connections and will disconnect you from the Jabber server after several minutes of inactivity!

There are two ways to deal with this. The simple one is to simply send a bit of no-op XML data once a minute or so; a single newline will suffice. If your computer never gets an acknowlegement from the server, the OS will detect that the connection is broken and indicate an error. The slightly more complex solution is to use your OS's networking API to set the "keep-alive" option on the socket [Rogers, p.185-186] but change the timeout period from the default 2 hours down to a few minutes. This may or may not be possible depending on your operating system; in the Unix world this feature was added in Posix.1g. [Rogers, p.201]


4. Declaring & Receiving Presence

A big part of being an IM client is informing the server (and thus the user's buddies) of the user's presence or online status. You need to do this right after login and whenever that status changes: the user might choose a different status (busy, away, unavailable...) from a menu or type in a new status message; or user inactivity might cause the status to change to away or extended-away (xa).

To declare a change in status you simply send a single <presence> element [JPO 1.4] whose type is either available or unavailable. You don't need to add "from" or "to" attributes; the server will add these when it relays the element to everyone whose buddy list you're on. [JPO 1.4.1.1, 1.4.1.5] The unavailable status is a handy way to make the user "invisible": it will appear to buddies as though the user is not logged in at all! The server will take care of relaying the status to all logged-in users who are subscribed to your user's status.

An <x> element using the jabber:x:signed namespace may be added to sign presence information. [JPO 1.6.25] I cannot find any details on how the signature is generated.

Receiving the presence status of your buddies is precisely the opposite: you receive the <presence> elements they send. When you log in you will receive one such element for each buddy, to get you up to date on their current status; this element may contain an <x> tag using the jabber:x:delay namespace to let you know the time at which the buddy's status last changed. [JPO 1.6.18, JPG p.89] Any future status changes while you're online will be sent to you immediately.


5. Managing Your Roster

Roster management is one of the most confusing areas of the protocol, or at least of its current documentation. Here's an overview.

How the roster works.

The roster is the list of Jabber entities with whom the user has a presence relationship. This includes people to whose status the user is subscribed ("buddies"), people who are subscribed to the user's status ("watchers"), and people to whom the user wants to subscribe but who haven't yet approved the request. The roster is thus the model of the GUI "buddy list".

The Jabber server stores the user's roster and notifies all logged-in clients whenever it changes: when the user adds or removes a buddy, when another user adds or removes the user as a buddy, and when a potential buddy approves or rejects a subscription request. This is known as a roster push. The change notifications are sent as <iq> elements using the jabber:iq:roster namespace. The client can also manually request a complete copy of the roster: it's necessary to do this just after logging in to detect changes that may have been made since the last logout at this computer.

[JPO 1.6.12]

Subscribing & unsubscribing buddies.

Confusingly enough, adding and removing buddies is not done via roster manipulation but via <presence> elements. Here the type is subscribe or unsubscribe and the request is sent to the buddy's address. Confirmation or rejection is received as a reply <presence> element (with the same ID) with a type of subscribed or unsubscribed.

Conversely, you'll also deal with other entities wanting to subscribe to your presence. In this case it's exactly the opposite: you receive the subscribe or unsubscribe and (after consulting with the user) reply with subscribed or unsubscribed.

[JPO 1.4.1.3 -- 1.4.1.7]

In all these cases, since your roster is changed, the server will also send another roster push to tell you about the roster element to change.

Manually updating the roster.

It's possible for you to update the server-side roster by sending <iq type="set"> elements. You would do this not to add or remove buddies (see the previous section for how to do that) but to add or change metadata associated with a buddy, such as their nickname or group[s]. [JPO 1.6.12]

Actually, it is possible to add people to the roster this way, but it won't subscribe you to their presence. They'll just be there passively, as in an e-mail address book. You may or may not want to support this in your UI; it doesn't seem especially useful to me, but it might be handy if there are people the user wants to send messages to but doesn't need to know their online status.

The documentation says it's also possible to remove people directly from the roster by setting the "subscription" attribute to "remove". [JPG p.65] This is the only way to remove an item added without a subscription (as in the previous paragraph); but I'm not sure what happens if you remove a subscribed buddy this way. Presumably it should unsubscribe you from their presence notifications just as if you had sent a <presence type="unsubscribe"> element.

Implementation tip.

Since the server will notify you of all changes in the roster, you can drive the management of the GUI buddy list entirely from these pushes. For example, when the user wants to add a buddy, you send out a <presence type="subscribe"> element but don't update the buddy list yet; you'll immediately receive a roster push from the server telling you about the change and should update the buddy list when you receive it. By using roster pushes to manage the buddy list, you ensure that the buddy list stays up to date whether it's you or another simultaneously logged-in client making the changes.

More roster info.

You may want to display more buddy information in your UI than simply the Jabber ID, presence state and status message. As described above, a full name/nickname can be associated with a buddy in the roster.

A whole lot more information about a buddy can be obtained by retrieving their vCard via an <iq> query, if they've stored one on the server [JPO 1.6.26]. vCards are the Internet equivalent of business or Rolodex cards and can contain any imaginable sort of personal and contact information such as photos, birthdays, phone numbers...

Still more generally, the Jabber server as of version 1.4 allows users to store arbitrary XML data tagged with any namespace, accessible to anyone who queries their account using the same namespace. [JPO 1.6.10] I'm not aware of any concrete usage of this, but it has great potential.


6. Sending & Receiving Messages

Compared to the roster, one-to-one messaging is straightforward, although there are a lot of optional bells and whistles. You send a <message> element [JPO 1.3] with a "to" attribute identifying the recipient; conversely, you receive <message> elements with a "from" attribute identifying the sender.

Anyone can send anyone else a message; you don't have to have permission, as you do to view their presence. Since instant messages are sometimes used for spamming or harassment, you should offer your user the ability to block incoming messages from specific addresses, or conversely to allow only messages from people on their buddy list. (This is a good setting to store as a server-side preference using the jabber:iq:private namespace.)

Message attributes.

Any message you receive will include a <from> attribute giving the address of the sender. You can be reasonably sure of the authenticity, more so than with e-mail, because this attribute is set by the sender's Jabber server and cannot be spoofed by the sender. Of course, it could be spoofed by their server, but that's much less likely, especially if the server has a well-known/trusted address.

A message may contain a <subject> element containing the subject of the message. Of course, there's no guarantee that the receiver's UI will display the subject. [JPO 1.3.3.4]

A message may contain a timestamp in the form of an <x> element using the jabber:x:delay namespace. (Typically the server will attach this element if the message was stored and forwarded, i.e. if you were not online at the time it was sent.) [JPO 1.6.18, JPG p.89] If the message has no timestamp, you can assume it was sent no more than a few seconds ago, subject only to network and server lag time.

The jabber:x:envelope namespace provides support for more addressing information, much like traditional e-mail. This includes names for the sender and receiver, lists of multiple receivers or "cc" recipients, and indications that a message was forwarded. [JPO 1.6.20]

The message body.

A message always contains a <body> element containing the plain-text body of the message. [JPO 1.3.3.1]

It may also optionally contain an <html> element containing the same text in HTML format. [JPO 1.3.3.3] But be careful with HTML: it needs to conform to the XHTML Basic dialect. Firstly, it needs to be syntactically correct XML, which means among other things that elements need to be strictly nested (no "<b><i>wow!</b></i>") and that tags with no separate close tag need to end with a "/" (for example, <hr/>.) There are many such restrictions that will come as quite a shock to traditional HTML authors. (The O'Reilly book HTML & XHTML has a great overview of the differences.) Be careful: many libraries that generate HTML from styled text don't conform to these rules, and sending bad XML will cause the server to disconnect you! (If you're writing your own code to convert styled text to XHTML, you'll find that making sure all the style tags nest is quite tricky...)

You'll also find that XHTML Basic, which was developed with typographically limited clients like cellphones in mind, is missing a number of classic HTML tags we take for granted, like <b>, <i> and <font>. Some of these have higher-level equivalents, like <strong> for <b>, but in general, to indicate styles and colors you should use "style" attributes containing CSS (Cascading Style Sheet) commands. However, since it doesn't appear that existing Jabber clients all adhere to these rules, your parsing code should be prepared to recognize old-style formatting commands like <font> as well.

Jabber supports encrypted messages. An encrypted message body is contained in an <x> element using the jabber:x:encrypted namespace. [JPO 1.6.19] The content is the encrypted message data. The documentation is vague, but apparently the message is encrypted using PGP or GPG and encoded using Base64. It's not clear whether it's possible to use any other encryption systems. If the message text is encrypted, the <body> element should contain an explanatory message like "(The text of this message is encrypted)" for the benefit of users whose client doesn't support encryption.

There is a jabber:x:signed namespace, which the documentation [JPO 1.6.25] says is used to sign presence information, but it doesn't say whether it can also be used to sign a message body (or describe exactly how the signature is generated.)

Other types of content.

Unlike MIME, Jabber messages have no standard mechanism for including separate pieces of content like graphics or sounds. This means there's no good way to include an image in an HTML message unless the image can be stored on some other server where the client can follow a URL to reach it.

Messages can be used to send files, but again, the file's data can't be enclosed in the <message>. Instead, the message contains a URL that points to a location from which the file can be downloaded. See section 7 of this document for details.

Roster items (buddy info) can be sent in a message using the jabber:x:roster namespace. [JPG pp.94-96]

A reference to a chat room can be sent in a message using the jabber:x:conference namespace. [JPO 1.6.17] This is how you invite someone to a chat.

Message types and threads.

The sender of a message can use the "type" attribute to send along a hint as to how the message should be displayed. If no "type" attribute is given, the message is standalone and should be displayed in a separate window. Type "chat" indicates that the message should be displayed using a one-to-one chat interface, appended to the same chat display as previous messages in the thread. Type "groupchat" is similar but is used for messages sent from a chat room. (Of course, these are only recommendations; the receiver's client has final control over the message UI.) [JPO 1.3.1.1 -- 1.3.1.4]

Finally type "error" indicates an error delivering a message you sent (most likely, it was sent to a nonexistent Jabber address.) Such a reply will contain a standard <error> element describing the problem. It appears from the documentation that the error reply, perhaps confusingly, contains the body of the original message. [JPO 1.3.1.3]

To assist clients in displaying messages in a chat user interface, messages should contain a <thread> element whose content uniquely identifies the sequence of messages. The clients can then map the thread to a particular window. The client sending the first message should make up a hopefully-unique thread ID (the JPO suggests hashing together the sender's Jabber ID and the current time), and all further replies in the same thread should contain the same thread ID. [JPO 1.3.3.5]

Message events (aka return receipts).

The sender of a message can use the jabber:x:events namespace [JPO 1.6.21] to request that notifications be sent by the receiver when the message is received or displayed to the user, or when the recipient starts composing a reply. It is of course optional for the recipient to send these notifications, and I believe this feature is fairly new, so older or simpler clients might not support it.

Notification that a reply is being composed is especially useful for chat-style UIs, especially group chat. It makes it much clearer who's speaking at any moment and can help keep everyone from talking at once.

Message expiration.

The sender of a message can use the jabber:x:expire namespace to request that the message expire after some time has elapsed. [JPO 1.6.22] If the message is stored offline and the expiration date has passed, the server will not deliver the message when the receiver logs in. And if the receiver has received the message but the user hasn't yet viewed it, the message could vanish when it expires.


7. Chatting

Jabber's groupchat or conference mechanism allows many-to-many chats hosted by a server (which need not be the same as the server any member is logged directly into.) Not only is many-to-many chat a complex thing to implement in a client regardless of protocol, but in the Jabber world this topic is made extra confusing because there are two generations of chat protocol in use. Groupchat is the original one, and conference is newer and more flexible. Conference is supported by the Jabber 1.4 server (as an extra plug-in module) but the protocol is still in flux and likely to change somewhat in the future. (I'm writing this in May 2001.)

I've decided to cover only the newer conference protocol here. It's what I'm implementing, it's what's going to remain relevant longer, and there's a need for documentation as existing clients add support for it.

The primary documentation of groupchat, most of which is still relevant, is in [JPG ch.6]. Conferencing is documented partially in [JPO 1.6.6] and more fully in a protocol draft by Jeremie.

Creating a chat room.

Before creating a room you need to have a room name and a conference server in mind. The server can be entered by the user, or can be discovered by browsing your login server using jabber:iq:browse. The room name can be entered by the user, or you can generate it programmatically (for instance, by appending a random number to the user's name.)

To make sure the room name is not in use, send an <iq type="get"> with xmlns="jabber:iq:browse" to the room. If the room doesn't exist, you'll get an error 404 (Not Found) back and can go ahead. Otherwise, if the room exists, you'll need to pick a different name.

There's some disagreement about how to create the room. Jeremie's draft says that you send an <iq type="set"> with xmlns="jabber:iq:browse" to create it, but this doesn't appear to work on its own with the 1.4.1 server. What actually works is to first send presence to the room, just as though it already existed and you were joining it, then send the set query.

Joining a chat room.

To join an existing room (whose ID may have been typed in by the user or received in a chat invitation), first send a <presence> element to it. Do not add a resource name to the room name; that's how it worked in the old groupchat, but not in conference. If you send the resource name the server will go into backward-compatibility mode and talk groupchat protocol to you.

Next, send an <iq type="set"> with xmlns="jabber:iq:browse" whose query contains one or more <nick> elements containing your desired nickname(s) in order of preference. Once you get a successful reply to this, you know you're really in the room.

Finally, it appears that you should send another <presence> element once you've joined the room, otherwise the existing members won't receive your initial presence. (This may just be a server bug. In any case it won't do any harm to send it twice.)

The chat's roster.

Every conference of course has its own roster, the set of people currently in the room. This will change over time as people join and leave, and people may sent presence changes as well (e.g. they might go idle.) Keep in mind that the JIDs of conference members are made-up proxies invented by the server, to provide anonymity. They all look like the conference's JID with a resource that's a long hex string.

The client is notified of who's in the chat in no less than four different ways. First, <presence> elements are sent for each member, both when you initially join the room and when any member changes their presence (updating status or message, or leaving.)

In addition, a "set" <iq> element using the jabber:iq:conference namespace is sent to you whenever the roster changes; its contents list the proxy Jabber ID and current nickname of each member. The top-level query will be a <conference> element with some attributes that describe the conference itself. It will contain <user> elements, one for each current member, with attributes "jid" (the proxy JID) and "name" (the current nickname). You can tell which member is you by comparing nicknames; that's how you can figure out what your proxy JID is.

Forthermore, when a member joins, leaves or changes nicknames, you'll get a similar query that contains a single <user> element. (You can tell a user that's leaving because there will be an extra type=remove attribute.)

Lastly, the server will send user-visible groupchat messages of the form "foobar has joined." or "foobar has left." You can recognize these messages because the "from" address will be just the room's address with no resource. You may prefer, as I do, to ignore these messages and instead display your own (which can be localized to match the client) in response to the above notifications.

By the way: if you want to find out who a person in a chat room really is, send their proxy JID a "get" <iq> element using the jabber:iq:browse namespace. If browsing is allowed, you'll get a reply containing a <user> element as the query, whose "jid" attribute gives the member's real address. (If the conference room was created with the "privacy" flag, this operation will not be allowed.)

Chat invitations.

A chat invitation is a regular instant message that contains an <x xmlns="jabber:x:conference"> element with a "jid" attribute whose value is the address of the chat room. If the user accepts the invitation, you join the chat room as described above. Unfortunately, there doesn't seem to be a way to notify the members of the chat room that you've declined an invitation.

Sending and receiving messages.

To send a message to the conference, send a <message> element with a type of "groupchat" to the chat room's address. To send a private message to a member, send a regular IM (with a type other than "groupchat") to their proxy JID.

Incoming chat messages can be recognized by the type "groupchat". Remove the resource ID from the "from" address and what's left is the chat room ID, which you use to look up the appropriate chat window in your data structures. The entire ID, of course, identifies the sender; it's the proxy ID generated by the chat room. Look that up in your in-memory roster to find the person's nickname (or even their real JID if you previously browsed for that.)

Jabber supports the IRC convention of allowing "emote" messages. These allow a person to send a message that looks like an action (such as a stage direction) rather than a quote. Your client can recognize an emote message by the prefix "/me "; this should be replaced with the person's [nick]name before displaying it.

For example, I might send the message "/me waves at everyone." Your client would display this as:

Jens waves at everyone.

Leaving the conference.

To leave the conference you simply send it a <presence type="unavailable"> element.

The server will then unhelpfully send you one final roster update showing that you're leaving; this might confuse your client since the chat room it's sent from is no longer one that it thinks you're in. Just ignore it.


8. File Transfer

Jabber doesn't directly support file transfer, only an "out-of-band" ("OOB") mechanism for sending a URL, with the assumption that the recipient will download the data at that URL to a file. The usage scheme involves the sender's client either uploading the file to a particular FTP/HTTP/WebDAV server, or opening up another port and running a trivial server on it. In either case the URL is sent to the receiver, who then makes a connection (using the protocol specified in the URL) to download the file. The latter mechanism is more efficient but will fail if the sender is behind a firewall or NAT server but the receiver isn't.

OOB isn't just for file transfer; it can be used to transfer any URL, i.e. a link to a favorite website, although if HTML messages are supported it seems more descriptive to send such a link via an <A> tag in the message body.

There are two similar ways to transmit the URL. The first is in a <message> element, as a nested <x> element using the jabber:x:oob namespace. [JPG p.92, JPO 1.6.23]. The second is in an <iq> query using the jabber:iq:oob namespace. [JPG p.53, JPO 1.6.9] The latter form allows acknowlegement via an IQ reply.


9. Registering New Accounts

The Jabber protocol allows clients to register new accounts on servers without having to go through a Web interface or beseech a sysadmin. (Of course, any particular server can disallow this if it wants to.) Registration uses a variant form of login.

To register, create a connection to the server and open a <stream> element just as in a normal login. But then send an <iq type="get"> element using the jabber:iq:register namespace. [JPG pp.57-62]

The server might reply with an error if it's not allowing registration. But normally it will send a reply containing a number of sub-elements. These help define the GUI of a dialog box to be presented to the user to be filled out:

  • <key>, if present, contains a magic authentication string that needs to be sent back to the server in subsequent registration commands.
  • <instructions> contains text that should be shown to the user in the registration window; it presumably contains instructions on how to fill out any non-obvious parts of the form, or perhaps general information about the server. Hopefully it's in a language the user can read, because you have no way to request any particular language.
  • <username>, <nick>, <password>, <name>, <first>, <last>, <email>, <address>, <city>, <state>, <zip>, <phone>, <url>, <date>, <misc>, <text> all represent fields in a form that needs to be filled out by the user. They'll initially have no content. Not all of them will necessarily be present (although <username> and <password> need to be.) Most of these are self-explanatory, although I'm not really clear on what the last four are for. You should map these keys to more verbose localized names to display in the dialog.

Once the user has filled out all of these fields, you send back an <iq type="set"> element whose query contains all the form field elements, with their contents being the text entered by the user. It also needs to contain a copy of the <key> element, if any, received from the server. Then you wait for a reply.

If the registration succeeds, you'll get back a reply element with an empty query, which indicates success. Hooray! You still need to close the connection. The connection can't be re-used to log in using the new account; open a new one for that.

If the registration fails, you'll get back an error element. This is not necessarily fatal &emdash; if the error code is 409 (Conflict), this means that the desired username is not available, so you should put up the dialog box again and ask the user to choose a different username, then send the updated registration query.

Updating registration.

The JPG says that an <iq type="set"> element can be used to update the registration information (such as password or e-mail address) of an existing account. But I'm not sure how this works; I suspect that, to be properly authenticated, it has to be sent after you're already logged in.

Canceling an account.

You can cancel an account by sending an <iq type="get"> element to obtain the server's <key>, then sending an <iq type="set"> containing a <remove> sub-element. Again, I think you have to be logged in already for this to work.


Appendix 1. References

Jabber protocol documents.

The Jabber Protocol Overview, by Peter Saint-Andre.

The Jabber Programmer's Guide, by Thomas Muldowney and Eliot Landrum. (Page numbers referred to in this document are from the 1.0.3 release.)

Generic Conferencing protocol draft, by Jeremie Miller.

(For more Jabber documentation, or to get these documents in PDF form, see the Jabber.org documentation website.)

Other useful references.

UNIX Network Programming, Volume 1 by W. Richard Stevens. Everything you wanted to know about sockets, including the keepalive option.

HTML & XHTML: The Definitive Guide by Chuck Musciano and Bill Kennedy (O'Reilly). Chapter 16 describes in great detail how XHTML syntax differs from traditional HTML, which is essential knowlege if you're trying to compose HTML message bodies.

Learning XML by Erik T. Ray (O'Reilly). Chapter 2, available online in its entirety, has a great overview of XML markup concepts.

I'd also like to thank David Waite, Dave Smith, Oliver Jones, Todd Bradley, and others who have helpfully answered questions on the invaluable jdev mailing list.


Appendix 2. Copyright & License

This document is Copyright © 2001 by Jens Alfke. All Rights Reserved.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. You may obtain a copy of the GNU Free Documentation License from the Free Software Foundation by visiting their Web site (http://www.fsf.org) or by writing to: The Free Software Foundation, Inc.; 59 Temple Place - Suite 330; Boston, MA 02111-1307; USA.

FAQ
Forum
Manuals
Articles
Bug Report
Suggestions
Tech Support
Live Support
Diagnostic Tools