Jabber Client Developer’s Cheat Sheet
The Jabber Client Developer's Cheat Sheet
Jens Alfke
rev. 4 :: 25 May 2001
(Copyright & license at end of
document)
Contents
- Introduction
- Jabber Sessions
- Logging In & Out
- Declaring & Receiving
Presence
- Managing Your Roster
- Sending & Receiving
Messages
- Chatting
- File Transfer
- Registering NewAccounts
1. Introduction
This is an informal tutorial guide for the perplexed developer
trying to implement a Jabber client from scratch. Two weeks ago that
developer was me; now, after many experiments and confused e-mails to
jdev, I'm dropping pearls of my great wisdom to help you out. You're
welcome!
Your need for this document will be less if you use one of the
many available Jabber client
libraries like libjabber or JabberCOM,
which do a lot of this work for you. However, the libraries typically
don't do all of the work of managing a real Jabber session
(libjabber in particular is pretty low-level) so you will
still need to do part of the work described here.
This document is a commentary on, not a substitute for, the
Jabber
Protocol Overview [JPO] or Jabber
Programmer's Guide [JPG]. Since those documents are
(in their present incarnations) arranged more as references than
introductions, the tutorial overview here should help you in
navigating them. I've included pointers to relevant sections in the
official documentation, to which you can refer for more details. (See
also the References section at the end of
this document.)
In particular, I'm going to assume you've already read enough of
the official documentation to know how <iq> tags work,
since you'll need to send and receive one to be able to log in. See
[JPO 1.5].
There are still (as of revision 2) several blank spots. There's a
placeholder section on chat that I hope to fill in soon (once I
implement chat in my own client!) There's no mention at all of
transports and bridging to other instant messaging protocols;
frankly, I'm unlikely to address this myself since I'm not especially
interested in that area.
2. Jabber Sessions
A Jabber session runs as a single continuous TCP connection on
port 5222, or on 5223 if SSL encryption is used. (Even if your client
supports SSL, the user should be able to disable it, since not all
Jabber servers support it.) It remains open for as long as the client
is logged in.
XML elements as commands.
The data stream sent in each direction over the connection forms a
continuous XML document whose outermost element is
<stream>, although the document of course isn't
finished until each side logs out by sending the final closing
</stream> tag. Within the outer
<stream> elements, sub-elements are sent in both
directions as commands, containing their own attributes and nested
elements to indicate parameters.
The two sides of the connection are asynchronous. Unlike
older protocols like POP, after you send a command you don't have to
wait for a reply from the server before sending another. Moreover,
the server may send you notifications at any time (for instance, if
one of your buddies logs in or out.) You should always be prepared to
handle any incoming command.
Then how do you identify the reply to a particular command? The
"id" attribute is used to associate commands with replies. A
command you send should include such an attribute with a reasonably
unique value; the response from the server will contain a matching
attribute with the same value. Internally you can use a hashtable to
remember the IDs of outstanding requests and associate them with any
necessary data. (A particularly flexible technique is to store a
pointer to a callback function or method that you'll invoke when the
reply arrives. This really helps simplify your flow of control.)
Parsing XML.
The most complex part of a Jabber implementation is parsing XML.
Fortunately, there is no need for you to code it! XML parsing is a
very standard task and there are many excellent general-purpose
libraries you can use as-is. Personally, for C/C++ based clients I
recommend expat, which I've used in my own client.
Expat was written by one of the designers of XML, it's widely
used, fast and compact, open-source, and comes under a very liberal
(MIT) license that allows it to be used even in commercial
closed-source software. (C++ developers preferring an object-based
API might consider Apache.org's Xerces.)
Other excellent
libraries exist for C/C++ and for other popular languages like
Java, Perl and Python. Note that you need a parser that can read from
a data stream and parse it incrementally as the data arrives; a
parser that has to be fed an entire XML document in one chunk won't
work for Jabber.
Writing XML.
Some libraries, like expat, only handle parsing XML, not
generating it. Fortunately of course the latter is a lot easier,
although you'll still find that it makes your code a lot cleaner and
more robust if you abstract this out into a simple API like
writeOpenTag(), writeText(),
writeCloseTag() instead of littering your source with
zillions of string literals filled with angle brackets! This is
particularly valuable since XML syntax, unlike HTML, is extremely
strict, and the server's parser may reject your input and cause you
to be disconnected if you forget a single quote character.
A couple of XML reminders: Tag and attribute names are case
sensitive. All attribute values need to be enclosed in quotes.
Attribute values, and text content outside of tags, need to escape
the magic XML metacharacters, which are:
< > & ' "
These need to be replaced with XML entity references which begin
with "&" and end with ";". You can use the
Unicode value in hex, in the format
"&xnnnn;", or the special entities:
< > & ' "
Also remember that non-ascii characters need to use the right
encoding. Your stream's <?xml> header should specify
your platform's native encoding, so you don't have to do any work to
translate non-ascii characters.
For lots more info on XML syntax, see the XML books in the
References section.
3. Logging In & Out
This section assumes the user already has an account on the
server. Jabber can also be used to register new accounts; see
section 9.
Opening the session.
Once you've opened the socket on port 5222 (or 5223 for SSL) you
need to send a standard XML header and open the
<stream> element that encloses the entire session:
<?xml version="1.0" encoding="UTF-8" ?>
<stream:stream
to="jabber.org"
xmlns="jabber:client"
xmlns:stream="http://etherx.jabber.org/streams">
That's enough to wake up the server. Get the input stream of the
socket hooked up to your XML parser. The server will send something
like this:
<?xml version="1.0" encoding="UTF-8" ?>
<stream:stream
from="jabber.org"
id="39ABA7D2"
xmlns="jabber:client"
xmlns:stream="http://etherx.jabber.org/streams">
Your XML parser will call you with great excitement to tell you
that it found the opening tag of a <stream> element.
Now it's your turn to log in...
Logging in.
The server is now waiting for you to authenticate yourself; it
won't let you do anything else until you do. You use an
<iq> query [JPO 1.5] to send your credentials,
and the server will send a reply to this query to tell you whether
you were successful. [JPO 1.5.3.3]
The query uses the jabber:iq:auth namespace and includes
<username> and <resource> elements to
identify the user's Jabber ID and the resource name of this computer.
[JPO 1.6.3] Many clients seem to use the client name (e.g.
"WinJab") as the resource; but this isn't very informative for
buddies and isn't condusive to unique resource names. I recommend
prompting users on their first login to choose a descriptive name for
this computer or location ("Home", "PowerBook", "Naomi's Office") and
storing it as a local preference.
The query also has to include the authentication payload, of
course. You can either send the raw password in a
<password> element, which is obviously discouraged, or
more securely as an encoded digest in a
<digest> element. The digest is computed by
concatenating the session ID (sent by the server in the "id"
attribute of its <stream> tag) with the password,
running this string through the SHA-1 algorithm to generate a 20-byte
hash, then converting the hash into a 40-character hex
representation. [Most Unix libraries include secure hash
functions, and you can find a GPL'd C implementation in the
libjabber sources.]
Having sent the login query, you need to wait for the reply, which
will have the same ID as the query you sent. If you get an <iq
type="reply"> then all is well and you are now logged in. If
the reply type is "error", the login failed and the reply
will contain an explanatory error code and message. In case of
failure you should close the session immediately (see next
section.)
Now what?
Once logged in, your next tasks are to establish
the user's presence and sync up the roster.
You might also want to:
- Send a jabber:iq:private query to retrieve custom
preferences from the server, if you store any there. [JPG
pp.55-56]
- Send a jabber:x:autoupdate query to see whether a
newer version of your client is available [JPG
pp.35-36]
Then you can relax and wait for user commands or server
notifications.
Closing the session, or being disconnected.
To log out, simply close your outer <stream>
element by writing "</stream:stream>", then close the
socket. If some fatal error on your side (like an XML parse error)
forces you to disconnect, but the socket is still OK, it's polite to
first write an <error> element containing an error
code and/or message.
In some cases the socket may be closed unexpectedly. Obviously
this can happen if the Jabber server process crashes. The server may
also pre-emptively log you out if a fatal error occurs; the most
common fatal error, at least during development, being that it
received incorrectly structured XML from you. In this case the server
will send a top-level <error> element containing a
message (but usually not an error code, for some reason) followed by
the </stream>. You should close the socket and tell
the user they've been disconnected.
It's important to note that normally a TCP socket cannot
distinguish between an intact but idle connection vs. a broken
connection where the other computer has crashed or network
connectivity has been broken. It'll just never receive any more data.
This is a problem for a real-time protocol like Jabber: you may
think you're still online and your buddies are still
available, but until you try to send some data you won't know that
you're disconnected. Worse, some firewalls and routers get aggressive
about scavenging idle connections and will disconnect you from the
Jabber server after several minutes of inactivity!
There are two ways to deal with this. The simple one is to simply
send a bit of no-op XML data once a minute or so; a single newline
will suffice. If your computer never gets an acknowlegement from the
server, the OS will detect that the connection is broken and indicate
an error. The slightly more complex solution is to use your OS's
networking API to set the "keep-alive" option on the socket
[Rogers, p.185-186] but change the timeout period from the
default 2 hours down to a few minutes. This may or may not be
possible depending on your operating system; in the Unix world this
feature was added in Posix.1g. [Rogers, p.201]
4. Declaring & Receiving Presence
A big part of being an IM client is informing the server (and thus
the user's buddies) of the user's presence or online status.
You need to do this right after login and whenever that status
changes: the user might choose a different status (busy, away,
unavailable...) from a menu or type in a new status message; or user
inactivity might cause the status to change to away or extended-away
(xa).
To declare a change in status you simply send a single
<presence> element [JPO 1.4] whose type is
either available or unavailable. You don't need to
add "from" or "to" attributes; the server will add
these when it relays the element to everyone whose buddy list you're
on. [JPO 1.4.1.1, 1.4.1.5] The unavailable status is
a handy way to make the user "invisible": it will appear to buddies
as though the user is not logged in at all! The server will take care
of relaying the status to all logged-in users who are subscribed to
your user's status.
An <x> element using the jabber:x:signed
namespace may be added to sign presence information. [JPO
1.6.25] I cannot find any details on how the signature is
generated.
Receiving the presence status of your buddies is precisely the
opposite: you receive the <presence> elements they
send. When you log in you will receive one such element for each
buddy, to get you up to date on their current status; this element
may contain an <x> tag using the
jabber:x:delay namespace to let you know the time at which
the buddy's status last changed. [JPO 1.6.18, JPG p.89] Any
future status changes while you're online will be sent to you
immediately.
5. Managing Your Roster
Roster management is one of the most confusing areas of the
protocol, or at least of its current documentation. Here's an
overview.
How the roster works.
The roster is the list of Jabber entities with whom the
user has a presence relationship. This includes people to whose
status the user is subscribed ("buddies"), people who are subscribed
to the user's status ("watchers"), and people to whom the user wants
to subscribe but who haven't yet approved the request. The roster is
thus the model of the GUI "buddy list".
The Jabber server stores the user's roster and notifies all
logged-in clients whenever it changes: when the user adds or removes
a buddy, when another user adds or removes the user as a buddy, and
when a potential buddy approves or rejects a subscription request.
This is known as a roster push. The change notifications are
sent as <iq> elements using the
jabber:iq:roster namespace. The client can also manually
request a complete copy of the roster: it's necessary to do this just
after logging in to detect changes that may have been made since the
last logout at this computer.
[JPO 1.6.12]
Subscribing & unsubscribing buddies.
Confusingly enough, adding and removing buddies is not done
via roster manipulation but via <presence> elements.
Here the type is subscribe or unsubscribe
and the request is sent to the buddy's address. Confirmation or
rejection is received as a reply <presence> element
(with the same ID) with a type of subscribed or
unsubscribed.
Conversely, you'll also deal with other entities wanting to
subscribe to your presence. In this case it's exactly the
opposite: you receive the subscribe or unsubscribe
and (after consulting with the user) reply with subscribed
or unsubscribed.
[JPO 1.4.1.3 -- 1.4.1.7]
In all these cases, since your roster is changed, the server will
also send another roster push to tell you about the roster element to
change.
Manually updating the roster.
It's possible for you to update the server-side roster by sending
<iq type="set"> elements. You would do this not to add
or remove buddies (see the previous section for how to do that) but
to add or change metadata associated with a buddy, such as their
nickname or group[s]. [JPO 1.6.12]
Actually, it is possible to add people to the roster this
way, but it won't subscribe you to their presence. They'll just be
there passively, as in an e-mail address book. You may or may not
want to support this in your UI; it doesn't seem especially useful to
me, but it might be handy if there are people the user wants to send
messages to but doesn't need to know their online status.
The documentation says it's also possible to remove people
directly from the roster by setting the "subscription"
attribute to "remove". [JPG p.65] This is the only
way to remove an item added without a subscription (as in the
previous paragraph); but I'm not sure what happens if you remove a
subscribed buddy this way. Presumably it should unsubscribe you from
their presence notifications just as if you had sent a
<presence type="unsubscribe"> element.
Implementation tip.
Since the server will notify you of all changes in the roster, you
can drive the management of the GUI buddy list entirely from these
pushes. For example, when the user wants to add a buddy, you send out
a <presence type="subscribe"> element but don't update
the buddy list yet; you'll immediately receive a roster push from the
server telling you about the change and should update the buddy list
when you receive it. By using roster pushes to manage the buddy list,
you ensure that the buddy list stays up to date whether it's you or
another simultaneously logged-in client making the changes.
More roster info.
You may want to display more buddy information in your UI than
simply the Jabber ID, presence state and status message. As described
above, a full name/nickname can be associated with a buddy in the
roster.
A whole lot more information about a buddy can be obtained by
retrieving their vCard via an
<iq> query, if they've stored one on the server
[JPO 1.6.26]. vCards are the Internet equivalent of business
or Rolodex cards and can contain any imaginable sort of personal and
contact information such as photos, birthdays, phone numbers...
Still more generally, the Jabber server as of version 1.4 allows
users to store arbitrary XML data tagged with any namespace,
accessible to anyone who queries their account using the same
namespace. [JPO 1.6.10] I'm not aware of any concrete usage
of this, but it has great potential.
6. Sending & Receiving Messages
Compared to the roster, one-to-one messaging is straightforward,
although there are a lot of optional bells and whistles. You send a
<message> element [JPO 1.3] with a
"to" attribute identifying the recipient; conversely, you
receive <message> elements with a "from"
attribute identifying the sender.
Anyone can send anyone else a message; you don't have to have
permission, as you do to view their presence. Since instant messages
are sometimes used for spamming or harassment, you should offer your
user the ability to block incoming messages from specific addresses,
or conversely to allow only messages from people on their buddy list.
(This is a good setting to store as a server-side preference using
the jabber:iq:private namespace.)
Message attributes.
Any message you receive will include a <from>
attribute giving the address of the sender. You can be
reasonably sure of the authenticity, more so than with e-mail,
because this attribute is set by the sender's Jabber server and
cannot be spoofed by the sender. Of course, it could be spoofed by
their server, but that's much less likely, especially if the server
has a well-known/trusted address.
A message may contain a <subject> element
containing the subject of the message. Of course, there's no
guarantee that the receiver's UI will display the subject. [JPO
1.3.3.4]
A message may contain a timestamp in the form of an
<x> element using the jabber:x:delay
namespace. (Typically the server will attach this element if the
message was stored and forwarded, i.e. if you were not online at the
time it was sent.) [JPO 1.6.18, JPG p.89] If the message has
no timestamp, you can assume it was sent no more than a few seconds
ago, subject only to network and server lag time.
The jabber:x:envelope namespace provides support for more
addressing information, much like traditional e-mail. This includes
names for the sender and receiver, lists of multiple receivers or
"cc" recipients, and indications that a message was forwarded.
[JPO 1.6.20]
The message body.
A message always contains a <body> element
containing the plain-text body of the message. [JPO
1.3.3.1]
It may also optionally contain an <html> element
containing the same text in HTML format. [JPO 1.3.3.3] But be
careful with HTML: it needs to conform to the XHTML
Basic dialect. Firstly, it needs to be syntactically correct
XML, which means among other things that elements need to be strictly
nested (no "<b><i>wow!</b></i>") and
that tags with no separate close tag need to end with a "/"
(for example, <hr/>.) There are many such restrictions
that will come as quite a shock to traditional HTML authors. (The
O'Reilly book HTML & XHTML has a great overview of the
differences.) Be careful: many libraries that generate HTML from
styled text don't conform to these rules, and sending bad XML will
cause the server to disconnect you! (If you're writing your own code
to convert styled text to XHTML, you'll find that making sure all the
style tags nest is quite tricky...)
You'll also find that XHTML Basic, which was developed with
typographically limited clients like cellphones in mind, is missing a
number of classic HTML tags we take for granted, like
<b>, <i> and <font>.
Some of these have higher-level equivalents, like
<strong> for <b>, but in general, to
indicate styles and colors you should use "style" attributes
containing CSS (Cascading Style Sheet) commands. However, since it
doesn't appear that existing Jabber clients all adhere to these
rules, your parsing code should be prepared to recognize old-style
formatting commands like <font> as well.
Jabber supports encrypted messages. An encrypted message body is
contained in an <x> element using the
jabber:x:encrypted namespace. [JPO 1.6.19] The
content is the encrypted message data. The documentation is vague,
but apparently the message is encrypted using PGP or GPG and encoded
using Base64. It's not clear whether it's possible to use any other
encryption systems. If the message text is encrypted, the
<body> element should contain an explanatory message
like "(The text of this message is encrypted)" for the benefit of
users whose client doesn't support encryption.
There is a jabber:x:signed namespace, which the
documentation [JPO 1.6.25] says is used to sign presence
information, but it doesn't say whether it can also be used to sign a
message body (or describe exactly how the signature is
generated.)
Other types of content.
Unlike MIME, Jabber messages have no standard mechanism for
including separate pieces of content like graphics or sounds. This
means there's no good way to include an image in an HTML message
unless the image can be stored on some other server where the client
can follow a URL to reach it.
Messages can be used to send files, but again, the file's data
can't be enclosed in the <message>. Instead, the
message contains a URL that points to a location from which the file
can be downloaded. See section 7 of this document for details.
Roster items (buddy info) can be sent in a message using the
jabber:x:roster namespace. [JPG pp.94-96]
A reference to a chat room can be sent in a message using the
jabber:x:conference namespace. [JPO 1.6.17] This is
how you invite someone to a chat.
Message types and threads.
The sender of a message can use the "type" attribute to
send along a hint as to how the message should be displayed. If no
"type" attribute is given, the message is standalone and
should be displayed in a separate window. Type "chat"
indicates that the message should be displayed using a one-to-one
chat interface, appended to the same chat display as previous
messages in the thread. Type "groupchat" is similar but is
used for messages sent from a chat room. (Of course, these are only
recommendations; the receiver's client has final control over the
message UI.) [JPO 1.3.1.1 -- 1.3.1.4]
Finally type "error" indicates an error delivering a
message you sent (most likely, it was sent to a nonexistent Jabber
address.) Such a reply will contain a standard <error>
element describing the problem. It appears from the documentation
that the error reply, perhaps confusingly, contains the body of the
original message. [JPO 1.3.1.3]
To assist clients in displaying messages in a chat user interface,
messages should contain a <thread> element whose
content uniquely identifies the sequence of messages. The clients can
then map the thread to a particular window. The client sending the
first message should make up a hopefully-unique thread ID (the JPO
suggests hashing together the sender's Jabber ID and the current
time), and all further replies in the same thread should contain the
same thread ID. [JPO 1.3.3.5]
Message events (aka return receipts).
The sender of a message can use the jabber:x:events
namespace [JPO 1.6.21] to request that notifications be sent
by the receiver when the message is received or displayed to the
user, or when the recipient starts composing a reply. It is of course
optional for the recipient to send these notifications, and I believe
this feature is fairly new, so older or simpler clients might not
support it.
Notification that a reply is being composed is especially useful
for chat-style UIs, especially group chat. It makes it much clearer
who's speaking at any moment and can help keep everyone from talking
at once.
Message expiration.
The sender of a message can use the jabber:x:expire
namespace to request that the message expire after some time has
elapsed. [JPO 1.6.22] If the message is stored offline and
the expiration date has passed, the server will not deliver the
message when the receiver logs in. And if the receiver has received
the message but the user hasn't yet viewed it, the message could
vanish when it expires.
7. Chatting
Jabber's groupchat or conference mechanism allows
many-to-many chats hosted by a server (which need not be the same as
the server any member is logged directly into.) Not only is
many-to-many chat a complex thing to implement in a client regardless
of protocol, but in the Jabber world this topic is made extra
confusing because there are two generations of chat protocol in use.
Groupchat is the original one, and conference is newer
and more flexible. Conference is supported by the Jabber 1.4 server
(as an extra plug-in module) but the protocol is still in flux and
likely to change somewhat in the future. (I'm writing this in May
2001.)
I've decided to cover only the newer conference protocol here.
It's what I'm implementing, it's what's going to remain relevant
longer, and there's a need for documentation as existing clients add
support for it.
The primary documentation of groupchat, most of which is still
relevant, is in [JPG ch.6]. Conferencing is documented
partially in [JPO 1.6.6] and more fully in a protocol
draft by Jeremie.
Creating a chat room.
Before creating a room you need to have a room name and a
conference server in mind. The server can be entered by the user, or
can be discovered by browsing your login server using
jabber:iq:browse. The room name can be entered by the user,
or you can generate it programmatically (for instance, by appending a
random number to the user's name.)
To make sure the room name is not in use, send an <iq
type="get"> with xmlns="jabber:iq:browse" to the
room. If the room doesn't exist, you'll get an error 404 (Not Found)
back and can go ahead. Otherwise, if the room exists, you'll need to
pick a different name.
There's some disagreement about how to create the room. Jeremie's
draft says that you send an <iq
type="set"> with
xmlns="jabber:iq:browse" to create it, but this doesn't appear
to work on its own with the 1.4.1 server. What actually works is to
first send presence to the room, just as though it already
existed and you were joining it, then send the set
query.
Joining a chat room.
To join an existing room (whose ID may have been typed in by the
user or received in a chat invitation), first send a
<presence> element to it. Do not add a resource
name to the room name; that's how it worked in the old groupchat, but
not in conference. If you send the resource name the server will go
into backward-compatibility mode and talk groupchat protocol to
you.
Next, send an <iq type="set"> with
xmlns="jabber:iq:browse" whose query contains one or more
<nick> elements containing your desired nickname(s) in
order of preference. Once you get a successful reply to this, you
know you're really in the room.
Finally, it appears that you should send another
<presence> element once you've joined the room,
otherwise the existing members won't receive your initial presence.
(This may just be a server bug. In any case it won't do any harm to
send it twice.)
The chat's roster.
Every conference of course has its own roster, the set of people
currently in the room. This will change over time as people join and
leave, and people may sent presence changes as well (e.g. they might
go idle.) Keep in mind that the JIDs of conference members are
made-up proxies invented by the server, to provide anonymity.
They all look like the conference's JID with a resource that's a long
hex string.
The client is notified of who's in the chat in no less than four
different ways. First, <presence> elements are sent
for each member, both when you initially join the room and when any
member changes their presence (updating status or message, or
leaving.)
In addition, a "set" <iq> element using
the jabber:iq:conference namespace is sent to you whenever
the roster changes; its contents list the proxy Jabber ID and current
nickname of each member. The top-level query will be a
<conference> element with some attributes that
describe the conference itself. It will contain <user>
elements, one for each current member, with attributes "jid"
(the proxy JID) and "name" (the current nickname). You can
tell which member is you by comparing nicknames; that's how you can
figure out what your proxy JID is.
Forthermore, when a member joins, leaves or changes nicknames,
you'll get a similar query that contains a single
<user> element. (You can tell a user that's leaving
because there will be an extra type=remove attribute.)
Lastly, the server will send user-visible groupchat messages of
the form "foobar has joined." or "foobar has left." You can recognize
these messages because the "from" address will be just the
room's address with no resource. You may prefer, as I do, to ignore
these messages and instead display your own (which can be localized
to match the client) in response to the above notifications.
By the way: if you want to find out who a person in a chat room
really is, send their proxy JID a "get" <iq> element
using the jabber:iq:browse namespace. If browsing is
allowed, you'll get a reply containing a <user>
element as the query, whose "jid" attribute gives the
member's real address. (If the conference room was created with the
"privacy" flag, this operation will not be allowed.)
Chat invitations.
A chat invitation is a regular instant message that contains an
<x xmlns="jabber:x:conference"> element with a
"jid" attribute whose value is the address of the chat room.
If the user accepts the invitation, you join the chat room as
described above. Unfortunately, there doesn't seem to be a way to
notify the members of the chat room that you've declined an
invitation.
Sending and receiving messages.
To send a message to the conference, send a
<message> element with a type of "groupchat"
to the chat room's address. To send a private message to a member,
send a regular IM (with a type other than "groupchat") to
their proxy JID.
Incoming chat messages can be recognized by the type "groupchat".
Remove the resource ID from the "from" address and what's
left is the chat room ID, which you use to look up the appropriate
chat window in your data structures. The entire ID, of course,
identifies the sender; it's the proxy ID generated by the chat room.
Look that up in your in-memory roster to find the person's nickname
(or even their real JID if you previously browsed for that.)
Jabber supports the IRC convention of allowing "emote" messages.
These allow a person to send a message that looks like an action
(such as a stage direction) rather than a quote. Your client can
recognize an emote message by the prefix "/me "; this should
be replaced with the person's [nick]name before displaying
it.
For example, I might send the message "/me waves at
everyone." Your client would display this as:
Jens waves at everyone.
Leaving the conference.
To leave the conference you simply send it a <presence
type="unavailable"> element.
The server will then unhelpfully send you one final roster update
showing that you're leaving; this might confuse your client since the
chat room it's sent from is no longer one that it thinks you're in.
Just ignore it.
8. File Transfer
Jabber doesn't directly support file transfer, only an
"out-of-band" ("OOB") mechanism for sending a URL, with the
assumption that the recipient will download the data at that URL to a
file. The usage scheme involves the sender's client either uploading
the file to a particular FTP/HTTP/WebDAV server, or opening up
another port and running a trivial server on it. In either case the
URL is sent to the receiver, who then makes a connection (using the
protocol specified in the URL) to download the file. The latter
mechanism is more efficient but will fail if the sender is behind a
firewall or NAT server but the receiver isn't.
OOB isn't just for file transfer; it can be used to transfer any
URL, i.e. a link to a favorite website, although if HTML messages are
supported it seems more descriptive to send such a link via an
<A> tag in the message body.
There are two similar ways to transmit the URL. The first is in a
<message> element, as a nested <x>
element using the jabber:x:oob namespace. [JPG p.92, JPO
1.6.23]. The second is in an <iq> query using the
jabber:iq:oob namespace. [JPG p.53, JPO 1.6.9] The
latter form allows acknowlegement via an IQ reply.
9. Registering New Accounts
The Jabber protocol allows clients to register new accounts on
servers without having to go through a Web interface or beseech a
sysadmin. (Of course, any particular server can disallow this if it
wants to.) Registration uses a variant form of login.
To register, create a connection to the server and open a
<stream> element just as in a normal login. But then
send an <iq type="get"> element using the
jabber:iq:register namespace. [JPG pp.57-62]
The server might reply with an error if it's not allowing
registration. But normally it will send a reply containing a number
of sub-elements. These help define the GUI of a dialog box to be
presented to the user to be filled out:
- <key>, if present, contains a magic
authentication string that needs to be sent back to the server in
subsequent registration commands.
- <instructions> contains text that should be
shown to the user in the registration window; it presumably
contains instructions on how to fill out any non-obvious parts of
the form, or perhaps general information about the server.
Hopefully it's in a language the user can read, because you have
no way to request any particular language.
- <username>, <nick>,
<password>, <name>,
<first>, <last>,
<email>, <address>,
<city>, <state>,
<zip>, <phone>,
<url>, <date>,
<misc>, <text> all represent fields
in a form that needs to be filled out by the user. They'll
initially have no content. Not all of them will necessarily be
present (although <username> and
<password> need to be.) Most of these are
self-explanatory, although I'm not really clear on what the last
four are for. You should map these keys to more verbose localized
names to display in the dialog.
Once the user has filled out all of these fields, you send back an
<iq type="set"> element whose query contains all the
form field elements, with their contents being the text entered by
the user. It also needs to contain a copy of the <key>
element, if any, received from the server. Then you wait for a
reply.
If the registration succeeds, you'll get back a reply element with
an empty query, which indicates success. Hooray! You still need to
close the connection. The connection can't be re-used to log in
using the new account; open a new one for that.
If the registration fails, you'll get back an error element. This
is not necessarily fatal &emdash; if the error code is 409
(Conflict), this means that the desired username is not available, so
you should put up the dialog box again and ask the user to choose a
different username, then send the updated registration query.
Updating registration.
The JPG says that an <iq type="set"> element can be
used to update the registration information (such as password or
e-mail address) of an existing account. But I'm not sure how this
works; I suspect that, to be properly authenticated, it has to
be sent after you're already logged in.
Canceling an account.
You can cancel an account by sending an <iq
type="get"> element to obtain the server's
<key>, then sending an <iq type="set">
containing a <remove> sub-element. Again, I think you
have to be logged in already for this to work.
Appendix 1. References
Jabber protocol documents.
The
Jabber Protocol Overview, by Peter Saint-Andre.
The Jabber
Programmer's Guide, by Thomas Muldowney and Eliot Landrum.
(Page numbers referred to in this document are from the 1.0.3
release.)
Generic
Conferencing protocol draft, by Jeremie Miller.
(For more Jabber documentation, or to get these documents in PDF
form, see the Jabber.org
documentation website.)
Other useful references.
UNIX
Network Programming, Volume 1 by W. Richard Stevens.
Everything you wanted to know about sockets, including the keepalive
option.
HTML &
XHTML: The Definitive Guide by Chuck Musciano and Bill
Kennedy (O'Reilly). Chapter 16 describes in great detail how XHTML
syntax differs from traditional HTML, which is essential knowlege if
you're trying to compose HTML message bodies.
Learning
XML by Erik T. Ray (O'Reilly). Chapter 2, available
online in its entirety, has a great overview of XML markup
concepts.
I'd also like to thank David Waite, Dave Smith, Oliver Jones, Todd
Bradley, and others who have helpfully answered questions on the
invaluable jdev mailing
list.
Appendix 2. Copyright & License
This document is Copyright © 2001 by Jens Alfke. All Rights
Reserved.
Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License,
Version 1.1 or any later version published by the Free Software
Foundation with no Invariant Sections, no Front-Cover Texts, and no
Back-Cover Texts. You may obtain a copy of the GNU Free Documentation
License from the Free Software Foundation by visiting their Web site
(http://www.fsf.org) or by writing to: The Free Software Foundation,
Inc.; 59 Temple Place - Suite 330; Boston, MA 02111-1307; USA.
|