Siren IM

The following specification outlines a new IM protocol, called "Siren", which employs a federated and full end-to-end encrypted approach to instant messaging. 

Note that this is an incomplete and changing document, is undergoing regular editing and will likely change at any time. 

Principles

Decentralised

Decentralisation is a key principle of the Siren design. A user interacts with a Siren server of their choosing, and does not have to place trust in an entity which they cannot verify. Reference implementations of the Siren server will be fully open-source and audit-able, and anyone can run a server instance or perform full code or protocol audits.

Federated

Siren servers can federate with each other to allow users to chat even though they may not be logged into the same Siren server instance. Federation will rely on DNS service records (SRV) to discover other servers to perform initial discovery, and server-to-server communications will be encrypted using server identity key-pairs. Federation is optional and can be disabled for an environment where a "closed" Siren deployment may be desirable.

End-to-end encrypted

All messages are fully end-to-end encrypted, all of the time. Public-key elliptic curve cryptography is used to ensure that only the intended recipient of a message will be able to decrypt it. Every device that a user logs in with will maintain it's own unique encryption keys, and additional signing keys will provide additional authentication of payloads sent by a given user. Servers will not maintain copies of user or device encryption keys, thus making it impossible for the server to impersonate a user or device.

Flexible

The protocol will be based around a message queue and publish-subscribe notifications. The message queue will be used for the storage and delivery of actual payloads, as well as critical notifications (such as key changes or revocations), and will be efficient enough for use on all device types, including mobile phones or devices with unreliable or intermittent connectivity. Publish-subscribe mechanisms will be used to receive any extended presence data, such as online status or availability, on an interest-only basis (i.e. a mobile device on a slow network may not be interested in these out-of-band messages and is under no obligation to subscribe to them).

Encryption

Key generation/exchange is to be performed using X25519 algorithm. Encryption of messages is to be performed using the XSalsa20 stream cipher using a 256-bit key and a 192-bit nonce. Authentication of messages is to be performed using the Ed25519 signature algorithm using a 256-bit key.

Identities

Server Identity 

An instance of the Siren server, represented by it's server name (i.e. neilalexander.eu if following the above example). 

A server maintains one key:

  • A Server Encryption Key (SEK) for encrypted communications between a device and the server, or between the server and another server.

User Identity 

A human user of the system, represented by their UID, similar in format to an email address. A typical UID may be neil@neilalexander.eu, where the first section represents the user and the second section represents the address of the Siren server where the user belongs. 

A user maintains one key:

  • A User Signing Key (USK) for signing device identity records.

Device Identity 

A device belonging to a user, represented by it's public key. Device identities are published into a directory on a Siren server, correlating one or more device identities to a user identity.

A device maintains one key:

  • A Device Encryption Key (DEK) for encrypting messages, freshly generated with each session.

Server Roles

Authentication Manager

The Authentication Manager component is responsible for handling user authentication when a device establishes a new session with the server. The authentication mechanisms can be negotiated between the server and the client; the server can support multiple authentication methods and agree a suitable method with the client, for example, username and password, certificate, etc. 

Critically, the Authentication Manager component is not responsible for the issuing of DEKs or USKs, as the client remains fully responsible for generating these at all times. 

Directory

The Directory is where DEKs are published and mapped to USKs and user IDs. It is effectively a lightweight tree that allows a client to ask for all DEKs belonging to a given user.

A server maintains a Directory for it's own users only. If a server receives a request for DEKs of a local user, then the server will fulfil the request from the local Directory. If the server receives a request for DEKs of a non-local user, then the server will attempt to federate with the correct server and proxy the request on behalf of the client.

Message Queue

The Message Queue will be used for the delivery of message payloads, as well as specific important notifications such as new keys or key revocations. Once a client has successfully opened a session and logged in, the client will start to receive messages from the Message Queue that are destined for that device's DEK. 

If there is no active session associated with that DEK, then the messages will be stored in the Message Queue for an arbitrary amount of time as configured in the server, i.e. 7 days. The messages will then be delivered when a session is opened using that DEK.

Publish-Subscribe

The Publish-Subscribe mechanism will be used for non-essential messages, for example, presence status changes (user logs in, logs out, goes away, becomes available, etc.) or broadcast messages that are only relevant at the current time. The client must register for Publish-Subscribe notifications that it is interested in; no Publish-Subscribe notifications will be sent to a client unsolicited.

Publish-Subscribe messages are not stored for replay back to devices that were not online at the time of notification. Publish-Subscribe messages are only ever sent to an active session.

Personal Roster

The server provides storage for a Personal Roster. In effect, this is the user's contact list and the user's block list. The Personal Roster is not encrypted at rest, but it is signed by the user using a USK. This makes it possible for the server to use the Personal Roster in order to implement contact requests and blocking functionality, but ensures that the server cannot manipulate the Personal Roster on behalf of the user, as the USK is never held by the server.

High-Level Interactions

Open a session

A client wishes to open a session. The following interactions will take place:

  1. The client deconstructs the UID and retrieves the server portion. The client then performs a DNS lookup for the SRV record of that domain name. The DNS response contains a hostname or IP address of the server, a port number and the public SEK.
  2. The client connects and sends it's own public DEK to the server. The server responds to the client, encrypted, with a challenge.
  3. The client computes the challenge and sends it back to the server, encrypted. If the challenge is accepted by the server then at this point key agreement has completed and all packets from this point forward are encrypted.
  4. The server responds with a session identifier, and pins the public DEK to the session ID in its database. The client should save the session identifier.
  5. The session is now open.

Log in as a user

Now that the session is open and you have an agreed method of communicating with the server, you can log in:

  1. The server sends a list of supported authentication mechanisms to the client. The client responds with the necessary authentication credentials, more than likely after prompting the user.
  2. If the server accepts the credentials, the server responds notifying the device that the authentication succeeded. One of the following now takes place:
    • If there are other devices logged in under that user identity, then the device is sent an instruction to wait for approval from another device:
      1. The server responds with a list of existing DEKs published under that user identity, and then sends the DEK of the new device to existing logged in devices.
      2. At this point the existing devices should prompt the user for permission to allow the new device to log in.
      3. If granted, the existing device should encrypt the existing private USK using the new device's DEK and send it back to the server. 
      4. The server relays the encrypted USK to the device and the device attempts to unpack the encrypted USK using the list of published DEKs provided in Step 1. The device then saves the USK.
      • Note that the private user signing key (USK) is only ever exchanged between devices directly, and is not visible to nor stored by the server at any point. The server does not have the capability at any time to cryptographically sign on behalf of the user.
    • If there are no other devices logged in under that user identity, then the device is sent an instruction to generate a new USK. 
  3. The device then sends a registration request to the server, asking to publish it's new DEK under the user identity in the directory. Critically, this request must be signed using the USK from above.
  4. If the server is happy that the signature from the USK matches existing directory entries for that user identity (therefore agreeing that Step 2 was successful and it was signed using the same USK) then the server will publish the new DEK under the user identity. 
  5. The user is now logged in.

Look up a user in the directory

Before you can send anything to or receive anything from another user, you must first interact with the directory in the server. The directory is a list of all user identities and their associated DEKs, signed using the appropriate USK. To look up someone:

  1. The client sends a request to the server asking for information on a given user, i.e. foo@bar.org
  2. The server deconstructs the UID into two parts: the username portion, and the server name portion.
  • If the server is happy that the server name portion is one served by itself then the lookup will be handled by that server's directory.
  • If the server decides that the server name portion is unknown, and federation is permitted, then the server will initiate a federation request with the named server and forward the lookup request onto it (detailed later). 
  1. The server then looks to see if there are any DEKs published under that user identity in the directory. If so, the server responds to the client with a list of all DEKs published under that user identity. 
  2. The client at this point should check to see if all of the DEKs are signed by the same USK. This is not strictly necessary, but is recommended, as it allows the client to verify that the information in the directory has not been compromised by a third-party. 
  3. The client should also check these DEKs against cached records from a previous lookup to see whether there are changes:
  • Any DEKs that exists in the cache but do not appear in the directory should be immediately deleted and never used again — they should be considered to be revoked after that point.
  • Any new DEKs can be presented to the user in the form of a trust prompt or notification.

Send a payload to another user

Once you have queried the directory and found the recipients' DEKs, the client is ready to send and receive payloads (or "messages") to other users:

  1. The client queries the directory for DEKs associated to a given user, as outlined above. The client may receive one or more DEKs, one for each device that the recipient is logged in with.
  2. The client should encrypt the payload once for each DEK. For example, if three DEKs are returned, then the payload will be encrypted separately three times, once using each DEK, resulting in three different cipher-texts. 
  3. The client should append a cryptographic signature to the cipher-text, signed using the USK, to prove that the message was sent from that user.
  4. The client then encrypts the payload using the SEK and sends the resulting cipher-texts to the server, with limited headers such as source and destination server name, source and destination public DEK and any relevant cryptographic properties, such as a nonce. 
  5. The server responds with an acknowledgement.

To do

  • Detail about cryptographic primitives and encryption/signature strength
  • Outline more high-level interactions
  • Further information about implementation and interaction between message queues and publish-subscribe
  • Wire protocol detail (currently undefined and any inspiration is welcome)