Cyrus IMAP Backend Redundancy

In a standalone, active/passive high-availability scenario with failover between Cyrus IMAP backends, an active backend node holds a service IP address and replicates to a passive backend node.

digraph { rankdir = LR; splines = true; overlab = prism; edge [color=gray50, fontname=Calibri, fontsize=11]; node [style=filled, shape=record, fontname=Calibri, fontsize=11]; "imapb01a" [color="#AAFFAA"]; "imapb01b" [color="#FFAAAA"]; "imapb01a" -> "imapb01b" [label="replication"]; }

Standalone active/passive High-Availability for IMAP Backends

In case of a failover, the active backend (imapb01a) goes down, and the passive backend (imapb01b) needs to become the active backend.

Since imapb01b now executes the mutations on the payload it serves, the recovery of imapb01a implies that imapb01b brings imapb01a back up-to-date. This causes the replication to need to occur in the direction opposite from the original, meaning that imapb01b is now the replication source, and imapb01a is now the replication target;

digraph { rankdir = LR; splines = true; overlab = prism; edge [color=gray50, fontname=Calibri, fontsize=11]; node [style=filled, shape=record, fontname=Calibri, fontsize=11]; "imapb01a" [color="#FFAAAA"]; "imapb01b" [color="#AAFFAA"]; "imapb01a" -> "imapb01b" [label="replication", dir=back]; }

Standalone active/passive High-Availability for IMAP Backends

For the replication to be able to continue in the opposite direction, the services on both imapb01a and imapb01b will need to reconfigured. This incurs a restart of the cyrus-imapd service.

Failing over services like this incurs downtime awaiting the restart of services to complete in full and successfully. For larger backends, the duration of a single restart may exceed the service health check interval.

To avoid the need to restart services with new configuration, one could consider replicating both backends to one another continuously;

digraph { rankdir = LR; splines = true; overlab = prism; edge [color=gray50, fontname=Calibri, fontsize=11]; node [style=filled, shape=record, fontname=Calibri, fontsize=11]; "imapb01a" [color="#AAFFAA"]; "imapb01b" [color="#AAAAFF"]; "imapb01a" -> "imapb01b" [label="replication", dir=both]; }

Standalone active/passive High-Availability for IMAP Backends with two-way replication for instant failover

imapb01b, the formerly passive backend and a replication target, is now also a replication source, such as the backend would have had to become in a failover scenario. Similarly, imapb01a is both a replication source and target at the same time.

So long as clients connect to either one of the two backends, replication will only occur effectively in one direction at a time, though.

Ensuring clients connect to either one of the two backends can be achieved in various ways.

An active backend could hold a service IP address, such that a failover scenario merely requires that service IP address to be bound by the other backend;

digraph { splines = true; overlab = prism; edge [color=gray50, fontname=Calibri, fontsize=11]; node [style=filled, shape=record, fontname=Calibri, fontsize=11]; "client"; "service ip" [style=invis,shape=point,size=0]; subgraph cluster_imapb01 { rankdir = LR; color = white; "imapb01a" [color="#AAFFAA"]; "imapb01b" [color="#AAAAFF"]; "imapb01a" -> "imapb01b" [label="replication", dir=both, constraint=false]; } "client" -> "service ip" [color="#AAFFAA"]; "service ip" -> "imapb01a" [color="#AAFFAA"]; "service ip" -> "imapb01b" [color="#FFAAAA"]; }

Clients can also be made to connect to a frontend that directs traffic from individual clients to the same backend consistently;

digraph { rankdir = TB; splines = true; overlab = prism; edge [color=gray50, fontname=Calibri, fontsize=11]; node [style=filled, shape=record, fontname=Calibri, fontsize=11]; "client"; "frontend"; "client" -> "frontend" [color="#AAFFAA"]; subgraph cluster_imapb01 { rankdir = LR; color = white; "imapb01a" [color="#AAFFAA"]; "imapb01b" [color="#AAAAFF"]; "imapb01a" -> "imapb01b" [label="replication", dir=both, constraint=false]; } "frontend" -> "imapb01a" [color="#AAFFAA",style=solid]; "frontend" -> "imapb01b" [color="#FFAAAA",style=solid]; }

With multiple clients connecting to frontends, connections from one client could be proxied to one backend, while connections from another client could be proxied to the other backend.

Note

The additional frontend is added for clarity.

digraph { rankdir = TB; splines = true; overlab = prism; edge [color=gray50, fontname=Calibri, fontsize=11]; node [style=filled, shape=record, fontname=Calibri, fontsize=11]; "client X", "client Y"; "frontend A", "frontend B"; "client X" -> "frontend A" [color="#AAFFAA", label="X->A"]; "client Y" -> "frontend B" [color="#AAFFAA", label="Y->B"]; subgraph cluster_imapb01 { rankdir = LR; color = white; "imapb01a" [color="#AAFFAA"]; "imapb01b" [color="#AAFFAA"]; "imapb01a" -> "imapb01b" [label="replication", dir=both, constraint=false]; } "frontend A" -> "imapb01a" [color="#AAFFAA",style=solid]; "frontend A" -> "imapb01b" [color="#FFAAAA",style=solid]; "frontend B" -> "imapb01a" [color="#FFAAAA",style=solid]; "frontend B" -> "imapb01b" [color="#AAFFAA",style=solid]; }

This puts both replica backend nodes in an active, participatory state, but only for known, pre-established segments of all traffic. Effectively, both backend nodes now also participate in load-balancing.

This is usually achieved with an NGINX Reverse IMAP Proxy or HAProxy IMAP Load-Balancer.

Larger environments cannot carry all mailboxes on single (pair of) backend nodes yet require that all mailboxes be available to all clients. This calls for a Cyrus IMAP Discrete Murder Topology, in which frontend IMAP servers perform continuous protocol-level analysis for mailbox operations, and proxy the connection to the correct backend.

digraph { rankdir = TB; splines = true; overlab = prism; edge [color=gray50, fontname=Calibri, fontsize=11]; node [style=filled, shape=record, fontname=Calibri, fontsize=11]; "client"; subgraph cluster_murder { rankdir = LR; color = white; "frontend"; } subgraph cluster_imapb01 { rankdir = LR; color = white; "imapb01a" [color="#AAFFAA"]; "imapb01b" [color="#AAAAFF"]; "imapb01a" -> "imapb01b" [label="replication", dir=both, constraint=false]; } subgraph cluster_imapb02 { rankdir = LR; color = white; "imapb02a" [color="#AAFFAA"]; "imapb02b" [color="#AAAAFF"]; "imapb02a" -> "imapb02b" [label="replication", dir=both, constraint=false]; } "client" -> "frontend" [color="#AAFFAA"]; "frontend" -> "imapb01a" [color="#AAFFAA",style=solid]; "frontend" -> "imapb01b" [color="#FFAAAA",style=solid]; "frontend" -> "imapb02a" [color="#AAFFAA",style=solid]; "frontend" -> "imapb02b" [color="#FFAAAA",style=solid]; }

In order for a frontend node to learn what backend node to proxy connections to for certain users and mailboxes, the backends need to tell the Cyrus IMAP Murder Mupdate Master which mailboxes they hold.

The mupdate master is the canonical authority on mailbox information across the cluster. As such, when it is sent a record for a mailbox user/john by a backend node that calls itself imapb01a, it’ll put on the record that the mailbox user/john resides on imapb01a.

During service startup, Cyrus IMAP backend nodes dump the list of mailboxes to the mupdate master, allowing the mupdate master to maintain a complete list of all mailboxes from all backend nodes.

However, in a scenario with replicated backends, both the imapb01a and imapb01b backend nodes would submit user/john as one of the mailboxes they hold;

digraph { rankdir = TB; splines = true; overlab = prism; edge [color=gray50, fontname=Calibri, fontsize=11]; node [style=filled, shape=record, fontname=Calibri, fontsize=11]; "client"; subgraph cluster_murder { rankdir = LR; color = white; "mupdate"; "frontend"; "frontend" -> "mupdate" [constraint=false, dir=none]; } subgraph cluster_imapb01 { rankdir = LR; color = white; "imapb01a" [color="#AAFFAA"]; "imapb01b" [color="#AAAAFF"]; "imapb01a" -> "imapb01b" [label="replication", dir=both, constraint=false]; } "client" -> "frontend" [color="#AAFFAA"]; "frontend" -> "imapb01a" [color="#AAFFAA",style=solid]; "frontend" -> "imapb01b" [color="#FFAAAA",style=solid]; "imapb01a", "imapb01b" -> "mupdate" [label="user/john"]; }

The mupdate master would not allow the same mailbox to exist in multiple locations, and refuse to acknowledge the second submission – causing that backend node to delete that mailbox from its spool.

The same conditions are created when a client creates, renames or deletes a mailbox – the replica would re-issue the same reservation and activation commands against the mupdate master on either a mailbox that already existed (create), was already locked and reserved (rename) or no longer exists (delete).

Referring back to our earlier example in which the direction of replication would need to switch between the formerly active now passive, and formerly passive now active through a re-configuration and restart of the cyrus-imapd service, one could use the same methodology for replicated backends that participate in a Cyrus IMAP Murder and re-configure passive nodes to become active murder participants requiring a service restart.

However, the original problem space and the active/passive switchover solution is only needed if imapb01a and imapb02a indeed submitted the user/john mailbox as if it were to exist in two different locations. This is not at all necessary;

The format for a mailbox entry – such as would be submitted by imapb01a – is as follows on imapb01a:

user.john %(A %(john lrswipkxtecdan) I 5e2d2a20-9852-445f-a9c3-154ddcead31d P default V 1463432076 M 1467579559)

The imapb01a backend would submit (part of) this information to the mupdate master such that the mupdate master records:

user.john %(A %(john lrswipkxtecdan) P default S imapb01a T r M 1467579559)

Please note that the record that the mupdate master receives includes the server name of the backend node on which the mailbox was reported to reside.

The submission by imapb01b (with S imapb01b) would contradict the known location (S imapb01a), and would therefore be refused by the mupdate master.

However, if both imapb01a and imapb01b were to submit records for mailboxes they hold that allow the mupdate master to resolve the contradiction, no such conflict between imapb01a and imapb01b exists, and the mupdate master can accept both submissions (that result in the same values on record).

This can be achieved by setting the servername configuration option in imapd.conf(5) to imapb01. Both imapb01a and imapb01b would now submit records for the user/john mailbox to reside on a server known as imapb01, and avoid the conflict altogether;

user.john %(A %(john lrswipkxtecdan) P default S imapb01 T r M 1467579559)

Both systems will now present the mupdate master with a mailbox user/john residing on the default partition of a server known as imapb01, preserving consistency.

Cyrus IMAP Frontend nodes will now proxy requests for user/john to the server address imapb01. It is required this address resolves (using DNS) to the Service IP address that imapb01a and imapb01b use (for IMAP frontends as well as internal mail exchangers.