RabbitMQ

2015/12/14

RabbitMQ is a message broker.

RabbitMQ, and messaging in general, uses some jargon.

  • Producing means nothing more than sedning. A program that sends messages is a producer.
  • A queue is the name for a mailbox. It lives inside RabbitMQ. Although messages flow through RabbitMQ and your applications, they can be stored only inside a queue. A queue is not bound by any limits, it can store as many messages as you like - it's essentially an infinite buffer. Many producers can send messages that go to one queue, many consumers can try to receive data from on queue.
  • Consuming has a similar meaning to receiving. A consumer is a program that mostly waits to receive messages.

The main idea behind Work Queues (Task Queues) is to avoid doing a resource-intensive task immediately and having to wait for it to complete. We encapsulate a task as a message and send it to a queue. A worker process running in the background will pop the tasks and eventually execute the job. When you run many workers the tasks will be shared between them.

One of the adventages of using a Task Queue is the ability to easily parallelise work. If we are building up a backlog of work, we can just add more workers and that way, scale easily.

By default, RabbitMQ will send each message to the next consumer, in sequence. On average every consumer will get the same number of messages. This way of distributing messages is called round-robin.

Message Acknowledgments

We don't want to lose any tasks. If a worker dies, we'd like the task to be delivered to another worker. In order to make sure a message is never lost, RabbitMQ supports message acknowledgments. An ack is sent back from the consumer to tell RabbitMQ that a particular message has been received, processed and that RabbitMQ is free to delete it.

If a consumer dies (its channel is closed, connection is closed, or TCP connection is lost) without sending an ACK, RabbitMQ will understand that a message wasn't processed fully and will re-queue it. If there are other consumers online at the same time, it will then quickly redeliver it to another consumer. That way you can be sure that no message is lost, even if the workers occasionally die.

There aren't any message timeouts; RabbitMQ will redeliver the message only when the worker connection dies. It's fine even if processing a message takes a very, very long time.

Message acknowledgments are turned on by default.

Message durability

When RabbitMQ quits or crashes it will forget the queues and messages unless you tell it not to. Two things are required to make sure that messages aren't lost: we need to mark both the queue and messages as durable.

First, we need to make sure that RabbitMQ will never lose our queue. In order to do so, we need to declare it as durable:

RabbitMQ doesnt allow you to redefine an existing queue with different parameters and will return an error to any program that tries to do that.

Marking messages as persistent doesnt fully guarantee that a message wont be lost. Although it tells RabbitMQ to save the message to disk, there is still a short time window when RabbitMQ has accepted a message and hasnt saved it yet. If you need a stronger guarantee then you can use publisher confirms.

Fair dispatch

In a situation with two workers, when all odd messages are heavy and even messages are light, one worker will constantly busy and the other one will do hardly any work. RabbitMQ doesnt know anything about that and will still dispatch messages evenly.

This happens because RabbitMQ just dispatches a message when the message enters the queue. It doesnt look at the number of unacknowledged messages for a consumer. It just blindly dispatches every n-th message to the n-th consumer.

In order to defeat that we can use the basicQos method with the prefetchCount = 1 setting. This tells RabbitMQ not to give more than one message to a worker at a time. Or, in other words, dont dispatch a new message to a worker until it has processed and acknowledged the previous one. Instead, it will dispatch it to the next worker that is not still busy.

Exchanges

The core idea in the messaging model in RabbitMQ is that the producer never sends any messages directly to a queue. Actually, quite often the producer doesnt even know if a message will be delivered to any queue at all.

Instead, the producer can only send messages to an exchange. On one side it receives messages from producers and the other side it pushes them to queues. The exchange must know exactly what to do with a message it receives.

We need to tell the exchange to send messages to our queue. That relationship between exchange and a queue is called a binding.

Direct exchange

The routing algorithm behind a direct exchange is simple - a message goes to the queues whose binding key exactly matches the routing key of the message.

Multiple bindings

the direct exchange will behave like fanout and will broadcast the message to all the matching queues.

Topic exchange

Messages sent tp a topic exchange cant have an arbitrary routing_key - it must be a list of words, delimited by dots. The words can be anything, but usually they specify some features connected to the message.

The binding key must also be in the same form. The logic behind the topic exchange is similar to a direct one - a message sent with a particular routing key will be delivered to all the queues that are bound with a matching binding key. However there are two important special cases for binding keys:

  • * can substitute for exactly one word
  • # can substitute for zero or more words

Queues are bound to an exchange using a 'Binding'. A publisher sends a message to an exchange. The exchange will accept the message and routes it to one or more queues (or another exchange) based on the bindings. An exchange completely decouples a publisher from queues and the consumers that consumes from those queues.

A queue will store the messages in memory or disk and deliver them to consumers. A queue binds itself to an exchange using a 'Binding' which describes the criteria for the type of messages it is interested in.

A binding defines the relationship between an exchange and a queue. The most simple case is where the binding equals the queue name. A binding decouples a queue from an exchange. The same queue can be bound to any number of exchanges using the same criteria or different criteria. Different queues can be bound to the same exchange using the same routing criteria as well.

A message can be matched with more than one queue if two or more queues are bound with the same routing criteria.

Direct Exchange

The exchange does a direct match between the routing key provided in the message and the routing criteria used when a queue is bound to this exchange.

The most common use case is to bind the queue to the exchange using the queue name.

Topic Exchange

The exchange does a wildcard match between the routing key and the routing pattern specified in the binding. The routing key is treated as zero or more words, delimited by '.' and supports special wildcard characters. "*" matches a single word and '#' matches zero or more words.

Fanout Exchange

Queues are bound to this exchange with no arguments. Hence any message sent to this exchange will be forwarded to all queues bound to this exchange.