Scaling RabbitMQ on a CoreOS cluster through Docker
- Erlang Solutions Team
- 22nd Mar 2017
- 19 min of reading time
RabbitMQ provides, among other features, clustering capabilities. Using clustering, a group of properly configured hosts will behave the same as a single broker instance.
All the nodes of a RabbitMQ cluster share the definition of vhosts, users, and exchanges but not queues. By default they physically reside on the node where they have been created, however as from version 3.6.1, the queue node owneriship can be configured using Queue Master Location policies. Queues are globally defined and reachable, by establishing a connection to any node of the cluster.
Modern architectures often involve container based ways of scaling such as Docker . In this post we will see how to create a dynamic scaling RabbitMQ cluster using CoreOS and Docker:
We will take you on a step by step journey from zero to the cluster.
We are going to use different technologies, although we will not get into the details of all of them. For instance it is not required to have a deep CoreOS/Docker knowledge for the purpose of executing this test.
It can be executed using your pc, and what you need is:
What we will do:
First we have to configure the CoreOS cluster:
1. Clone the vagrant repository:
$ git clone https://github.com/coreos/coreos-vagrant
$ cd coreos-vagrant
2. Use the user-data example file:
$ cp user-data.sample user-data
3. Configure the cluster parameters:
$ cp config.rb.sample config.rb
4. Open the file then uncomment num_instances
and change it to 3, or execute:
sed -i.bk 's/$num_instances=1/$num_instances=3/' config.rb
5. Start the machines using vagrant up
:
$ vagrant up
Bringing machine 'core-01' up with 'virtualbox' provider...
Bringing machine 'core-02' up with 'virtualbox' provider...
Bringing machine 'core-03' up with 'virtualbox' provider…
6. Add the ssh key:
ssh-add ~/.vagrant.d/insecure_private_key
7. Use vagrant ssh core-XX -- -A
to login, ex:
$ vagrant ssh core-01 -- -A
$ vagrant ssh core-02 -- -A
$ vagrant ssh core-03 -- -A
8. Test your CoreOS cluster, login to the machine core-01:
$ vagrant ssh core-01 -- -A
Then
core@core-01 ~ $ fleetctl list-machines
MACHINE IP METADATA
5f676932... 172.17.8.103 -
995875fc... 172.17.8.102 -
e4ae7225... 172.17.8.101 -
9. Test the etcd service:
core@core-01 ~ $ etcdctl set /my-message "I love Italy"
I love Italy
10. Login to vagrant ssh core-02:
$ vagrant ssh core-02 -- -A
core@core-02 ~ $ etcdctl get /my-message
I love Italy
11. Login to vagrant ssh core-03:
vagrant ssh core-02 -- -A
core@core-03 ~ $ etcdctl get /my-message
I love Italy
As result you should have:
12. Test Docker installation using docker -v
:
core@core-01 ~ $ docker -v
Docker version 1.12.3, build 34a2ead
13. (Optional step) Run the first image with docker run :
core@core-01 ~ $ docker run ubuntu /bin/echo 'Hello world'
…
Hello world
The CoreOS the cluster is ready, and we are able to run Docker inside CoreOS. Let’s test our first RabbitMQ docker instance:
14. Execute the official RabbitMQ docker image:
core@core-01 ~ $ docker run -d --hostname my-rabbit --name first_rabbit -p 15672:15672 rabbitmq:3-management
15. Check your eth1
vagrant IP (used to access the machine) :
core@core-01 ~ $ ifconfig | grep -A1 eth1
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.17.8.101 netmask 255.255.255.0 broadcast 172.17.8.255
Go to http://<your_ip>:15672/#/
in this case: http://172.17.8.101:15672/#/.
You should see the RabbitMQ management UI as (guest
guest
):
In order to scale up the node above, we should run another container with --link
parameter and execute rabbitmqctl join_cluster rabbit@<docker_host_name>
. In order to scale down we should stop the second container and execute rabbitmqctl forget_cluster_node rabbit@<docker_host_name>
.
Turn into more positive. e.g. This is one of the areas where further enhancements on automation would be helpful.
We need docker orchestration to configure and manage the docker cluster. Among the available orchestration tools, we have chosen Docker swarm.
Before going ahead we should remove all the running containers:
core@core-01 ~ $ docker rm -f $(docker ps -a -q)
And the images:
core@core-01 ~ $ docker rmi -f $(docker images -q)
Docker Swarm is the native clustering mechanism for Docker. We need to initialize one node and join the other nodes, as:
1. Swarm initialization: to the node-01 execute docker swarm init --advertise-addr 172.17.8.101
.
docker swarm init
automatically generates the command (with the token) to join other nodes to the cluster, as:
core@core-01 ~ $ docker swarm init --advertise-addr 172.17.8.101
Swarm initialized: current node (2fyocfwfwy9o3akuf6a7mg19o) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-3xq8o0yc7h74agna72u2dhqv8blaw40zs1oow9io24u229y22z-4bysfgwdijzutfl6ydguqdu1s \
172.17.8.101:2377
Docker swarm cluster is is composed by leader node and worker nodes.
2. Join the core-02 to the cluster docker swarm join --token <token> <ip>:<port>
(you can copy and paste the command generated to the step 1) :
In this case:
core@core-02 ~ $ docker swarm join \
--token SWMTKN-1-3xq8o0yc7h74agna72u2dhqv8blaw40zs1oow9io24u229y22z-4bysfgwdijzutfl6ydguqdu1s \
172.17.8.101:2377
This node joined a swarm as a worker.
3. Join the core-03 to the cluster docker swarm join --token <token> <ip>:<port>
:
core@core-03 ~ $ docker swarm join \
--token SWMTKN-1-3xq8o0yc7h74agna72u2dhqv8blaw40zs1oow9io24u229y22z-4bysfgwdijzutfl6ydguqdu1s \
172.17.8.101:2377
This node joined a swarm as a worker.
4. Check the swarm cluster using docker node ls
:
core@core-01 ~ $ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
07m3d8ipj2kgdiv9jptv9k18a core-02 Ready Active
2fyocfwfwy9o3akuf6a7mg19o * core-01 Ready Active Leader
8cicxxpn5f86u3roembijanig core-03 Ready Active
There are different ways to create a RabbitMQ cluster:
rabbitmqctl
rabbitmq-autocluster
(a plugin)rabbitmq-clusterer
(a plugin)To create the cluster we use the rabbitmq-autocluster plugin since it supports different services discovery such as Consul, etcd2, DNS, AWS EC2 tags or AWS Autoscaling Groups.
We decided to use etcd2, this is why we tested it on Configure CoreOS cluster machines see step 8.
1. Create A Docker network:
core@core-01~$ docker network create --driver overlay rabbitmq-network
The swarm makes the overlay network available only to nodes in the swarm that require it for a service
2. Create a Docker service:
core@core-01 ~ $ docker service create --name rabbitmq-docker-service \
-p 15672:15672 -p 5672:5672 --network rabbitmq-network -e AUTOCLUSTER_TYPE=etcd \
-e ETCD_HOST=${COREOS_PRIVATE_IPV4} -e ETCD_TTL=30 -e RABBITMQ_ERLANG_COOKIE='ilovebeam' \
-e AUTOCLUSTER_CLEANUP=true -e CLEANUP_WARN_ONLY=false gsantomaggio/rabbitmq-autocluster
Note: The first time you have to wait a few seconds.
3. Check the service list using docker service ls
4. You can check the RabbitMQ instance running on http://<you_vagrant_ip>:15672/#/
most likely http://172.17.8.101:15672/#/
5. Scale your cluster, using docker service scale
as:
core@core-01 ~ $ docker service scale rabbitmq-docker-service=5
rabbitmq-docker-service scaled to 5
Since the 3 CoreOS machine are in cluster, you can use all the 3 machines to access the cluster, as:
6. Check the cluster status on the machine:
core@core-01 ~ $ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b480a09ea6e2 gsantomaggio/rabbitmq-autocluster:latest "docker-entrypoint.sh" 1 seconds ago Up Less than a second 4369/tcp, 5671-5672/tcp,
15671-15672/tcp, 25672/tcp rabbitmq-docker-service.3.1vp3o2w1eelzbpjngxncb9wur
aabb62882b1b gsantomaggio/rabbitmq-autocluster:latest "docker-entrypoint.sh" 6 seconds ago Up 5 seconds 4369/tcp, 5671-5672/tcp,
15671-15672/tcp, 25672/tcp rabbitmq-docker-service.1.f2larueov9lk33rwzael6oore
Same to the other nodes, you have more or less the same number of containers.
Let’s see now in detail the docker service
parameters:
Command | Description |
---|---|
docker service create | Create a docker service |
--name rabbitmq-docker-service | Set the service name, you can check the services list using docker service ls |
-p 15672:15672 -p 5672:5672 | map the RabbitMQ standard ports, 5672 is the AMQP port and 15672 is the Management_UI port |
--network rabbitmq-network | Choose the docker network |
-e RABBITMQ_ERLANG_COKIE='ilovebeam' | Set the same erlang.cookie value to all the containers, needed by RabbitMQ to create a cluster. With different erlang.cookie it is not possible create a cluster. |
Next are the auto-cluster parameters:
Command | Description |
---|---|
-e AUTOCLUSTER_TYPE=etcd | set the service discovery backend = etcd |
-e ETCD_HOST=${COREOS_PRIVATE_IPV4} | The containers need to know the etcd2 ip. After executing the service you can query the database using the command line etcdctl ex: etcdctl ls /rabbitmq -recursive or using the http API ex: curl -L http://127.0.0.1:2379/v2/keys/rabbitmq |
-e ETCD_TTL=30 | Used to specify how long a node can be down before it is removed from etcd’s list of RabbitMQ nodes in the cluster |
-e AUTOCLUSTER_CLEANUP=true | Enables a periodic check that removes any nodes that are not alive in the cluster and no longer listed in the service discovery list. Scaling down removes one or more containers, the nodes will be removed from etcd database, see, for example: docker service scale rabbitmq-docker-service=4 |
-e CLEANUP_WARN_ONLY=false | If set, the plugin will only warn about nodes that it would cleanup. AUTOCLUSTER_CLEANUP requires CLEANUP_WARN_ONLY=false to work. |
gsantomaggio/rabbitmq-autocluster | The official docker image does not support the auto-cluster plugin, in my personal opinion they should. I created a docker image and registered it on docker-hub. |
AUTOCLUSTER_CLEANUP
to true removes the node automatically, if AUTOCLUSTER_CLEANUP
is false you need to remove the node manually.
Scaling down and AUTOCLUSTER_CLEANUP
can be very dangerous, if there are not HA policies all the queues and messages stored to the node will be lost. To enable HA policy you can use the command line or the HTTP API, in this case the easier way is the HTTP API, as:
curl -u guest:guest -H "Content-Type: application/json" -X PUT \
-d '{"pattern":"","definition":{"ha-mode":"exactly","ha-params":3,"ha-sync-mode":"automatic"}}' \
http://172.17.8.101:15672/api/policies/%2f/ha-3-nodes
Note: Enabling the mirror queues across all the nodes could impact the performance, especially when the number of the nodes is undefined. Using "ha-mode":"exactly","ha-params":3
we enable the mirror only for 3 nodes. So scaling down should be done for one node at time, in this way RabbitMQ can move the mirroring to other nodes.
RabbitMQ can easily scale inside Docker, each RabbitMQ node has its own files and does not need to share anything through the file system. It fits perfectly with containers.
This architecture implements important features as:
Scaling RabbitMQ on Docker and CoreOS is very easy and powerful, we are testing and implementing the same environment using different orchestration tools and service discovery tools as kubernetes, consul etc, by the way we consider this architecture as experimental.
Here you can see the final result:
Enjoy!
Erlang Solutions is the world leader in RabbitMQ consultancy, development, and support.
We can help you design, set up, operate and optimise a system with RabbitMQ. Got a system with more than the typical requirements? We also offer RabbitMQ customisation and bespoke support.
How and why we created Buildex.
Guide to effective RabbitMQ monitoring.
RabbitMQ is the most deployed open source message broker. It provides a highly available solution to be used as a message bus, as a routing layer for microservices of as a mediation layer for legacy systems . Find out about how our world-leading RabbitMQ experts can help you.