Loopabull
Loopabull is an event-driven Ansible-based automation engine. This is used for various tasks, originally slated for Release Engineering Automation.
Contact Information
- Owner
-
Adam Miller (maxamillion) Pierre-Yves Chibon (pingou)
- Contact
-
#fedora-admin, #fedora-releng, #fedora-noc, sysadmin-main, sysadmin-releng
- Location
-
loopabull01.phx2.fedoraproject.org loopabull01.stg.phx2.fedoraproject.org
- Purpose
-
Event Driven Automation of tasks within the Fedora Infrastructure and Fedora Release Engineering
Overview
The loopabull system is setup such that an event will take place within the infrastructure and a fedmsg is sent, then loopabull will consume that message, trigger an Ansible playbook that shares a name with the fedmsg topic, and provide the payload of the fedmsg to the playbook as extra variables.
Setup
The setup is relatively simple, the Overview above describes it and a more detailed version can be found in the releng docs.
+-----------------+ +-------------------------------+ | | | | | fedmsg +------------>| Looper | | | | (fedmsg handler plugin) | | | | | +-----------------+ +-------------------------------+ | | +-------------------+ | | | | | | | | Loopabull +<-------------+ | (Event Loop) | | | +---------+---------+ | | | | V +----------+-----------+ | | | ansible-playbook | | | +----------------------+
Expanding loopabull
The documentation to expand loopabull’s usage is documented at: https://pagure.io/Fedora-Infra/loopabull-tasks
Outage
In the event that loopabull isn’t responding or isn’t running playbooks as it should be, the following scenarios should be approached.
What is going on?
There are a few commands that may help figuring out what is going:
-
Check the status of the different services:
systemctl |grep loopabull
-
Follow the logs of the different services:
journalctl -lfu loopabull -u loopabull@1 -u loopabull@2 -u loopabull@3 \ -u loopabull@4 -u loopabull@5
If a playbook returns a non-zero error code, the worker running it will be stopped. If that happens, you may want to carefully review the logs to assess what lead to this situation so it can be prevented in the future.
-
Monitoring the queue size
The loopabull service listens to the fedmsg bus and puts the messages as they come into a rabbitmq/amqp queue for the workers to process. If you want to see the number of messages pending to be processed by the workers you can check the queue size using:
rabbitmqctl list_queues
The output will be something like:
Listing queues ... workers 489989 ...done.
Where workers
is the name of the queue used by loopabull and 489989
the number of messages in that queue (yes that day we were recovering
from a several-day long outage).
Network Interruption
Sometimes if the network is interrupted, the loopabull service will hang because the fedmsg listener will hold a dead socket open. The service and its workers simply needs to be restarted at that point.
systemctl restart loopabull loopabull@1 loopabull@2 loopabull@3 \ loopabull@4 loopabull@5