Working with Celery
Architecture
TL;DR: We create scalable kubernetes deployments to correspond with particular celery queues.
The long version
A Celery queue is the namespace in redis where a task is held until it is processed. A Celery worker is a daemon that monitors one or more queues for tasks to complete, then does the work of completing the tasks. A kubernetes deployment is a scalable collection of daemons that do something, like serve HTML pages, process documents, or complete async tasks.
Our approach as of freelawproject/infrastructure#410 is to create a handful of kubernetes deployments, that run celery workers, that are configured to process particular celery queues. This means that we have a deployment that does nightly crawl work, another that does elastic indexing, and so forth. Each deployment is monitoring one or two queues, so by enqueuing your tasks to a particular queue, you ensure that it is completed by a particular deployment.
For example, let's say we want to hash passwords. We might have a task called hash_a_password(pwd). To call it with celery, we'd run:
hash_a_password.delay(pwd)
Or the equivalent:
hash_a_password.apply_async(args=(pwd,))
This would send the task to the default celery queue (named celery), and it would get processed. But say we want to scale it up to process hundreds of times per second. We wouldn't want to put thousands of tasks onto the default queue because they'd interfere with other work that's going on — not great. That queue is currently processed by six workers, so we wouldn't want to tie them all up.
A better approach is to create a particular queue for this called password_hasher and to scale that up to a big kubernetes deployment. If we were hashing lots of passwords long term, that would be a good solution.
Often though, we do batch work that doesn't need its own deployment for long term, it's often a one-off job that runs for a week or a month or two, and creating a kubernetes deployment for it would be kind of a pain.
To solve this, we have four queues with associated deployments that we can use for random batch jobs:
| Deployment Name | Queue |
|---|---|
celery-batch0 |
batch0 |
celery-batch1 |
batch1 |
celery-batch2 |
batch2 |
celery-batch3 |
batch3 |
These deployments are ready to go, but they're scaled to zero workers so they don't take up resources in our cluster when they're not in use.
Read on for how to check out a deployment and scale it up.
Check Our Procedure
When you're ready to do your batch work, begin by checking out the batch you want to use. Fill in this spreadsheet to claim a batch as your own:
https://docs.google.com/spreadsheets/d/1d65isUj7GJzv2PjUwj9WELZ63MeY01MeTId1hc8gEII/
This way, two developers don't accidentally use the same deployment/queue/workers.
Great. Now that batch deployment is yours, and you can start sending things to it with a command like:
./manage.py hash_passwords --queue batch0
If you stop there, you'll crash the website. Keep reading.
Monitoring Queue Length
At this point, you've done a bad thing because you're sending tasks to the batch0 celery queue, but you're not consuming tasks from that queue. This means that the queue will grow and grow until redis runs out of memory and everything falls over. Why? Because as mentioned above, the batch deployments are scaled to zero by default — meaning they have no workers! To make it actually process your tasks, you'll need to scale up the deployment (see next section).
To monitor the length of your queue, you can go to:
https://www.courtlistener.com/monitoring/celery-queues/
And to monitor the overall memory usage of redis (and the overall system health) you can use the Keep-It-Up dashboard:
In general, a value under about 100 is good, and under 500 is fine. If you hit 1,000 or if your tasks have large objects as their arguments, be careful.
Scaling Up a Deployment
You can scale up a deployment with something like:
kubectl -n court-listener scale deployment celery-batch0 --replicas=1
That would give you one worker, which would dutifully process the batch0 queue one item at a time. Nice!
From here, you can slowly scale up the number of replicas that you're using while monitoring as above. If your task only needs a bunch of CPU, you can probably scale up pretty high to get a task done. But usually our tasks rely on external resources like the DB, Redis, or Elastic. In this case, you should slowly scale up your job while monitoring the services it relies on to make sure everything is not overly impacted.
Some normal replica counts are:
| Replicas Count | Description |
|---|---|
| 0 | The deployment is not using memory and won't process anything. This is the default state for our deployments and how you should leave them when you're done using them. |
| 1-5 | Smallish deployment. This doesn't usually cause many problems. |
| 6-10 | Medium deployment. Generally this is when things start making trouble. |
| 10-20 | Big stuff. It can work, but be careful and know your workload. |
| > 20 | We rarely do this because sending more than 20 requests to Elastic or our DB at a time is asking for trouble. |
Note: Every worker costs money and creates CO2. If you're not using a worker, scale it back down to zero until you're ready to use it again.
When you're done
When you're done with your batch work or when you're setting it aside for a bit:
Go back to the spreadsheet and clear out your reservation:
https://docs.google.com/spreadsheets/d/1d65isUj7GJzv2PjUwj9WELZ63MeY01MeTId1hc8gEII/
Otherwise, your colleagues will be annoyed that they have to ask you, "Did you forget to return the celery reservation?"
Scale your deployment back to zero. This is important because deployments cost more money they more workers they have (even if the workers are idle).
If we can't remember to do this, we'll have to have somebody checking out and monitoring the batch deployments. Hopefully that doesn't become necessary! I encourage you to set reminders to scale things down so they don't sit there idly wasting money.
Launching the Built-In Dashboard
If you are working with celery tasks, it's helpful to use celery events which is a simple curses monitor displaying task and worker history, you can inspect the result and traceback of tasks, and it also supports some management commands like rate limiting and shutting down workers.
First, you need to set
CELERY_TASK_ALWAYS_EAGERtoFalseunder DEVELOPMENT Settings:cl.settings.third_party.celery, to allow running your task into the async worker:CELERY_TASK_ALWAYS_EAGER = FalseRemember to set itTRUEonce debugging is finished, otherwise, tests aren't going to pass.To use celery events you need to run a celery worker that sends events, adding the
-Eoption:docker exec -it cl-celery celery -A cl worker -c 1 -EFor concurrent workers, the
-coption should be changed in the previous command:-c {NUMBER_OF_CONCURRENT_WORKERS}. Also, may need to updateCELERY_WORKER_CONCURRENCYvalue incl.settings.third_party.celeryNote that if the Celery tasks you are testing are assigned to a special queue, such as "batch1", you will need to specify the queue name to the celery worker mentioned in 2.:
docker exec -it cl-celery celery -A cl worker --queues batch1. Otherwise, you would see the tasks being sent, but no worker executing them.Then start
celery eventsmonitor:docker exec -it cl-celery celery -A cl eventsNow, when executing a task you'll be able to monitor it with celery events.