1. 30

I recently just tried to get Celery up and going but was totally stumped by worker vs beat modes. Gcloud App Engine Cron seems like my next bet.

What do you use and why?

  1. 25

    I give away Sidekiq and sell Sidekiq Enterprise. If you use Ruby/Rails, it’s the standard. https://sidekiq.org

    1. 3

      it’s the standard

      For a very good reason. Wonderful piece of software. Thanks @mperham!

      1. 2

        Woah thats awesome. I have been using sidekiq lots. Great bit of software.

        1. 1

          Hey @mperham!

          Thanks for all of your code. I’ve used and loved several of your projects.

          Out of curiosity what is the current status of Faktory? It sounded like an interesting project, but the rate of development looks like it kind of cratered last Dec.

          1. 2

            It’s under active development but summer has been slow due to family issues. Latest:


        2. 11

          For my own projects I use cron exclusively.

          At work we use cron for system-level tasks (e.g. backups) and Celery for application-level tasks (e.g. periodically poll inventory from warehouses), with RabbitMQ as its backend.

          Also, think about monitoring those tasks, especially backups. A lot of people don’t and it’s a recipe for disaster. I have started using https://cronhub.io/ recently but there are other similar services such as https://cronitor.io/, or you can roll your own like I used to do.

          1. 3

            I would like to second this post.

            The programming language/framework specific scheduling parts don’t matter all that much, but the message bus/backend parts do. RabbitMQ and other AMQP solutions are pretty good, try avoiding a simple key-value store based backend such as Redis.

            1. 1

              Any specific reason for avoiding Redis/key-value stores? I’ve only had one such experience (resque-php) and the main downside seemed to be the need for polling, but honestly I don’t know if that’s because of Redis or because of resque-php’s implementation. I’d like to hear more about that!

              1. 2

                It’s too simplistic. I mean it works for very basic usage, but once you start caring about things like HA or backups or wider usage (so multiple vhosts in rabbitmq terminology) or logging/monitoring it kind of shows how inadequate it is.

                Redis clustering is not that nice. Introspectability - it’s on the wrong level, you don’t generally care about the key/value parts, you care more about the message bus parts and since Redis isn’t aware of that it can’t help you with it.

          2. 10

            Kubernetes has the ability to run jobs on a cron schedule, and you can launch one off run to completion pods as tasks.

            1. 2

              This is what we do too.

            2. 6

              I’d use Dramatiq with APScheduler, but I wrote the former so I’m biased!

              1. 2

                Was definitely also looking at Dramatiq! Can you explain why one needs both APScheduler and Dramatiq?

                I also nearly missed the Cron-like scheduling in the cookbook when I looked at the Scheduling Messages portion of the guide.

                1. 2

                  APScheduler (or something like it) is necessary if you want to execute tasks at regular intervals (i.e. “cron-like scheduling”) since Dramatiq doesn’t have anything built-in to do that. Of course, you can also just use cron instead of APScheduler. :D

              2. 6

                We use cron at Buffer but gradually moving our cron jobs to Kubernetes. It’s a lot easier to schedule jobs on Kubernetes.

                1. 6

                  I’m not Celery’s biggest fan, however I can explain the modes to you. beat is for repeating tasks on a schedule, optionally using a crontab-like syntax for scheduling. A single beat (last I checked) must be in the cluster to kick off scheduled jobs and at least 1 worker must be present to execute said jobs. In some cases, where you don’t have periodic or scheduled tasks, you might not run any beat nodes and only worker nodes.

                  Lately, I’ve been using AWS Lambda to run small jobs using their Cron Scheduling syntax. For larger jobs, I put a message on an queue. This has worked pretty well without involving a lot of infrastructure.

                  If you have a box lying around with spare capacity, just using Cron is pretty fine.

                  1. 5

                    We use Apache Airflow at my company. I cannot say why it was chosen before other alternatives as that happened bevor I joined the company. It is Open Source, actively maintained by a not-so-small community. It is written in Python, quite customizable, and you can write plugins in Python. It is not very complicated to deploy and configure. You can schedule not only on timers but also on other triggers, like a file appears on a file system or an HTTP end-point is called. The tasks are expressed as graphs in clean Python code and can be arbitrarily complex. You can certainly use it to cover complex ETL pipelines.

                    Having said that, Apache Airflow has it’s quirks. Multi-user is a bit of a problem and while it is definitely powerful it adds to overall complexity, too.

                    1. 3

                      We use cron for task scheduling, and then we have a Kafka queue where all events are added (made by task scheduling and otherwise) and then processed by a number of different “workers” depending on the category of tasks. We used to use Amazon SQS, but it’s fairly slow and not 100% reliable (messages would get dropped for no reason).

                      As for Celery, there are a lot of pitfalls so make sure you test everything thoroughly if you go that route. beat is the task scheduler, worker is the runner, but each requires their own unique process.

                      1. 3

                        The last time I deployed a runner I used beanstalkd along with this Python binding. This tool doesn’t use scheduling, outside of it’s ability to restart / reorder jobs.

                        It is quite basic but it did the bit I needed: I had a threaded work queue that would sometimes need to fork. Rather than dealing with forking from inside a worker in a work pool, I would the send the job to beanstalk and fork from the runner instead.

                        1. 3

                          We use Taskcluster, a home developed CI system. It supports both per-checkin and cron-like tasks. We use it because it was becoming increasingly clear that our previous buildbot based CI was a major productivity bottleneck, and other off the shelf solutions couldn’t handle the scale and complexity we needed.

                          While possibly a case of not-invented-here syndrome, I’d argue the decision to build from scratch was the right one. We are now in a very good spot where developers can self-serve their own tasks simply by adding some in-tree configuration. The tasks can run on a wide variety of platforms including AWS, Azure, physical machines in our data centre and more.

                          The taskcluster team has been working to make it easier for other organizations to run their own instances.

                          1. 2

                            For personal stuff on my server, I use cron. At work, we have a number of tasks driven by the Windows Task Runner, for things like ticket stress levels, checking servers, and so on.

                            Unless you’re talking about a task queue? That’s a different thing, and one that I’ve yet to make much use of personally. I’ve written some synchronizing programs, specifically for synchronizing calendars back and forth between Exchange and a CRM, which are similar to notion to a task queue, but a bit more specialized.

                            I’ve not set up Celery or similar programs myself as of yet.

                            1. 1

                              For task runner, we use queue_classic. Despite “database as message queue” is considered classical antipattern, it’s suitable for our load and our use cases (also postgres has notify so polling is not required).

                              For scheduler we use cron configured with whenever, which is not very convenient. Looking for replacement for it, but everything else is too complex.

                                1. 1

                                  If you’re doing java or c#, I’ve used Quartz(.NET) with success. It’s XML configured but it’s bearable once you learn it, and it has a bunch of different failure handling modes which comes in handy when doing real things :-)

                                  1. 1

                                    I’ve always stuck with cron for scheduling. On my own laptop I use Task Spooler to queue up expensive commands so that they run one at a time in the background (e.g. large downloads and network transfers; as well as commands which may interfere if run concurrently)

                                    1. 1

                                      As we are moving to Nomad for cluster orchestration we also use their Batch/Periodic jobs for scheduling tasks. (We also wrote our own based on something from Spring, I’m not touching that…)

                                      1. 1

                                        I’ve also used Celery. Run one beat mode process and then a worker process for each queue/task type all configured with supervisord. It’s kind of annoying and confusing to set up because you need RabbitMQ and once or twice a year the EC2 instance RabbitMQ runs on gets hosed and your whole setup goes down so then you need to look into RabbitMQ clustering.

                                        The other alternative I’ve tried is CloudWatch Events on a cron schedule which send messages to a queue or SNS topic. SQS and SNS are more simple terms of code but not as high throughput. It’s also a ton of work to automate the setup of SNS topics and SQS queue subscriptions. So you’d need something like Terraform to handle that for you.

                                        Overall I’m not stoked on either solution. The next thing I’d want to try is CloudWatch + Kinesis.