mirror of
https://github.com/aljazceru/Tutorial-Codebase-Knowledge.git
synced 2025-12-19 07:24:20 +01:00
259 lines
12 KiB
Markdown
259 lines
12 KiB
Markdown
---
|
|
layout: default
|
|
title: "Configuration"
|
|
parent: "Celery"
|
|
nav_order: 2
|
|
---
|
|
|
|
# Chapter 2: Configuration - Telling Celery How to Work
|
|
|
|
In [Chapter 1: The Celery App](01_celery_app.md), we created our first `Celery` app instance. We gave it a name and told it where our message broker and result backend were located using the `broker` and `backend` arguments:
|
|
|
|
```python
|
|
# From Chapter 1
|
|
from celery import Celery
|
|
|
|
app = Celery('tasks',
|
|
broker='redis://localhost:6379/0',
|
|
backend='redis://localhost:6379/0')
|
|
```
|
|
|
|
This worked, but what if we want to change settings later, or manage many different settings? Passing everything directly when creating the `app` can become messy.
|
|
|
|
## What Problem Does Configuration Solve?
|
|
|
|
Think of Celery as a busy workshop with different stations (workers, schedulers) and tools (message brokers, result storage). **Configuration** is the central instruction manual or settings panel for this entire workshop.
|
|
|
|
It tells Celery things like:
|
|
|
|
* **Where is the message broker?** (The post office for tasks)
|
|
* **Where should results be stored?** (The filing cabinet for completed work)
|
|
* **How should tasks be handled?** (e.g., What format should the messages use? Are there any speed limits for certain tasks?)
|
|
* **How should the workers behave?** (e.g., How many tasks can they work on at once?)
|
|
* **How should scheduled tasks run?** (e.g., What timezone should be used?)
|
|
|
|
Without configuration, Celery wouldn't know how to connect to your broker, where to put results, or how to manage the workflow. Configuration allows you to customize Celery to fit your specific needs.
|
|
|
|
## Key Configuration Concepts
|
|
|
|
While Celery has many settings, here are some fundamental ones you'll encounter often:
|
|
|
|
1. **`broker_url`**: The address of your message broker (like Redis or RabbitMQ). This is essential for sending and receiving task messages. We'll learn more about brokers in [Chapter 4: Broker Connection (AMQP)](04_broker_connection__amqp_.md).
|
|
2. **`result_backend`**: The address of your result store. This is needed if you want to keep track of task status or retrieve return values. We cover this in [Chapter 6: Result Backend](06_result_backend.md).
|
|
3. **`include`**: A list of module names that the Celery worker should import when it starts. This is often where your task definitions live (like the `add` task from Chapter 1).
|
|
4. **`task_serializer`**: Defines the format used to package task messages before sending them to the broker (e.g., 'json', 'pickle'). 'json' is a safe and common default.
|
|
5. **`timezone`**: Sets the timezone Celery uses, which is important for scheduled tasks managed by [Chapter 7: Beat (Scheduler)](07_beat__scheduler_.md).
|
|
|
|
## How to Configure Your Celery App
|
|
|
|
Celery is flexible and offers several ways to set its configuration.
|
|
|
|
**Method 1: Directly on the App Object (After Creation)**
|
|
|
|
You can update the configuration *after* creating the `Celery` app instance using the `app.conf.update()` method. This is handy for simple adjustments or quick tests.
|
|
|
|
```python
|
|
# celery_app.py
|
|
from celery import Celery
|
|
|
|
# Create the app (maybe with initial settings)
|
|
app = Celery('tasks', broker='redis://localhost:6379/0')
|
|
|
|
# Update configuration afterwards
|
|
app.conf.update(
|
|
result_backend='redis://localhost:6379/1', # Use database 1 for results
|
|
task_serializer='json',
|
|
result_serializer='json',
|
|
accept_content=['json'], # Only accept json formatted tasks
|
|
timezone='Europe/Oslo',
|
|
enable_utc=True, # Use UTC timezone internally
|
|
# Add task modules to import when worker starts
|
|
include=['my_tasks'] # Assumes you have a file my_tasks.py with tasks
|
|
)
|
|
|
|
print(f"Broker URL set to: {app.conf.broker_url}")
|
|
print(f"Result backend set to: {app.conf.result_backend}")
|
|
print(f"Timezone set to: {app.conf.timezone}")
|
|
```
|
|
|
|
**Explanation:**
|
|
|
|
* We create the `app` like before, potentially setting some initial config like the `broker`.
|
|
* `app.conf.update(...)`: We pass a Python dictionary to this method. The keys are Celery setting names (like `result_backend`, `timezone`), and the values are what we want to set them to.
|
|
* `app.conf` is the central configuration object attached to your `app` instance.
|
|
|
|
**Method 2: Dedicated Configuration Module (Recommended)**
|
|
|
|
For most projects, especially larger ones, it's cleaner to keep your Celery settings in a separate Python file (e.g., `celeryconfig.py`).
|
|
|
|
1. **Create `celeryconfig.py`:**
|
|
|
|
```python
|
|
# celeryconfig.py
|
|
|
|
# Broker settings
|
|
broker_url = 'redis://localhost:6379/0'
|
|
|
|
# Result backend settings
|
|
result_backend = 'redis://localhost:6379/1'
|
|
|
|
# Task settings
|
|
task_serializer = 'json'
|
|
result_serializer = 'json'
|
|
accept_content = ['json']
|
|
|
|
# Timezone settings
|
|
timezone = 'America/New_York'
|
|
enable_utc = True # Recommended
|
|
|
|
# List of modules to import when the Celery worker starts.
|
|
imports = ('proj.tasks',) # Example: Assuming tasks are in proj/tasks.py
|
|
```
|
|
|
|
**Explanation:**
|
|
* This is just a standard Python file.
|
|
* We define variables whose names match the Celery configuration settings (e.g., `broker_url`, `timezone`). Celery expects these specific names.
|
|
|
|
2. **Load the configuration in your app file (`celery_app.py`):**
|
|
|
|
```python
|
|
# celery_app.py
|
|
from celery import Celery
|
|
|
|
# Create the app instance (no need to pass broker/backend here now)
|
|
app = Celery('tasks')
|
|
|
|
# Load configuration from the 'celeryconfig' module
|
|
# Assumes celeryconfig.py is in the same directory or Python path
|
|
app.config_from_object('celeryconfig')
|
|
|
|
print(f"Loaded Broker URL from config file: {app.conf.broker_url}")
|
|
print(f"Loaded Timezone from config file: {app.conf.timezone}")
|
|
|
|
# You might still define tasks in this file or in the modules listed
|
|
# in celeryconfig.imports
|
|
@app.task
|
|
def multiply(x, y):
|
|
return x * y
|
|
```
|
|
|
|
**Explanation:**
|
|
* `app = Celery('tasks')`: We create the app instance, but we don't need to specify the broker or backend here because they will be loaded from the file.
|
|
* `app.config_from_object('celeryconfig')`: This is the key line. It tells Celery to:
|
|
* Find a module named `celeryconfig`.
|
|
* Look at all the uppercase variables defined in that module.
|
|
* Use those variables to configure the `app`.
|
|
|
|
This approach keeps your settings organized and separate from your application logic.
|
|
|
|
**Method 3: Environment Variables**
|
|
|
|
Celery settings can also be controlled via environment variables. This is very useful for deployments (e.g., using Docker) where you might want to change the broker address without changing code.
|
|
|
|
Environment variable names typically follow the pattern `CELERY_<SETTING_NAME_IN_UPPERCASE>`.
|
|
|
|
For example, you could set the broker URL in your terminal before running your app or worker:
|
|
|
|
```bash
|
|
# In your terminal (Linux/macOS)
|
|
export CELERY_BROKER_URL='amqp://guest:guest@localhost:5672//'
|
|
export CELERY_RESULT_BACKEND='redis://localhost:6379/2'
|
|
|
|
# Now run your Python script or Celery worker
|
|
python your_script.py
|
|
# or
|
|
# celery -A your_app_module worker --loglevel=info
|
|
```
|
|
|
|
Celery automatically picks up these environment variables. They often take precedence over settings defined in a configuration file or directly on the app, making them ideal for overriding settings in different environments (development, staging, production).
|
|
|
|
*Note: The exact precedence order can sometimes depend on how and when configuration is loaded, but environment variables are generally a high-priority source.*
|
|
|
|
## How It Works Internally (Simplified View)
|
|
|
|
1. **Loading:** When you create a `Celery` app or call `app.config_from_object()`, Celery reads the settings from the specified source (arguments, object/module, environment variables).
|
|
2. **Storing:** These settings are stored in a dictionary-like object accessible via `app.conf`. Celery uses a default set of values initially, which are then updated or overridden by your configuration.
|
|
3. **Accessing:** When a Celery component needs a setting (e.g., the worker needs the `broker_url` to connect, or a task needs the `task_serializer`), it simply looks up the required key in the `app.conf` object.
|
|
|
|
```mermaid
|
|
sequenceDiagram
|
|
participant ClientCode as Your App Setup (e.g., celery_app.py)
|
|
participant CeleryApp as app = Celery(...)
|
|
participant ConfigSource as celeryconfig.py / Env Vars
|
|
participant Worker as Celery Worker Process
|
|
participant Broker as Message Broker (e.g., Redis)
|
|
|
|
ClientCode->>CeleryApp: Create instance
|
|
ClientCode->>CeleryApp: app.config_from_object('celeryconfig')
|
|
CeleryApp->>ConfigSource: Read settings (broker_url, etc.)
|
|
ConfigSource-->>CeleryApp: Return settings values
|
|
Note over CeleryApp: Stores settings in app.conf
|
|
|
|
Worker->>CeleryApp: Start worker for 'app'
|
|
Worker->>CeleryApp: Access app.conf.broker_url
|
|
CeleryApp-->>Worker: Return 'redis://localhost:6379/0'
|
|
Worker->>Broker: Connect using 'redis://localhost:6379/0'
|
|
```
|
|
|
|
This diagram shows the app loading configuration first, and then the worker using that stored configuration (`app.conf`) to perform its duties, like connecting to the broker.
|
|
|
|
## Code Dive: Where Configuration Lives
|
|
|
|
* **`app.conf`:** This is the primary interface you interact with. It's an instance of a special dictionary-like class (`celery.app.utils.Settings`) that handles loading defaults, converting keys (Celery has changed setting names over time), and providing convenient access. You saw this in the direct update example: `app.conf.update(...)`.
|
|
* **Loading Logic (`config_from_object`)**: Methods like `app.config_from_object` typically delegate to the app's "loader" (`app.loader`). The loader (e.g., `celery.loaders.base.BaseLoader` or `celery.loaders.app.AppLoader`) handles the actual importing of the configuration module and extracting the settings. See `loaders/base.py` for the `config_from_object` method definition.
|
|
* **Default Settings**: Celery has a built-in set of default values for all its settings. These are defined in `celery.app.defaults`. Your configuration overrides these defaults. See `app/defaults.py`.
|
|
* **Accessing Settings**: Throughout the Celery codebase, different components access the configuration via `app.conf`. For instance, when sending a task (`app/base.py:send_task`), the code looks up `app.conf.broker_url` (or related settings) to know where and how to send the message.
|
|
|
|
```python
|
|
# Simplified concept from loaders/base.py
|
|
class BaseLoader:
|
|
# ...
|
|
def config_from_object(self, obj, silent=False):
|
|
if isinstance(obj, str):
|
|
# Import the module (e.g., 'celeryconfig')
|
|
obj = self._smart_import(obj, imp=self.import_from_cwd)
|
|
# ... error handling ...
|
|
# Store the configuration (simplified - actual process merges)
|
|
self._conf = force_mapping(obj) # Treat obj like a dictionary
|
|
# ...
|
|
return True
|
|
|
|
# Simplified concept from app/base.py (where settings are used)
|
|
class Celery:
|
|
# ...
|
|
def send_task(self, name, args=None, kwargs=None, **options):
|
|
# ... other setup ...
|
|
|
|
# Access configuration to know where the broker is
|
|
broker_connection_url = self.conf.broker_url # Reads from app.conf
|
|
|
|
# Use the broker URL to get a connection/producer
|
|
with self.producer_or_acquire(producer) as P:
|
|
# ... create message ...
|
|
# Send message using the connection derived from broker_url
|
|
self.amqp.send_task_message(P, name, message, **options)
|
|
|
|
# ... return result object ...
|
|
```
|
|
|
|
This illustrates the core idea: load configuration into `app.conf`, then components read from `app.conf` when they need instructions.
|
|
|
|
## Conclusion
|
|
|
|
Configuration is the backbone of Celery's flexibility. You've learned:
|
|
|
|
* **Why it's needed:** To tell Celery *how* to operate (broker, backend, tasks settings).
|
|
* **What can be configured:** Broker/backend URLs, serializers, timezones, task imports, and much more.
|
|
* **How to configure:**
|
|
* Directly via `app.conf.update()`.
|
|
* Using a dedicated module (`celeryconfig.py`) with `app.config_from_object()`. (Recommended)
|
|
* Using environment variables (great for deployment).
|
|
* **How it works:** Settings are loaded into `app.conf` and accessed by Celery components as needed.
|
|
|
|
With your Celery app configured, you're ready to define the actual work you want Celery to do. That's where Tasks come in!
|
|
|
|
**Next:** [Chapter 3: Task](03_task.md)
|
|
|
|
---
|
|
|
|
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge) |