mirror of
https://github.com/aljazceru/Tutorial-Codebase-Knowledge.git
synced 2026-01-11 18:54:19 +01:00
init push
This commit is contained in:
270
output/Pydantic Core/01_basemodel.md
Normal file
270
output/Pydantic Core/01_basemodel.md
Normal file
@@ -0,0 +1,270 @@
|
||||
# Chapter 1: BaseModel - Your Data Blueprint
|
||||
|
||||
Welcome to the Pydantic tutorial! We're excited to guide you through the powerful features of Pydantic, starting with the absolute core concept: `BaseModel`.
|
||||
|
||||
## Why Do We Need Structured Data?
|
||||
|
||||
Imagine you're building a web application. You receive data from users – maybe their name and age when they sign up. This data might come as JSON, form data, or just plain Python dictionaries.
|
||||
|
||||
```json
|
||||
// Example user data from an API
|
||||
{
|
||||
"username": "cool_cat_123",
|
||||
"age": "28", // Oops, age is a string!
|
||||
"email": "cat@example.com"
|
||||
}
|
||||
```
|
||||
|
||||
How do you make sure this data is correct? Is `username` always provided? Is `age` actually a number, or could it be text like `"twenty-eight"`? Handling all these checks manually can be tedious and error-prone.
|
||||
|
||||
This is where Pydantic and `BaseModel` come in!
|
||||
|
||||
## Introducing `BaseModel`: The Blueprint
|
||||
|
||||
Think of `BaseModel` as a **blueprint** for your data. You define the structure you expect – what fields should exist and what their types should be (like `string`, `integer`, `boolean`, etc.). Pydantic then uses this blueprint to automatically:
|
||||
|
||||
1. **Parse:** Read incoming data (like a dictionary).
|
||||
2. **Validate:** Check if the data matches your blueprint (e.g., is `age` really an integer?). If not, it tells you exactly what's wrong.
|
||||
3. **Serialize:** Convert your structured data back into simple formats (like a dictionary or JSON) when you need to send it somewhere else.
|
||||
|
||||
It's like having an automatic quality checker and translator for your data!
|
||||
|
||||
## Defining Your First Model
|
||||
|
||||
Let's create a blueprint for a simple `User`. We want each user to have a `name` (which should be text) and an `age` (which should be a whole number).
|
||||
|
||||
In Pydantic, you do this by creating a class that inherits from `BaseModel` and using standard Python type hints:
|
||||
|
||||
```python
|
||||
# Import BaseModel from Pydantic
|
||||
from pydantic import BaseModel
|
||||
|
||||
# Define your data blueprint (Model)
|
||||
class User(BaseModel):
|
||||
name: str # The user's name must be a string
|
||||
age: int # The user's age must be an integer
|
||||
```
|
||||
|
||||
That's it! This simple class `User` is now a Pydantic model. It acts as the blueprint for creating user objects.
|
||||
|
||||
## Using Your `BaseModel` Blueprint
|
||||
|
||||
Now that we have our `User` blueprint, let's see how to use it.
|
||||
|
||||
### Creating Instances (Parsing and Validation)
|
||||
|
||||
You create instances of your model just like any regular Python class, passing the data as keyword arguments. Pydantic automatically parses and validates the data against your type hints (`name: str`, `age: int`).
|
||||
|
||||
**1. Valid Data:**
|
||||
|
||||
```python
|
||||
# Input data (e.g., from a dictionary)
|
||||
user_data = {'name': 'Alice', 'age': 30}
|
||||
|
||||
# Create a User instance
|
||||
user_alice = User(**user_data) # The ** unpacks the dictionary
|
||||
|
||||
# Pydantic checked that 'name' is a string and 'age' is an integer.
|
||||
# It worked! Let's see the created object.
|
||||
print(user_alice)
|
||||
# Expected Output: name='Alice' age=30
|
||||
```
|
||||
|
||||
Behind the scenes, Pydantic looked at `user_data`, compared it to the `User` blueprint, saw that `'Alice'` is a valid `str` and `30` is a valid `int`, and created the `user_alice` object.
|
||||
|
||||
**2. Invalid Data:**
|
||||
|
||||
What happens if the data doesn't match the blueprint?
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel, ValidationError
|
||||
|
||||
class User(BaseModel):
|
||||
name: str
|
||||
age: int
|
||||
|
||||
# Input data with age as a string that isn't a number
|
||||
invalid_data = {'name': 'Bob', 'age': 'twenty-eight'}
|
||||
|
||||
try:
|
||||
user_bob = User(**invalid_data)
|
||||
except ValidationError as e:
|
||||
print(e)
|
||||
"""
|
||||
Expected Output (simplified):
|
||||
1 validation error for User
|
||||
age
|
||||
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='twenty-eight', input_type=str]
|
||||
"""
|
||||
```
|
||||
|
||||
Pydantic catches the error! Because `'twenty-eight'` cannot be understood as an `int` for the `age` field, it raises a helpful `ValidationError` telling you exactly which field (`age`) failed and why.
|
||||
|
||||
**3. Type Coercion (Smart Conversion):**
|
||||
|
||||
Pydantic is often smart enough to convert types when it makes sense. For example, if you provide `age` as a string containing digits:
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel
|
||||
|
||||
class User(BaseModel):
|
||||
name: str
|
||||
age: int
|
||||
|
||||
# Input data with age as a numeric string
|
||||
data_with_string_age = {'name': 'Charlie', 'age': '35'}
|
||||
|
||||
# Create a User instance
|
||||
user_charlie = User(**data_with_string_age)
|
||||
|
||||
# Pydantic converted the string '35' into the integer 35!
|
||||
print(user_charlie)
|
||||
# Expected Output: name='Charlie' age=35
|
||||
print(type(user_charlie.age))
|
||||
# Expected Output: <class 'int'>
|
||||
```
|
||||
|
||||
Pydantic automatically *coerced* the string `'35'` into the integer `35` because the blueprint specified `age: int`. This leniency is often very convenient.
|
||||
|
||||
### Accessing Data
|
||||
|
||||
Once you have a valid model instance, you access its data using standard attribute access:
|
||||
|
||||
```python
|
||||
# Continuing from the user_alice example:
|
||||
print(f"User's Name: {user_alice.name}")
|
||||
# Expected Output: User's Name: Alice
|
||||
|
||||
print(f"User's Age: {user_alice.age}")
|
||||
# Expected Output: User's Age: 30
|
||||
```
|
||||
|
||||
### Serialization (Converting Back)
|
||||
|
||||
Often, you'll need to convert your model instance back into a basic Python dictionary (e.g., to send it as JSON over a network). `BaseModel` provides easy ways to do this:
|
||||
|
||||
**1. `model_dump()`:** Converts the model to a dictionary.
|
||||
|
||||
```python
|
||||
# Continuing from the user_alice example:
|
||||
user_dict = user_alice.model_dump()
|
||||
|
||||
print(user_dict)
|
||||
# Expected Output: {'name': 'Alice', 'age': 30}
|
||||
print(type(user_dict))
|
||||
# Expected Output: <class 'dict'>
|
||||
```
|
||||
|
||||
**2. `model_dump_json()`:** Converts the model directly to a JSON string.
|
||||
|
||||
```python
|
||||
# Continuing from the user_alice example:
|
||||
user_json = user_alice.model_dump_json(indent=2) # indent for pretty printing
|
||||
|
||||
print(user_json)
|
||||
# Expected Output:
|
||||
# {
|
||||
# "name": "Alice",
|
||||
# "age": 30
|
||||
# }
|
||||
print(type(user_json))
|
||||
# Expected Output: <class 'str'>
|
||||
```
|
||||
|
||||
These methods allow you to easily share your structured data.
|
||||
|
||||
## Under the Hood: How Does `BaseModel` Work?
|
||||
|
||||
You don't *need* to know the internals to use Pydantic effectively, but a little insight can be helpful!
|
||||
|
||||
**High-Level Steps:**
|
||||
|
||||
When Python creates your `User` class (which inherits from `BaseModel`), some Pydantic magic happens via its `ModelMetaclass`:
|
||||
|
||||
1. **Inspection:** Pydantic looks at your class definition (`User`), finding the fields (`name`, `age`) and their type hints (`str`, `int`).
|
||||
2. **Schema Generation:** It generates an internal "Core Schema". This is a detailed, language-agnostic description of your data structure and validation rules. Think of it as an even more detailed blueprint used internally by Pydantic's fast validation engine (written in Rust!). We'll explore this more in [Chapter 5](05_core_schema___validation_serialization.md).
|
||||
3. **Validator/Serializer Creation:** Based on this Core Schema, Pydantic creates highly optimized functions (internally) for validating input data and serializing model instances for *this specific model* (`User`).
|
||||
|
||||
Here's a simplified diagram:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Dev as Developer
|
||||
participant Py as Python Interpreter
|
||||
participant Meta as BaseModel Metaclass
|
||||
participant Core as Pydantic Core Engine
|
||||
|
||||
Dev->>Py: Define `class User(BaseModel): name: str, age: int`
|
||||
Py->>Meta: Ask to create the `User` class
|
||||
Meta->>Meta: Inspect fields (`name: str`, `age: int`)
|
||||
Meta->>Core: Request schema based on fields & types
|
||||
Core-->>Meta: Provide internal Core Schema for User
|
||||
Meta->>Core: Request validator function from schema
|
||||
Core-->>Meta: Provide optimized validator
|
||||
Meta->>Core: Request serializer function from schema
|
||||
Core-->>Meta: Provide optimized serializer
|
||||
Meta-->>Py: Return the fully prepared `User` class (with hidden validator/serializer attached)
|
||||
Py-->>Dev: `User` class is ready to use
|
||||
```
|
||||
|
||||
**Instantiation and Serialization Flow:**
|
||||
|
||||
* When you call `User(name='Alice', age=30)`, Python calls the `User` class's `__init__` method. Pydantic intercepts this and uses the optimized **validator** created earlier to check the input data against the Core Schema. If valid, it creates the instance; otherwise, it raises `ValidationError`.
|
||||
* When you call `user_alice.model_dump()`, Pydantic uses the optimized **serializer** created earlier to convert the instance's data back into a dictionary, again following the rules defined in the Core Schema.
|
||||
|
||||
**Code Location:**
|
||||
|
||||
Most of this intricate setup logic happens within the `ModelMetaclass` found in `pydantic._internal._model_construction.py`. It coordinates with the `pydantic-core` Rust engine to build the schema and the validation/serialization logic.
|
||||
|
||||
```python
|
||||
# Extremely simplified conceptual view of metaclass action
|
||||
class ModelMetaclass(type):
|
||||
def __new__(mcs, name, bases, namespace, **kwargs):
|
||||
# 1. Find fields and type hints in 'namespace'
|
||||
fields = {} # Simplified: find 'name: str', 'age: int'
|
||||
annotations = {} # Simplified
|
||||
|
||||
# ... collect fields, config, etc. ...
|
||||
|
||||
# 2. Generate Core Schema (pseudo-code)
|
||||
# core_schema = pydantic_core.generate_schema(fields, annotations, config)
|
||||
# (This happens internally, see Chapter 5)
|
||||
|
||||
# 3. Create validator & serializer (pseudo-code)
|
||||
# validator = pydantic_core.SchemaValidator(core_schema)
|
||||
# serializer = pydantic_core.SchemaSerializer(core_schema)
|
||||
|
||||
# Create the actual class object
|
||||
cls = super().__new__(mcs, name, bases, namespace, **kwargs)
|
||||
|
||||
# Attach the generated validator/serializer (simplified)
|
||||
# cls.__pydantic_validator__ = validator
|
||||
# cls.__pydantic_serializer__ = serializer
|
||||
# cls.__pydantic_core_schema__ = core_schema # Store the schema
|
||||
|
||||
return cls
|
||||
|
||||
# class BaseModel(metaclass=ModelMetaclass):
|
||||
# ... rest of BaseModel implementation ...
|
||||
```
|
||||
|
||||
This setup ensures that validation and serialization are defined *once* when the class is created, making instance creation (`User(...)`) and dumping (`model_dump()`) very fast.
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've learned the fundamentals of `pydantic.BaseModel`:
|
||||
|
||||
* It acts as a **blueprint** for your data structures.
|
||||
* You define fields and their types using standard **Python type hints**.
|
||||
* Pydantic automatically handles **parsing**, **validation** (with helpful errors), and **serialization** (`model_dump`, `model_dump_json`).
|
||||
* It uses a powerful internal **Core Schema** and optimized validators/serializers for great performance.
|
||||
|
||||
`BaseModel` is the cornerstone of Pydantic. Now that you understand the basics, you might be wondering how to add more specific validation rules (like "age must be positive") or control how fields are handled during serialization.
|
||||
|
||||
In the next chapter, we'll dive into customizing fields using the `Field` function.
|
||||
|
||||
Next: [Chapter 2: Fields (FieldInfo / Field function)](02_fields__fieldinfo___field_function_.md)
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
326
output/Pydantic Core/02_fields__fieldinfo___field_function_.md
Normal file
326
output/Pydantic Core/02_fields__fieldinfo___field_function_.md
Normal file
@@ -0,0 +1,326 @@
|
||||
# Chapter 2: Customizing Your Blueprint's Rooms - Fields
|
||||
|
||||
In [Chapter 1: BaseModel - Your Data Blueprint](01_basemodel.md), we learned how `BaseModel` acts like a blueprint for our data, defining the expected structure and types using simple Python type hints. We saw how Pydantic uses this blueprint to parse, validate, and serialize data.
|
||||
|
||||
But what if we need more specific instructions for certain parts of our blueprint? What if a room needs a specific paint color (a default value)? Or what if the blueprint uses one name for a room ("Lounge"), but the construction crew knows it by another name ("Living Room") (an alias)?
|
||||
|
||||
This is where Pydantic's **Fields** come in. They allow us to add these extra details and constraints to the attributes within our models.
|
||||
|
||||
## Why Customize Fields?
|
||||
|
||||
Let's go back to our `User` model:
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel
|
||||
|
||||
class User(BaseModel):
|
||||
name: str
|
||||
age: int
|
||||
```
|
||||
|
||||
This is great, but real-world data often has quirks:
|
||||
|
||||
1. **Missing Data:** What if `age` isn't always provided? Should it default to something sensible, like `18`?
|
||||
2. **Naming Conflicts:** What if the incoming data (e.g., JSON from a JavaScript frontend) uses `userName` instead of `name` (camelCase vs. snake_case)?
|
||||
3. **Basic Rules:** What if we know `age` must always be a positive number?
|
||||
|
||||
Simply using type hints (`str`, `int`) doesn't cover these cases. We need a way to add more *metadata* (extra information) to our fields.
|
||||
|
||||
## Introducing `Field()`: Adding Notes to the Blueprint
|
||||
|
||||
Pydantic provides the `Field()` function precisely for this purpose. You use it as the *default value* when defining an attribute on your model, and pass arguments to it to specify the extra details.
|
||||
|
||||
Think of it like adding specific notes or requirements to a room on your building blueprint.
|
||||
|
||||
```python
|
||||
# Import Field along with BaseModel
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
# Our User model, now with customizations using Field()
|
||||
class User(BaseModel):
|
||||
name: str = Field(
|
||||
default='Guest', # Note 1: Default name is 'Guest'
|
||||
alias='userName', # Note 2: Expect 'userName' in input data
|
||||
min_length=3 # Note 3: Name must be at least 3 characters
|
||||
)
|
||||
age: int = Field(
|
||||
default=18, # Note 1: Default age is 18
|
||||
gt=0 # Note 2: Age must be greater than 0
|
||||
)
|
||||
email: str | None = Field(
|
||||
default=None, # Note 3: Email is optional (defaults to None)
|
||||
description='The user email address' # Note 4: Add a description
|
||||
)
|
||||
```
|
||||
|
||||
Let's break down how we use `Field()`:
|
||||
|
||||
1. **Import:** You need to import `Field` from `pydantic`.
|
||||
2. **Assignment:** Instead of just `name: str`, you write `name: str = Field(...)`. The `Field()` call replaces a simple default value (though `Field()` *can* specify a default).
|
||||
3. **Arguments:** You pass keyword arguments to `Field()` to specify the metadata:
|
||||
* `default`: Sets a default value if the field isn't provided in the input data. If you *only* need a default, you can often just write `name: str = 'Guest'` or `age: int = 18`, but `Field(default=...)` is useful when combined with other options. Use `...` (Ellipsis) or omit `default` entirely to mark a field as required.
|
||||
* `alias`: Tells Pydantic to look for this name (`'userName'`) in the input data (like a dictionary or JSON) when parsing, and use this alias when serializing (e.g., in `model_dump(by_alias=True)`).
|
||||
* `gt` (greater than), `ge` (greater than or equal), `lt` (less than), `le` (less than or equal): Basic numeric constraints.
|
||||
* `min_length`, `max_length`: Constraints for strings, lists, etc.
|
||||
* `description`: A human-readable description, often used for generating documentation or schemas.
|
||||
* ...and many more!
|
||||
|
||||
## Using Models with `Field()`
|
||||
|
||||
Let's see how our customized `User` model behaves:
|
||||
|
||||
**1. Using Defaults:**
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
class User(BaseModel):
|
||||
name: str = Field(default='Guest', alias='userName', min_length=3)
|
||||
age: int = Field(default=18, gt=0)
|
||||
email: str | None = Field(default=None, description='The user email address')
|
||||
|
||||
# Input data missing name and age
|
||||
input_data_1 = {'email': 'new@example.com'}
|
||||
|
||||
# Pydantic uses the defaults!
|
||||
user1 = User(**input_data_1)
|
||||
print(user1)
|
||||
# Expected Output: name='Guest' age=18 email='new@example.com'
|
||||
```
|
||||
|
||||
Pydantic automatically filled in `name` and `age` using the `default` values we specified in `Field()`.
|
||||
|
||||
**2. Using Aliases:**
|
||||
|
||||
```python
|
||||
# Continuing from above...
|
||||
|
||||
# Input data using the alias 'userName'
|
||||
input_data_2 = {'userName': 'Alice', 'age': 30}
|
||||
|
||||
# Pydantic correctly uses the alias to populate 'name'
|
||||
user2 = User(**input_data_2)
|
||||
print(user2)
|
||||
# Expected Output: name='Alice' age=30 email=None
|
||||
|
||||
# Dumping the model back, using the alias
|
||||
print(user2.model_dump(by_alias=True))
|
||||
# Expected Output: {'userName': 'Alice', 'age': 30, 'email': None}
|
||||
|
||||
# Dumping without by_alias uses the actual field names
|
||||
print(user2.model_dump())
|
||||
# Expected Output: {'name': 'Alice', 'age': 30, 'email': None}
|
||||
```
|
||||
|
||||
Pydantic successfully read the `userName` key from the input thanks to `alias='userName'`. When dumping *with* `by_alias=True`, it uses the alias again.
|
||||
|
||||
**3. Using Validation Constraints:**
|
||||
|
||||
```python
|
||||
# Continuing from above...
|
||||
from pydantic import ValidationError
|
||||
|
||||
# Input data with invalid values
|
||||
invalid_data_1 = {'userName': 'Bo', 'age': 30} # Name too short
|
||||
invalid_data_2 = {'userName': 'Charlie', 'age': -5} # Age not > 0
|
||||
|
||||
try:
|
||||
User(**invalid_data_1)
|
||||
except ValidationError as e:
|
||||
print(f"Error 1:\n{e}")
|
||||
"""
|
||||
Expected Output (simplified):
|
||||
Error 1:
|
||||
1 validation error for User
|
||||
name
|
||||
String should have at least 3 characters [type=string_too_short, context={'min_length': 3}, ...]
|
||||
"""
|
||||
|
||||
try:
|
||||
User(**invalid_data_2)
|
||||
except ValidationError as e:
|
||||
print(f"Error 2:\n{e}")
|
||||
"""
|
||||
Expected Output (simplified):
|
||||
Error 2:
|
||||
1 validation error for User
|
||||
age
|
||||
Input should be greater than 0 [type=greater_than, context={'gt': 0}, ...]
|
||||
"""
|
||||
```
|
||||
|
||||
Pydantic enforced the `min_length=3` and `gt=0` constraints we added via `Field()`, giving helpful errors when the rules were violated.
|
||||
|
||||
## What is `FieldInfo`? The Architect's Specification
|
||||
|
||||
So, you use the `Field()` function to add notes to your blueprint. But how does Pydantic *store* and *use* this information internally?
|
||||
|
||||
When Pydantic processes your model definition, it takes the information you provided in `Field()` (and the type hint) and bundles it all up into an internal object called `FieldInfo`.
|
||||
|
||||
**Analogy:** `Field()` is the sticky note you put on the blueprint ("Living Room - Must have fireplace"). `FieldInfo` is the formal entry in the architect's detailed specification document that captures this requirement along with the room's dimensions (type hint), default paint color (default value), etc.
|
||||
|
||||
You don't usually create `FieldInfo` objects directly. You use the convenient `Field()` function, and Pydantic creates the `FieldInfo` for you.
|
||||
|
||||
Every Pydantic model has a special attribute called `model_fields` which is a dictionary mapping field names to their corresponding `FieldInfo` objects.
|
||||
|
||||
```python
|
||||
# Continuing from the User model above
|
||||
|
||||
# Access the internal FieldInfo objects
|
||||
print(User.model_fields['name'])
|
||||
# Expected Output (representation may vary slightly):
|
||||
# FieldInfo(annotation=str, required=False, default='Guest', alias='userName', alias_priority=2, validation_alias='userName', serialization_alias='userName', metadata=[MinLen(min_length=3)])
|
||||
|
||||
print(User.model_fields['age'])
|
||||
# Expected Output:
|
||||
# FieldInfo(annotation=int, required=False, default=18, metadata=[Gt(gt=0)])
|
||||
|
||||
print(User.model_fields['email'])
|
||||
# Expected Output:
|
||||
# FieldInfo(annotation=Union[str, NoneType], required=False, default=None, description='The user email address')
|
||||
```
|
||||
|
||||
You can see how the `FieldInfo` object holds all the details: the `annotation` (type), `default`, `alias`, `description`, and even the constraints like `MinLen(min_length=3)` and `Gt(gt=0)` stored in its `metadata` attribute.
|
||||
|
||||
## Under the Hood: From `Field()` to `FieldInfo`
|
||||
|
||||
Let's revisit the model creation process from Chapter 1, now including `Field()`.
|
||||
|
||||
**High-Level Steps:**
|
||||
|
||||
When Python creates your `User` class:
|
||||
|
||||
1. **Inspection:** Pydantic's `ModelMetaclass` inspects the class definition. It finds `name: str = Field(alias='userName', ...)`, `age: int = Field(default=18, ...)`, etc.
|
||||
2. **`FieldInfo` Creation:** For each attribute defined with `Field()`, Pydantic calls internal logic (like `FieldInfo.from_annotated_attribute`) using the type hint (`str`, `int`) and the result of the `Field(...)` call. This creates the `FieldInfo` object containing all the configuration (type, default, alias, constraints, etc.).
|
||||
3. **Storage:** These `FieldInfo` objects are stored in an internal dictionary, which becomes accessible via `YourModel.model_fields`.
|
||||
4. **Schema Generation:** Pydantic uses these comprehensive `FieldInfo` objects (along with model-level [Configuration](03_configuration__configdict___configwrapper_.md)) to generate the internal [Core Schema](05_core_schema___validation_serialization.md). This schema is the detailed instruction set for the fast validation and serialization engine.
|
||||
|
||||
**Sequence Diagram:**
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Dev as Developer
|
||||
participant Py as Python
|
||||
participant Meta as ModelMetaclass
|
||||
participant FInfo as FieldInfo
|
||||
|
||||
Dev->>Py: Define `class User(BaseModel): name: str = Field(alias='userName')`
|
||||
Py->>Meta: Ask to create the `User` class
|
||||
Meta->>Meta: Inspect `name` attribute: finds `str` and `Field(alias='userName')` assignment
|
||||
Meta->>FInfo: Create `FieldInfo` using `str` and the `Field()` arguments
|
||||
FInfo-->>Meta: Return `FieldInfo(annotation=str, alias='userName', default=PydanticUndefined, ...)`
|
||||
Meta->>Meta: Store this `FieldInfo` instance in `cls.__pydantic_fields__['name']`
|
||||
Meta->>Meta: (Repeat for other fields like 'age', 'email')
|
||||
Meta-->>Py: Return the fully prepared `User` class (with `model_fields` populated)
|
||||
Py-->>Dev: `User` class is ready
|
||||
```
|
||||
|
||||
**Code Location:**
|
||||
|
||||
* The `Field()` function itself is defined in `pydantic/fields.py`. It's a relatively simple function that just captures its arguments and returns a `FieldInfo` instance.
|
||||
* The `FieldInfo` class is also defined in `pydantic/fields.py`. It holds attributes like `annotation`, `default`, `alias`, `metadata`, etc.
|
||||
* The logic that finds fields in a class definition, handles the `Field()` assignments, and creates the `FieldInfo` objects primarily happens within the `collect_model_fields` function (in `pydantic._internal._fields.py`), which is called by the `ModelMetaclass` (in `pydantic._internal._model_construction.py`) during class creation.
|
||||
|
||||
```python
|
||||
# Simplified view from pydantic/fields.py
|
||||
|
||||
# The user-facing function
|
||||
def Field(
|
||||
default: Any = PydanticUndefined,
|
||||
*,
|
||||
alias: str | None = _Unset,
|
||||
description: str | None = _Unset,
|
||||
gt: float | None = _Unset,
|
||||
# ... many other arguments
|
||||
) -> Any: # Returns Any for type checker convenience
|
||||
# It captures all arguments and passes them to create a FieldInfo instance
|
||||
field_info = FieldInfo.from_field(
|
||||
default,
|
||||
alias=alias,
|
||||
description=description,
|
||||
gt=gt,
|
||||
# ... passing all arguments through
|
||||
)
|
||||
return field_info # Actually returns a FieldInfo instance at runtime
|
||||
|
||||
# The internal storage class
|
||||
class FieldInfo:
|
||||
# Attributes to store all the configuration
|
||||
annotation: type[Any] | None
|
||||
default: Any
|
||||
alias: str | None
|
||||
description: str | None
|
||||
metadata: list[Any] # Stores constraints like Gt, MinLen, etc.
|
||||
# ... other attributes
|
||||
|
||||
def __init__(self, **kwargs) -> None:
|
||||
# Simplified: Assigns kwargs to attributes
|
||||
self.annotation = kwargs.get('annotation')
|
||||
self.default = kwargs.get('default', PydanticUndefined)
|
||||
self.alias = kwargs.get('alias')
|
||||
self.description = kwargs.get('description')
|
||||
# ... and collects constraints into self.metadata
|
||||
self.metadata = self._collect_metadata(kwargs)
|
||||
|
||||
@staticmethod
|
||||
def from_field(default: Any = PydanticUndefined, **kwargs) -> 'FieldInfo':
|
||||
# Creates an instance, handling the default value logic
|
||||
# ... implementation ...
|
||||
return FieldInfo(default=default, **kwargs)
|
||||
|
||||
def _collect_metadata(self, kwargs: dict[str, Any]) -> list[Any]:
|
||||
# Simplified: Takes kwargs like 'gt=0' and converts them
|
||||
# to internal metadata objects like 'annotated_types.Gt(0)'
|
||||
metadata = []
|
||||
if 'gt' in kwargs:
|
||||
# metadata.append(annotated_types.Gt(kwargs.pop('gt'))) # Real code is more complex
|
||||
pass # Simplified
|
||||
# ... handles other constraint kwargs ...
|
||||
return metadata
|
||||
|
||||
# --- Simplified view from pydantic._internal._fields.py ---
|
||||
|
||||
def collect_model_fields(cls, config_wrapper, ns_resolver, *, typevars_map=None):
|
||||
fields: dict[str, FieldInfo] = {}
|
||||
type_hints = get_model_type_hints(cls, ns_resolver=ns_resolver) # Get {'name': str, 'age': int, ...}
|
||||
|
||||
for ann_name, (ann_type, evaluated) in type_hints.items():
|
||||
if is_valid_field_name(ann_name):
|
||||
assigned_value = getattr(cls, ann_name, PydanticUndefined) # Check if Field() was used
|
||||
|
||||
if isinstance(assigned_value, FieldInfo): # If name = Field(...) was used
|
||||
# Create FieldInfo using the type hint AND the assigned FieldInfo object
|
||||
field_info = FieldInfo.from_annotated_attribute(ann_type, assigned_value)
|
||||
elif assigned_value is PydanticUndefined: # If only name: str was used
|
||||
# Create FieldInfo just from the type hint
|
||||
field_info = FieldInfo.from_annotation(ann_type)
|
||||
else: # If name: str = 'some_default' was used
|
||||
# Create FieldInfo from type hint and simple default
|
||||
field_info = FieldInfo.from_annotated_attribute(ann_type, assigned_value)
|
||||
|
||||
fields[ann_name] = field_info
|
||||
# ... more logic for inheritance, docstrings, etc. ...
|
||||
|
||||
return fields, set() # Returns dict of field names to FieldInfo objects
|
||||
|
||||
```
|
||||
|
||||
This process ensures that all the configuration you provide via `Field()` is captured systematically in `FieldInfo` objects, ready to be used for generating the validation/serialization schema.
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've now learned how to add detailed configuration to your `BaseModel` fields using the `Field()` function:
|
||||
|
||||
* `Field()` allows you to specify **defaults**, **aliases**, basic **validation constraints** (like `gt`, `max_length`), **descriptions**, and more.
|
||||
* It acts like adding specific **notes or requirements** to the rooms in your data blueprint.
|
||||
* Internally, Pydantic captures this information in `FieldInfo` objects.
|
||||
* `FieldInfo` holds the complete specification for a field (type, default, alias, constraints, etc.) and is stored in the model's `model_fields` attribute.
|
||||
* This detailed `FieldInfo` is crucial for Pydantic's powerful validation and serialization capabilities.
|
||||
|
||||
You now have more control over individual fields. But what about configuring the overall behavior of the *entire* model? For example, how can we tell Pydantic to *always* use aliases when serializing, or to forbid extra fields not defined in the model? That's where model configuration comes in.
|
||||
|
||||
Next: [Chapter 3: Configuration (ConfigDict / ConfigWrapper)](03_configuration__configdict___configwrapper_.md)
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
@@ -0,0 +1,369 @@
|
||||
# Chapter 3: Configuring Your Blueprint - Model Settings
|
||||
|
||||
In [Chapter 1](01_basemodel.md), we learned about `BaseModel` as our data blueprint, and in [Chapter 2](02_fields__fieldinfo___field_function_.md), we saw how `Field()` lets us add specific notes (like defaults or aliases) to individual rooms (fields) on that blueprint.
|
||||
|
||||
But what about instructions that apply to the *entire* blueprint? Imagine needing rules like:
|
||||
|
||||
* "Absolutely no extra furniture allowed that's not in the plan!" (Forbid extra fields)
|
||||
* "Once built, nothing inside can be changed!" (Make the model immutable/frozen)
|
||||
* "All room names on the final report should be lowercase." (Apply a naming convention during output)
|
||||
|
||||
These are model-wide settings, not specific to just one field. Pydantic provides a way to configure this overall behavior using model configuration.
|
||||
|
||||
## Why Configure the Whole Model?
|
||||
|
||||
Let's consider a simple `Product` model:
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel
|
||||
|
||||
class Product(BaseModel):
|
||||
item_id: int
|
||||
name: str
|
||||
price: float | None = None
|
||||
```
|
||||
|
||||
This works, but we might want to enforce stricter rules or change default behaviors:
|
||||
|
||||
1. **Strictness:** What if we receive data like `{'item_id': 123, 'name': 'Thingy', 'color': 'blue'}`? By default, Pydantic ignores the extra `color` field. We might want to *reject* data with unexpected fields.
|
||||
2. **Immutability:** What if, once a `Product` object is created, we want to prevent accidental changes like `product.price = 99.99`?
|
||||
3. **Naming Conventions:** What if our API expects JSON keys in `camelCase` (like `itemId`) instead of Python's standard `snake_case` (`item_id`)?
|
||||
|
||||
These global behaviors are controlled via Pydantic's configuration system.
|
||||
|
||||
## Introducing `ConfigDict` and `model_config`
|
||||
|
||||
Pydantic allows you to customize model behavior by adding a special class attribute called `model_config`. This attribute should be assigned a dictionary-like object called `ConfigDict`.
|
||||
|
||||
Think of `model_config = ConfigDict(...)` as the **master instruction sheet** or the **global settings panel** attached to your `BaseModel` blueprint. It provides overarching rules for how Pydantic should handle the model.
|
||||
|
||||
**`ConfigDict`:** A special dictionary (specifically, a `TypedDict`) provided by Pydantic where you specify configuration options using key-value pairs.
|
||||
**`model_config`:** The class attribute on your `BaseModel` where you assign your `ConfigDict`.
|
||||
|
||||
Let's add some configuration to our `Product` model:
|
||||
|
||||
```python
|
||||
# Import ConfigDict
|
||||
from pydantic import BaseModel, ConfigDict
|
||||
|
||||
class Product(BaseModel):
|
||||
# Define model-wide settings here
|
||||
model_config = ConfigDict(
|
||||
frozen=True, # Setting 1: Make instances immutable
|
||||
extra='forbid', # Setting 2: Forbid extra fields during input validation
|
||||
validate_assignment=True # Setting 3: Re-validate fields when they are assigned a new value
|
||||
)
|
||||
|
||||
item_id: int
|
||||
name: str
|
||||
price: float | None = None
|
||||
|
||||
# --- How these settings affect behavior ---
|
||||
|
||||
# 1. Forbid Extra Fields ('extra=forbid')
|
||||
try:
|
||||
# Input data has an extra 'color' field
|
||||
product_data_extra = {'item_id': 123, 'name': 'Thingy', 'color': 'blue'}
|
||||
Product(**product_data_extra)
|
||||
except Exception as e:
|
||||
print(f"Error on extra field:\n{e}")
|
||||
# Expected Output (simplified):
|
||||
# Error on extra field:
|
||||
# 1 validation error for Product
|
||||
# color
|
||||
# Extra inputs are not permitted [type=extra_forbidden, ...]
|
||||
|
||||
# 2. Immutability ('frozen=True')
|
||||
product = Product(item_id=456, name="Gadget")
|
||||
print(f"Initial product: {product}")
|
||||
# Expected Output: Initial product: item_id=456 name='Gadget' price=None
|
||||
|
||||
try:
|
||||
# Attempt to change a field on the frozen instance
|
||||
product.name = "New Gadget"
|
||||
except Exception as e:
|
||||
print(f"\nError on assignment to frozen model:\n{e}")
|
||||
# Expected Output (simplified):
|
||||
# Error on assignment to frozen model:
|
||||
# 1 validation error for Product
|
||||
# name
|
||||
# Instance is frozen [type=frozen_instance, ...]
|
||||
|
||||
# 3. Validate Assignment ('validate_assignment=True')
|
||||
product_mutable = Product.model_construct(item_id=789, name="Widget") # Use model_construct to bypass initial __init__ validation for demo
|
||||
try:
|
||||
# Attempt to assign an invalid type (int instead of str)
|
||||
product_mutable.name = 999
|
||||
except Exception as e:
|
||||
print(f"\nError on invalid assignment:\n{e}")
|
||||
# Expected Output (simplified):
|
||||
# Error on invalid assignment:
|
||||
# 1 validation error for Product
|
||||
# name
|
||||
# Input should be a valid string [type=string_type, input_value=999, input_type=int]
|
||||
```
|
||||
|
||||
By adding the `model_config` dictionary, we changed the fundamental behavior of our `Product` model without altering the field definitions themselves.
|
||||
|
||||
## Common Configuration Options
|
||||
|
||||
Here are a few more useful options you can set in `ConfigDict`:
|
||||
|
||||
* **`alias_generator`**: Automatically generate aliases for fields. Often used to convert between `snake_case` and `camelCase`.
|
||||
```python
|
||||
from pydantic import BaseModel, ConfigDict
|
||||
from pydantic.alias_generators import to_camel # Import a helper
|
||||
|
||||
class User(BaseModel):
|
||||
user_id: int
|
||||
first_name: str
|
||||
|
||||
model_config = ConfigDict(
|
||||
alias_generator=to_camel, # Use the camelCase generator
|
||||
populate_by_name=True # Allow using EITHER alias or python name for input (see warning below)
|
||||
# Replaced by validate_by_name=True + validate_by_alias=True
|
||||
)
|
||||
|
||||
# Input using camelCase aliases
|
||||
user_data_camel = {'userId': 1, 'firstName': 'Arthur'}
|
||||
user = User(**user_data_camel)
|
||||
print(f"User created from camelCase: {user}")
|
||||
# Expected Output: User created from camelCase: user_id=1 first_name='Arthur'
|
||||
|
||||
# Output (dumping) using aliases requires `by_alias=True`
|
||||
print(f"Dumped with aliases: {user.model_dump(by_alias=True)}")
|
||||
# Expected Output: Dumped with aliases: {'userId': 1, 'firstName': 'Arthur'}
|
||||
|
||||
print(f"Dumped without aliases: {user.model_dump()}")
|
||||
# Expected Output: Dumped without aliases: {'user_id': 1, 'first_name': 'Arthur'}
|
||||
```
|
||||
* **Modern Alias Control (Pydantic >= v2.11):** Instead of `populate_by_name`, use `validate_by_alias`, `validate_by_name`, and `serialize_by_alias` for finer control:
|
||||
```python
|
||||
from pydantic import BaseModel, ConfigDict
|
||||
from pydantic.alias_generators import to_camel
|
||||
|
||||
class UserV2(BaseModel):
|
||||
user_id: int
|
||||
first_name: str
|
||||
|
||||
model_config = ConfigDict(
|
||||
alias_generator=to_camel,
|
||||
validate_by_name=True, # Allow input using 'user_id', 'first_name'
|
||||
validate_by_alias=True, # Allow input using 'userId', 'firstName' (default is True)
|
||||
serialize_by_alias=True # Use aliases ('userId', 'firstName') when dumping by default
|
||||
)
|
||||
|
||||
user_data_camel = {'userId': 1, 'firstName': 'Zaphod'}
|
||||
user_camel = UserV2(**user_data_camel)
|
||||
print(f"User from camel: {user_camel}")
|
||||
# > User from camel: user_id=1 first_name='Zaphod'
|
||||
|
||||
user_data_snake = {'user_id': 2, 'first_name': 'Ford'}
|
||||
user_snake = UserV2(**user_data_snake)
|
||||
print(f"User from snake: {user_snake}")
|
||||
# > User from snake: user_id=2 first_name='Ford'
|
||||
|
||||
# serialize_by_alias=True means model_dump() uses aliases by default
|
||||
print(f"Dumped (default alias): {user_camel.model_dump()}")
|
||||
# > Dumped (default alias): {'userId': 1, 'firstName': 'Zaphod'}
|
||||
print(f"Dumped (force no alias): {user_camel.model_dump(by_alias=False)}")
|
||||
# > Dumped (force no alias): {'user_id': 1, 'first_name': 'Zaphod'}
|
||||
```
|
||||
|
||||
* **`use_enum_values`**: When serializing (e.g., with `model_dump`), use the *value* of an enum member instead of the member itself.
|
||||
```python
|
||||
from enum import Enum
|
||||
from pydantic import BaseModel, ConfigDict
|
||||
|
||||
class Status(Enum):
|
||||
PENDING = "pending"
|
||||
PROCESSING = "processing"
|
||||
COMPLETE = "complete"
|
||||
|
||||
class Order(BaseModel):
|
||||
order_id: int
|
||||
status: Status
|
||||
|
||||
model_config = ConfigDict(
|
||||
use_enum_values=True # Use the string value of Status
|
||||
)
|
||||
|
||||
order = Order(order_id=101, status=Status.PROCESSING)
|
||||
print(f"Order object status type: {type(order.status)}")
|
||||
# Expected Output: Order object status type: <enum 'Status'>
|
||||
|
||||
print(f"Order dumped: {order.model_dump()}")
|
||||
# Expected Output: Order dumped: {'order_id': 101, 'status': 'processing'}
|
||||
# Note: 'status' is the string "processing", not Status.PROCESSING
|
||||
```
|
||||
|
||||
* **`str_strip_whitespace` / `str_to_lower` / `str_to_upper`**: Automatically clean string inputs.
|
||||
```python
|
||||
from pydantic import BaseModel, ConfigDict
|
||||
|
||||
class Comment(BaseModel):
|
||||
text: str
|
||||
author: str
|
||||
|
||||
model_config = ConfigDict(
|
||||
str_strip_whitespace=True, # Remove leading/trailing whitespace
|
||||
str_to_lower=True # Convert to lowercase
|
||||
)
|
||||
|
||||
comment_data = {'text': ' Hello World! ', 'author': ' ALICE '}
|
||||
comment = Comment(**comment_data)
|
||||
print(comment)
|
||||
# Expected Output: text='hello world!' author='alice'
|
||||
```
|
||||
|
||||
You can find the full list of configuration options in the Pydantic documentation for [`ConfigDict`](https://docs.pydantic.dev/latest/api/config/#pydantic.config.ConfigDict).
|
||||
|
||||
**Important Note:** Configuration set in `model_config` generally applies *during validation and serialization*. For example, `alias_generator` helps Pydantic understand incoming data with aliases and optionally use aliases when producing output, but the internal attribute name in your Python code remains the Python name (e.g., `user_id`).
|
||||
|
||||
## What About `ConfigWrapper`? (Internal Detail)
|
||||
|
||||
You might see `ConfigWrapper` mentioned in Pydantic's internal code or documentation.
|
||||
|
||||
**Analogy:** If `ConfigDict` is the settings form you fill out (`frozen=True`, `extra='forbid'`), then `ConfigWrapper` is the internal manager object that Pydantic creates *from* your form. This manager holds onto your settings, knows the default values for settings you *didn't* specify, and provides a consistent way for the rest of Pydantic (like the schema builder) to ask "Is this model frozen?" or "What should happen with extra fields?".
|
||||
|
||||
**Key Point:** As a user writing Pydantic models, you almost always interact with **`ConfigDict`** via the `model_config` attribute. You generally don't need to create or use `ConfigWrapper` directly. It's an internal helper that makes Pydantic's life easier.
|
||||
|
||||
## Under the Hood: How Configuration is Applied
|
||||
|
||||
Let's refine our understanding of how a `BaseModel` class gets created, now including configuration.
|
||||
|
||||
**High-Level Steps:**
|
||||
|
||||
When Python creates your `Product` class:
|
||||
|
||||
1. **Inspection:** Pydantic's `ModelMetaclass` inspects the class definition. It finds the fields (`item_id: int`, etc.) and also looks for the `model_config` attribute.
|
||||
2. **Config Processing:** If `model_config` (a `ConfigDict`) is found, Pydantic uses it (along with config from any parent classes) to create an internal `ConfigWrapper` instance. This wrapper standardizes access to all config settings, applying defaults for any missing options.
|
||||
3. **FieldInfo Creation:** It processes field definitions, potentially using `Field()` as discussed in [Chapter 2](02_fields__fieldinfo___field_function_.md), creating `FieldInfo` objects.
|
||||
4. **Schema Generation:** Pydantic now uses *both* the `FieldInfo` objects *and* the settings from the `ConfigWrapper` to generate the detailed internal [Core Schema](05_core_schema___validation_serialization.md). For example, if the `ConfigWrapper` says `frozen=True`, this instruction is baked into the Core Schema.
|
||||
5. **Validator/Serializer Creation:** Optimized validator and serializer functions are created based on this final Core Schema.
|
||||
|
||||
**Sequence Diagram:**
|
||||
|
||||
This diagram shows how `model_config` influences the process:
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Dev as Developer
|
||||
participant Py as Python
|
||||
participant Meta as ModelMetaclass
|
||||
participant CfgWrap as ConfigWrapper
|
||||
participant Core as Pydantic Core Engine
|
||||
|
||||
Dev->>Py: Define `class Product(BaseModel): model_config = ConfigDict(frozen=True, extra='forbid') ...`
|
||||
Py->>Meta: Ask to create `Product` class
|
||||
Meta->>Meta: Find `model_config` dict in namespace
|
||||
Meta->>CfgWrap: Create `ConfigWrapper` using `model_config` (and defaults)
|
||||
CfgWrap-->>Meta: Return `ConfigWrapper(config_dict={'frozen': True, 'extra': 'forbid', ...other defaults...})`
|
||||
Meta->>Meta: Collect fields (`item_id`, `name`, `price`) and their FieldInfo
|
||||
Meta->>Core: Request Core Schema using FieldInfo AND ConfigWrapper settings (e.g., frozen, extra)
|
||||
Core-->>Meta: Provide Core Schema incorporating model-wide rules
|
||||
Meta->>Core: Request validator/serializer from Core Schema
|
||||
Core-->>Meta: Provide optimized validator/serializer reflecting config
|
||||
Meta-->>Py: Return fully prepared `Product` class
|
||||
Py-->>Dev: `Product` class is ready, respecting the config
|
||||
```
|
||||
|
||||
The `ConfigWrapper` acts as a bridge, translating the user-friendly `ConfigDict` into instructions the Core Engine understands when building the schema and validators.
|
||||
|
||||
**Code Location:**
|
||||
|
||||
* `ConfigDict`: Defined in `pydantic/config.py`. It's essentially a `TypedDict` listing all valid configuration keys.
|
||||
* `ConfigWrapper`: Defined in `pydantic._internal._config.py`. Its `__init__` takes the config dictionary. The `ConfigWrapper.for_model` class method is used by the metaclass to gather configuration from base classes and the current class definition. Its `core_config` method translates the stored config into the format needed by `pydantic-core`.
|
||||
* `ModelMetaclass`: In `pydantic._internal._model_construction.py`, the `__new__` method calls `ConfigWrapper.for_model` and passes the resulting wrapper to `build_schema_generator` and ultimately `complete_model_class`, which coordinates schema and validator/serializer creation.
|
||||
|
||||
```python
|
||||
# Simplified view from pydantic/config.py
|
||||
# ConfigDict is a TypedDict listing allowed keys and their types
|
||||
class ConfigDict(TypedDict, total=False):
|
||||
frozen: bool
|
||||
extra: Literal['allow', 'ignore', 'forbid'] | None
|
||||
alias_generator: Callable[[str], str] | None
|
||||
# ... many more options
|
||||
|
||||
# Simplified view from pydantic._internal._config.py
|
||||
class ConfigWrapper:
|
||||
config_dict: ConfigDict # Stores the actual config values
|
||||
|
||||
def __init__(self, config: ConfigDict | dict[str, Any] | type[Any] | None, *, check: bool = True):
|
||||
# Simplification: Stores the input config, potentially validating keys
|
||||
self.config_dict = prepare_config(config) # prepare_config handles defaults/deprecation
|
||||
|
||||
# Provides attribute access like wrapper.frozen, falling back to defaults
|
||||
def __getattr__(self, name: str) -> Any:
|
||||
try:
|
||||
return self.config_dict[name]
|
||||
except KeyError:
|
||||
# Fallback to default values defined in config_defaults
|
||||
# return config_defaults[name] # Simplified
|
||||
pass # Actual implementation is more complex
|
||||
|
||||
# Used during model creation to gather config from all sources
|
||||
@classmethod
|
||||
def for_model(cls, bases: tuple[type[Any], ...], namespace: dict[str, Any], kwargs: dict[str, Any]) -> Self:
|
||||
config_new = ConfigDict()
|
||||
# 1. Inherit config from base classes
|
||||
# 2. Get config from 'model_config' in the current class namespace
|
||||
# 3. Get config from kwargs passed during class definition (e.g., class Model(BaseModel, frozen=True): ...)
|
||||
# ... logic to merge these sources ...
|
||||
return cls(config_new) # Return a wrapper with the final merged config
|
||||
|
||||
# Creates the config dictionary specifically for pydantic-core
|
||||
def core_config(self, title: str | None) -> core_schema.CoreConfig:
|
||||
# Extracts relevant keys from self.config_dict and maps them
|
||||
# to the names expected by pydantic_core.CoreConfig
|
||||
# e.g., {'extra': 'forbid'} becomes {'extra_fields_behavior': 'forbid'}
|
||||
core_options = { ... }
|
||||
return core_schema.CoreConfig(**core_options)
|
||||
|
||||
# Simplified view from pydantic._internal._model_construction.py (ModelMetaclass.__new__)
|
||||
def __new__(mcs, name, bases, namespace, **kwargs):
|
||||
# ... lots of setup ...
|
||||
|
||||
# Step 1: Gather configuration
|
||||
config_wrapper = ConfigWrapper.for_model(bases, namespace, kwargs) # Merges config from bases, class def, kwargs
|
||||
|
||||
# Step 2: Prepare schema generator using the config
|
||||
schema_generator = build_schema_generator(
|
||||
cls, # The class being built
|
||||
config_wrapper,
|
||||
# ... other args ...
|
||||
)
|
||||
|
||||
# Step 3: Build core schema, validator, serializer (using schema_generator which uses config_wrapper)
|
||||
# core_schema = schema_generator.generate_schema(cls) # Simplified
|
||||
# validator = SchemaValidator(core_schema, config_wrapper.core_config()) # Simplified
|
||||
# serializer = SchemaSerializer(core_schema, config_wrapper.core_config()) # Simplified
|
||||
|
||||
# ... attach schema, validator, serializer to the class ...
|
||||
cls = super().__new__(mcs, name, bases, namespace, **kwargs)
|
||||
# cls.__pydantic_validator__ = validator
|
||||
# ...
|
||||
|
||||
return cls
|
||||
```
|
||||
|
||||
This setup ensures that the model-wide rules defined in `model_config` are consistently applied during both validation (creating model instances) and serialization (dumping model instances).
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've learned how to configure the overall behavior of your `BaseModel` blueprints:
|
||||
|
||||
* Use the `model_config` class attribute, assigning it a `ConfigDict`.
|
||||
* `ConfigDict` acts as the **master instruction sheet** or **settings panel** for the model.
|
||||
* Common settings include `frozen`, `extra`, `alias_generator`, `use_enum_values`, and string cleaning options.
|
||||
* Pydantic uses this configuration, often via the internal `ConfigWrapper`, to tailor the validation and serialization logic defined in the [Core Schema](05_core_schema___validation_serialization.md).
|
||||
|
||||
With `BaseModel`, `Field`, and `ConfigDict`, you have powerful tools to define the structure, field-specific details, and overall behavior of your data models.
|
||||
|
||||
But what if you need logic that goes beyond simple configuration? What if you need custom validation rules that depend on multiple fields, or complex transformations before or after validation/serialization? That's where Pydantic's decorators come in.
|
||||
|
||||
Next: [Chapter 4: Custom Logic (Decorators & Annotated Helpers)](04_custom_logic__decorators___annotated_helpers_.md)
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
@@ -0,0 +1,503 @@
|
||||
# Chapter 4: Custom Logic (Decorators & Annotated Helpers)
|
||||
|
||||
In [Chapter 3: Configuration (ConfigDict / ConfigWrapper)](03_configuration__configdict___configwrapper_.md), we learned how to set global rules for our data blueprints using `model_config`. But what if we need more specific, custom rules or transformations that go beyond simple settings?
|
||||
|
||||
Imagine you need rules like:
|
||||
* "This username must not contain any spaces."
|
||||
* "The `end_date` must always be later than the `start_date`."
|
||||
* "When sending this data as JSON, format this specific date field as `YYYY-MM-DD`."
|
||||
* "When validating, convert incoming usernames to lowercase automatically."
|
||||
|
||||
These require custom code logic. Pydantic provides flexible ways to inject this custom logic directly into the validation and serialization processes.
|
||||
|
||||
## Why Custom Logic?
|
||||
|
||||
Standard type hints (`str`, `int`), [Fields](02_fields__fieldinfo___field_function_.md) (`Field(gt=0)`), and [Configuration](03_configuration__configdict___configwrapper_.md) (`ConfigDict(extra='forbid')`) cover many common cases. However, sometimes the rules are more complex or specific to your application's needs.
|
||||
|
||||
For example, checking if a password meets complexity requirements (length, uppercase, numbers, symbols) or ensuring consistency between multiple fields (`start_date < end_date`) requires writing your own Python functions.
|
||||
|
||||
Pydantic offers two main ways to add this custom logic:
|
||||
1. **Decorators:** Special markers (`@...`) you put above methods in your `BaseModel` class.
|
||||
2. **`Annotated` Helpers:** Using Python's `typing.Annotated` along with special Pydantic classes to attach logic directly to a type hint.
|
||||
|
||||
**Analogy:** Think of these as adding custom steps to the construction (validation) and reporting (serialization) process for your data blueprint.
|
||||
* **Validators** are like adding extra *inspection checks* at different stages of construction (before basic checks, after basic checks, or wrapping the entire process).
|
||||
* **Serializers** are like specifying custom *formatting rules* for the final report (converting your data back to simple types like dicts or JSON).
|
||||
|
||||
Let's explore these mechanisms.
|
||||
|
||||
## Decorators: Adding Logic via Methods
|
||||
|
||||
Decorators are a standard Python feature. They are functions that modify or enhance other functions or methods. Pydantic uses decorators to let you designate specific methods in your `BaseModel` as custom validators or serializers.
|
||||
|
||||
### `@field_validator`: Checking Individual Fields
|
||||
|
||||
The `@field_validator` decorator lets you add custom validation logic for one or more specific fields *after* Pydantic has performed its initial type checks and coercion.
|
||||
|
||||
**Use Case:** Let's ensure a `username` field doesn't contain spaces.
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel, field_validator, ValidationError
|
||||
|
||||
class UserRegistration(BaseModel):
|
||||
username: str
|
||||
email: str
|
||||
|
||||
# This method will be called automatically for the 'username' field
|
||||
# AFTER Pydantic checks it's a string.
|
||||
@field_validator('username')
|
||||
@classmethod # Field validators should usually be class methods
|
||||
def check_username_spaces(cls, v: str) -> str:
|
||||
print(f"Checking username: '{v}'")
|
||||
if ' ' in v:
|
||||
# Raise a ValueError if the rule is broken
|
||||
raise ValueError('Username cannot contain spaces')
|
||||
# Return the valid value (can also modify it here if needed)
|
||||
return v
|
||||
|
||||
# --- Try it out ---
|
||||
|
||||
# Valid username
|
||||
user_ok = UserRegistration(username='cool_cat123', email='cat@meow.com')
|
||||
print(f"Valid user created: {user_ok}")
|
||||
# Expected Output:
|
||||
# Checking username: 'cool_cat123'
|
||||
# Valid user created: username='cool_cat123' email='cat@meow.com'
|
||||
|
||||
# Invalid username
|
||||
try:
|
||||
UserRegistration(username='cool cat 123', email='cat@meow.com')
|
||||
except ValidationError as e:
|
||||
print(f"\nValidation Error:\n{e}")
|
||||
# Expected Output (simplified):
|
||||
# Checking username: 'cool cat 123'
|
||||
# Validation Error:
|
||||
# 1 validation error for UserRegistration
|
||||
# username
|
||||
# Value error, Username cannot contain spaces [type=value_error, ...]
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
1. We defined a `check_username_spaces` method inside our `UserRegistration` model.
|
||||
2. We decorated it with `@field_validator('username')`. This tells Pydantic: "After you validate `username` as a `str`, call this method with the value."
|
||||
3. The `@classmethod` decorator is typically used so the method receives the class (`cls`) as the first argument instead of an instance (`self`).
|
||||
4. Inside the method, `v` holds the value of the `username` field *after* Pydantic's basic `str` validation.
|
||||
5. We check our custom rule (`' ' in v`).
|
||||
6. If the rule is violated, we raise a `ValueError` (Pydantic catches this and wraps it in a `ValidationError`).
|
||||
7. If the value is okay, we **must return it**. You could also transform the value here (e.g., `return v.lower()`).
|
||||
|
||||
`@field_validator` has a `mode` argument (`'before'` or `'after'`, default is `'after'`). `'after'` (as shown) runs *after* Pydantic's internal validation for the field type. `'before'` runs *before*, giving you the raw input value.
|
||||
|
||||
### `@model_validator`: Checking the Whole Model
|
||||
|
||||
Sometimes, validation depends on multiple fields interacting. The `@model_validator` decorator lets you run logic that involves the entire model's data.
|
||||
|
||||
**Use Case:** Ensure `end_date` is after `start_date`.
|
||||
|
||||
```python
|
||||
from datetime import date
|
||||
from pydantic import BaseModel, model_validator, ValidationError
|
||||
from typing import Self # Used for type hint in Python 3.11+
|
||||
|
||||
class Trip(BaseModel):
|
||||
start_date: date
|
||||
end_date: date
|
||||
destination: str
|
||||
|
||||
# This method runs AFTER the model fields are validated individually
|
||||
@model_validator(mode='after')
|
||||
def check_dates(self) -> Self: # Use 'Self' or 'Trip' as return hint
|
||||
print(f"Checking dates: start={self.start_date}, end={self.end_date}")
|
||||
if self.start_date >= self.end_date:
|
||||
raise ValueError('End date must be after start date')
|
||||
# Return the validated model instance
|
||||
return self
|
||||
|
||||
# --- Try it out ---
|
||||
|
||||
# Valid dates
|
||||
trip_ok = Trip(start_date=date(2024, 7, 1), end_date=date(2024, 7, 10), destination='Beach')
|
||||
print(f"Valid trip: {trip_ok}")
|
||||
# Expected Output:
|
||||
# Checking dates: start=2024-07-01, end=2024-07-10
|
||||
# Valid trip: start_date=datetime.date(2024, 7, 1) end_date=datetime.date(2024, 7, 10) destination='Beach'
|
||||
|
||||
# Invalid dates
|
||||
try:
|
||||
Trip(start_date=date(2024, 7, 10), end_date=date(2024, 7, 1), destination='Mountains')
|
||||
except ValidationError as e:
|
||||
print(f"\nValidation Error:\n{e}")
|
||||
# Expected Output (simplified):
|
||||
# Checking dates: start=2024-07-10, end=2024-07-01
|
||||
# Validation Error:
|
||||
# 1 validation error for Trip
|
||||
# Value error, End date must be after start date [type=value_error, ...]
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
1. We defined a `check_dates` method.
|
||||
2. We decorated it with `@model_validator(mode='after')`. This tells Pydantic: "After validating all individual fields and creating the model instance, call this method."
|
||||
3. In `'after'` mode, the method receives `self` (the model instance). We can access all fields like `self.start_date`.
|
||||
4. We perform our cross-field check.
|
||||
5. If invalid, raise `ValueError`.
|
||||
6. If valid, **must return `self`** (the model instance).
|
||||
|
||||
`@model_validator` also supports `mode='before'`, where the method runs *before* individual field validation. In `'before'` mode, the method receives the class (`cls`) and the raw input data (usually a dictionary) and must return the (potentially modified) data dictionary to be used for further validation.
|
||||
|
||||
### `@field_serializer`: Customizing Field Output
|
||||
|
||||
This decorator lets you control how a specific field is converted (serialized) when you call methods like `model_dump()` or `model_dump_json()`.
|
||||
|
||||
**Use Case:** Serialize a `date` object as a simple `"YYYY-MM-DD"` string.
|
||||
|
||||
```python
|
||||
from datetime import date
|
||||
from pydantic import BaseModel, field_serializer
|
||||
|
||||
class Event(BaseModel):
|
||||
name: str
|
||||
event_date: date
|
||||
|
||||
# Customize serialization for the 'event_date' field
|
||||
@field_serializer('event_date')
|
||||
def serialize_date(self, dt: date) -> str:
|
||||
# Return the custom formatted string
|
||||
return dt.strftime('%Y-%m-%d')
|
||||
|
||||
# --- Try it out ---
|
||||
event = Event(name='Party', event_date=date(2024, 12, 25))
|
||||
|
||||
# Default dump (dictionary)
|
||||
print(f"Model object: {event}")
|
||||
# Expected Output: Model object: name='Party' event_date=datetime.date(2024, 12, 25)
|
||||
|
||||
dumped_dict = event.model_dump()
|
||||
print(f"Dumped dict: {dumped_dict}")
|
||||
# Expected Output: Dumped dict: {'name': 'Party', 'event_date': '2024-12-25'}
|
||||
|
||||
dumped_json = event.model_dump_json(indent=2)
|
||||
print(f"Dumped JSON:\n{dumped_json}")
|
||||
# Expected Output:
|
||||
# Dumped JSON:
|
||||
# {
|
||||
# "name": "Party",
|
||||
# "event_date": "2024-12-25"
|
||||
# }
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
1. We defined `serialize_date` and decorated it with `@field_serializer('event_date')`.
|
||||
2. The method receives `self` (the instance) and `dt` (the value of the `event_date` field). You can also add an optional `info: SerializationInfo` argument for more context.
|
||||
3. It returns the desired serialized format (a string in this case).
|
||||
4. When `model_dump()` or `model_dump_json()` is called, Pydantic uses this method for the `event_date` field instead of its default date serialization.
|
||||
|
||||
### `@model_serializer`: Customizing Model Output
|
||||
|
||||
This allows custom logic for serializing the entire model object.
|
||||
|
||||
**Use Case:** Add a calculated `duration_days` field during serialization.
|
||||
|
||||
```python
|
||||
from datetime import date, timedelta
|
||||
from pydantic import BaseModel, model_serializer
|
||||
from typing import Dict, Any
|
||||
|
||||
class Trip(BaseModel):
|
||||
start_date: date
|
||||
end_date: date
|
||||
destination: str
|
||||
|
||||
# Customize the entire model's serialization
|
||||
@model_serializer
|
||||
def serialize_with_duration(self) -> Dict[str, Any]:
|
||||
# Start with the default field values
|
||||
data = {'start_date': self.start_date, 'end_date': self.end_date, 'destination': self.destination}
|
||||
# Calculate and add the custom field
|
||||
duration = self.end_date - self.start_date
|
||||
data['duration_days'] = duration.days
|
||||
return data
|
||||
|
||||
# --- Try it out ---
|
||||
trip = Trip(start_date=date(2024, 8, 1), end_date=date(2024, 8, 5), destination='Lake')
|
||||
|
||||
print(f"Model object: {trip}")
|
||||
# Expected Output: Model object: start_date=datetime.date(2024, 8, 1) end_date=datetime.date(2024, 8, 5) destination='Lake'
|
||||
|
||||
dumped_dict = trip.model_dump()
|
||||
print(f"Dumped dict: {dumped_dict}")
|
||||
# Expected Output: Dumped dict: {'start_date': datetime.date(2024, 8, 1), 'end_date': datetime.date(2024, 8, 5), 'destination': 'Lake', 'duration_days': 4}
|
||||
|
||||
dumped_json = trip.model_dump_json(indent=2)
|
||||
print(f"Dumped JSON:\n{dumped_json}")
|
||||
# Expected Output:
|
||||
# Dumped JSON:
|
||||
# {
|
||||
# "start_date": "2024-08-01",
|
||||
# "end_date": "2024-08-05",
|
||||
# "destination": "Lake",
|
||||
# "duration_days": 4
|
||||
# }
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
1. We decorated `serialize_with_duration` with `@model_serializer`.
|
||||
2. The default `mode='plain'` means this method *replaces* the standard model serialization. It receives `self`.
|
||||
3. We manually construct the dictionary we want as output, adding our calculated `duration_days`.
|
||||
4. This dictionary is used by `model_dump()` and `model_dump_json()`.
|
||||
|
||||
There's also a `mode='wrap'` for `@model_serializer` (and `@field_serializer`) which is more advanced. It gives you a `handler` function to call the *next* serialization step (either Pydantic's default or another wrapper), allowing you to modify the result *around* the standard logic.
|
||||
|
||||
## `Annotated` Helpers: Attaching Logic to Type Hints
|
||||
|
||||
Python's `typing.Annotated` allows adding extra metadata to type hints. Pydantic leverages this to let you attach validation and serialization logic directly inline with your field definitions.
|
||||
|
||||
**Analogy:** Instead of separate instruction sheets (decorators), this is like putting specific instruction tags directly onto an item in the blueprint.
|
||||
|
||||
Common helpers include:
|
||||
* **Validators:** `BeforeValidator`, `AfterValidator`, `PlainValidator`, `WrapValidator`
|
||||
* **Serializers:** `PlainSerializer`, `WrapSerializer`
|
||||
|
||||
Let's see how `AfterValidator` compares to `@field_validator`.
|
||||
|
||||
**Use Case:** Ensure `username` has no spaces, using `Annotated`.
|
||||
|
||||
```python
|
||||
from typing import Annotated
|
||||
from pydantic import BaseModel, Field, ValidationError
|
||||
# Import the helper
|
||||
from pydantic.functional_validators import AfterValidator
|
||||
|
||||
# Define the validation function (can be outside the class)
|
||||
def check_no_spaces(v: str) -> str:
|
||||
print(f"Checking username via Annotated: '{v}'")
|
||||
if ' ' in v:
|
||||
raise ValueError('Username cannot contain spaces')
|
||||
return v
|
||||
|
||||
class UserRegistrationAnnotated(BaseModel):
|
||||
# Attach the validator function directly to the type hint
|
||||
username: Annotated[str, AfterValidator(check_no_spaces)]
|
||||
email: str
|
||||
|
||||
# --- Try it out ---
|
||||
|
||||
# Valid username
|
||||
user_ok = UserRegistrationAnnotated(username='another_cat', email='cat@meow.com')
|
||||
print(f"Valid user: {user_ok}")
|
||||
# Expected Output:
|
||||
# Checking username via Annotated: 'another_cat'
|
||||
# Valid user: username='another_cat' email='cat@meow.com'
|
||||
|
||||
# Invalid username
|
||||
try:
|
||||
UserRegistrationAnnotated(username='another cat', email='cat@meow.com')
|
||||
except ValidationError as e:
|
||||
print(f"\nValidation Error:\n{e}")
|
||||
# Expected Output (simplified):
|
||||
# Checking username via Annotated: 'another cat'
|
||||
# Validation Error:
|
||||
# 1 validation error for UserRegistrationAnnotated
|
||||
# username
|
||||
# Value error, Username cannot contain spaces [type=value_error, ...]
|
||||
```
|
||||
|
||||
**Explanation:**
|
||||
1. We import `Annotated` from `typing` and `AfterValidator` from Pydantic.
|
||||
2. We define a standalone function `check_no_spaces` (it doesn't need to be a method).
|
||||
3. In the model, we define `username` as `Annotated[str, AfterValidator(check_no_spaces)]`. This tells Pydantic: "The type is `str`, and after validating it as a string, apply the `check_no_spaces` function."
|
||||
4. The behavior is identical to the `@field_validator` example, but the logic is attached differently.
|
||||
|
||||
Similarly, you can use `BeforeValidator` (runs before Pydantic's type validation) or `PlainSerializer` / `WrapSerializer` to attach serialization logic.
|
||||
|
||||
**Use Case:** Serialize `date` as `"YYYY-MM-DD"` using `Annotated` and `PlainSerializer`.
|
||||
|
||||
```python
|
||||
from datetime import date
|
||||
from typing import Annotated
|
||||
from pydantic import BaseModel
|
||||
# Import the helper
|
||||
from pydantic.functional_serializers import PlainSerializer
|
||||
|
||||
# Define the serializer function
|
||||
def format_date_yyyymmdd(dt: date) -> str:
|
||||
return dt.strftime('%Y-%m-%d')
|
||||
|
||||
class EventAnnotated(BaseModel):
|
||||
name: str
|
||||
# Attach the serializer function directly to the type hint
|
||||
event_date: Annotated[date, PlainSerializer(format_date_yyyymmdd)]
|
||||
|
||||
# --- Try it out ---
|
||||
event = EventAnnotated(name='Conference', event_date=date(2024, 10, 15))
|
||||
|
||||
print(f"Model object: {event}")
|
||||
# Expected Output: Model object: name='Conference' event_date=datetime.date(2024, 10, 15)
|
||||
|
||||
dumped_dict = event.model_dump()
|
||||
print(f"Dumped dict: {dumped_dict}")
|
||||
# Expected Output: Dumped dict: {'name': 'Conference', 'event_date': '2024-10-15'}
|
||||
|
||||
dumped_json = event.model_dump_json(indent=2)
|
||||
print(f"Dumped JSON:\n{dumped_json}")
|
||||
# Expected Output:
|
||||
# Dumped JSON:
|
||||
# {
|
||||
# "name": "Conference",
|
||||
# "event_date": "2024-10-15"
|
||||
# }
|
||||
```
|
||||
|
||||
This achieves the same result as the `@field_serializer` example, but by attaching the logic via `Annotated`.
|
||||
|
||||
**Which to choose? Decorators vs. Annotated Helpers:**
|
||||
* **Decorators (`@field_validator`, etc.):** Keep logic tightly coupled with the model class definition. Good if the logic intrinsically belongs to the model or needs access to `cls` or `self`. Can feel more object-oriented.
|
||||
* **`Annotated` Helpers (`AfterValidator`, etc.):** Allow defining reusable validation/serialization functions outside the model. Good for applying the same logic across different models or fields. Can make type hints more verbose but keeps the model body cleaner.
|
||||
|
||||
## Under the Hood: Wiring Up the Logic
|
||||
|
||||
How does Pydantic discover and apply this custom logic?
|
||||
|
||||
**Decorators:**
|
||||
1. **Class Creation:** When Python creates your `BaseModel` class (like `UserRegistration`), Pydantic's `ModelMetaclass` scans the class attributes.
|
||||
2. **Decorator Detection:** It finds methods decorated with Pydantic decorators (`@field_validator`, `@model_serializer`, etc.). It uses helper classes like `PydanticDescriptorProxy` (from `pydantic._internal._decorators`) to wrap these methods and store metadata about the decorator (like which fields it applies to, the mode, etc., using internal classes like `FieldValidatorDecoratorInfo`).
|
||||
3. **Info Storage:** Information about all found decorators is collected and stored internally, often associated with the class (e.g., in a hidden `__pydantic_decorators__` attribute holding a `DecoratorInfos` object).
|
||||
4. **Schema Integration:** When generating the [Core Schema](05_core_schema___validation_serialization.md) for the model, Pydantic consults this stored decorator information. It translates the decorator rules (e.g., "run `check_username_spaces` after validating `username`") into corresponding schema components (like `after_validator_function`). The core validation/serialization engine then uses this schema.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Dev as Developer
|
||||
participant Py as Python Interpreter
|
||||
participant Meta as BaseModel Metaclass
|
||||
participant DecInfo as DecoratorInfos
|
||||
participant Core as Pydantic Core Engine
|
||||
|
||||
Dev->>Py: Define `class User(BaseModel): ... @field_validator('username') def check_spaces(cls, v): ...`
|
||||
Py->>Meta: Ask to create the `User` class
|
||||
Meta->>Meta: Scan class attributes, find `check_spaces` wrapped by PydanticDescriptorProxy
|
||||
Meta->>DecInfo: Store info: func=check_spaces, applies_to='username', mode='after'
|
||||
Meta->>Core: Request Core Schema, providing field info AND DecoratorInfos
|
||||
Core->>Core: Build schema, incorporating an 'after_validator' step for 'username' linked to `check_spaces`
|
||||
Core-->>Meta: Provide internal Core Schema for User
|
||||
Meta->>Core: Request validator/serializer functions from schema
|
||||
Core-->>Meta: Provide optimized functions incorporating custom logic
|
||||
Meta-->>Py: Return the fully prepared `User` class
|
||||
Py-->>Dev: `User` class is ready
|
||||
```
|
||||
|
||||
**`Annotated` Helpers:**
|
||||
1. **Field Processing:** During class creation, when Pydantic processes a field like `username: Annotated[str, AfterValidator(check_no_spaces)]`, it analyzes the `Annotated` metadata.
|
||||
2. **Helper Recognition:** It recognizes Pydantic helper classes like `AfterValidator`. These helpers often implement a special method `__get_pydantic_core_schema__`.
|
||||
3. **Schema Generation:** Pydantic's schema generation logic (often involving `GetCoreSchemaHandler` from `pydantic.annotated_handlers`) calls `AfterValidator.__get_pydantic_core_schema__`. This method tells the handler how to integrate the custom logic (`check_no_spaces`) into the [Core Schema](05_core_schema___validation_serialization.md) being built for the `username` field.
|
||||
4. **Schema Integration:** The handler modifies the schema-in-progress to include the custom logic (e.g., adding an `after_validator_function` component pointing to `check_no_spaces`). The final schema used by the core engine contains this logic directly associated with the field.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Dev as Developer
|
||||
participant Py as Python Interpreter
|
||||
participant Meta as BaseModel Metaclass
|
||||
participant SchemaGen as Core Schema Generator
|
||||
participant Helper as AfterValidator Instance
|
||||
participant Core as Pydantic Core Engine
|
||||
|
||||
Dev->>Py: Define `class User(BaseModel): username: Annotated[str, AfterValidator(check_no_spaces)]`
|
||||
Py->>Meta: Ask to create the `User` class
|
||||
Meta->>SchemaGen: Start building schema for `User`
|
||||
SchemaGen->>SchemaGen: Process 'username' field, see `Annotated[str, AfterValidator(...)]`
|
||||
SchemaGen->>Helper: Call `__get_pydantic_core_schema__` on `AfterValidator` instance
|
||||
Helper->>SchemaGen: Generate schema for base type (`str`)
|
||||
SchemaGen-->>Helper: Return base `str` schema
|
||||
Helper->>Helper: Modify schema, adding 'after_validator' pointing to `check_no_spaces`
|
||||
Helper-->>SchemaGen: Return modified schema for 'username'
|
||||
SchemaGen->>Core: Finalize schema for `User` model incorporating custom logic
|
||||
Core-->>SchemaGen: Provide completed Core Schema
|
||||
SchemaGen-->>Meta: Return Core Schema
|
||||
Meta->>Core: Request validator/serializer from final schema
|
||||
Core-->>Meta: Provide optimized functions
|
||||
Meta-->>Py: Return the fully prepared `User` class
|
||||
Py-->>Dev: `User` class is ready
|
||||
```
|
||||
|
||||
**Code Location:**
|
||||
* Decorator logic (detection, storage, proxy): `pydantic._internal._decorators.py`
|
||||
* `Annotated` helper classes (`AfterValidator`, `PlainSerializer`, etc.): `pydantic.functional_validators.py`, `pydantic.functional_serializers.py`
|
||||
* Schema generation integrating these: Primarily involves internal schema builders calling `__get_pydantic_core_schema__` on annotated types/metadata, often orchestrated via `pydantic._internal._generate_schema.GenerateSchema`. The `GetCoreSchemaHandler` from `pydantic.annotated_handlers.py` is passed around to facilitate this.
|
||||
|
||||
```python
|
||||
# Simplified concept from pydantic.functional_validators.py
|
||||
|
||||
@dataclasses.dataclass(frozen=True)
|
||||
class AfterValidator:
|
||||
func: Callable # The user's validation function
|
||||
|
||||
# This method is called by Pydantic during schema building
|
||||
def __get_pydantic_core_schema__(
|
||||
self,
|
||||
source_type: Any, # The base type (e.g., str)
|
||||
handler: GetCoreSchemaHandler # Helper to get schema for base type
|
||||
) -> core_schema.CoreSchema:
|
||||
# 1. Get the schema for the base type (e.g., str_schema())
|
||||
schema = handler(source_type)
|
||||
# 2. Wrap it with an 'after_validator' step using self.func
|
||||
info_arg = _inspect_validator(self.func, 'after') # Check signature
|
||||
if info_arg:
|
||||
# Use core_schema function for validators with info arg
|
||||
return core_schema.with_info_after_validator_function(
|
||||
self.func, schema=schema
|
||||
)
|
||||
else:
|
||||
# Use core_schema function for validators without info arg
|
||||
return core_schema.no_info_after_validator_function(
|
||||
self.func, schema=schema
|
||||
)
|
||||
|
||||
# Simplified concept from pydantic._internal._decorators.py
|
||||
|
||||
@dataclass
|
||||
class FieldValidatorDecoratorInfo: # Stores info about @field_validator
|
||||
fields: tuple[str, ...]
|
||||
mode: Literal['before', 'after', 'wrap', 'plain']
|
||||
# ... other options
|
||||
|
||||
@dataclass
|
||||
class PydanticDescriptorProxy: # Wraps the decorated method
|
||||
wrapped: Callable
|
||||
decorator_info: FieldValidatorDecoratorInfo | ... # Stores the info object
|
||||
|
||||
# Simplified concept from ModelMetaclass during class creation
|
||||
|
||||
# ... scan class attributes ...
|
||||
decorators = DecoratorInfos() # Object to hold all found decorators
|
||||
for var_name, var_value in vars(model_cls).items():
|
||||
if isinstance(var_value, PydanticDescriptorProxy):
|
||||
info = var_value.decorator_info
|
||||
# Store the decorator info (function, fields, mode, etc.)
|
||||
# in the appropriate category within 'decorators' object
|
||||
if isinstance(info, FieldValidatorDecoratorInfo):
|
||||
decorators.field_validators[var_name] = Decorator(
|
||||
func=var_value.wrapped, info=info # Simplified
|
||||
)
|
||||
# ... handle other decorator types ...
|
||||
|
||||
# ... later, when building the core schema ...
|
||||
# schema_generator uses the 'decorators' object to add validation/serialization
|
||||
# steps to the core schema based on the stored decorator info.
|
||||
```
|
||||
|
||||
Both decorators and `Annotated` helpers ultimately achieve the same goal: embedding custom Python functions into the Pydantic validation and serialization pipeline by modifying the underlying [Core Schema](05_core_schema___validation_serialization.md).
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've learned two powerful ways to add custom logic to your Pydantic models:
|
||||
|
||||
* **Decorators** (`@field_validator`, `@model_validator`, `@field_serializer`, `@model_serializer`) allow you to designate methods within your model class for custom validation or serialization tasks, applying logic to specific fields or the entire model.
|
||||
* **`Annotated` Helpers** (`BeforeValidator`, `AfterValidator`, `PlainSerializer`, etc.) let you attach validation or serialization functions directly to a field's type hint using `typing.Annotated`, often promoting reusable logic functions.
|
||||
|
||||
These tools give you fine-grained control over how your data is processed, going beyond basic type checks and configuration. They are essential for handling real-world data validation and formatting complexities.
|
||||
|
||||
Understanding how these mechanisms work often involves looking at the internal representation Pydantic uses: the Core Schema. In the next chapter, we'll delve into what this schema looks like and how Pydantic uses it.
|
||||
|
||||
Next: [Chapter 5: Core Schema & Validation/Serialization](05_core_schema___validation_serialization.md)
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
@@ -0,0 +1,251 @@
|
||||
# Chapter 5: Core Schema & Validation/Serialization
|
||||
|
||||
In the previous chapters, we've seen how to define data structures using [BaseModel](01_basemodel.md), customize fields with [Field()](02_fields__fieldinfo___field_function_.md), set model-wide behavior with [Configuration](03_configuration__configdict___configwrapper_.md), and even add [Custom Logic](04_custom_logic__decorators___annotated_helpers_.md) using decorators. You might be wondering: how does Pydantic take all these Python definitions and use them to perform such fast and reliable validation and serialization?
|
||||
|
||||
The secret lies in an internal representation called the **Core Schema** and a high-performance engine called `pydantic-core`. Let's peek under the hood!
|
||||
|
||||
## Why Look Under the Hood?
|
||||
|
||||
Imagine you've designed a beautiful blueprint for a house (your Pydantic `BaseModel`). You've specified room sizes (type hints), special fixtures (`Field` constraints), and overall building codes (`ConfigDict`). You've even added custom inspection notes (decorators).
|
||||
|
||||
Now, how does the construction crew actually *build* the house and check everything rigorously? They don't just glance at the user-friendly blueprint. They work from a highly detailed **technical specification** derived from it. This spec leaves no room for ambiguity.
|
||||
|
||||
In Pydantic, the **`CoreSchema`** is that technical specification, and the **`pydantic-core`** engine (written in Rust) is the super-efficient construction crew that uses it. Understanding this helps explain:
|
||||
|
||||
* **Speed:** Why Pydantic is so fast.
|
||||
* **Consistency:** How validation and serialization rules are strictly enforced.
|
||||
* **Power:** How complex requirements are translated into concrete instructions.
|
||||
|
||||
## What is the Core Schema? The Technical Specification
|
||||
|
||||
When Pydantic processes your `BaseModel` definition (including type hints, `Field` calls, `ConfigDict`, decorators, etc.), it translates all that information into an internal data structure called the **Core Schema**.
|
||||
|
||||
Think of the Core Schema as:
|
||||
|
||||
1. **The Bridge:** It connects your user-friendly Python code to the high-performance Rust engine (`pydantic-core`).
|
||||
2. **The Detailed Plan:** It's a precise, language-agnostic description of your data structure and all associated rules. It's like a very detailed dictionary or JSON object.
|
||||
3. **The Single Source of Truth:** It captures *everything* needed for validation and serialization:
|
||||
* Field types (`str`, `int`, `datetime`, nested models, etc.)
|
||||
* Constraints (`min_length`, `gt`, `pattern`, etc. from `Field()`)
|
||||
* Aliases (`alias='userName'` from `Field()`)
|
||||
* Defaults (from `Field()` or `= default_value`)
|
||||
* Model-wide settings (`extra='forbid'`, `frozen=True` from `ConfigDict`)
|
||||
* Custom logic (references to your `@field_validator`, `@field_serializer` functions, etc.)
|
||||
|
||||
**Analogy:** Your Python `BaseModel` is the architect's blueprint. The `CoreSchema` is the exhaustive technical specification document derived from that blueprint, detailing every material, dimension, and construction step.
|
||||
|
||||
### A Glimpse of the Schema (Conceptual)
|
||||
|
||||
You don't normally interact with the Core Schema directly, but let's imagine what a simplified piece might look like for a field `name: str = Field(min_length=3)`.
|
||||
|
||||
```python
|
||||
# Conceptual representation - the actual structure is more complex!
|
||||
name_field_schema = {
|
||||
'type': 'str', # The basic type expected
|
||||
'min_length': 3, # Constraint from Field(min_length=3)
|
||||
'strict': False, # Default strictness mode from config
|
||||
'strip_whitespace': None # Default string handling from config
|
||||
# ... other settings relevant to strings
|
||||
}
|
||||
|
||||
# A schema for a whole model wraps field schemas:
|
||||
model_schema = {
|
||||
'type': 'model',
|
||||
'cls': YourModelClass, # Reference to the Python class
|
||||
'schema': {
|
||||
'type': 'model-fields',
|
||||
'fields': {
|
||||
'name': { 'type': 'model-field', 'schema': name_field_schema },
|
||||
# ... schema for other fields ...
|
||||
},
|
||||
# ... details about custom model validators ...
|
||||
},
|
||||
'config': { # Merged config settings
|
||||
'title': 'YourModelClass',
|
||||
'extra_behavior': 'ignore',
|
||||
'frozen': False,
|
||||
# ...
|
||||
},
|
||||
# ... details about custom serializers ...
|
||||
}
|
||||
```
|
||||
|
||||
This internal schema precisely defines what `pydantic-core` needs to know to handle the `name` field and the overall model during validation and serialization.
|
||||
|
||||
**Inspecting the Real Schema:**
|
||||
|
||||
Pydantic actually stores this generated schema on your model class. You can (carefully) inspect it:
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
class User(BaseModel):
|
||||
id: int
|
||||
username: str = Field(min_length=5, alias='userName')
|
||||
|
||||
# Access the generated core schema
|
||||
# Warning: Internal structure, subject to change!
|
||||
print(User.__pydantic_core_schema__)
|
||||
# Output will be a complex dictionary representing the detailed schema
|
||||
# (Output is large and complex, not shown here for brevity)
|
||||
```
|
||||
|
||||
While you *can* look at `__pydantic_core_schema__`, treat it as an internal implementation detail. Its exact structure might change between Pydantic versions.
|
||||
|
||||
## What is `pydantic-core`? The Efficient Construction Crew
|
||||
|
||||
`pydantic-core` is the heart of Pydantic's performance. It's a separate library, written in Rust (a language known for speed and safety), that does the heavy lifting of validation and serialization.
|
||||
|
||||
**How it Works:**
|
||||
|
||||
1. **Input:** When your `BaseModel` class is first defined, Pydantic generates the `CoreSchema` (as described above).
|
||||
2. **Compilation:** This `CoreSchema` is passed to the `pydantic-core` engine. The engine takes this schema and *compiles* it into highly optimized, specialized validator and serializer functions *specifically for your model*. Think of this as the crew studying the spec and preparing the exact tools needed for *this specific house*.
|
||||
3. **Storage:** These compiled Rust objects are attached to your Python model class, typically as `__pydantic_validator__` and `__pydantic_serializer__`.
|
||||
|
||||
```python
|
||||
# You can access these too (again, internal details!)
|
||||
print(User.__pydantic_validator__)
|
||||
# Output: <SchemaValidator 'User' ...> (a pydantic-core object)
|
||||
|
||||
print(User.__pydantic_serializer__)
|
||||
# Output: <SchemaSerializer 'User' ...> (a pydantic-core object)
|
||||
```
|
||||
|
||||
This "compilation" step happens only *once* when the class is created. This makes subsequent validation and serialization extremely fast.
|
||||
|
||||
## Validation Flow: Checking Incoming Materials
|
||||
|
||||
When you create an instance of your model or validate data:
|
||||
|
||||
```python
|
||||
# Example: Validation
|
||||
try:
|
||||
user_data = {'id': 1, 'userName': 'validUser'}
|
||||
user = User(**user_data) # Calls __init__ -> pydantic validation
|
||||
# or: user = User.model_validate(user_data)
|
||||
except ValidationError as e:
|
||||
print(e)
|
||||
```
|
||||
|
||||
Here's what happens behind the scenes:
|
||||
|
||||
1. **Call:** Your Python code triggers validation (e.g., via `__init__` or `model_validate`).
|
||||
2. **Delegate:** Pydantic passes the input data (`user_data`) to the pre-compiled `User.__pydantic_validator__` (the Rust object).
|
||||
3. **Execute:** The `pydantic-core` validator executes its optimized Rust code, guided by the rules baked in from the `CoreSchema`. It checks:
|
||||
* Types (is `id` an `int`? is `userName` a `str`?)
|
||||
* Coercion (can `'1'` be turned into `1` for `id`?)
|
||||
* Constraints (is `len('validUser') >= 5`?)
|
||||
* Aliases (use `userName` from input for the `username` field)
|
||||
* Required fields (is `id` present?)
|
||||
* Extra fields (handle according to `model_config['extra']`)
|
||||
* Custom validators (`@field_validator`, etc. are called back into Python if needed, though core logic is Rust)
|
||||
4. **Result:**
|
||||
* If all checks pass, the validator returns the validated data, which Pydantic uses to create/populate the `User` instance.
|
||||
* If any check fails, the Rust validator gathers detailed error information and raises a `pydantic_core.ValidationError`, which Pydantic surfaces to your Python code.
|
||||
|
||||
**Analogy:** The construction crew takes the delivery of materials (`user_data`) and uses the technical spec (`CoreSchema` baked into the validator) to rigorously check if everything is correct (right type, right size, etc.). If not, they issue a detailed non-compliance report (`ValidationError`).
|
||||
|
||||
## Serialization Flow: Generating Reports
|
||||
|
||||
When you dump your model instance:
|
||||
|
||||
```python
|
||||
# Example: Serialization
|
||||
user = User(id=1, username='validUser')
|
||||
user_dict = user.model_dump()
|
||||
# or: user_json = user.model_dump_json()
|
||||
```
|
||||
|
||||
Here's the flow:
|
||||
|
||||
1. **Call:** Your Python code calls `model_dump()` or `model_dump_json()`.
|
||||
2. **Delegate:** Pydantic passes the model instance (`user`) to the pre-compiled `User.__pydantic_serializer__` (the Rust object).
|
||||
3. **Execute:** The `pydantic-core` serializer executes its optimized Rust code, again guided by the `CoreSchema`. It:
|
||||
* Iterates through the fields specified by the schema.
|
||||
* Applies serialization rules (e.g., use aliases if `by_alias=True`).
|
||||
* Handles `include`, `exclude`, `exclude_unset`, `exclude_defaults`, `exclude_none` logic efficiently.
|
||||
* Formats values for the target output (Python objects for `model_dump`, JSON types for `model_dump_json`).
|
||||
* Calls custom serializers (`@field_serializer`, etc.) back into Python if needed.
|
||||
4. **Result:** The serializer returns the final dictionary or JSON string.
|
||||
|
||||
**Analogy:** The crew uses the technical spec (`CoreSchema` baked into the serializer) to generate a standardized report (`dict` or JSON) about the constructed house (`model instance`), formatting details (like using aliases) as requested.
|
||||
|
||||
## Under the Hood: The Assembly Line
|
||||
|
||||
Let's visualize the entire process from defining a class to using it.
|
||||
|
||||
**Step-by-Step:**
|
||||
|
||||
1. **Definition:** You define your `class User(BaseModel): ...` in Python.
|
||||
2. **Metaclass Magic:** When Python creates the `User` class, Pydantic's `ModelMetaclass` intercepts.
|
||||
3. **Inspection:** The metaclass inspects the class definition: fields, type hints, `Field()` calls, `model_config`, decorators.
|
||||
4. **Schema Generation (Python):** This information is fed into Pydantic's Python-based schema generation logic (`pydantic._internal._generate_schema`).
|
||||
5. **CoreSchema Creation:** The generator produces the detailed `CoreSchema` data structure.
|
||||
6. **Hand-off to Rust:** This `CoreSchema` is passed to the `pydantic-core` Rust library.
|
||||
7. **Compilation (Rust):** `pydantic-core` creates optimized `SchemaValidator` and `SchemaSerializer` instances based *specifically* on that schema.
|
||||
8. **Attachment:** These Rust-backed objects are attached to the `User` class as `__pydantic_validator__` and `__pydantic_serializer__`.
|
||||
9. **Ready:** The `User` class is now fully prepared.
|
||||
10. **Usage (Validation):** Calling `User(...)` uses `User.__pydantic_validator__` (Rust) to process input.
|
||||
11. **Usage (Serialization):** Calling `user.model_dump()` uses `User.__pydantic_serializer__` (Rust) to generate output.
|
||||
|
||||
**Sequence Diagram:**
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Dev as Developer
|
||||
participant PyClassDef as Python Class Definition
|
||||
participant PydanticPy as Pydantic (Python Layer)
|
||||
participant CoreSchemaDS as CoreSchema (Data Structure)
|
||||
participant PydanticCore as pydantic-core (Rust Engine)
|
||||
participant UserCode as User Code
|
||||
|
||||
Dev->>PyClassDef: Define `class User(BaseModel): ...`
|
||||
PyClassDef->>PydanticPy: Python creates class, Pydantic metaclass intercepts
|
||||
PydanticPy->>PydanticPy: Inspects fields, config, decorators
|
||||
PydanticPy->>CoreSchemaDS: Generates detailed CoreSchema
|
||||
PydanticPy->>PydanticCore: Pass CoreSchema to Rust engine
|
||||
PydanticCore->>PydanticCore: Compile SchemaValidator from CoreSchema
|
||||
PydanticCore->>PydanticCore: Compile SchemaSerializer from CoreSchema
|
||||
PydanticCore-->>PydanticPy: Return compiled Validator & Serializer objects
|
||||
PydanticPy->>PyClassDef: Attach Validator/Serializer to class object (`User`)
|
||||
|
||||
UserCode->>PyClassDef: Instantiate: `User(...)` or `User.model_validate(...)`
|
||||
PyClassDef->>PydanticCore: Use attached SchemaValidator
|
||||
PydanticCore->>PydanticCore: Execute fast validation logic
|
||||
alt Validation OK
|
||||
PydanticCore-->>UserCode: Return validated instance/data
|
||||
else Validation Error
|
||||
PydanticCore-->>UserCode: Raise ValidationError
|
||||
end
|
||||
|
||||
UserCode->>PyClassDef: Serialize: `user.model_dump()`
|
||||
PyClassDef->>PydanticCore: Use attached SchemaSerializer
|
||||
PydanticCore->>PydanticCore: Execute fast serialization logic
|
||||
PydanticCore-->>UserCode: Return dict/JSON string
|
||||
```
|
||||
|
||||
**Code Location:**
|
||||
|
||||
* **Metaclass & Orchestration:** `pydantic._internal._model_construction.py` (handles class creation)
|
||||
* **Schema Generation (Python side):** `pydantic._internal._generate_schema.py` (builds the schema structure)
|
||||
* **Core Engine:** The `pydantic-core` library (Rust code, compiled). You interact with it via the `SchemaValidator` and `SchemaSerializer` objects attached to your models.
|
||||
* **Schema Representation:** The `CoreSchema` itself is defined using types from `pydantic_core.core_schema`.
|
||||
|
||||
## Conclusion
|
||||
|
||||
You've now seen the engine behind Pydantic's power!
|
||||
|
||||
* Pydantic translates your Python model definitions (`BaseModel`, `Field`, `ConfigDict`, decorators) into a detailed, internal **`CoreSchema`**.
|
||||
* This `CoreSchema` acts as the **technical specification** for your data.
|
||||
* The high-performance **`pydantic-core`** engine (written in Rust) takes this schema and "compiles" it into optimized `SchemaValidator` and `SchemaSerializer` objects.
|
||||
* These specialized objects perform fast **validation** (checking input) and **serialization** (dumping output) according to the rules defined in the schema.
|
||||
|
||||
This combination of a clear Python API and a powerful Rust core allows Pydantic to be both user-friendly and incredibly performant.
|
||||
|
||||
What if you want to leverage this powerful validation and serialization engine for types that *aren't* full `BaseModel` classes? Maybe just validate a standalone `list[int]` or serialize a `datetime` object according to specific rules? That's where `TypeAdapter` comes in handy.
|
||||
|
||||
Next: [Chapter 6: TypeAdapter](06_typeadapter.md)
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
365
output/Pydantic Core/06_typeadapter.md
Normal file
365
output/Pydantic Core/06_typeadapter.md
Normal file
@@ -0,0 +1,365 @@
|
||||
# Chapter 6: TypeAdapter - Your Universal Data Handler
|
||||
|
||||
Welcome to the final chapter of our Pydantic Core tutorial! In [Chapter 5: Core Schema & Validation/Serialization](05_core_schema___validation_serialization.md), we dove deep into how Pydantic uses the `CoreSchema` and the `pydantic-core` engine to efficiently validate and serialize data for your `BaseModel` classes.
|
||||
|
||||
But what if you have data that *isn't* structured as a `BaseModel`? Imagine you receive a simple list of product IDs from an API, or you need to validate a function argument that's just a dictionary or a date. You still want Pydantic's powerful validation and maybe its smart serialization, but creating a whole `BaseModel` just for `list[int]` seems like overkill.
|
||||
|
||||
This is exactly where `TypeAdapter` comes in!
|
||||
|
||||
## The Problem: Handling Simple Types
|
||||
|
||||
Let's say you're working with a function that expects a list of user IDs, which should all be positive integers:
|
||||
|
||||
```python
|
||||
# Our expected data structure: a list of positive integers
|
||||
# Example: [101, 205, 300]
|
||||
|
||||
# Incoming data might be messy:
|
||||
raw_data_ok = '[101, "205", 300]' # Comes as JSON string, contains string number
|
||||
raw_data_bad = '[101, -5, "abc"]' # Contains negative number and non-number string
|
||||
|
||||
def process_user_ids(user_ids: list[int]):
|
||||
# How do we easily validate 'raw_data' conforms to list[int]
|
||||
# AND ensure all IDs are positive *before* this function runs?
|
||||
# And how do we handle the string "205"?
|
||||
for user_id in user_ids:
|
||||
print(f"Processing user ID: {user_id}")
|
||||
# We assume user_ids is already clean list[int] here
|
||||
```
|
||||
|
||||
Manually parsing the JSON, checking the type of the list and its elements, converting strings like `"205"` to integers, and validating positivity can be tedious and error-prone. We want Pydantic's magic for this simple list!
|
||||
|
||||
## Introducing `TypeAdapter`: The Universal Handler
|
||||
|
||||
`TypeAdapter` provides Pydantic's validation and serialization capabilities for **arbitrary Python types**, not just `BaseModel` subclasses.
|
||||
|
||||
**Analogy:** Think of `TypeAdapter` as a **universal quality checker and packager**. Unlike `BaseModel` (which is like a specific blueprint for a complex object), `TypeAdapter` can handle *any* kind of item – a list, a dictionary, an integer, a date, a union type, etc. – as long as you tell it the **type specification** the item should conform to.
|
||||
|
||||
It acts as a lightweight wrapper around Pydantic's core validation and serialization engine for any type hint you give it.
|
||||
|
||||
## Creating a `TypeAdapter`
|
||||
|
||||
You create a `TypeAdapter` by simply passing the Python type you want to handle to its initializer.
|
||||
|
||||
Let's create one for our `list[int]` requirement, but let's add the positivity constraint using `PositiveInt` from Pydantic's types.
|
||||
|
||||
```python
|
||||
from typing import List
|
||||
from pydantic import TypeAdapter, PositiveInt
|
||||
|
||||
# Define the specific type we want to validate against
|
||||
# This can be any Python type hint Pydantic understands
|
||||
UserIdListType = List[PositiveInt]
|
||||
|
||||
# Create the adapter for this type
|
||||
user_id_list_adapter = TypeAdapter(UserIdListType)
|
||||
|
||||
print(user_id_list_adapter)
|
||||
# Expected Output: TypeAdapter(<class 'list[pydantic.types.PositiveInt]'>)
|
||||
```
|
||||
|
||||
We now have `user_id_list_adapter`, an object specifically configured to validate data against the `List[PositiveInt]` type and serialize Python lists matching this type.
|
||||
|
||||
## Validation with `TypeAdapter`
|
||||
|
||||
The primary use case is validation. `TypeAdapter` offers methods similar to `BaseModel`'s `model_validate` and `model_validate_json`.
|
||||
|
||||
### `validate_python()`
|
||||
|
||||
This method takes a Python object (like a list or dict) and validates it against the adapter's type. It performs type checks, coercion (like converting `"205"` to `205`), and runs any defined constraints (like `PositiveInt`).
|
||||
|
||||
```python
|
||||
from pydantic import ValidationError, PositiveInt, TypeAdapter
|
||||
from typing import List
|
||||
|
||||
UserIdListType = List[PositiveInt]
|
||||
user_id_list_adapter = TypeAdapter(UserIdListType)
|
||||
|
||||
# --- Example 1: Valid data (with coercion needed) ---
|
||||
python_data_ok = [101, "205", 300] # "205" needs converting to int
|
||||
|
||||
try:
|
||||
validated_list = user_id_list_adapter.validate_python(python_data_ok)
|
||||
print(f"Validation successful: {validated_list}")
|
||||
# Expected Output: Validation successful: [101, 205, 300]
|
||||
print(f"Types: {[type(x) for x in validated_list]}")
|
||||
# Expected Output: Types: [<class 'int'>, <class 'int'>, <class 'int'>]
|
||||
except ValidationError as e:
|
||||
print(f"Validation failed: {e}")
|
||||
|
||||
# --- Example 2: Invalid data (negative number) ---
|
||||
python_data_bad_value = [101, -5, 300] # -5 is not PositiveInt
|
||||
|
||||
try:
|
||||
user_id_list_adapter.validate_python(python_data_bad_value)
|
||||
except ValidationError as e:
|
||||
print(f"\nValidation failed as expected:\n{e}")
|
||||
# Expected Output (simplified):
|
||||
# Validation failed as expected:
|
||||
# 1 validation error for list[PositiveInt]
|
||||
# 1
|
||||
# Input should be greater than 0 [type=greater_than, context={'gt': 0}, input_value=-5, input_type=int]
|
||||
|
||||
# --- Example 3: Invalid data (wrong type) ---
|
||||
python_data_bad_type = [101, "abc", 300] # "abc" cannot be int
|
||||
|
||||
try:
|
||||
user_id_list_adapter.validate_python(python_data_bad_type)
|
||||
except ValidationError as e:
|
||||
print(f"\nValidation failed as expected:\n{e}")
|
||||
# Expected Output (simplified):
|
||||
# Validation failed as expected:
|
||||
# 1 validation error for list[PositiveInt]
|
||||
# 1
|
||||
# Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='abc', input_type=str]
|
||||
```
|
||||
|
||||
Just like with `BaseModel`, `TypeAdapter` gives you clear validation errors pinpointing the exact location and reason for the failure. It also handles useful type coercion automatically.
|
||||
|
||||
### `validate_json()`
|
||||
|
||||
If your input data is a JSON string (or bytes/bytearray), you can use `validate_json()` to parse and validate in one step.
|
||||
|
||||
```python
|
||||
# Continuing from above...
|
||||
|
||||
# Input as a JSON string
|
||||
raw_data_ok_json = '[101, "205", 300]'
|
||||
raw_data_bad_json = '[101, -5, "abc"]'
|
||||
|
||||
# Validate the good JSON
|
||||
try:
|
||||
validated_list_from_json = user_id_list_adapter.validate_json(raw_data_ok_json)
|
||||
print(f"\nValidated from JSON: {validated_list_from_json}")
|
||||
# Expected Output: Validated from JSON: [101, 205, 300]
|
||||
except ValidationError as e:
|
||||
print(f"\nJSON validation failed: {e}")
|
||||
|
||||
# Validate the bad JSON
|
||||
try:
|
||||
user_id_list_adapter.validate_json(raw_data_bad_json)
|
||||
except ValidationError as e:
|
||||
print(f"\nJSON validation failed as expected:\n{e}")
|
||||
# Expected Output (simplified):
|
||||
# JSON validation failed as expected:
|
||||
# 1 validation error for list[PositiveInt]
|
||||
# 1
|
||||
# Input should be greater than 0 [type=greater_than, context={'gt': 0}, input_value=-5, input_type=int]
|
||||
```
|
||||
|
||||
This is extremely handy for validating raw API request bodies or data loaded from JSON files without needing to parse the JSON yourself first.
|
||||
|
||||
## Serialization with `TypeAdapter`
|
||||
|
||||
`TypeAdapter` can also serialize Python objects according to the rules of its associated type, similar to `BaseModel.model_dump()` and `model_dump_json()`.
|
||||
|
||||
### `dump_python()`
|
||||
|
||||
Converts a Python object into a "dumped" representation (often simpler Python types). This is most useful when the type involves Pydantic models or types with custom serialization logic (like datetimes, enums, etc.). For simple types like `list[int]`, it might not change much.
|
||||
|
||||
Let's use a slightly more complex example: `List[datetime]`.
|
||||
|
||||
```python
|
||||
from datetime import datetime
|
||||
from typing import List
|
||||
from pydantic import TypeAdapter
|
||||
|
||||
datetime_list_adapter = TypeAdapter(List[datetime])
|
||||
|
||||
# A list of datetime objects
|
||||
dt_list = [datetime(2023, 1, 1, 12, 0, 0), datetime(2024, 7, 15, 9, 30, 0)]
|
||||
|
||||
# Dump to Python objects (datetimes remain datetimes by default)
|
||||
dumped_python = datetime_list_adapter.dump_python(dt_list)
|
||||
print(f"Dumped Python: {dumped_python}")
|
||||
# Expected Output: Dumped Python: [datetime.datetime(2023, 1, 1, 12, 0), datetime.datetime(2024, 7, 15, 9, 30)]
|
||||
|
||||
# To get JSON-compatible types (strings), use mode='json'
|
||||
dumped_for_json = datetime_list_adapter.dump_python(dt_list, mode='json')
|
||||
print(f"Dumped for JSON: {dumped_for_json}")
|
||||
# Expected Output: Dumped for JSON: ['2023-01-01T12:00:00', '2024-07-15T09:30:00']
|
||||
```
|
||||
|
||||
### `dump_json()`
|
||||
|
||||
Directly serializes the Python object into a JSON string, using Pydantic's encoders (e.g., converting `datetime` to ISO 8601 strings).
|
||||
|
||||
```python
|
||||
# Continuing with datetime_list_adapter and dt_list...
|
||||
|
||||
# Dump directly to a JSON string
|
||||
dumped_json_str = datetime_list_adapter.dump_json(dt_list, indent=2)
|
||||
print(f"\nDumped JSON:\n{dumped_json_str.decode()}") # .decode() to convert bytes to string for printing
|
||||
# Expected Output:
|
||||
# Dumped JSON:
|
||||
# [
|
||||
# "2023-01-01T12:00:00",
|
||||
# "2024-07-15T09:30:00"
|
||||
# ]
|
||||
```
|
||||
|
||||
This uses the same powerful serialization engine as `BaseModel`, ensuring consistent output formats.
|
||||
|
||||
## Getting JSON Schema
|
||||
|
||||
You can also generate a [JSON Schema](https://json-schema.org/) for the type handled by the adapter using the `json_schema()` method.
|
||||
|
||||
```python
|
||||
# Using our user_id_list_adapter from before...
|
||||
# UserIdListType = List[PositiveInt]
|
||||
# user_id_list_adapter = TypeAdapter(UserIdListType)
|
||||
|
||||
schema = user_id_list_adapter.json_schema()
|
||||
|
||||
import json
|
||||
print(f"\nJSON Schema:\n{json.dumps(schema, indent=2)}")
|
||||
# Expected Output:
|
||||
# JSON Schema:
|
||||
# {
|
||||
# "items": {
|
||||
# "exclusiveMinimum": 0,
|
||||
# "type": "integer"
|
||||
# },
|
||||
# "title": "List[PositiveInt]",
|
||||
# "type": "array"
|
||||
# }
|
||||
```
|
||||
|
||||
This schema accurately describes the expected data: an array (`"type": "array"`) where each item (`"items"`) must be an integer (`"type": "integer"`) that is greater than 0 (`"exclusiveMinimum": 0`).
|
||||
|
||||
## Under the Hood: Direct Line to the Core
|
||||
|
||||
How does `TypeAdapter` work? It acts as a direct interface to the validation and serialization machinery we discussed in [Chapter 5](05_core_schema___validation_serialization.md).
|
||||
|
||||
**Step-by-Step:**
|
||||
|
||||
1. **Instantiation:** When you create `adapter = TypeAdapter(MyType)`, Pydantic immediately analyzes `MyType`.
|
||||
2. **Schema Generation:** It generates the internal `CoreSchema` specifically for `MyType`, just like it would for a field within a `BaseModel`.
|
||||
3. **Core Engine:** This `CoreSchema` is passed to the `pydantic-core` Rust engine.
|
||||
4. **Compilation:** `pydantic-core` compiles and creates optimized `SchemaValidator` and `SchemaSerializer` objects based *only* on the `CoreSchema` for `MyType`.
|
||||
5. **Storage:** These compiled validator and serializer objects are stored directly on the `TypeAdapter` instance (e.g., as `adapter.validator` and `adapter.serializer`).
|
||||
6. **Usage:** When you call `adapter.validate_python(data)` or `adapter.dump_json(obj)`, the `TypeAdapter` simply delegates the call directly to its stored `SchemaValidator` or `SchemaSerializer`.
|
||||
|
||||
**Sequence Diagram:**
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Dev as Developer
|
||||
participant TA as TypeAdapter
|
||||
participant PydanticPy as Pydantic (Python Layer)
|
||||
participant CoreSchemaDS as CoreSchema
|
||||
participant PydanticCore as pydantic-core (Rust Engine)
|
||||
|
||||
Dev->>TA: adapter = TypeAdapter(List[PositiveInt])
|
||||
TA->>PydanticPy: Request schema generation for List[PositiveInt]
|
||||
PydanticPy->>CoreSchemaDS: Generate CoreSchema for List[PositiveInt]
|
||||
PydanticPy->>PydanticCore: Pass CoreSchema to Rust engine
|
||||
PydanticCore->>PydanticCore: Compile SchemaValidator for List[PositiveInt]
|
||||
PydanticCore->>PydanticCore: Compile SchemaSerializer for List[PositiveInt]
|
||||
PydanticCore-->>TA: Return compiled Validator & Serializer
|
||||
TA->>TA: Store validator on self.validator
|
||||
TA->>TA: Store serializer on self.serializer
|
||||
TA-->>Dev: Adapter instance is ready
|
||||
|
||||
Dev->>TA: adapter.validate_python(data)
|
||||
TA->>PydanticCore: Call self.validator.validate_python(data)
|
||||
PydanticCore-->>TA: Return validated data or raise ValidationError
|
||||
TA-->>Dev: Return result
|
||||
|
||||
Dev->>TA: adapter.dump_json(obj)
|
||||
TA->>PydanticCore: Call self.serializer.to_json(obj)
|
||||
PydanticCore-->>TA: Return JSON bytes
|
||||
TA-->>Dev: Return result
|
||||
```
|
||||
|
||||
Unlike `BaseModel`, where the validator/serializer are attached to the *class*, with `TypeAdapter`, they are attached to the *instance* of the adapter. This makes `TypeAdapter` a neat, self-contained tool for handling specific types.
|
||||
|
||||
**Code Location:**
|
||||
|
||||
* The main logic is in `pydantic/type_adapter.py`.
|
||||
* The `TypeAdapter.__init__` method orchestrates the process:
|
||||
* It determines the correct Python namespaces for resolving type hints.
|
||||
* It calls internal schema generation logic (`pydantic._internal._generate_schema.GenerateSchema`) to build the `CoreSchema` for the given type.
|
||||
* It uses `pydantic_core.SchemaValidator(core_schema, config)` and `pydantic_core.SchemaSerializer(core_schema, config)` to create the core engine objects.
|
||||
* These are stored on the instance as `self.validator` and `self.serializer`.
|
||||
* Methods like `validate_python`, `dump_json`, etc., are thin wrappers that call the corresponding methods on `self.validator` or `self.serializer`.
|
||||
|
||||
```python
|
||||
# Simplified conceptual view from pydantic/type_adapter.py
|
||||
|
||||
from pydantic_core import SchemaValidator, SchemaSerializer, CoreSchema
|
||||
# ... other imports
|
||||
|
||||
class TypeAdapter(Generic[T]):
|
||||
core_schema: CoreSchema
|
||||
validator: SchemaValidator | PluggableSchemaValidator # Actually uses PluggableSchemaValidator internally
|
||||
serializer: SchemaSerializer
|
||||
|
||||
def __init__(self, type: Any, *, config: ConfigDict | None = None, ...):
|
||||
self._type = type
|
||||
self._config = config
|
||||
# ... (fetch parent frame namespaces) ...
|
||||
ns_resolver = _namespace_utils.NsResolver(...)
|
||||
|
||||
# ... Call internal _init_core_attrs ...
|
||||
self._init_core_attrs(ns_resolver=ns_resolver, force=True)
|
||||
|
||||
def _init_core_attrs(self, ns_resolver, force, raise_errors=False):
|
||||
# ... Simplified schema generation ...
|
||||
config_wrapper = _config.ConfigWrapper(self._config)
|
||||
schema_generator = _generate_schema.GenerateSchema(config_wrapper, ns_resolver)
|
||||
try:
|
||||
core_schema = schema_generator.generate_schema(self._type)
|
||||
self.core_schema = schema_generator.clean_schema(core_schema)
|
||||
core_config = config_wrapper.core_config(None)
|
||||
|
||||
# Create and store validator and serializer
|
||||
# Note: Actual code uses create_schema_validator for plugin support
|
||||
self.validator = SchemaValidator(self.core_schema, core_config)
|
||||
self.serializer = SchemaSerializer(self.core_schema, core_config)
|
||||
self.pydantic_complete = True
|
||||
|
||||
except Exception:
|
||||
# Handle errors, potentially set mocks if build fails
|
||||
# ...
|
||||
pass
|
||||
|
||||
def validate_python(self, object: Any, /, **kwargs) -> T:
|
||||
# Directly delegates to the stored validator
|
||||
return self.validator.validate_python(object, **kwargs)
|
||||
|
||||
def validate_json(self, data: str | bytes | bytearray, /, **kwargs) -> T:
|
||||
# Directly delegates to the stored validator
|
||||
return self.validator.validate_json(data, **kwargs)
|
||||
|
||||
def dump_python(self, instance: T, /, **kwargs) -> Any:
|
||||
# Directly delegates to the stored serializer
|
||||
return self.serializer.to_python(instance, **kwargs)
|
||||
|
||||
def dump_json(self, instance: T, /, **kwargs) -> bytes:
|
||||
# Directly delegates to the stored serializer
|
||||
return self.serializer.to_json(instance, **kwargs)
|
||||
|
||||
def json_schema(self, **kwargs) -> dict[str, Any]:
|
||||
# Generates schema based on self.core_schema
|
||||
schema_generator_instance = GenerateJsonSchema(**kwargs)
|
||||
return schema_generator_instance.generate(self.core_schema, mode=kwargs.get('mode', 'validation'))
|
||||
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
Congratulations! You've learned about `TypeAdapter`, a flexible tool for applying Pydantic's validation and serialization to any Python type, not just `BaseModel`s.
|
||||
|
||||
* It's ideal for validating simple types, function arguments, or data structures where a full `BaseModel` isn't necessary.
|
||||
* You create it by passing the target type: `TypeAdapter(YourType)`.
|
||||
* It provides `.validate_python()`, `.validate_json()`, `.dump_python()`, `.dump_json()`, and `.json_schema()` methods.
|
||||
* It works by generating a `CoreSchema` for the target type and using dedicated `SchemaValidator` and `SchemaSerializer` instances from `pydantic-core`.
|
||||
|
||||
`TypeAdapter` completes our tour of the essential concepts in Pydantic V2. You've journeyed from the basic `BaseModel` blueprint, through customizing fields and configuration, adding custom logic, understanding the core schema engine, and finally, applying these powers universally with `TypeAdapter`.
|
||||
|
||||
We hope this tutorial has given you a solid foundation for using Pydantic effectively to build robust, reliable, and well-defined data interfaces in your Python applications. Happy coding!
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
38
output/Pydantic Core/index.md
Normal file
38
output/Pydantic Core/index.md
Normal file
@@ -0,0 +1,38 @@
|
||||
# Tutorial: Pydantic Core
|
||||
|
||||
Pydantic Core provides the fundamental machinery for **data validation**, **parsing**, and **serialization** in Pydantic. It takes Python *type hints* and uses them to define how data should be structured and processed. Users typically interact with it by defining classes that inherit from `BaseModel`, which automatically gets validation and serialization capabilities based on its annotated fields. Pydantic Core ensures data conforms to the defined types and allows converting between Python objects and formats like JSON efficiently, leveraging Rust for performance.
|
||||
|
||||
|
||||
**Source Repository:** [https://github.com/pydantic/pydantic/tree/6c38dc93f40a47f4d1350adca9ec0d72502e223f/pydantic](https://github.com/pydantic/pydantic/tree/6c38dc93f40a47f4d1350adca9ec0d72502e223f/pydantic)
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A0["BaseModel"]
|
||||
A1["Fields (FieldInfo / Field function)"]
|
||||
A2["Core Schema & Validation/Serialization"]
|
||||
A3["Configuration (ConfigDict / ConfigWrapper)"]
|
||||
A4["Custom Logic (Decorators & Annotated Helpers)"]
|
||||
A5["TypeAdapter"]
|
||||
A0 -- "Contains and defines" --> A1
|
||||
A0 -- "Is configured by" --> A3
|
||||
A0 -- "Applies custom logic via" --> A4
|
||||
A1 -- "Is converted into" --> A2
|
||||
A3 -- "Configures core engine for" --> A2
|
||||
A4 -- "Modifies validation/seriali..." --> A2
|
||||
A5 -- "Uses core engine for" --> A2
|
||||
A5 -- "Can be configured by" --> A3
|
||||
```
|
||||
|
||||
## Chapters
|
||||
|
||||
1. [BaseModel](01_basemodel.md)
|
||||
2. [Fields (FieldInfo / Field function)](02_fields__fieldinfo___field_function_.md)
|
||||
3. [Configuration (ConfigDict / ConfigWrapper)](03_configuration__configdict___configwrapper_.md)
|
||||
4. [Custom Logic (Decorators & Annotated Helpers)](04_custom_logic__decorators___annotated_helpers_.md)
|
||||
5. [Core Schema & Validation/Serialization](05_core_schema___validation_serialization.md)
|
||||
6. [TypeAdapter](06_typeadapter.md)
|
||||
|
||||
|
||||
---
|
||||
|
||||
Generated by [AI Codebase Knowledge Builder](https://github.com/The-Pocket/Tutorial-Codebase-Knowledge)
|
||||
Reference in New Issue
Block a user