Fix: Marshmallow Not Working — Schema Errors, Load vs Dump, and Field Validation
Part of: Python Errors
Quick Answer
How to fix Marshmallow errors — Schema not validated on dump, ValidationError messages format, unknown field handling, missing vs default, post_load object construction, and Marshmallow 3 to 4 migration.
The Error
You define a Marshmallow schema and the wrong direction validates:
from marshmallow import Schema, fields
class UserSchema(Schema):
name = fields.Str(required=True)
age = fields.Int(required=True)
schema = UserSchema()
result = schema.dump({"name": "Alice"}) # No age
print(result) # {"name": "Alice"} — no error, age missing silentlyOr invalid input doesn’t raise but returns wrong data:
result = schema.load({"name": "Alice", "age": "not a number"})
# Raises ValidationError — but format may differ from expectationsOr unknown fields silently disappear:
result = schema.load({"name": "Alice", "age": 30, "extra": "value"})
print(result) # {"name": "Alice", "age": 30} — "extra" silently droppedOr post_load doesn’t construct objects as expected:
@post_load
def make_user(self, data, **kwargs):
return User(**data)
result = schema.load({"name": "Alice", "age": 30})
print(type(result)) # User instance — but only if @post_load was hitOr you migrate from Marshmallow 3 to 4 and decorators change behavior.
Marshmallow is the Python serialization library that predates Pydantic — declarative schemas, separate load() (deserialize/validate) and dump() (serialize) methods, post-processing hooks. It’s still heavily used in Flask and Pyramid ecosystems via flask-marshmallow and marshmallow-sqlalchemy. The load/dump asymmetry and unknown-field handling produce specific failure modes that don’t exist in Pydantic’s single-direction model. This guide covers each.
Why This Happens
Marshmallow separates two directions explicitly:
load()— JSON/dict → Python object (validates input)dump()— Python object → JSON/dict (serializes output)
Only load() runs validators. dump() formats values but doesn’t validate — useful for serializing trusted internal data, surprising when you expect validation everywhere.
The default for unknown fields was RAISE in Marshmallow 2 but INCLUDE/EXCLUDE patterns evolved across versions. Without explicit handling, the behavior depends on global settings.
Fix 1: Basic Schema Setup
from marshmallow import Schema, fields, validate
class UserSchema(Schema):
id = fields.Int(dump_only=True) # Only on dump (e.g., DB-assigned)
name = fields.Str(required=True, validate=validate.Length(min=1, max=100))
email = fields.Email(required=True)
age = fields.Int(required=True, validate=validate.Range(min=0, max=150))
is_active = fields.Bool(load_default=True) # Optional on load with default
created_at = fields.DateTime(dump_only=True)
schema = UserSchema()
# Load (deserialize and validate)
data = schema.load({"name": "Alice", "email": "[email protected]", "age": 30})
print(data) # {"name": "Alice", "email": "[email protected]", "age": 30, "is_active": True}
# Dump (serialize)
user_obj = {"id": 1, "name": "Alice", "email": "[email protected]", "age": 30, "is_active": True, "created_at": datetime.now()}
result = schema.dump(user_obj)
print(result) # {"id": 1, "name": "Alice", ...}dump_only — field appears only in output, ignored on input. load_only — field appears only in input (passwords, secrets), removed from output. load_default — default value used during load if missing. dump_default — default value used during dump if attribute missing.
Common Mistake: Using a single default argument (deprecated in Marshmallow 3). Specify load_default and dump_default explicitly — they often differ. For example, id has no load default (auto-generated by DB) but might have a dump default (None for new instances).
For Pydantic comparison and migration patterns, see Pydantic validation error.
Fix 2: Validation Errors
from marshmallow import ValidationError
try:
schema.load({"name": "Alice", "email": "not-an-email", "age": -5})
except ValidationError as err:
print(err.messages)
# {
# "email": ["Not a valid email address."],
# "age": ["Must be greater than or equal to 0."]
# }
print(err.valid_data)
# {"name": "Alice"} — partial valid dataerr.messages is a dict of field → list of error messages. Multi-level for nested schemas.
Custom error messages:
class UserSchema(Schema):
age = fields.Int(
required=True,
validate=validate.Range(min=0, max=150),
error_messages={"required": "Age is required", "invalid": "Age must be an integer"},
)Custom validators:
from marshmallow import validates, ValidationError
class UserSchema(Schema):
age = fields.Int(required=True)
birth_year = fields.Int(required=True)
@validates("age")
def validate_age(self, value, **kwargs):
if value < 0:
raise ValidationError("Age cannot be negative")
if value > 150:
raise ValidationError("Age unreasonably high")
@validates_schema # Validate across multiple fields
def validate_consistency(self, data, **kwargs):
current_year = 2025
if data.get("birth_year") and data.get("age"):
expected_age = current_year - data["birth_year"]
if abs(data["age"] - expected_age) > 1:
raise ValidationError(
f"Age {data['age']} doesn't match birth_year {data['birth_year']}"
)Common Mistake: Raising ValueError from a validator instead of ValidationError. ValueError isn’t caught by Marshmallow’s error handling — it propagates as a raw Python error. Always use marshmallow.ValidationError for validation-related issues.
Fix 3: Unknown Field Handling
from marshmallow import Schema, fields, INCLUDE, EXCLUDE, RAISE
class UserSchema(Schema):
name = fields.Str()
age = fields.Int()
class Meta:
unknown = EXCLUDE # Drop unknown fields silently (default in many setups)
# Other options:
# unknown = INCLUDE — Include unknown fields in output
# unknown = RAISE — Raise ValidationError on unknown fieldsEXCLUDE is the most common (and surprisingly silent) default:
schema = UserSchema()
result = schema.load({"name": "Alice", "age": 30, "ssn": "123-45-6789"})
# {"name": "Alice", "age": 30} — ssn silently droppedThis is often what you want (frontend sends extra fields, you don’t care) but it can hide bugs.
Use RAISE for strict validation:
class StrictSchema(Schema):
class Meta:
unknown = RAISE
schema = StrictSchema()
schema.load({"name": "Alice", "extra": "value"})
# ValidationError: {"extra": ["Unknown field."]}Per-call override:
result = schema.load(data, unknown=RAISE)Pro Tip: Use RAISE for inbound user input (API request bodies) — catches client bugs (typos, deprecated field names) immediately. Use EXCLUDE for trusted internal data flow where you want forward-compatibility. The contract for an external API should be strict; the contract for internal services can be permissive.
Fix 4: dump() Doesn’t Validate
class UserSchema(Schema):
age = fields.Int(required=True, validate=validate.Range(min=0, max=150))
schema = UserSchema()
result = schema.dump({"age": -10})
print(result) # {"age": -10} — no validation!dump() serializes existing data without validation. If you want validation on output too:
# Manually re-validate
schema.load(schema.dump(user_obj))Or use marshmallow’s validate method directly:
errors = schema.validate(data)
if errors:
print(errors) # Same format as ValidationError.messagesCommon Mistake: Assuming dump() validates and shipping invalid data. The asymmetry is intentional — dumping a database row shouldn’t fail just because the DB has dirty data — but new users expect symmetric behavior. If you need validation on dump, do it explicitly.
Fix 5: post_load and Object Construction
from marshmallow import Schema, fields, post_load
from dataclasses import dataclass
@dataclass
class User:
name: str
age: int
class UserSchema(Schema):
name = fields.Str(required=True)
age = fields.Int(required=True)
@post_load
def make_user(self, data, **kwargs):
return User(**data)
schema = UserSchema()
result = schema.load({"name": "Alice", "age": 30})
print(type(result)) # <class 'User'>
print(result.name) # Alicepost_load runs after validation — convert the dict to a domain object.
For dump, use pre_dump:
@pre_dump
def to_dict(self, obj, **kwargs):
if isinstance(obj, User):
return {"name": obj.name, "age": obj.age}
return objOr use class-based field access via Meta.model:
class UserSchema(Schema):
name = fields.Str()
age = fields.Int()
class Meta:
# No automatic model binding in pure Marshmallow
# For SQLAlchemy auto-binding, use marshmallow-sqlalchemy
...For SQLAlchemy integration that auto-builds schemas from models, see SQLAlchemy not working.
Fix 6: Nested Schemas
class AddressSchema(Schema):
street = fields.Str()
city = fields.Str()
class UserSchema(Schema):
name = fields.Str(required=True)
address = fields.Nested(AddressSchema, required=True)
secondary_addresses = fields.List(fields.Nested(AddressSchema))
data = {
"name": "Alice",
"address": {"street": "123 Main", "city": "NYC"},
"secondary_addresses": [
{"street": "456 Side St", "city": "Boston"},
],
}
result = UserSchema().load(data)Forward references for self-referential schemas:
class CategorySchema(Schema):
name = fields.Str()
children = fields.List(fields.Nested("CategorySchema")) # String referencePartial nested loading:
result = UserSchema().load(data, partial=True)
# All required fields become optionalCommon Mistake: Using a class reference in fields.Nested(AddressSchema) when the class hasn’t been defined yet (forward reference). Use a string "AddressSchema" for forward references — Marshmallow resolves it at load time. Class references only work for already-defined schemas.
Fix 7: Flask Integration with flask-marshmallow
pip install flask-marshmallow flask-sqlalchemy marshmallow-sqlalchemyfrom flask import Flask, request, jsonify
from flask_sqlalchemy import SQLAlchemy
from flask_marshmallow import Marshmallow
app = Flask(__name__)
app.config["SQLALCHEMY_DATABASE_URI"] = "sqlite:///app.db"
db = SQLAlchemy(app)
ma = Marshmallow(app)
class User(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(100))
email = db.Column(db.String(100))
class UserSchema(ma.SQLAlchemyAutoSchema):
class Meta:
model = User
load_instance = True # Returns User instance from load()
user_schema = UserSchema()
users_schema = UserSchema(many=True)
@app.route("/users", methods=["GET"])
def list_users():
users = User.query.all()
return jsonify(users_schema.dump(users))
@app.route("/users", methods=["POST"])
def create_user():
try:
user = user_schema.load(request.json, session=db.session)
db.session.add(user)
db.session.commit()
return user_schema.dump(user), 201
except ValidationError as err:
return jsonify({"errors": err.messages}), 400SQLAlchemyAutoSchema introspects the model — no need to redeclare fields. load_instance=True builds a model instance directly on load.
For Flask routing patterns that pair with marshmallow, see Flask 404 not found.
Fix 8: Meta.fields and Meta.exclude
class UserSchema(Schema):
id = fields.Int()
name = fields.Str()
email = fields.Str()
password_hash = fields.Str()
class Meta:
# Only include specific fields
fields = ("id", "name", "email")
# Or exclude specific fields
# exclude = ("password_hash",)Per-call field selection:
# Dump only specific fields
schema = UserSchema(only=("name", "email"))
schema.dump(user_obj) # {"name": "...", "email": "..."}
# Exclude fields
schema = UserSchema(exclude=("password_hash",))Common Mistake: Putting password_hash in the schema but forgetting to exclude it on dump. The hash gets serialized into responses. Always use load_only=True on password fields (they appear in load, but not dump):
class UserSchema(Schema):
name = fields.Str()
password = fields.Str(load_only=True) # Never appears in dump outputdump_only + load_only cleanly separates directions:
class UserSchema(Schema):
id = fields.Int(dump_only=True) # DB-generated, no input
password = fields.Str(load_only=True) # User input, never output
name = fields.Str() # BidirectionalProduction Incident Lens — API Contracts Drift Silently
The reason a marshmallow schema breaks in production is rarely “wrong code today.” It is “wrong code three releases ago that nobody noticed until a downstream consumer changed.” The blast radius is per-endpoint: every consumer of the affected endpoint either gets stale fields, missing fields, or wrong types. With dump() skipping validation, that drift can ship undetected for weeks.
Incident pattern — the renamed field that wasn’t. A backend developer renamed created_at → creation_time in the model. The schema still serializes both — created_at via dump_only=True mapped to the old DB column, creation_time via attribute="creation_time". Each release picks the one that exists. A frontend deploy starts reading creation_time because backend said it would; backend then reverts the change because of a migration bug; frontend now sees null for every timestamp because the schema is still emitting creation_time but the attribute resolves to nothing. No exception, no log line — just blank dates in the UI.
Mitigation: snapshot the schema in CI.
# tests/test_schema_contract.py
def test_user_schema_matches_snapshot(snapshot):
schema = UserSchema()
sample = {"id": 1, "name": "Alice", "email": "[email protected]", ...}
snapshot.assert_match(schema.dump(sample), "user_schema.json")Use pytest-snapshot or syrupy. Any field rename, type change, or field removal fails the snapshot test — forcing an explicit decision instead of a silent drift.
Incident pattern — the unknown field that broke a deploy. A frontend team adds device_id to login requests. The marshmallow schema has unknown = RAISE. Every login starts failing with {"device_id": ["Unknown field."]} because the backend hasn’t been updated. The frontend team rolled forward; the backend rolled back; the gap was hours.
Mitigation: per-direction unknown policy.
class LoginSchema(Schema):
username = fields.Str(required=True)
password = fields.Str(required=True, load_only=True)
class Meta:
unknown = EXCLUDE # Tolerate new fields from clientsFor incoming requests from clients you don’t fully control, EXCLUDE is more resilient. Reserve RAISE for service-to-service contracts where both sides ship together — frontend-to-backend is rarely that case.
Blast radius checklist for schema changes:
- List every endpoint that uses the schema. If you don’t know, your schemas need a registry.
- For each, identify the consumers. Mobile apps shipped a year ago can still be in the wild.
- Add a snapshot test before merging. If the snapshot updates, the PR description must explain why.
- Deprecate before removing. Keep old fields with
dump_default=Nonefor at least one release cycle. - Version the API path or media type when breaking the contract is unavoidable.
Common Mistake: Removing a field from a schema during a “cleanup” PR. The PR looks safe — no callers in the repo. But callers outside the repo (mobile clients, third-party integrations, scheduled jobs in another service) silently break. The schema is part of the contract; treat removals like database drops, not like code refactors.
The audit obligation is real: every schema change is a contract change. Most teams add a BREAKING: prefix to commit messages for schema removals so the release notes surface them automatically.
Still Not Working?
Marshmallow vs Pydantic
- Marshmallow — Mature, separate load/dump, no Pydantic-style class-as-instance pattern. Best for Flask projects, complex serialization workflows.
- Pydantic — Single-direction validation, class instances are the data, faster for typical workloads. Best for FastAPI, modern type-driven code.
- msgspec — Fastest, less flexible. Best for high-throughput message decoding where every microsecond counts.
For new projects without Flask-marshmallow legacy, Pydantic 2 is usually the right choice. Marshmallow’s strength is its load/dump asymmetry and post-processing hooks — sometimes exactly what you need.
Migrating from Marshmallow 2 to 3 to 4
Marshmallow 3 was a major rewrite (2019); 4 (planned) continues evolving. Key migration points:
load()returns data directly (no.dataattribute in v3+)MarshalResultnamedtuple removed in v3defaultargument split intoload_default/dump_defaultmissingargument renamed toload_default(withdump_default=missingif needed)
Common Mistake: Following Marshmallow 2 tutorials with v3+. The .data attribute on results doesn’t exist — result = schema.load(data) directly returns the data dict.
Custom Field Types
from marshmallow import fields, ValidationError
class HexColor(fields.Field):
def _serialize(self, value, attr, obj, **kwargs):
return f"#{value:06x}" if isinstance(value, int) else value
def _deserialize(self, value, attr, data, **kwargs):
if not isinstance(value, str) or not value.startswith("#"):
raise ValidationError("Must be a hex color string")
try:
return int(value[1:], 16)
except ValueError:
raise ValidationError("Invalid hex color")
class ThemeSchema(Schema):
primary_color = HexColor()
result = ThemeSchema().load({"primary_color": "#FF5733"})
print(result) # {"primary_color": 16734003}
dump = ThemeSchema().dump({"primary_color": 16734003})
print(dump) # {"primary_color": "#ff5733"}_serialize is for dump; _deserialize is for load. Pair them for round-trip conversion.
Polymorphic Schemas
For unions / discriminated types:
from marshmallow_oneofschema import OneOfSchema
class CatSchema(Schema):
name = fields.Str()
meow_count = fields.Int()
class DogSchema(Schema):
name = fields.Str()
bark_volume = fields.Int()
class PetSchema(OneOfSchema):
type_schemas = {"cat": CatSchema, "dog": DogSchema}pip install marshmallow-oneofschemaUse when a field can be one of several different types — the schema picks the right sub-schema based on a discriminator field.
Testing Schemas
import pytest
from marshmallow import ValidationError
def test_valid():
data = {"name": "Alice", "age": 30, "email": "[email protected]"}
result = UserSchema().load(data)
assert result["name"] == "Alice"
def test_invalid_email():
with pytest.raises(ValidationError) as exc:
UserSchema().load({"name": "Alice", "age": 30, "email": "invalid"})
assert "email" in exc.value.messages
def test_required():
with pytest.raises(ValidationError) as exc:
UserSchema().load({"name": "Alice"})
assert "age" in exc.value.messagesPair these tests with a baseline of representative payloads — real anonymized API requests from staging logs. Synthetic tests pass; real-world payloads catch the unknown-field handling, encoding quirks, and edge cases your schema accidentally tolerated.
Integration with WebArgs (for Flask request parsing)
from webargs import fields
from webargs.flaskparser import use_args
@app.route("/users", methods=["POST"])
@use_args({
"name": fields.Str(required=True),
"age": fields.Int(required=True),
})
def create_user(args):
# args is a dict with validated input
return jsonify(args)webargs uses marshmallow under the hood — concise way to validate Flask request bodies/query strings without defining a full Schema class.
Combining with Pydantic in a Codebase
Some teams use Pydantic for new code, Marshmallow for legacy:
# Convert Pydantic to Marshmallow data
pydantic_model.model_dump() # → dict
marshmallow_schema.dump(pydantic_model.model_dump()) # → output dictFor Pydantic Settings patterns that overlap, see Pydantic Settings not working.
Datetime Serialization Doesn’t Match Frontend Expectations
The classic source of “the date is wrong” tickets. Marshmallow’s fields.DateTime() defaults to ISO 8601 with no explicit timezone. A naive datetime in your DB serializes as "2026-05-22T10:30:00" (no Z, no offset); the frontend parses it as local time and displays a different hour to every user.
Fix by explicit format and timezone-aware datetimes:
from datetime import datetime, timezone
class EventSchema(Schema):
occurred_at = fields.DateTime(format="iso", required=True)
# Make sure you store timezone-aware datetimes
schema.dump({"occurred_at": datetime.now(timezone.utc)})
# {"occurred_at": "2026-05-22T10:30:00+00:00"}For APIs consumed by browsers, force UTC at the schema level and let the client localize. Mixing zones at the schema is how you ship bugs that only appear during daylight saving transitions.
many=True Silently Drops Errors for Sub-Items
When loading a list of items, partial failures don’t always raise — depending on configuration, marshmallow can return valid items and skip invalid ones, or raise with the full set of errors keyed by index.
try:
result = UserSchema(many=True).load([
{"name": "Alice", "age": 30, "email": "[email protected]"},
{"name": "Bob", "age": "thirty", "email": "bad"}, # Invalid
])
except ValidationError as err:
print(err.messages)
# {1: {"age": ["Not a valid integer."], "email": ["Not a valid email."]}}The error dict is keyed by index, not by item content. If your bulk endpoint silently drops bad items instead of raising, check whether you wrapped the load in a try/except that suppresses ValidationError — that’s how data loss happens at scale.
Schema Performance Cliff on Large Lists
Marshmallow is reasonably fast for small payloads but slows on very large lists — thousands of nested items can take seconds to serialize. The cause is per-field method dispatch through Python.
Mitigations:
- Use
partial=Trueon dump when you can trade strictness for speed. - For pure speed, switch the hot endpoint to msgspec or orjson with manual schemas.
- Pre-compute derived fields outside the schema. A schema method that calls the DB per item is an O(N) query trap — load the related data once, then map by ID inside
pre_dump.
If your endpoint serializes hundreds of items per request and CPU profiles show marshmallow at the top, the rewrite is usually faster than tuning. Marshmallow is a correctness tool; it is not optimized for tight loops.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: msgspec Not Working — Struct Definition, Type Validation, and JSON/MessagePack Encoding
How to fix msgspec errors — Struct field type not supported, ValidationError on decode, msgspec vs Pydantic differences, custom type hooks, frozen Struct mutation, and JSON Schema generation.
Fix: attrs Not Working — Slots Conflict, Validator Errors, and dataclasses Migration
How to fix attrs errors — attrs.define vs attr.s API confusion, __slots__ inheritance issues, validator not running on assignment, converter type narrowing, cattrs structuring failed, and difference from dataclasses.
Fix: Gunicorn Not Working — Worker Timeout, Boot Errors, and Signal Handling
How to fix Gunicorn errors — WORKER TIMEOUT killed, ImportError cannot import app, worker class not found, connection refused 502 behind nginx, graceful reload not working, and sync vs async worker selection.
Fix: Flask Route Returns 404 Not Found
How to fix Flask routes returning 404 — trailing slash redirect, Blueprint prefix issues, route not registered, debug mode, and common URL rule mistakes.