Fix: Weaviate Not Working — Client v4 Migration, Schema Setup, and Vectorizer Errors

Q: How do I fix "Weaviate Not Working — Client v4 Migration, Schema Setup, and Vectorizer Errors"?

How to fix Weaviate errors — client v3 to v4 migration breaking imports, schema creation property mismatch, vectorizer module not loaded, connection refused localhost 8080, batch import errors, and hybrid search alpha tuning.

The Error

You install the Python client and tutorials don’t work:

import weaviate
client = weaviate.Client("http://localhost:8080")
# AttributeError: module 'weaviate' has no attribute 'Client'

Or the schema rejects your collection:

weaviate.exceptions.UnexpectedStatusCodeError:
Collection creation failed: data type 'string' is not supported, use 'text'

Or the vectorizer module isn’t available:

no module 'text2vec-openai' configured on cluster

Or batch imports fail silently and you only notice missing data later:

with client.batch as batch:
    for doc in docs:
        batch.add_data_object(...)
# Some succeed, some don't — no exception raised

Or hybrid search alpha tuning produces unexpected results:

client.query.get("Article", ["title", "content"]) \
    .with_hybrid(query="ML", alpha=0.5) \
    .do()
# Returns only keyword matches, no semantic matches

Weaviate is the hybrid vector database — combines vector similarity with keyword search and GraphQL queries. The v4 Python client (released late 2023) introduced a completely new API surface with type-safe collections, breaking every v3 tutorial. The module system (text2vec-openai, text2vec-cohere, text2vec-transformers) requires explicit configuration on the cluster side. This guide covers each.

Why This Happens

Weaviate’s v4 client redesigned the API around typed Collections instead of dictionaries — much cleaner but completely incompatible with v3. Most tutorials online predate this and use weaviate.Client(url) which doesn’t exist in v4.

Vectorizer modules run on the Weaviate server — when you write a document, the server calls the module to compute the embedding. If the module isn’t enabled (via env vars at startup), schema creation fails or documents get no vectors.

Fix 1: v3 to v4 Client Migration

# OLD — v3 (broken in v4)
import weaviate

client = weaviate.Client("http://localhost:8080")
client.schema.create_class({...})

# NEW — v4
import weaviate

client = weaviate.connect_to_local()
# Or for cloud
client = weaviate.connect_to_wcs(
    cluster_url="https://...weaviate.network",
    auth_credentials=weaviate.auth.AuthApiKey("..."),
)

# Always close
client.close()

Context manager (recommended):

import weaviate

with weaviate.connect_to_local() as client:
    # Operations here
    print(client.is_ready())

v3 → v4 API changes:

v3	v4
`weaviate.Client(url)`	`weaviate.connect_to_local()` / `connect_to_wcs()`
`client.schema.create_class({...})`	`client.collections.create(...)`
`client.data_object.create({...}, "Article")`	`articles = client.collections.get("Article"); articles.data.insert({...})`
`client.query.get("Article", ["title"]).do()`	`articles.query.fetch_objects()`
Dict-based responses	Typed objects with attributes

Connection helpers:

# Local Weaviate (Docker on localhost:8080)
client = weaviate.connect_to_local()

# Custom local config
client = weaviate.connect_to_local(
    host="localhost",
    port=8080,
    grpc_port=50051,
    headers={"X-OpenAI-Api-Key": "sk-..."},   # For OpenAI vectorizer
)

# Weaviate Cloud Services
client = weaviate.connect_to_wcs(
    cluster_url="https://your-cluster.weaviate.network",
    auth_credentials=weaviate.auth.AuthApiKey("your-api-key"),
    headers={"X-OpenAI-Api-Key": "sk-..."},
)

# Custom URL with auth
client = weaviate.connect_to_custom(
    http_host="weaviate.example.com",
    http_port=443,
    http_secure=True,
    grpc_host="weaviate.example.com",
    grpc_port=443,
    grpc_secure=True,
    auth_credentials=weaviate.auth.AuthApiKey("..."),
)

Common Mistake: Following a tutorial that uses weaviate.Client(url) — that API doesn’t exist in v4. The error (module has no attribute 'Client') is clear, but new users assume v4 is broken. Always check the tutorial date — anything before late 2023 uses v3.

Fix 2: Creating Collections (Schemas)

import weaviate
import weaviate.classes.config as wvc

with weaviate.connect_to_local() as client:
    client.collections.create(
        name="Article",
        properties=[
            wvc.Property(name="title", data_type=wvc.DataType.TEXT),
            wvc.Property(name="content", data_type=wvc.DataType.TEXT),
            wvc.Property(name="published_at", data_type=wvc.DataType.DATE),
            wvc.Property(name="author_id", data_type=wvc.DataType.INT),
            wvc.Property(name="tags", data_type=wvc.DataType.TEXT_ARRAY),
        ],
        vectorizer_config=wvc.Configure.Vectorizer.text2vec_openai(
            model="text-embedding-3-small",
        ),
        generative_config=wvc.Configure.Generative.openai(
            model="gpt-4o-mini",
        ),
    )

Data types:

v4 DataType	Use for
`TEXT`	Strings (full-text + vector indexed)
`TEXT_ARRAY`	List of strings
`INT`	Integers
`NUMBER`	Floats
`BOOL`	Booleans
`DATE`	RFC 3339 datetime
`UUID`	UUID strings
`GEO_COORDINATES`	Geo lat/lng
`BLOB`	Base64-encoded binary
`OBJECT`	Nested object
`OBJECT_ARRAY`	List of nested objects

Common error — wrong data type:

data type 'string' is not supported, use 'text'

Weaviate v1.18+ requires TEXT not string. The v4 client uses the new names; manual REST API calls or v3 examples may use string and fail.

Vectorizer module options:

# OpenAI embeddings (requires X-OpenAI-Api-Key header)
vectorizer_config=wvc.Configure.Vectorizer.text2vec_openai(
    model="text-embedding-3-small",   # or text-embedding-3-large
)

# Cohere
vectorizer_config=wvc.Configure.Vectorizer.text2vec_cohere(
    model="embed-english-v3.0",
)

# Local HuggingFace transformer
vectorizer_config=wvc.Configure.Vectorizer.text2vec_huggingface(
    model="sentence-transformers/all-MiniLM-L6-v2",
)

# Local Weaviate transformers container
vectorizer_config=wvc.Configure.Vectorizer.text2vec_transformers()

# No vectorizer — provide your own vectors at insert time
vectorizer_config=wvc.Configure.Vectorizer.none()

Fix 3: Enabling Modules on the Server

no module 'text2vec-openai' configured on cluster

Modules must be enabled when starting Weaviate.

Docker Compose example:

# docker-compose.yml
services:
  weaviate:
    image: cr.weaviate.io/semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
      - "50051:50051"
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true"
      PERSISTENCE_DATA_PATH: "/var/lib/weaviate"
      DEFAULT_VECTORIZER_MODULE: "text2vec-openai"
      ENABLE_MODULES: "text2vec-openai,text2vec-cohere,text2vec-huggingface,generative-openai,qna-openai,ref2vec-centroid"
      CLUSTER_HOSTNAME: "node1"
    volumes:
      - ./weaviate_data:/var/lib/weaviate

Pass API keys at query time (recommended over baking them into the cluster):

client = weaviate.connect_to_local(
    headers={
        "X-OpenAI-Api-Key": "sk-...",
        "X-Cohere-Api-Key": "...",
    },
)

The cluster forwards these headers to the vectorizer module. Multiple users on the same cluster can use their own API keys.

Verify modules:

meta = client.get_meta()
print(meta["modules"])
# {'text2vec-openai': {...}, 'generative-openai': {...}, ...}

Pro Tip: Enable more modules than you currently use — they’re inert until referenced. Enabling text2vec-openai, text2vec-cohere, text2vec-huggingface, generative-openai, and generative-cohere covers most use cases. Adding modules later requires a server restart.

Fix 4: Inserting and Batch Imports

articles = client.collections.get("Article")

# Single insert
articles.data.insert({
    "title": "My Article",
    "content": "Full text...",
    "published_at": "2025-04-24T10:00:00Z",
})

# Batch insert (efficient for many items)
with articles.batch.dynamic() as batch:
    for doc in documents:
        batch.add_object(
            properties={
                "title": doc["title"],
                "content": doc["content"],
            },
        )

# Check for batch errors
if len(articles.batch.failed_objects) > 0:
    for failed in articles.batch.failed_objects:
        print(f"Failed: {failed.message}")

Common Mistake: Not checking failed_objects after a batch import. Weaviate’s batch API queues operations and continues on failures — you’d never know rows were dropped without checking. Always inspect failed_objects and failed_references after a batch:

with articles.batch.dynamic() as batch:
    for doc in documents:
        batch.add_object(properties=doc)
    # Batch flushes on context exit

print(f"Inserted: {len(documents) - len(articles.batch.failed_objects)}")
print(f"Failed: {len(articles.batch.failed_objects)}")
for failed in articles.batch.failed_objects[:5]:   # Show first 5 errors
    print(f"  {failed.original_uuid}: {failed.message}")

Batch with custom vectors (skip the vectorizer):

import numpy as np

with articles.batch.dynamic() as batch:
    for doc, vec in zip(documents, embeddings):
        batch.add_object(
            properties=doc,
            vector=vec.tolist(),   # Use your pre-computed embedding
        )

Fixed-size batch:

with articles.batch.fixed_size(batch_size=200, concurrent_requests=2) as batch:
    for doc in documents:
        batch.add_object(properties=doc)

fixed_size flushes when the batch hits batch_size; dynamic adapts based on server response time.

Fix 5: Querying — Vector, Keyword, and Hybrid

articles = client.collections.get("Article")

# Vector search (semantic)
response = articles.query.near_text(
    query="machine learning",
    limit=10,
)
for obj in response.objects:
    print(obj.properties["title"], obj.metadata.distance)

# Keyword search (BM25)
response = articles.query.bm25(
    query="machine learning",
    limit=10,
)

# Hybrid search (combines both)
response = articles.query.hybrid(
    query="machine learning",
    alpha=0.75,   # 0 = pure keyword, 1 = pure vector
    limit=10,
)

alpha parameter for hybrid:

Alpha	Behavior
`0.0`	Pure BM25 keyword search
`0.25`	Mostly keyword, some vector
`0.5`	Balanced (default)
`0.75`	Mostly vector, some keyword
`1.0`	Pure vector search

Common Mistake: Setting alpha=0.5 and being surprised by what dominates. The two scores have different distributions — BM25 scores are unbounded, vector scores are typically [0, 1]. Weaviate normalizes them, but the exact balance varies by query. Try alpha=0.7 or alpha=0.3 and test on real queries; the “right” alpha is workload-specific.

Filter results:

import weaviate.classes.query as wq

response = articles.query.hybrid(
    query="machine learning",
    filters=wq.Filter.by_property("author_id").equal(123),
    limit=10,
)

# Compound filter
filters = wq.Filter.all_of([
    wq.Filter.by_property("published_at").greater_than("2024-01-01"),
    wq.Filter.by_property("tags").contains_any(["ml", "ai"]),
])

response = articles.query.near_text(
    query="...",
    filters=filters,
    limit=10,
)

Return specific properties:

import weaviate.classes.query as wq

response = articles.query.fetch_objects(
    return_properties=["title", "author_id"],
    return_metadata=wq.MetadataQuery(distance=True, creation_time=True),
    limit=10,
)

Fix 6: References Between Collections

# Create related collections
client.collections.create(
    name="Author",
    properties=[wvc.Property(name="name", data_type=wvc.DataType.TEXT)],
)

client.collections.create(
    name="Article",
    properties=[
        wvc.Property(name="title", data_type=wvc.DataType.TEXT),
    ],
    references=[
        wvc.ReferenceProperty(name="author", target_collection="Author"),
    ],
)

Insert with reference:

import uuid

author_uuid = client.collections.get("Author").data.insert(
    {"name": "Alice"},
)

article_uuid = client.collections.get("Article").data.insert(
    properties={"title": "ML Intro"},
    references={"author": author_uuid},
)

Query with references:

import weaviate.classes.query as wq

response = articles.query.fetch_objects(
    return_references=wq.QueryReference(
        link_on="author",
        return_properties=["name"],
    ),
    limit=10,
)

for obj in response.objects:
    print(obj.properties["title"])
    for ref in obj.references["author"].objects:
        print(f"  by {ref.properties['name']}")

For comparing Weaviate’s reference model to other vector databases, see Pinecone not working and Qdrant not working.

Fix 7: Generative Search (RAG Built-In)

Weaviate can run generation directly — pass retrieved docs to an LLM as context, return generated text:

import weaviate.classes.generate as wgen

articles = client.collections.get("Article")

response = articles.generate.near_text(
    query="machine learning basics",
    grouped_task="Summarize these articles in 2 paragraphs.",
    limit=5,
)

print(response.generated)   # Single generated text from all 5 docs

Per-result generation:

response = articles.generate.near_text(
    query="how do transformers work",
    single_prompt="Rewrite this title as a question: {title}",
    limit=5,
)

for obj in response.objects:
    print(obj.properties["title"])
    print(f"  → {obj.generated}")   # Generated per object

Configure generative module per query:

import weaviate.classes.generate as wgen
import weaviate.classes.config as wvc

# At collection level
client.collections.create(
    name="Article",
    properties=[...],
    generative_config=wvc.Configure.Generative.openai(model="gpt-4o-mini"),
)

# Override at query time (requires X-OpenAI-Api-Key header on the client)
response = articles.generate.near_text(
    query="...",
    grouped_task="Summarize",
    generative_provider=wgen.GenerativeConfig.openai(model="gpt-4o"),
)

For LLM API patterns that interact with Weaviate’s generative module, see OpenAI API not working.

Fix 8: Backup and Migration

# Backup to filesystem (set up BACKUP_FILESYSTEM_PATH in docker env)
client.backup.create(
    backup_id="backup-2025-04-24",
    backend="filesystem",
    include_collections=["Article", "Author"],
    wait_for_completion=True,
)

# Restore
client.backup.restore(
    backup_id="backup-2025-04-24",
    backend="filesystem",
    include_collections=["Article"],
    wait_for_completion=True,
)

S3 backup backend:

# docker-compose.yml additions
environment:
  ENABLE_MODULES: "backup-s3,..."
  BACKUP_S3_BUCKET: "my-weaviate-backups"
  BACKUP_S3_ENDPOINT: "s3.amazonaws.com"
  AWS_ACCESS_KEY_ID: "..."
  AWS_SECRET_ACCESS_KEY: "..."

client.backup.create(backup_id="weekly-2025-04-24", backend="s3")

Common Mistake: Backing up before a schema change, then restoring after the schema change — Weaviate restores the OLD schema and your new properties are lost. Backup includes schema. Always backup AFTER schema changes you want to preserve.

Platform Differences and Deployment Targets

Weaviate runs in three meaningfully different topologies — Weaviate Cloud, self-hosted (Docker or Kubernetes), and embedded. The Python client is the same, but performance characteristics, module availability, and operational burden are not. The vectorizer module choice is also a fork in the road that’s hard to reverse.

Weaviate Cloud (WCD, formerly WCS). Managed multi-tenant service from Weaviate Inc. Provision a cluster via the web console; auth uses an API key. Cluster URLs look like https://your-cluster-id.weaviate.network. Modules (text2vec-openai, generative-openai, etc.) are pre-enabled. You pass downstream API keys as headers per request. Zero ops, predictable pricing tied to dimensions and object count. The trade-off: you can’t install custom modules, no on-prem deployment, and outbound calls to OpenAI/Cohere go from Weaviate Cloud’s region, which may not match your app’s region.

Self-hosted Docker. The most common dev and small-prod setup. Single container with persistent volume. Modules are enabled via ENABLE_MODULES env var at startup — adding a module later requires container restart. Authentication is opt-in (AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: "true" for local dev). For production single-node, set AUTHENTICATION_APIKEY_ENABLED: "true" with a fixed key. Don’t run anonymous in any environment reachable from the internet.

Self-hosted Kubernetes via Helm. The official Helm chart (weaviate/weaviate) deploys a StatefulSet with persistent volume claims. Configure modules and resources via values.yaml:

modules:
  text2vec-openai:
    enabled: true
  generative-openai:
    enabled: true
resources:
  requests:
    memory: 4Gi
    cpu: 1
  limits:
    memory: 8Gi

For multi-node, the chart supports replicas: 3 with CLUSTER_GOSSIP_BIND_PORT and CLUSTER_DATA_BIND_PORT automatically wired. Storage class matters — HNSW indexes do random reads, so use SSD-backed storage classes (gp3 on EKS, pd-ssd on GKE), not standard HDD-backed defaults.

Kubernetes operator. Weaviate Inc. publishes a Kubernetes operator (Custom Resource Definition) that manages cluster lifecycle, automated backups to S3, and rolling upgrades. Use the operator when you run multiple Weaviate clusters in the same Kubernetes setup. For a single cluster, the Helm chart is simpler.

Embedded Weaviate. A lightweight in-process mode for prototyping and testing. Start with weaviate.connect_to_embedded() — the client downloads and runs a local Weaviate binary on a free port. No Docker required. Embedded mode has limitations: not all modules work (especially the transformer-based ones), single-process only, data persists in ~/.local/share/weaviate between runs unless overridden. Useful for unit tests; not for production.

Module choices and their cost models. Vectorization modules each have a different tradeoff:

text2vec-openai. Highest quality embeddings (text-embedding-3-large). Costs ~$0.13 per million tokens at large scale. Calls go to OpenAI’s API — adds 100–300ms latency per batch and depends on OpenAI’s uptime.
text2vec-cohere. Quality close to OpenAI on English; Cohere’s multilingual model is better for non-English. Similar pricing model.
text2vec-huggingface. Calls the HuggingFace Inference API. Cheaper but rate-limited on the free tier. Latency varies.
text2vec-transformers. Runs the model locally inside a separate container next to Weaviate. No per-token cost, no external API dependency, but uses GPU/CPU resources. Pin a specific model version; default sentence-transformers/all-MiniLM-L6-v2 is small but English-only.
No vectorizer (Vectorizer.none()). Compute embeddings in your app and pass vector= on insert. Most flexible — use this when you want a Modal or Replicate-hosted custom embedding model, or when batching embeddings across multiple data stores.

Cohere/OpenAI key rotation. When you rotate the OpenAI key, the cluster’s stored key (if set via OPENAI_APIKEY env var) goes stale. Pass keys as request headers (X-OpenAI-Api-Key) instead — the client overrides the cluster-level key per request, and rotating just means updating your app’s config. Use cluster-level keys only when every tenant shares a single billing account.

Connection options — REST vs gRPC. The v4 client uses gRPC by default (port 50051) for data operations and REST (port 8080) for metadata. Both must be reachable. Behind a proxy or load balancer that only forwards HTTP/1.1, gRPC fails silently — you’ll see hangs on client.is_ready(). Configure your ingress for HTTP/2 + TLS, or use Weaviate Cloud which handles this for you.

Still Not Working?

Weaviate vs Other Vector DBs

Weaviate — Hybrid search built-in, GraphQL queries, generative search module, references between collections. Best for RAG with structured metadata.
Pinecone — Managed SaaS, simpler, no built-in hybrid. See Pinecone not working.
Qdrant — Strong filtering, self-hostable. See Qdrant not working.
ChromaDB — Simplest for prototypes. See ChromaDB not working.

Weaviate’s hybrid search and generative module make it strong for RAG. Choose Pinecone for zero-ops; Qdrant for self-hosted with rich filters.

Connection Refused Errors

ConnectionRefusedError: [Errno 111] Connection refused

Weaviate not running. Verify:

docker ps                            # Is the container up?
curl http://localhost:8080/v1/.well-known/ready   # Ready endpoint

If Docker says “exited”, check logs:

docker logs weaviate-1
# Look for "module not found", "permission denied", or memory errors

Multi-Tenancy

Weaviate supports tenant isolation — each tenant gets its own vector index, with independent data:

import weaviate.classes.config as wvc

client.collections.create(
    name="Article",
    properties=[wvc.Property(name="title", data_type=wvc.DataType.TEXT)],
    multi_tenancy_config=wvc.Configure.multi_tenancy(enabled=True),
)

articles = client.collections.get("Article")

# Create tenants
articles.tenants.create([
    wvc.Tenant(name="customer-a"),
    wvc.Tenant(name="customer-b"),
])

# Insert into specific tenant
tenant_a = articles.with_tenant("customer-a")
tenant_a.data.insert({"title": "Customer A's article"})

# Query within tenant
response = articles.with_tenant("customer-a").query.fetch_objects()

Multi-tenancy is far more efficient than putting tenant IDs in metadata and filtering — Weaviate stores separate HNSW indexes per tenant, queries are isolated, and inactive tenants can be offloaded to disk to save RAM.

Cross-References vs Embedded Properties

When modeling related data, choose between references (separate collection + link) or embedded objects (nested property):

# Embedded (denormalized — duplicate data, fast queries)
client.collections.create(
    name="Article",
    properties=[
        wvc.Property(name="title", data_type=wvc.DataType.TEXT),
        wvc.Property(
            name="author",
            data_type=wvc.DataType.OBJECT,
            nested_properties=[
                wvc.Property(name="name", data_type=wvc.DataType.TEXT),
                wvc.Property(name="bio", data_type=wvc.DataType.TEXT),
            ],
        ),
    ],
)

# Referenced (normalized — single source of truth, requires join queries)
# See Fix 6 above

Use embedded for immutable or rarely-updated relationships (article author at write time). Use references when the related data changes frequently (current user profile).

Memory and Resource Limits

Weaviate’s vector indexes live in RAM by default. For collections with millions of objects:

# Configure HNSW to use disk + flat compression
client.collections.create(
    name="Article",
    properties=[...],
    vector_index_config=wvc.Configure.VectorIndex.hnsw(
        ef_construction=128,
        max_connections=16,
        vector_cache_max_objects=100_000,
    ),
)

Tune Docker memory limits to match the vector cache budget — Weaviate quietly evicts when limits are tight, and recall drops before any error fires.

Async Client

import weaviate

async with weaviate.use_async_with_local() as client:
    await client.is_ready()
    articles = client.collections.get("Article")
    response = await articles.query.fetch_objects(limit=10)

Use the async client whenever you call Weaviate from FastAPI or any other ASGI app — the sync client blocks the event loop for the duration of every gRPC round-trip.

Schema Changes Are Forward-Only

Weaviate lets you add properties to an existing collection but not remove or change their types. If you need to drop a property or change TEXT to INT, the only path is: dump data via collections.iterator(), delete the collection, recreate with the new schema, re-insert. Plan schemas defensively — store extra metadata as a single nested metadata object of type OBJECT so future shape changes don’t require collection drops.

HNSW Tuning for Speed vs Recall

Default HNSW settings (ef_construction=128, max_connections=64) target balanced behavior. For high-recall use cases (legal, medical search), raise ef_construction to 256 and ef at query time to 200 — slower inserts and queries but better recall. For latency-critical use cases (autocomplete, low-latency suggestions), drop ef to 64. The ef parameter is per-query (return_metadata=wq.MetadataQuery(score=True) plus target_vector config), so you can have different recall profiles for different endpoints against the same collection.

Reindexing After Embedding Model Upgrade

When you upgrade from text-embedding-3-small to text-embedding-3-large, existing vectors stay at the old dimension and old space. Mixing them with new vectors produces nonsense results. Reindex by iterating through the collection, regenerating vectors (or letting the vectorizer module recompute via re-insert), and writing them back. There’s no in-place upgrade. For collections over 10M objects, plan an hours-to-days reindex window or run a parallel new collection and dual-write during the cutover.