Quick Start¶
This guide shows a complete working example: indexing documents, performing semantic search, and keyword search — in under 40 lines.
Prerequisites: IRIS running via Docker and the package installed. See Docker Setup and Installation.
1. Set credentials¶
export IRIS_CONNECTION_STRING="localhost:1972/USER"
export IRIS_USERNAME="_system"
export IRIS_PASSWORD="SYS"
2. Complete example¶
quickstart.py
from haystack import Document, Pipeline
from haystack.components.embedders import (
SentenceTransformersDocumentEmbedder,
SentenceTransformersTextEmbedder,
)
from haystack.components.writers import DocumentWriter
from haystack.document_stores.types import DuplicatePolicy
from intersystems_iris_haystack.document_stores import IRISDocumentStore
from intersystems_iris_haystack.components.retrievers import (
IRISBm25Retriever,
IRISEmbeddingRetriever,
)
MODEL = "sentence-transformers/all-MiniLM-L6-v2"
# ── Initialize the store ──────────────────────────────────────────────────
store = IRISDocumentStore(embedding_dim=384)
print(f"Documents before indexing: {store.count_documents()}")
# ── Indexing pipeline ─────────────────────────────────────────────────────
indexing = Pipeline()
indexing.add_component("embedder", SentenceTransformersDocumentEmbedder(model=MODEL))
indexing.add_component("writer", DocumentWriter(store, policy=DuplicatePolicy.OVERWRITE))
indexing.connect("embedder.documents", "writer.documents")
documents = [
Document(content="IRIS is a high-performance multimodel database.", meta={"category": "db"}),
Document(content="Haystack is a framework for building LLM applications.", meta={"category": "ai"}),
Document(content="Vector search finds semantically similar documents.", meta={"category": "ai"}),
Document(content="IRIS supports SQL, JSON, vectors, and globals.", meta={"category": "db"}),
]
indexing.run({"embedder": {"documents": documents}})
print(f"Documents after indexing: {store.count_documents()}")
# ── Semantic search ───────────────────────────────────────────────────────
query_pipeline = Pipeline()
query_pipeline.add_component("embedder", SentenceTransformersTextEmbedder(model=MODEL))
query_pipeline.add_component("retriever", IRISEmbeddingRetriever(store, top_k=2))
query_pipeline.connect("embedder.embedding", "retriever.query_embedding")
result = query_pipeline.run({"embedder": {"text": "how does similarity search work?"}})
print("\n── Semantic search ──────────")
for doc in result["retriever"]["documents"]:
print(f" [{doc.score:.4f}] {doc.content[:60]}...")
# ── BM25 keyword search ───────────────────────────────────────────────────
bm25 = IRISBm25Retriever(store, top_k=2)
result = bm25.run(query="database SQL JSON")
print("\n── BM25 keyword search ──────")
for doc in result["documents"]:
print(f" [{doc.score:.4f}] {doc.content[:60]}...")
# ── Metadata filter ───────────────────────────────────────────────────────
ai_docs = store.filter_documents({"category": "ai"})
print(f"\n── Filter category=ai ───────")
for doc in ai_docs:
print(f" {doc.content[:60]}...")
store.close()
3. Expected output¶
Documents before indexing: 0
Documents after indexing: 4
── Semantic search ──────────
[0.6123] Vector search finds semantically similar documents....
[0.4891] IRIS is a high-performance multimodel database....
── BM25 keyword search ──────
[1.4320] IRIS supports SQL, JSON, vectors, and globals....
[0.8741] IRIS is a high-performance multimodel database....
── Filter category=ai ───────
Haystack is a framework for building LLM applications....
Vector search finds semantically similar documents....
Next steps¶
-
IRISDocumentStore
All initialization options, table schema, and connection management.
-
Metadata Filtering
Legacy format, official Haystack operator/conditions, all operators.
-
API Reference
Complete parameter reference for all classes.