# DSPy.rb - Comprehensive Reference > Build LLM apps like you build software. Type-safe, modular, testable. DSPy.rb brings software engineering best practices to LLM development. Instead of tweaking prompts, you define what you want with Ruby types and let DSPy handle the rest. ## Table of Contents 1. [Overview](#overview) 2. [Installation & Setup](#installation--setup) 3. [Core Concepts](#core-concepts) 4. [Signatures](#signatures) 5. [Modules](#modules) 6. [Predictors](#predictors) 7. [Rich Types](#complex-types) 8. [Multimodal Support](#multimodal-support) 9. [Agent Systems](#agent-systems) 10. [Memory Systems](#memory-systems) 11. [Toolsets](#toolsets) 12. [Optimization](#optimization) 13. [Production Features](#production-features) 14. [Testing Strategies](#testing-strategies) 15. [API Reference](#api-reference) 16. [Integration Guides](#integration-guides) 17. [Examples](#examples) ## Overview DSPy.rb is a Ruby framework for building language model applications with programmatic prompts. It provides: - **Type-safe signatures** - Define inputs/outputs with Sorbet types - **Modular components** - Compose and reuse LLM logic - **Automatic optimization** - Use data to improve prompts, not guesswork - **Production-ready** - Built-in observability, testing, and error handling ### Key Features - **Provider Support**: OpenAI, Anthropic, Google Gemini, Ollama (via OpenAI compatibility) - **Type Safety**: Sorbet integration throughout - **Automatic JSON Extraction**: Provider-optimized strategies - **Composable Modules**: Chain, compose, and reuse - **Multimodal Support**: Text and image inputs with vision models via raw chat (signature wiring planned) - **Agent Systems**: ReAct (core), CodeAct (`dspy-code_act`), and custom agents - **Memory & State**: Persistent memory for stateful applications - **Observability**: Automatic APM integration, token tracking, performance monitoring ### Provider Compatibility | Feature | OpenAI | Anthropic | Gemini | Ollama | |---------|--------|-----------|--------|--------| | Text Generation | ✅ | ✅ | ✅ | ✅ | | Structured Output | ✅ | ✅ | ✅ | ✅ | | Vision (Raw Chat) | ✅ | ✅ | ✅ | ❌ | | Vision (Signatures) | ✅ | ✅ | ✅ | ❌ | | Image URLs | ✅ | ❌ | ❌ | ❌ | | Image Base64 | ✅ | ✅ | ✅ | ❌ | | Tool Calling | ✅ | ✅ | ✅ | Varies | ### Current Limitations - **Streaming**: Supported via block streaming on OpenAI, Anthropic, and Gemini adapters; modules return concatenated content only (no token-by-token callbacks). - **Function/Tool Calling**: Anthropic adapter accepts `tools:`; OpenAI and Gemini adapters do not yet expose tool specs. - **Image URLs**: Only OpenAI supports direct URL references. - **Local Models**: Limited multimodal support through Ollama. - **Batch Processing**: Single request processing only. ## Installation & Setup ### Requirements - Ruby 3.3 or higher - Bundler ### Installation Add to your Gemfile: ```ruby gem 'dspy' ``` Then run: ```bash bundle install ``` ### Provider Adapter Gems Add the adapter gems that match the providers you call so DSPy only pulls the SDKs you actually use: ```ruby # Gemfile gem 'dspy' gem 'dspy-openai' # OpenAI, OpenRouter, Ollama gem 'dspy-anthropic' # Claude gem 'dspy-gemini' # Gemini ``` Each adapter gem already depends on the official SDK (`openai`, `anthropic`, `gemini-ai`), so you don’t need to add those manually. DSPy auto-loads adapters when the gem is present—no extra `require` needed. See the adapter READMEs for details: - [OpenAI / OpenRouter / Ollama adapters](https://github.com/vicentereig/dspy.rb/blob/main/lib/dspy/openai/README.md) - [Anthropic adapters](https://github.com/vicentereig/dspy.rb/blob/main/lib/dspy/anthropic/README.md) - [Gemini adapters](https://github.com/vicentereig/dspy.rb/blob/main/lib/dspy/gemini/README.md) ### Basic Configuration ```ruby require 'dspy' # Configure with OpenAI DSPy.configure do |c| c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY']) end # Or configure with Anthropic DSPy.configure do |c| c.lm = DSPy::LM.new('anthropic/claude-3-sonnet', api_key: ENV['ANTHROPIC_API_KEY']) end # Or configure with Google Gemini DSPy.configure do |c| c.lm = DSPy::LM.new('gemini/gemini-1.5-pro', api_key: ENV['GEMINI_API_KEY']) end # Or use Ollama for local models DSPy.configure do |c| c.lm = DSPy::LM.new('ollama/llama3.2') # No API key needed for local end ``` ### Environment Variables ```bash # LLM API Keys export OPENAI_API_KEY=sk-your-key-here export ANTHROPIC_API_KEY=sk-ant-your-key-here export GEMINI_API_KEY=your-gemini-key # Optional: Observability export OTEL_SERVICE_NAME=my-dspy-app export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 export LANGFUSE_SECRET_KEY=sk_your_key export LANGFUSE_PUBLIC_KEY=pk_your_key export NEW_RELIC_LICENSE_KEY=your_license_key ``` ### Advanced Configuration ```ruby DSPy.configure do |c| # Language Model c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY'], temperature: 0.7, max_tokens: 2000 ) # Observability c.logger = Dry.Logger(:dspy, formatter: :json) do |logger| logger.add_backend(level: :info, stream: $stdout) end end ``` ## Core Concepts ### 1. Signatures Signatures define the interface between your application and language models: ```ruby class EmailClassifier < DSPy::Signature description "Classify customer support emails by category and priority" class Priority < T::Enum enums do Low = new('low') Medium = new('medium') High = new('high') Urgent = new('urgent') end end input do const :email_content, String const :sender, String end output do const :category, String const :priority, Priority const :confidence, Float end end ``` ### 2. Modules Modules provide reusable LLM components: - **DSPy::Module** - Base class for custom modules - **Per-instance configuration** - Each module can have its own LM - **Composability** - Combine modules for complex workflows ### 3. Predictors Built-in predictors for different reasoning patterns: - **Predict** - Basic LLM calls - **ChainOfThought** - Step-by-step reasoning - **ReAct** - Tool-using agents - **CodeAct** - Dynamic code generation (install the `dspy-code_act` gem) ### 4. Optimization Improve accuracy with data: - **MIPROv2** - Advanced multi-prompt optimization with bootstrap sampling, instruction generation, and Bayesian optimization strategies - **Evaluation** - Comprehensive framework with built-in metrics, custom evaluation functions, error handling, batch processing, and detailed result analysis ## Signatures ### Basic Structure ```ruby class TaskSignature < DSPy::Signature description "Clear description of what this signature accomplishes" input do const :field_name, String end output do const :result_field, String end end ``` ### Input Types ```ruby input do const :text, String # Required string const :context, T.nilable(String) # Optional string const :max_length, Integer # Required integer const :include_score, T::Boolean # Boolean const :tags, T::Array[String] # Array of strings const :metadata, T::Hash[String, String] # Hash end ``` ### Output Types with Enums ```ruby class Priority < T::Enum enums do Low = new('low') Medium = new('medium') High = new('high') end end output do const :priority, Priority const :confidence, Float end ``` ### Default Values (v0.7.0+) ```ruby class SmartSearch < DSPy::Signature description "Search with intelligent defaults" input do const :query, String const :max_results, Integer, default: 10 const :language, String, default: "English" end output do const :results, T::Array[String] const :cached, T::Boolean, default: false end end ``` ### Working with Structs ```ruby class ContactInfo < T::Struct const :name, String const :email, String const :phone, T.nilable(String) end class ExtractContact < DSPy::Signature description "Extract contact information" output do const :contact, ContactInfo end end ``` ### Union Types (v0.11.0+) ```ruby # Single-field unions - automatic type detection class TaskAction < DSPy::Signature output do const :action, T.any(CreateTask, UpdateTask, DeleteTask) end end # DSPy automatically adds a _type discriminator field to distinguish # between different struct types in unions ``` #### Type Discrimination with `_type` Fields DSPy.rb uses sophisticated type discrimination to handle complex data structures reliably: ```ruby # Example structs for demonstration class SearchAction < T::Struct const :query, String const :max_results, Integer, default: 10 end class AnswerAction < T::Struct const :content, String const :confidence, Float end # Union type signature class ActionSignature < DSPy::Signature output do const :action, T.any(SearchAction, AnswerAction) end end ``` **How `_type` Fields Work:** 1. **Automatic Injection**: DSPy adds `_type` fields to JSON schemas with `const` constraints 2. **Type Resolution**: LLMs include the `_type` field in responses to indicate struct type 3. **Automatic Filtering**: DSPy filters out `_type` fields during deserialization for all structs 4. **Recursive Handling**: Works at any nesting level in complex data structures **JSON Response Example:** ```json { "action": { "_type": "SearchAction", "query": "Ruby programming", "max_results": 5 } } ``` **Important Considerations:** - **Reserved Field**: Never define your own `_type` fields in T::Struct classes - **Automatic Filtering**: `_type` is automatically removed during struct creation - **Union vs Direct**: Both union types and direct struct fields handle `_type` filtering - **Error Prevention**: Prevents type mismatch errors during deserialization ## Modules ### Creating Custom Modules ```ruby class SentimentAnalyzer < DSPy::Module def initialize super @predictor = DSPy::Predict.new(SentimentSignature) end def forward(text:) @predictor.call(text: text) end end ``` ### Module Composition ```ruby class DocumentProcessor < DSPy::Module def initialize super @classifier = DocumentClassifier.new @summarizer = DocumentSummarizer.new @extractor = KeywordExtractor.new end def forward(document:) classification = @classifier.call(content: document) summary = @summarizer.call(content: document) keywords = @extractor.call(content: document) { document_type: classification.document_type, summary: summary.summary, keywords: keywords.keywords } end end ``` ### Lifecycle Callbacks Modules expose Rails-style lifecycle hooks so you can instrument cross-cutting concerns without cluttering `forward`. - `before` callbacks run ahead of `forward` for setup (timers, context loading) - `around` callbacks wrap `forward` and must `yield`, letting you bracket execution - `after` callbacks fire once `forward` returns for cleanup, logging, or persistence Callbacks execute in the order: all `before` hooks → `around` (pre-yield) → `forward` → `around` (post-yield) → `after` hooks. Multiple callbacks of the same type run in registration order. ```ruby class InstrumentedModule < DSPy::Module before :start_timer around :with_context after :record_metrics def initialize super @predictor = DSPy::Predict.new(QuestionSignature) end def forward(question:) @predictor.call(question: question) end private def start_timer @started_at = Time.now end def with_context load_context result = yield save_context(result) result end def record_metrics DSPy.logger.info(duration: Time.now - @started_at) end end ``` Callbacks target `forward` automatically, so you typically register hooks with just `before :hydrate_context` or `around :with_context`—no `target: :call` required. This keeps instrumentation, memory management, and persistence glued to the typed `forward` path that already emits observability spans. See [`docs/src/core-concepts/module-runtime-context.md`](/core-concepts/module-runtime-context/) for more patterns, including fiber-local LM overrides. ### Per-Instance LM Configuration ```ruby module = DSPy::ChainOfThought.new(SignatureClass) module.configure do |config| config.lm = DSPy::LM.new('anthropic/claude-3-sonnet', api_key: ENV['ANTHROPIC_API_KEY'] ) end ``` ## Predictors ### Predict Basic LLM calls with signatures: ```ruby predictor = DSPy::Predict.new(EmailClassifier) result = predictor.call( email_content: "My order hasn't arrived", sender: "customer@example.com" ) ``` ### ChainOfThought Adds automatic reasoning to any signature: ```ruby # Automatically adds :reasoning field to output cot = DSPy::ChainOfThought.new(ComplexAnalysis) result = cot.call(data: complex_data) puts result.reasoning # Step-by-step explanation ``` ### ReAct Tool-using agent with reasoning: ```ruby # Define tools (you would implement CalculatorTool) calculator = YourCalculatorTool.new memory_tools = DSPy::Tools::MemoryToolset.to_tools # Create agent agent = DSPy::ReAct.new( ResearchSignature, tools: [calculator, *memory_tools], max_iterations: 10 ) result = agent.call(query: "Calculate compound interest...") ``` ### CodeAct CodeAct now ships in the `dspy-code_act` gem. See [`lib/dspy/code_act/README.md`](https://github.com/vicentereig/dspy.rb/blob/main/lib/dspy/code_act/README.md) for examples, safety recommendations, and advanced usage patterns. ## Rich Types ### Enums ```ruby class Status < T::Enum enums do Active = new('active') Inactive = new('inactive') Pending = new('pending') end end ``` ### Structs ```ruby class Product < T::Struct const :name, String const :price, Float const :tags, T::Array[String], default: [] end ``` ### Arrays of Structs ```ruby output do const :products, T::Array[Product] end # Automatic conversion from JSON result.products.each do |product| puts "#{product.name}: $#{product.price}" end ``` ### Union Types ```ruby # Automatic type detection (v0.11.0+) output do const :result, T.any(SuccessResult, ErrorResult) end # Pattern matching case result.result when SuccessResult puts "Success: #{result.result.message}" when ErrorResult puts "Error: #{result.result.error}" end ``` #### Advanced Type Discrimination Examples ```ruby # Nested structures with automatic _type handling class CompanyAddress < T::Struct const :street, String const :city, String const :postal_code, String end class Person < T::Struct const :name, String const :address, CompanyAddress end class Company < T::Struct const :name, String const :address, CompanyAddress const :employees, T::Array[Person] end # Complex signature with nested discriminated types class BusinessAnalysis < DSPy::Signature description "Analyze business entities with nested address data" output do const :entity, T.any(Person, Company) # Union with complex nested types const :confidence, Float end end ``` **JSON Schema Generated for Union Types:** The `T.any(Person, Company)` generates a `oneOf` schema: ```json { "type": "object", "properties": { "entity": { "oneOf": [ { "type": "object", "properties": { "_type": { "type": "string", "const": "Person" }, "name": { "type": "string" }, "address": { "type": "object", "properties": { "_type": { "type": "string", "const": "CompanyAddress" }, "street": { "type": "string" }, "city": { "type": "string" }, "postal_code": { "type": "string" } }, "required": ["_type", "street", "city", "postal_code"] } }, "required": ["_type", "name", "address"] }, { "type": "object", "properties": { "_type": { "type": "string", "const": "Company" }, "name": { "type": "string" }, "address": { /* same CompanyAddress schema */ }, "employees": { "type": "array", "items": { /* Person schema with _type: "Person" */ } } }, "required": ["_type", "name", "address", "employees"] } ] }, "confidence": { "type": "number" } } } ``` **LLM Response Format:** ```json { "entity": { "_type": "Company", "name": "Tech Corp", "address": { "_type": "CompanyAddress", "street": "123 Business Ave", "city": "Tech City", "postal_code": "12345" }, "employees": [ { "_type": "Person", "name": "John Doe", "address": { "_type": "CompanyAddress", "street": "456 Home St", "city": "Residential Area", "postal_code": "67890" } } ] }, "confidence": 0.95 } ``` **Key Points:** - Each struct in the schema requires its exact `_type` const value - Union types use `oneOf` with separate schema variants - Direct struct fields get a single schema with required `_type` - All `_type` fields are automatically filtered during deserialization ### Nested Structures ```ruby class Company < T::Struct class Department < T::Struct const :name, String const :head, String end const :name, String const :departments, T::Array[Department] end ``` ## Multimodal Support DSPy.rb provides comprehensive support for text and image inputs through its unified `DSPy::Image` interface, enabling vision-capable LLM applications across multiple providers. ### Image Input Types ```ruby # URL-based images (OpenAI only) image = DSPy::Image.new(url: "https://example.com/image.jpg") # Base64 encoded images (both providers) image = DSPy::Image.new( base64: base64_string, content_type: "image/jpeg" ) # Byte array images (both providers) File.open("image.jpg", "rb") do |file| image = DSPy::Image.new( data: file.read, content_type: "image/jpeg" ) end # With detail level (OpenAI only) image = DSPy::Image.new( url: "https://example.com/image.jpg", detail: "high" ) ``` ### Using Images with Raw Chat Currently, multimodal support works at the raw chat level using the message builder: ```ruby # Configure with vision-capable model lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY']) # Single image analysis image = DSPy::Image.new(url: "https://example.com/photo.jpg") response = lm.raw_chat do |messages| messages.user_with_image('What is in this image?', image) end puts response # String response # Multiple images image1 = DSPy::Image.new(url: "https://example.com/before.jpg") image2 = DSPy::Image.new(url: "https://example.com/after.jpg") response = lm.raw_chat do |messages| messages.user_with_images('Compare these images', [image1, image2]) end # With system prompt response = lm.raw_chat do |messages| messages.system('You are an expert image analyst.') messages.user_with_image('Analyze this image in detail.', image) end ``` ### Provider-Specific Usage **OpenAI (supports URLs and base64):** ```ruby lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY']) # URL images work image = DSPy::Image.new(url: 'https://example.com/image.jpg') response = lm.raw_chat do |messages| messages.user_with_image('What color is this?', image) end ``` **Anthropic (base64 only):** ```ruby lm = DSPy::LM.new('anthropic/claude-3-sonnet', api_key: ENV['ANTHROPIC_API_KEY']) # Must use base64 or raw data File.open('image.jpg', 'rb') do |file| image = DSPy::Image.new( data: file.read, content_type: 'image/jpeg' ) response = lm.raw_chat do |messages| messages.system('You are a color detection expert.') messages.user_with_image('What color?', image) end end ``` ### Error Handling DSPy.rb validates image compatibility and provides clear error messages: ```ruby # Non-vision model error begin non_vision_lm = DSPy::LM.new('openai/gpt-3.5-turbo', api_key: ENV['OPENAI_API_KEY']) image = DSPy::Image.new(url: 'https://example.com/image.jpg') non_vision_lm.raw_chat do |messages| messages.user_with_image('What is this?', image) end rescue ArgumentError => e puts "Error: #{e.message}" # Model does not support vision end # Provider compatibility error begin anthropic_lm = DSPy::LM.new('anthropic/claude-3-sonnet', api_key: ENV['ANTHROPIC_API_KEY']) image = DSPy::Image.new(url: 'https://example.com/image.jpg') anthropic_lm.raw_chat do |messages| messages.user_with_image('What is this?', image) end rescue DSPy::LM::IncompatibleImageFeatureError => e puts "Error: #{e.message}" # Anthropic doesn't support image URLs end ``` ### Supported Formats and Limits - **Formats**: JPEG, PNG, GIF, WebP - **Size Limit**: 5MB per image - **OpenAI**: URLs and base64, supports `detail` parameter - **Anthropic**: Base64 and raw data only, no `detail` parameter - **Signatures**: Image fields in signatures are not yet supported; use raw chat for vision. ### Structured Multimodal Signatures Note: Image fields in signatures are not yet serialized into LM requests; use `raw_chat`/`MessageBuilder` for vision today. The examples below illustrate the intended API once signature-level multimodal wiring lands. DSPy.rb supports advanced structured signatures for comprehensive image analysis: ```ruby # Image analysis with detailed extraction class ImageAnalysis < DSPy::Signature description "Analyze images comprehensively to extract objects, colors, mood, and style" input do const :image, DSPy::Image, description: 'Image to analyze' const :focus, String, default: 'general', description: 'Analysis focus' const :detail_level, String, default: 'standard', description: 'Detail level' end output do const :description, String, description: 'Overall image description' const :objects, T::Array[String], description: 'Objects detected in image' const :dominant_colors, T::Array[String], description: 'Main colors present' const :mood, String, description: 'Overall mood or atmosphere' const :style, String, description: 'Artistic style or characteristics' const :confidence, Float, description: 'Analysis confidence (0.0-1.0)' end end # Type-safe bounding box detection class BoundingBox < T::Struct const :x, Float, description: 'Normalized x coordinate (0.0-1.0)' const :y, Float, description: 'Normalized y coordinate (0.0-1.0)' const :width, Float, description: 'Normalized width (0.0-1.0)' const :height, Float, description: 'Normalized height (0.0-1.0)' end class DetectedObject < T::Struct const :label, String, description: 'Object type/label' const :bbox, BoundingBox, description: 'Bounding box coordinates' const :confidence, Float, description: 'Detection confidence (0.0-1.0)' end class BoundingBoxDetection < DSPy::Signature description "Detect and locate objects in images with normalized bounding box coordinates" input do const :image, DSPy::Image, description: 'Image to analyze for object detection' const :query, T.any(String, NilClass), description: 'Object to detect' end output do const :objects, T::Array[DetectedObject], description: 'Detected objects with type-safe bounding boxes' const :count, Integer, description: 'Total objects detected' const :confidence, Float, description: 'Overall detection confidence' end end # Usage examples analyzer = DSPy::Predict.new(ImageAnalysis) image = DSPy::Image.new(url: 'https://example.com/landscape.jpg') result = analyzer.call( image: image, focus: 'colors', detail_level: 'detailed' ) puts result.description puts "Colors: #{result.dominant_colors.join(', ')}" puts "Objects: #{result.objects.join(', ')}" puts "Mood: #{result.mood}" # Object detection with type safety detector = DSPy::Predict.new(BoundingBoxDetection) detection = detector.call( image: image, query: 'vehicles' ) detection.objects.each do |obj| puts "#{obj.label} at (#{obj.bbox.x}, #{obj.bbox.y})" puts "Size: #{obj.bbox.width} x #{obj.bbox.height}" end ``` ## Agent Systems ### ReAct Agent Reasoning + Acting pattern: ```ruby class ResearchAssistant < DSPy::Module def initialize super # Create tools (implement as needed for your use case) calculator = YourCalculatorTool.new memory_tools = DSPy::Tools::MemoryToolset.to_tools @agent = DSPy::ReAct.new( ResearchSignature, tools: [calculator, *memory_tools] ) end def forward(query:) @agent.call(query: query) end end ``` ### CodeAct Agent Install `dspy-code_act` to build Think-Code-Observe agents. The gem ships with a full agent walkthrough in its README. ### Custom Agents Build your own agent patterns: ```ruby class CustomAgent < DSPy::Module def initialize super @planner = DSPy::ChainOfThought.new(PlanningSignature) # Requires the dspy-code_act gem for Think-Code-Observe execution @executor = DSPy::CodeAct.new(ExecutionSignature) @validator = DSPy::Predict.new(ValidationSignature) end def forward(task:) plan = @planner.call(task: task) execution = @executor.call(plan: plan.plan) validation = @validator.call(result: execution.solution) { result: execution.solution, confidence: validation.confidence } end end ``` ## Memory Systems ### Basic Memory Operations ```ruby # Initialize memory DSPy::Memory.configure do |config| config.storage_adapter = :in_memory # or :redis end # Store memory memory_id = DSPy::Memory.manager.store_memory( "User prefers dark mode", user_id: "user123", tags: ["preferences", "ui"] ) # Retrieve memory memory = DSPy::Memory.manager.retrieve_memory(memory_id) # Search memories memories = DSPy::Memory.manager.search_memories( user_id: "user123", tags: ["preferences"] ) ``` ### Memory with Agents ```ruby class PersonalAssistant < DSPy::Module def initialize super memory_tools = DSPy::Tools::MemoryToolset.to_tools @agent = DSPy::ReAct.new( AssistantSignature, tools: memory_tools ) end def forward(user_message:, user_id:) @agent.call( user_message: user_message, user_id: user_id ) end end ``` ### Redis Storage ```ruby require 'redis' DSPy::Memory.configure do |config| config.storage_adapter = :redis config.redis_client = Redis.new(url: ENV['REDIS_URL']) config.redis_namespace = 'dspy:memory' end ``` ## Toolsets ### Creating Tools with Advanced Sorbet Types Tools now support comprehensive Sorbet type system including enums, structs, arrays, and hashes with automatic JSON conversion: ```ruby # Enum-based tool with comprehensive type support class CalculatorTool < DSPy::Tools::Base tool_name 'calculator' tool_description 'Performs arithmetic operations with type-safe enum inputs and comprehensive error handling' class Operation < T::Enum enums do Add = new('add') Subtract = new('subtract') Multiply = new('multiply') Divide = new('divide') Power = new('power') Root = new('root') end end class CalculationResult < T::Struct const :result, T.any(Float, String) const :operation_performed, String const :success, T::Boolean const :error_message, T.nilable(String) end sig { params(operation: Operation, num1: Float, num2: Float).returns(CalculationResult) } def call(operation:, num1:, num2:) case operation when Operation::Add CalculationResult.new( result: num1 + num2, operation_performed: "#{num1} + #{num2}", success: true, error_message: nil ) when Operation::Subtract CalculationResult.new( result: num1 - num2, operation_performed: "#{num1} - #{num2}", success: true ) when Operation::Multiply CalculationResult.new( result: num1 * num2, operation_performed: "#{num1} * #{num2}", success: true ) when Operation::Divide if num2 == 0 CalculationResult.new( result: "Error: Division by zero", operation_performed: "#{num1} / #{num2}", success: false, error_message: "Cannot divide by zero" ) else CalculationResult.new( result: num1 / num2, operation_performed: "#{num1} / #{num2}", success: true ) end when Operation::Power CalculationResult.new( result: num1 ** num2, operation_performed: "#{num1} ^ #{num2}", success: true ) when Operation::Root if num1 < 0 && num2.even? CalculationResult.new( result: "Error: Even root of negative number", operation_performed: "#{num1} root #{num2}", success: false, error_message: "Cannot take even root of negative number" ) else CalculationResult.new( result: num1 ** (1.0 / num2), operation_performed: "#{num1} root #{num2}", success: true ) end end end end # Complex tool with nested structs and arrays class DataAnalysisTool < DSPy::Tools::Base tool_name 'data_analyzer' tool_description 'Analyzes datasets with comprehensive type-safe configuration' class DataType < T::Enum enums do Numeric = new('numeric') Categorical = new('categorical') Text = new('text') DateTime = new('datetime') end end class ColumnInfo < T::Struct const :name, String const :type, DataType const :nullable, T::Boolean, default: false const :unique_values, T.nilable(Integer) end class AnalysisConfig < T::Struct const :include_correlations, T::Boolean, default: true const :max_unique_values, Integer, default: 100 const :sample_size, T.nilable(Integer), default: nil const :output_format, T::Array[String], default: ['summary', 'stats'] end class AnalysisResult < T::Struct const :total_rows, Integer const :total_columns, Integer const :columns, T::Array[ColumnInfo] const :missing_data_percentage, Float const :recommendations, T::Array[String] end sig { params( data: String, columns: T::Array[ColumnInfo], config: T.nilable(AnalysisConfig) ).returns(AnalysisResult) } def call(data:, columns:, config: nil) config ||= AnalysisConfig.new # Simulate analysis total_rows = data.split("\n").length missing_percentage = 0.05 recommendations = [] recommendations << "Consider handling missing data" if missing_percentage > 0.1 recommendations << "Some columns have high cardinality" if columns.any? { |c| (c.unique_values || 0) > config.max_unique_values } AnalysisResult.new( total_rows: total_rows, total_columns: columns.length, columns: columns, missing_data_percentage: missing_percentage, recommendations: recommendations ) end end ``` ### Creating Toolsets Toolsets group related tools together with comprehensive type support: ```ruby class WeatherToolset < DSPy::Tools::Toolset toolset_name "weather" class WeatherCondition < T::Enum enums do Sunny = new('sunny') Cloudy = new('cloudy') Rainy = new('rainy') Snowy = new('snowy') Stormy = new('stormy') end end class Temperature < T::Struct const :celsius, Float const :fahrenheit, Float const :feels_like, Float end class WeatherReport < T::Struct const :location, String const :condition, WeatherCondition const :temperature, Temperature const :humidity, Integer const :wind_speed, Float const :timestamp, String end class ForecastDay < T::Struct const :date, String const :condition, WeatherCondition const :high_temp, Float const :low_temp, Float const :precipitation_chance, Integer end tool :get_current, tool_name: "weather_current", description: "Get current weather conditions" tool :get_forecast, description: "Get detailed weather forecast" tool :get_alerts, description: "Get weather alerts for location" sig { params(location: String).returns(WeatherReport) } def get_current(location:) # Actual implementation would call weather API WeatherReport.new( location: location, condition: WeatherCondition::Sunny, temperature: Temperature.new( celsius: 22.0, fahrenheit: 71.6, feels_like: 24.0 ), humidity: 60, wind_speed: 10.5, timestamp: Time.now.iso8601 ) end sig { params(location: String, days: Integer).returns(T::Array[ForecastDay]) } def get_forecast(location:, days: 7) # Actual implementation would call weather API (1..days).map do |day| ForecastDay.new( date: (Date.today + day).to_s, condition: [WeatherCondition::Sunny, WeatherCondition::Cloudy, WeatherCondition::Rainy].sample, high_temp: 20.0 + rand(10), low_temp: 10.0 + rand(8), precipitation_chance: rand(100) ) end end sig { params(location: String, severity: T.nilable(String)).returns(T::Array[String]) } def get_alerts(location:, severity: nil) # Actual implementation would call weather API alerts = ["High wind warning in effect"] severity ? alerts.select { |a| a.downcase.include?(severity.downcase) } : alerts end end # Advanced database toolset with complex operations class DatabaseToolset < DSPy::Tools::Toolset toolset_name "database" class QueryType < T::Enum enums do Select = new('select') Insert = new('insert') Update = new('update') Delete = new('delete') Join = new('join') Aggregate = new('aggregate') end end class Column < T::Struct const :name, String const :type, String const :nullable, T::Boolean, default: true const :default_value, T.nilable(String) end class QueryConfig < T::Struct const :limit, T.nilable(Integer) const :offset, T.nilable(Integer), default: 0 const :order_by, T.nilable(String) const :include_count, T::Boolean, default: false end class QueryResult < T::Struct const :success, T::Boolean const :data, T::Array[T::Hash[String, T.untyped]] const :row_count, Integer const :execution_time_ms, Float const :query_type, QueryType end tool :execute_query, description: "Execute database query with type safety" tool :describe_table, description: "Get table schema information" tool :optimize_query, description: "Analyze and optimize query performance" sig { params( sql: String, query_type: QueryType, config: T.nilable(QueryConfig) ).returns(QueryResult) } def execute_query(sql:, query_type:, config: nil) config ||= QueryConfig.new # Simulate query execution mock_data = [ { "id" => 1, "name" => "Alice", "email" => "alice@example.com" }, { "id" => 2, "name" => "Bob", "email" => "bob@example.com" } ] QueryResult.new( success: true, data: mock_data, row_count: mock_data.length, execution_time_ms: 45.2, query_type: query_type ) end sig { params(table_name: String).returns(T::Array[Column]) } def describe_table(table_name:) # Simulate table description [ Column.new(name: "id", type: "INTEGER", nullable: false), Column.new(name: "name", type: "VARCHAR(255)", nullable: false), Column.new(name: "email", type: "VARCHAR(320)", nullable: true, default_value: "NULL") ] end sig { params(sql: String).returns(T::Hash[String, T.any(String, Integer, Float, T::Array[String])]) } def optimize_query(sql:) # Simulate query optimization analysis { original_query: sql, estimated_cost: 125, execution_plan: ["Index Scan", "Hash Join", "Sort"], optimization_suggestions: [ "Consider adding index on user_id column", "Use LIMIT to reduce result set size", "Consider query rewriting for better performance" ], performance_rating: 7.5 } end end # Convert to tool instances for agents weather_tools = WeatherToolset.to_tools database_tools = DatabaseToolset.to_tools # Use with ReAct agent agent = DSPy::ReAct.new( WeatherSignature, tools: weather_tools + database_tools, max_iterations: 15 ) ``` ### Built-in Toolsets ```ruby # Memory toolset with persistent storage memory_tools = DSPy::Tools::MemoryToolset.to_tools # Includes: memory_store, memory_retrieve, memory_search, # memory_list, memory_update, memory_delete, # memory_clear, memory_count, memory_get_metadata # Text processing toolset text_tools = DSPy::Tools::TextProcessingToolset.to_tools # Includes: summarize, extract_keywords, count_tokens, # translate, format_markdown, extract_entities, # sentiment_analysis, clean_text, compare_texts ``` ### Automatic Type Conversion DSPy.rb provides seamless automatic conversion from JSON parameters to Ruby types in tools: ```ruby # When agents call tools, JSON strings are automatically converted # Agent provides: { "operation": "add", "num1": 10, "num2": 20 } # DSPy converts: # - "add" string → CalculatorTool::Operation::Add enum # - 10, 20 numbers → Float values # - Result: tool.call(operation: Add, num1: 10.0, num2: 20.0) class ComplexTool < DSPy::Tools::Base class Priority < T::Enum enums do Low = new('low') High = new('high') end end class TaskData < T::Struct const :title, String const :priority, Priority const :tags, T::Array[String] end # Agent JSON: { # "task": { # "title": "Fix bug", # "priority": "high", # "tags": ["urgent", "backend"] # } # } # # Automatically converts to: # - "high" → Priority::High enum # - Array strings → T::Array[String] # - Nested hash → TaskData struct sig { params(task: TaskData).returns(String) } def call(task:) "Created task: #{task.title} (#{task.priority.serialize}) with tags: #{task.tags.join(', ')}" end end # Conversion works for all Sorbet types: # - T::Enum → Automatic string-to-enum conversion # - T::Struct → Recursive hash-to-struct conversion # - T::Array[Type] → Array element conversion # - T::Hash[String, Type] → Hash value conversion # - T.nilable(Type) → Handles null/nil values # - T.any(Type1, Type2) → Union type resolution # - Nested combinations → Deep conversion at any level ``` ## Optimization ### MIPROv2 Optimization Advanced multi-prompt optimization with bootstrap sampling and Bayesian optimization: ```ruby # Auto-configuration modes for different needs light_optimizer = DSPy::Teleprompt::MIPROv2::AutoMode.light(metric: your_metric) medium_optimizer = DSPy::Teleprompt::MIPROv2::AutoMode.medium(metric: your_metric) heavy_optimizer = DSPy::Teleprompt::MIPROv2::AutoMode.heavy(metric: your_metric) # Custom configuration using dry-configurable pattern custom_optimizer = DSPy::Teleprompt::MIPROv2.new(metric: custom_metric) custom_optimizer.configure do |config| config.num_trials = 15 config.num_instruction_candidates = 6 config.max_bootstrapped_examples = 5 config.max_labeled_examples = 20 config.bootstrap_sets = 6 config.optimization_strategy = :bayesian # or :greedy, :adaptive config.early_stopping_patience = 4 end # Run optimization program = DSPy::Predict.new(YourSignature) result = custom_optimizer.compile(program, trainset: training_examples, valset: validation_examples) puts "Best MIPROv2 score: #{result.best_score_value}" puts "Optimized program: #{result.optimized_program}" puts "Optimization history: #{result.history.length} trials" ``` ### GEPA Optimization Genetic-Pareto reflective prompt evolution that replays traces, collects feedback, and asks a reflection LM to rewrite instructions: > Install the `dspy-gepa` gem (and set `DSPY_WITH_GEPA=1` when developing inside the monorepo) to load `DSPy::Teleprompt::GEPA`. ```ruby feedback_map = { 'self' => ->(predictor_output:, predictor_inputs:, module_inputs:, module_outputs:, captured_trace:) do DSPy::Prediction.new( score: predictor_output[:answer] == module_outputs[:answer] ? 1.0 : 0.2, feedback: "Tie feedback to the original question: #{module_inputs.input_values[:question]}" ) end } optimizer = DSPy::Teleprompt::GEPA.new( metric: metric, feedback_map: feedback_map, experiment_tracker: GEPA::Logging::ExperimentTracker.new ) program = DSPy::Predict.new(YourSignature) result = optimizer.compile(program, trainset: train_examples, valset: val_examples) puts "GEPA trials: #{result.history.count}" puts "Latest reflection: #{result.history.last&.reflection}" ``` The ADE demo at `examples/ade_optimizer_gepa/main.rb` shows GEPA running with budgets, telemetry, and per-predictor traces. ### Evaluation Framework DSPy.rb provides a comprehensive evaluation system for testing and measuring LLM application performance. #### Basic Evaluation ```ruby # Create evaluator with a predictor and metric predictor = DSPy::Predict.new(YourSignature) metric = DSPy::Metrics.exact_match(field: :answer) evaluator = DSPy::Evals.new(predictor, metric: metric) # Evaluate against test examples result = evaluator.evaluate(test_examples, display_progress: true) puts "Score: #{result.score}" puts "Passed: #{result.passed_examples}/#{result.total_examples}" ``` #### Built-in Metrics ```ruby # Exact match comparison exact_metric = DSPy::Metrics.exact_match(field: :answer, case_sensitive: false) # Contains/substring matching contains_metric = DSPy::Metrics.contains(field: :answer) # Numeric difference with tolerance numeric_metric = DSPy::Metrics.numeric_difference(field: :score, tolerance: 0.1) # Composite AND (all must pass) composite_metric = DSPy::Metrics.composite_and(exact_metric, contains_metric) ``` #### Custom Metrics ```ruby # Custom metric as proc custom_metric = ->(example, prediction) do return false unless prediction && prediction.respond_to?(:answer) prediction.answer.downcase.strip == example.expected_answer.downcase.strip end # Multi-factor custom metric quality_metric = ->(example, prediction) do return 0.0 unless prediction score = 0.0 score += 0.5 if prediction.answer == example.expected_answer # Accuracy score += 0.3 if prediction.explanation&.length&.> 50 # Completeness score += 0.2 if prediction.confidence&.> 0.8 # Confidence score end evaluator = DSPy::Evals.new(predictor, metric: quality_metric) ``` #### Single Example Evaluation ```ruby example = { input: { question: "What is 2+2?" }, expected: { answer: "4" } } result = evaluator.call(example) puts result.passed # => true/false puts result.metrics # => Hash with metric details puts result.prediction # => Actual prediction object ``` #### Batch Evaluation Results ```ruby batch_result = evaluator.evaluate(test_examples) # Overall metrics puts batch_result.score # 0.0 to 1.0 puts batch_result.total_examples # Total count puts batch_result.passed_examples # Passed count puts batch_result.pass_rate # Passed/Total ratio # Individual results batch_result.results.each do |individual_result| puts "Example passed: #{individual_result.passed}" puts "Error: #{individual_result.metrics[:error]}" if individual_result.metrics[:error] end ``` #### Error Handling ```ruby # Configure error handling evaluator = DSPy::Evals.new( predictor, metric: metric, max_errors: 3, # Stop after 3 errors provide_traceback: true # Include stack traces ) # Errors are captured in results result = evaluator.evaluate(test_examples) error_count = result.results.count { |r| r.metrics[:error] } puts "#{error_count} examples failed with errors" ``` #### Integration with Optimization ```ruby # Use evaluation with MIPROv2 optimization custom_metric = proc do |example, prediction| # Return true/false for pass/fail evaluation prediction.answer.downcase.strip == example.expected[:answer].downcase.strip end # Create MIPROv2 optimizer optimizer = DSPy::Teleprompt::MIPROv2.new(metric: custom_metric) # Create program to optimize program = DSPy::ChainOfThought.new(YourSignature) # Run optimization result = optimizer.compile(program, trainset: train_examples, valset: validation_examples) puts "Best optimized score: #{result.best_score_value}" puts "Optimized program available at: result.optimized_program" puts "Optimization completed #{result.history.length} trials" ``` ## Production Features ### Observability **Configuration Requirements:** Add the optional gems: ```ruby gem 'dspy' gem 'dspy-o11y' gem 'dspy-o11y-langfuse' ``` Langfuse integration also requires these environment variables: ```bash export LANGFUSE_PUBLIC_KEY=pk_your_public_key export LANGFUSE_SECRET_KEY=sk_your_secret_key # Optional: defaults to https://cloud.langfuse.com export LANGFUSE_HOST=https://your-langfuse-instance.com ``` **Basic Setup:** ```ruby # Enable structured logging DSPy.configure do |c| c.logger = Dry.Logger(:dspy, formatter: :json) end # Observability automatically activates when Langfuse env vars are present # Span tracking emitted for: # - llm.generate # - dspy.predict # - module.forward # - tool.execute ``` ### OpenTelemetry Integration ```ruby require 'opentelemetry/sdk' # Configure OTEL OpenTelemetry::SDK.configure do |c| c.service_name = 'my-dspy-app' c.use 'OpenTelemetry::Instrumentation::Net::HTTP' end # DSPy logs can be forwarded to OTEL DSPy.configure do |c| c.logger = Dry.Logger(:dspy, formatter: :json) do |logger| logger.add_backend(stream: "/var/log/dspy/traces.json") end end ``` ### Error Handling ```ruby begin result = predictor.call(input: data) rescue DSPy::Errors::ValidationError => e puts "Invalid input: #{e.message}" rescue DSPy::Errors::LMError => e puts "LLM error: #{e.message}" # Implement retry logic end ``` ### Token Usage Tracking ```ruby # Automatic tracking with log processing File.foreach("log/dspy.log") do |line| event = JSON.parse(line) if event["event"] == "llm.generate" puts "Input tokens: #{event["gen_ai.usage.prompt_tokens"]}" puts "Output tokens: #{event["gen_ai.usage.completion_tokens"]}" puts "Model: #{event["gen_ai.request.model"]}" end end ``` ### Event System DSPy.rb includes a comprehensive event system for monitoring and extending functionality: **Dual Event APIs:** ```ruby # Simple logging (legacy) DSPy.log('custom.event', data: 'value', context: 'info') # Structured event system (recommended) DSPy.event('custom.event', { data: 'value', context: 'info' }) ``` **Event Subscription:** ```ruby # Subscribe to specific events subscription_id = DSPy.events.subscribe('llm.*') do |event_name, attributes| puts "LLM Event: #{event_name}" puts "Attributes: #{attributes.inspect}" end # Subscribe to all events all_events_id = DSPy.events.subscribe('*') do |event_name, attributes| # Custom processing for all events MyCustomLogger.log(event_name, attributes) end # Unsubscribe when done DSPy.events.unsubscribe(subscription_id) ``` **Built-in Events:** - `llm.generate` - LLM API calls with token usage and timing - `dspy.predict` - Prediction operations with inputs/outputs - `module.forward` - Module execution with span tracking - `tool.execute` - Tool invocations in agents - `span.start` / `span.end` - OpenTelemetry span lifecycle **Custom Event Subscribers:** ```ruby class MetricsCollector def initialize @subscription = DSPy.events.subscribe('llm.*') do |event, attrs| collect_llm_metrics(event, attrs) end end private def collect_llm_metrics(event, attributes) # Custom metrics collection logic if event == 'llm.generate' tokens = attributes['gen_ai.usage.completion_tokens'] || 0 @total_tokens += tokens end end end # Start collecting metrics = MetricsCollector.new ``` ### Registry & Storage ```ruby # Register optimization results automatically registry = DSPy::Registry::RegistryManager.new registry.register_optimization_result( optimization_result, signature_name: "EmailClassifier", metadata: { accuracy: 0.95 } ) # Deploy with strategy deployed = registry.deploy_with_strategy("EmailClassifier", strategy: "conservative") # Get deployed version current = registry.registry.get_deployed_version("EmailClassifier") ``` ## Testing Strategies ### Unit Testing with RSpec ```ruby RSpec.describe EmailClassifier do let(:classifier) { EmailClassifier.new } it "classifies spam correctly" do result = classifier.call( email_content: "Win a prize!", sender: "spam@example.com" ) expect(result.category).to eq("spam") expect(result.confidence).to be > 0.8 end end ``` ### Integration Testing with VCR ```ruby RSpec.describe "LLM Integration", vcr: true do let(:predictor) { DSPy::Predict.new(AnalysisSignature) } it "analyzes text with real LLM" do result = predictor.call(text: "Sample text") expect(result).to respond_to(:analysis) end end ``` ### Mocking LLM Responses ```ruby # In tests allow(predictor).to receive(:call).and_return( DSPy::Prediction.new( sentiment: "positive", confidence: 0.9 ) ) ``` ### Testing Agents ```ruby RSpec.describe ResearchAssistant do let(:assistant) { ResearchAssistant.new } it "uses tools to answer questions" do result = assistant.call( query: "What is 2+2?" ) expect(result.answer).to eq("4") expect(result.tool_calls).to include( hash_including(tool: "calculator") ) end end ``` ## API Reference ### Core Classes #### DSPy::Signature - `description(text)` - Set signature description - `input { }` - Define input schema - `output { }` - Define output schema - `.input_json_schema` - Get input JSON schema - `.output_json_schema` - Get output JSON schema #### DSPy::Module - `initialize` - Constructor - `forward(**kwargs)` - Main processing method - `call(**kwargs)` - Alias for forward - `configure { |config| }` - Configure module #### DSPy::Prediction - Automatic type conversion from JSON - Access fields as methods - Handles enums, structs, arrays ### Predictors #### DSPy::Predict - `new(signature_class)` - Create predictor - `call(**inputs)` - Execute prediction #### DSPy::ChainOfThought - Adds `:reasoning` field automatically - Same API as Predict #### DSPy::ReAct - `new(signature, tools:, max_iterations: 10)` - Returns result with tool call history #### DSPy::CodeAct - `new(signature, max_iterations: 8)` (requires the `dspy-code_act` gem) - Returns solution and execution history ### Configuration #### DSPy.configure ```ruby DSPy.configure do |c| c.lm # Language model c.logger # Structured logger c.strategy # Extraction strategy end ``` ### Strategy Selection (v0.9.0+) ```ruby # Automatic provider optimization DSPy.configure do |c| c.strategy = DSPy::Strategy::Strict # Provider-optimized # or c.strategy = DSPy::Strategy::Compatible # Works everywhere end ``` ## Integration Guides ### Rails Integration ```ruby # config/initializers/dspy.rb Rails.application.config.after_initialize do DSPy.configure do |c| c.lm = DSPy::LM.new( Rails.application.credentials.llm_model, api_key: Rails.application.credentials.llm_api_key ) c.logger = Dry.Logger(:dspy, formatter: Rails.env.production? ? :json : :string) end end # app/services/email_classifier_service.rb class EmailClassifierService def initialize @classifier = DSPy::ChainOfThought.new(EmailClassifier) end def classify(email) @classifier.call( email_content: email.body, sender: email.from ) end end ``` ### Sidekiq Jobs ```ruby class AnalyzeDocumentJob include Sidekiq::Job def perform(document_id) document = Document.find(document_id) analyzer = DSPy::Predict.new(DocumentAnalysis) result = analyzer.call(content: document.text) document.update!( category: result.category, summary: result.summary ) end end ``` ### API Endpoints ```ruby # Sinatra example post '/api/classify' do content_type :json data = JSON.parse(request.body.read) classifier = DSPy::Predict.new(TextClassifier) result = classifier.call(text: data['text']) { category: result.category.serialize, confidence: result.confidence }.to_json end ``` ## Examples Additional runnable examples in the repo: - Workflow router: `examples/workflow_router.rb` - Evaluator + optimizer loop: `examples/evaluator_loop.rb` - GitHub assistant agent: `examples/github-assistant/` ### Email Support System ```ruby # Signature for email classification class EmailTriage < DSPy::Signature description "Triage customer support emails" class Priority < T::Enum enums do Low = new('low') Medium = new('medium') High = new('high') Urgent = new('urgent') end end input do const :subject, String const :body, String const :customer_tier, String end output do const :department, String const :priority, Priority const :summary, String const :auto_reply_suggested, T::Boolean end end # Agent with memory class SupportAgent < DSPy::Module def initialize super memory_tools = DSPy::Tools::MemoryToolset.to_tools @triage = DSPy::ChainOfThought.new(EmailTriage) @agent = DSPy::ReAct.new( SupportResponse, tools: memory_tools ) end def forward(email:, customer_id:) # Triage email triage_result = @triage.call( subject: email.subject, body: email.body, customer_tier: email.customer.tier ) # Generate response with context response = @agent.call( email: email.body, customer_id: customer_id, priority: triage_result.priority.serialize ) { department: triage_result.department, priority: triage_result.priority, response: response.suggested_reply, should_escalate: triage_result.priority == EmailTriage::Priority::Urgent } end end ``` ### Data Analysis Pipeline ```ruby # Multi-stage analysis class DataPipeline < DSPy::Module def initialize super @cleaner = DSPy::Predict.new(DataCleaning) # Optional: install dspy-code_act to enable dynamic analysis @analyzer = DSPy::CodeAct.new(DataAnalysis) @visualizer = DSPy::ChainOfThought.new(DataVisualization) @reporter = DSPy::ChainOfThought.new(ReportGeneration) end def forward(raw_data:, analysis_goals:) # Clean data cleaned = @cleaner.call(data: raw_data) # Analyze analysis = @analyzer.call( data: cleaned.cleaned_data, goals: analysis_goals ) # Generate visualizations viz = @visualizer.call( data: analysis.results, chart_types: ["bar", "line", "scatter"] ) # Create report report = @reporter.call( analysis: analysis.solution, visualizations: viz.code, goals: analysis_goals ) { cleaned_data: cleaned.cleaned_data, analysis_results: analysis.solution, visualization_code: viz.code, final_report: report.report } end end ``` ### Content Moderation System ```ruby # Example: Complex content analysis system # Note: This demonstrates the architecture - implement signature classes as needed class ContentModerator < DSPy::Module class ViolationType < T::Enum enums do None = new('none') Spam = new('spam') Toxic = new('toxic') Misinformation = new('misinformation') OffTopic = new('off_topic') end end class ModerationResult < T::Struct const :violation_type, ViolationType const :confidence, Float const :explanation, String const :action, String # "approve", "flag", "remove" end def initialize super # Note: These are example signature classes - implement as needed @classifier = DSPy::ChainOfThought.new(YourContentClassificationSignature) @fact_checker = DSPy::ReAct.new( YourFactCheckingSignature, tools: [your_web_search_tool] ) end def forward(content:, context:) # Initial classification classification = @classifier.call( content: content, context: context ) # Fact check if needed if classification.needs_fact_check fact_result = @fact_checker.call( claim: content, context: context ) if fact_result.likely_false return ModerationResult.new( violation_type: ViolationType::Misinformation, confidence: fact_result.confidence, explanation: fact_result.explanation, action: "flag" ) end end ModerationResult.new( violation_type: classification.violation_type, confidence: classification.confidence, explanation: classification.reasoning, action: determine_action(classification) ) end private def determine_action(classification) case classification.violation_type when ViolationType::None "approve" when ViolationType::Spam, ViolationType::Toxic "remove" else "flag" end end end ``` ## Advanced Patterns ### Custom Strategy Implementation ```ruby class CustomJSONStrategy < DSPy::StrategyInterface def extract_json(signature, lm_response) # Custom extraction logic parsed = JSON.parse(lm_response) signature.output_struct.new(parsed) rescue JSON::ParserError # Fallback logic end end ``` ### Dynamic Module Configuration ```ruby class AdaptiveModule < DSPy::Module def initialize super @strategies = { simple: DSPy::Predict.new(SimpleSignature), complex: DSPy::ChainOfThought.new(ComplexSignature) } end def forward(input:, complexity: :simple) strategy = @strategies[complexity] strategy.call(input: input) end end ``` ### Streaming Responses Streaming is supported today by passing a block to `raw_chat` (or any adapter-aware call). The block receives provider chunks while DSPy also accumulates the final string. ```ruby lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY']) buffer = "" lm.raw_chat([{ role: 'user', content: 'Stream the alphabet slowly.' }]) do |chunk| text = chunk.dig('choices', 0, 'delta', 'content') next unless text buffer << text print text end puts "\nFinal: #{buffer}" ``` ## Performance Optimization ### Caching ```ruby # Schema caching happens automatically # First call generates schema result1 = predictor.call(input: "text") # Subsequent calls use cached schema result2 = predictor.call(input: "more text") ``` ### Batch Processing ```ruby # Process multiple items efficiently items = ["text1", "text2", "text3"] results = items.map do |item| predictor.call(text: item) end ``` ### Connection Pooling ```ruby # Configure HTTP client DSPy::LM.configure do |config| config.http_timeout = 30 config.max_retries = 3 config.retry_delay = 1 end ``` ## Troubleshooting ### Common Issues 1. **Type Conversion Failures** - Check nesting depth (keep under 3 levels) - Verify enum values match exactly - Use T.nilable for optional fields 2. **JSON Extraction Errors** - Enable debug logging - Check provider compatibility - Use Compatible strategy as fallback 3. **Memory Issues** - Configure appropriate storage backend - Implement memory compaction - Set retention policies 4. **Performance Problems** - Use provider-optimized strategies - Implement caching where appropriate - Monitor token usage ### Debug Mode ```ruby DSPy.configure do |c| c.logger = Dry.Logger(:dspy) do |logger| logger.add_backend(level: :debug, stream: $stdout) end end # Process all log events File.foreach("log/dspy.log") do |line| event = JSON.parse(line) puts "Event: #{event["event"]}" puts "Attributes: #{event.select { |k, v| k.start_with?("gen_ai") || k.start_with?("dspy") }}" end ``` ## Best Practices 1. **Signature Design** - Clear, specific descriptions - Appropriate type constraints - Meaningful field names 2. **Module Composition** - Single responsibility principle - Dependency injection - Testable components 3. **Error Handling** - Graceful degradation - Retry strategies - User-friendly messages 4. **Production Deployment** - Enable monitoring - Set up alerts - Version your modules ## Resources - **Documentation**: https://oss.vicente.services/dspy.rb/ - **GitHub**: https://github.com/vicentereig/dspy.rb - **Issues**: https://github.com/vicentereig/dspy.rb/issues - **Examples**: https://github.com/vicentereig/dspy.rb/tree/main/examples ## Version History - v0.33.0 - Latest release - See CHANGELOG.md for full history --- Generated for DSPy.rb v0.33.0