# DSPy.rb > Build LLM apps like you build software. Type-safe, modular, testable. DSPy.rb brings software engineering best practices to LLM development. Instead of tweaking prompts, you define what you want with Ruby types and let DSPy handle the rest. ## Overview DSPy.rb is a Ruby framework for building language model applications with programmatic prompts. It provides: - **Type-safe signatures** - Define inputs/outputs with Sorbet types - **Modular components** - Compose and reuse LLM logic - **Automatic optimization** - Use data to improve prompts, not guesswork - **Production-ready** - Built-in observability, testing, and error handling ## Core Concepts ### 1. Signatures Define interfaces between your app and LLMs using Ruby types: ```ruby class EmailClassifier < DSPy::Signature description "Classify customer support emails by category and priority" class Priority < T::Enum enums do Low = new('low') Medium = new('medium') High = new('high') Urgent = new('urgent') end end input do const :email_content, String const :sender, String end output do const :category, String const :priority, Priority # Type-safe enum with defined values const :confidence, Float end end ``` ### 2. Modules Build complex workflows from simple building blocks: - **Predict** - Basic LLM calls with signatures - **ChainOfThought** - Step-by-step reasoning - **ReAct** - Tool-using agents - **CodeAct** - Dynamic code generation agents (install the `dspy-code_act` gem) #### Lifecycle callbacks Rails-style lifecycle hooks ship with every `DSPy::Module`, so you can wrap `forward` without touching instrumentation: - **`before`** – runs ahead of `forward` for setup (metrics, context loading) - **`around`** – wraps `forward`, calls `yield`, and lets you pair setup/teardown logic - **`after`** – fires after `forward` returns for cleanup or persistence Callbacks target `forward` by default, so `around :manage_turn` works without passing `target:`. Execution order is deterministic: all `before` hooks → `around` (pre-yield) → `forward` → `around` (post-yield) → all `after` hooks. See the Module Runtime Context guide for full examples. ### 3. Tools & Toolsets Create type-safe tools for agents with comprehensive Sorbet support: ```ruby # Enum-based tool with automatic type conversion class CalculatorTool < DSPy::Tools::Base tool_name 'calculator' tool_description 'Performs arithmetic operations with type-safe enum inputs' class Operation < T::Enum enums do Add = new('add') Subtract = new('subtract') Multiply = new('multiply') Divide = new('divide') end end sig { params(operation: Operation, num1: Float, num2: Float).returns(T.any(Float, String)) } def call(operation:, num1:, num2:) case operation when Operation::Add then num1 + num2 when Operation::Subtract then num1 - num2 when Operation::Multiply then num1 * num2 when Operation::Divide return "Error: Division by zero" if num2 == 0 num1 / num2 end end end # Multi-tool toolset with rich types class DataToolset < DSPy::Tools::Toolset toolset_name "data_processing" class Format < T::Enum enums do JSON = new('json') CSV = new('csv') XML = new('xml') end end class ProcessingConfig < T::Struct const :max_rows, Integer, default: 1000 const :include_headers, T::Boolean, default: true const :encoding, String, default: 'utf-8' end tool :convert, description: "Convert data between formats" tool :validate, description: "Validate data structure" sig { params(data: String, from: Format, to: Format, config: T.nilable(ProcessingConfig)).returns(String) } def convert(data:, from:, to:, config: nil) config ||= ProcessingConfig.new "Converted from #{from.serialize} to #{to.serialize} with config: #{config.inspect}" end sig { params(data: String, format: Format).returns(T::Hash[String, T.any(String, Integer, T::Boolean)]) } def validate(data:, format:) { valid: true, format: format.serialize, row_count: 42, message: "Data validation passed" } end end ``` ### 4. Type System & Discriminators DSPy.rb uses sophisticated type discrimination for complex data structures: - **Automatic `_type` field injection** - DSPy adds discriminator fields to structs for type safety - **Union type support** - T.any() types automatically disambiguated by `_type` - **Reserved field name** - Avoid defining your own `_type` fields in structs - **Recursive filtering** - `_type` fields filtered during deserialization at all nesting levels ### 5. Optimization Improve accuracy with real data: - **MIPROv2** - Advanced multi-prompt optimization with bootstrap sampling and Bayesian optimization - **GEPA (Genetic-Pareto Reflective Prompt Evolution)** - Reflection-driven instruction rewrite loop with feedback maps, experiment tracking, and telemetry - **Evaluation** - Comprehensive framework with built-in and custom metrics, error handling, and batch processing > Install the optional `dspy-gepa` gem (and set `DSPY_WITH_GEPA=1` when working from this monorepo) before using the GEPA teleprompter. ```ruby # Evolve instructions with GEPA feedback_map = { 'self' => ->(predictor_output:, module_inputs:, **) do DSPy::Prediction.new(score: 1.0, feedback: "Call out mistakes for #{module_inputs.input_values[:question]}") end } gepa = DSPy::Teleprompt::GEPA.new(metric: metric, feedback_map: feedback_map) optimized = gepa.compile(program, trainset: train_examples, valset: val_examples) ``` ## Quick Start ```ruby # Install gem 'dspy' # Configure DSPy.configure do |c| c.lm = DSPy::LM.new('openai/gpt-4o-mini', api_key: ENV['OPENAI_API_KEY']) # or use Ollama for local models # c.lm = DSPy::LM.new('ollama/llama3.2') end # Define a task class SentimentAnalysis < DSPy::Signature description "Analyze sentiment of text" input do const :text, String end output do const :sentiment, String # positive, negative, neutral const :score, Float # 0.0 to 1.0 end end # Use it analyzer = DSPy::Predict.new(SentimentAnalysis) result = analyzer.call(text: "This product is amazing!") puts result.sentiment # => "positive" puts result.score # => 0.92 ``` ## Provider Adapter Gems Add the adapter gems that match the providers you call so DSPy can load the right SDKs without bloating your bundle: ```ruby # Gemfile gem 'dspy' gem 'dspy-openai' # OpenAI, OpenRouter, Ollama gem 'dspy-anthropic' # Claude gem 'dspy-gemini' # Gemini ``` Each adapter gem already pulls in the official SDK (`openai`, `anthropic`, `gemini-ai`), so you don’t need to add those manually. DSPy auto-loads the adapters when the gem is present—no extra `require` needed. Adapter documentation lives alongside the code: - [OpenAI / OpenRouter / Ollama adapters](https://github.com/vicentereig/dspy.rb/blob/main/lib/dspy/openai/README.md) - [Anthropic adapters](https://github.com/vicentereig/dspy.rb/blob/main/lib/dspy/anthropic/README.md) - [Gemini adapters](https://github.com/vicentereig/dspy.rb/blob/main/lib/dspy/gemini/README.md) ## Evaluation & Metrics Comprehensive testing and measurement framework: ```ruby # Basic evaluation with built-in metrics metric = DSPy::Metrics.exact_match(field: :answer, case_sensitive: false) evaluator = DSPy::Evals.new(predictor, metric: metric) # Type-safe examples using DSPy::Example test_examples = [ DSPy::Example.new( signature_class: YourSignature, input: { question: "What is 2+2?" }, expected: { answer: "4" } ) ] result = evaluator.evaluate(test_examples, display_progress: true) puts "Pass rate: #{result.pass_rate}" # => 0.95 puts "Total: #{result.total_examples}" # => 100 puts "Passed: #{result.passed_examples}" # => 95 # Advanced metrics with detailed results numeric_metric = DSPy::Metrics.numeric_difference(field: :score, tolerance: 0.1) # Custom multi-factor metrics quality_metric = ->(example, prediction) do return 0.0 unless prediction score = 0.0 score += 0.5 if prediction.answer == example.expected[:answer] # Accuracy score += 0.3 if prediction.explanation&.length&.> 50 # Completeness score += 0.2 if prediction.confidence&.> 0.8 # Confidence score end # Error-resilient batch evaluation evaluator = DSPy::Evals.new( predictor, metric: quality_metric, max_errors: 3, # Stop after 3 errors provide_traceback: true # Include stack traces ) batch_result = evaluator.evaluate(large_test_set) error_count = batch_result.results.count { |r| r.metrics[:error] } # Built-in metrics: exact_match, contains, numeric_difference, composite_and ``` ## MIPROv2 Optimization Advanced multi-prompt optimization with bootstrap sampling and Bayesian optimization: ```ruby # Auto-configuration modes for different needs light_optimizer = DSPy::Teleprompt::MIPROv2::AutoMode.light(metric: your_metric) # 6 trials, greedy medium_optimizer = DSPy::Teleprompt::MIPROv2::AutoMode.medium(metric: your_metric) # 12 trials, adaptive heavy_optimizer = DSPy::Teleprompt::MIPROv2::AutoMode.heavy(metric: your_metric) # 18 trials, Bayesian # Custom configuration with Bayesian optimization using dry-configurable optimizer = DSPy::Teleprompt::MIPROv2.new(metric: custom_metric) optimizer.configure do |config| config.optimization_strategy = :bayesian # or :greedy, :adaptive config.num_trials = 15 config.num_instruction_candidates = 6 end # Run optimization program = DSPy::ChainOfThought.new(YourSignature) result = optimizer.compile(program, trainset: training_examples, valset: validation_examples) puts "Best score: #{result.best_score_value}" optimized_program = result.optimized_program ``` ## Main Features ### Type Safety - Sorbet integration for compile-time checks - Automatic JSON schema generation - Type discrimination with `_type` field handling for union types and structs - Enum types for controlled outputs - Struct types for complex data ### Composability - Chain modules together - Share signatures across modules - Swap predictors without changing logic - Build reusable components ### Observability - Langfuse integration available when `dspy-o11y` + `dspy-o11y-langfuse` gems are installed and env vars are set - Structured logging with span tracking - Token usage tracking - Performance monitoring > Install `dspy-o11y` plus `dspy-o11y-langfuse` (and set `DSPY_WITH_O11Y=1 DSPY_WITH_O11Y_LANGFUSE=1` inside this repo) to enable the optional observability stack. ### Testing - RSpec integration - VCR for recording LLM interactions - Mock responses for unit tests - Evaluation frameworks ## Documentation Structure - **Getting Started** - Installation, quick start, first program - **Core Concepts** - Signatures, modules, predictors, multimodal, examples - **Advanced** - Complex types, memory systems, agents, RAG - **Optimization** - Prompt tuning, evaluation, benchmarking - **Production** - Observability, storage, troubleshooting - **Blog** - Tutorials and deep dives ## Key URLs - Homepage: https://oss.vicente.services/dspy.rb/ - GitHub: https://github.com/vicentereig/dspy.rb - Documentation: https://oss.vicente.services/dspy.rb/getting-started/ - API Reference: https://oss.vicente.services/dspy.rb/core-concepts/ ## More Examples in This Repo - Workflow router: `examples/workflow_router.rb` - Evaluator + optimizer loop: `examples/evaluator_loop.rb` - GitHub assistant agent: `examples/github-assistant/` ## For LLMs When helping users with DSPy.rb: 1. **Focus on signatures** - They define the contract with LLMs 2. **Use proper types** - T::Enum for categories, T::Struct for complex data 3. **Leverage automatic type conversion** - Tools and toolsets automatically convert JSON strings to proper Ruby types (enums, structs, arrays, hashes) 4. **Compose modules** - Chain predictors for complex workflows 5. **Create type-safe tools** - Use Sorbet signatures for comprehensive tool parameter validation and conversion 6. **Test thoroughly** - Use RSpec and VCR for reliable tests 7. **Monitor production** - Enable Langfuse by installing the optional o11y gems and setting env vars ## Version Current: 0.33.0