Skip to content

73. Strategy Scheduler System Design

Overview

The Strategy Scheduler represents a critical core management module in professional quantitative trading systems, enabling unified management of all strategy runners. This system transforms the trading platform into a multi-strategy management factory, providing enterprise-grade strategy orchestration capabilities.

🎯 Core Capabilities

Capability Description
Unified Strategy Management Centralized control of all strategy runners
Scheduled Operations Automated start/stop based on trading schedules
Health Monitoring Automatic detection and recovery of failed strategies
API Control Interface Programmatic and frontend control capabilities
Container Orchestration Docker/Kubernetes-based strategy lifecycle management

System Architecture

Microservice Design

New Microservice: strategy-scheduler

services/strategy-scheduler/
├── src/
│   ├── main.py                 # FastAPI application entry point
│   ├── scheduler/
│   │   ├── strategy_manager.py # Strategy container lifecycle management
│   │   ├── cron_scheduler.py   # Scheduled task management
│   │   └── health_monitor.py   # Strategy health checking
│   ├── models/
│   │   ├── strategy_model.py   # Strategy configuration models
│   │   └── schedule_model.py   # Schedule configuration models
│   ├── routes/
│   │   ├── scheduler_api.py    # REST API endpoints
│   │   └── health_api.py       # Health check endpoints
│   ├── config.py               # Configuration management
│   └── requirements.txt        # Python dependencies
├── Dockerfile                  # Container definition
└── docker-compose.yml          # Local development setup

Technology Stack

Component Technology Purpose
Scheduling Engine APScheduler Python-based cron-like scheduling
Container Management Docker SDK / Kubernetes API Strategy runner lifecycle control
API Framework FastAPI RESTful API interface
Message Communication NATS Strategy control messaging
Health Monitoring Custom health checks Strategy status monitoring

Core Modules Design

Strategy Manager Module

Purpose: Manages strategy runner container lifecycle using Docker/Kubernetes APIs

Key Functions: - Container Lifecycle: Start, stop, restart strategy containers - Status Monitoring: Real-time container status tracking - Resource Management: CPU, memory allocation and monitoring - Configuration Management: Strategy parameter injection

Design Principles: - ✅ Container Isolation: Each strategy runs in isolated containers - ✅ Resource Limits: Configurable CPU/memory limits per strategy - ✅ Graceful Shutdown: Proper cleanup and state preservation - ✅ Error Recovery: Automatic restart on container failures

Cron Scheduler Module

Purpose: Manages scheduled strategy operations using APScheduler

Key Functions: - Trading Schedule Management: Market open/close automation - Flexible Scheduling: Cron expressions for complex schedules - Timezone Support: Multi-timezone trading schedule support - Schedule Persistence: Persistent schedule storage

Common Scheduling Patterns: - Daily Trading: 9:30 AM start, 4:00 PM stop (US markets) - Pre-Market: 4:00 AM start, 9:30 AM stop - After-Hours: 4:00 PM start, 8:00 PM stop - Weekend Maintenance: Friday 5:00 PM shutdown, Monday 4:00 AM startup

Health Monitor Module

Purpose: Continuous monitoring of strategy health and automatic recovery

Monitoring Dimensions: - Container Health: Docker container status monitoring - Strategy Performance: CPU, memory, network usage - Trading Activity: Order flow and execution monitoring - Data Connectivity: Market data feed status

Recovery Mechanisms: - Automatic Restart: Failed strategy container restart - Escalation Alerts: Notification of persistent failures - Graceful Degradation: Partial system operation during failures - Health Reporting: Real-time health status dashboard

API Interface Design

REST API Endpoints

Strategy Lifecycle Management:

POST   /api/v1/strategy/{strategy_id}/start     # Start strategy immediately
POST   /api/v1/strategy/{strategy_id}/stop      # Stop strategy immediately
POST   /api/v1/strategy/{strategy_id}/restart   # Restart strategy
GET    /api/v1/strategy/{strategy_id}/status    # Get strategy status

Schedule Management:

POST   /api/v1/strategy/{strategy_id}/schedule/start  # Schedule strategy start
POST   /api/v1/strategy/{strategy_id}/schedule/stop   # Schedule strategy stop
GET    /api/v1/strategy/{strategy_id}/schedule        # Get current schedule
DELETE /api/v1/strategy/{strategy_id}/schedule        # Remove schedule

Batch Operations:

POST   /api/v1/strategies/start-all              # Start all strategies
POST   /api/v1/strategies/stop-all               # Stop all strategies
GET    /api/v1/strategies/status                 # Get all strategy statuses

Health Monitoring:

GET    /api/v1/health/strategies                 # Overall system health
GET    /api/v1/health/strategy/{strategy_id}     # Individual strategy health
GET    /api/v1/health/containers                 # Container health status

Message Flow Architecture

Frontend Control Interface
    ↓
Strategy Scheduler API
    ↓
Strategy Manager (Docker/K8s API)
    ↓
Strategy Runner Container
    ↓
NATS Message Bus
    ↓
Trading System Components

Frontend Integration

Strategy Management Dashboard

Enhanced Strategy View Components: - Real-time Status Display: Live strategy status indicators - Control Buttons: Start, stop, restart, pause operations - Schedule Configuration: Visual schedule builder interface - Health Monitoring: Real-time health status visualization - Resource Usage: CPU, memory, network usage charts

User Interface Features: - One-Click Operations: Single-click strategy control - Batch Operations: Multi-strategy management - Schedule Visualization: Calendar-based schedule display - Alert Management: Health alert configuration - Audit Trail: Operation history and logging

Operational Benefits

Production Readiness

Benefit Description
Automated Operations Reduced manual intervention requirements
24/7 Reliability Continuous operation with automatic recovery
Scalable Management Support for hundreds of concurrent strategies
Professional Workflow Enterprise-grade strategy management

Risk Management

Risk Mitigation Implementation
Strategy Failures Automatic detection and restart
Resource Exhaustion Resource limits and monitoring
Schedule Conflicts Conflict detection and resolution
Data Loss State preservation and recovery

Operational Efficiency

Efficiency Gain Impact
Reduced Manual Work 90% reduction in manual strategy management
Faster Recovery Sub-minute failure recovery time
Better Monitoring Real-time visibility into all strategies
Standardized Operations Consistent management across all strategies

Implementation Roadmap

Phase 1: Foundation (Weeks 1-2)

  • Basic Container Management: Docker SDK integration
  • Simple API Interface: Core start/stop functionality
  • Health Monitoring: Basic container status checking
  • Frontend Integration: Basic control interface

Phase 2: Scheduling (Weeks 3-4)

  • APScheduler Integration: Cron-based scheduling
  • Schedule Management API: Schedule CRUD operations
  • Timezone Support: Multi-timezone trading schedules
  • Schedule Persistence: Database storage for schedules

Phase 3: Advanced Features (Weeks 5-6)

  • Advanced Health Monitoring: Performance metrics tracking
  • Automatic Recovery: Failure detection and restart
  • Resource Management: CPU/memory limits and monitoring
  • Batch Operations: Multi-strategy management

Phase 4: Production Ready (Weeks 7-8)

  • Kubernetes Integration: Production container orchestration
  • Advanced Monitoring: Prometheus/Grafana integration
  • Security Hardening: Access controls and audit logging
  • Performance Optimization: High-throughput API design

Integration with Existing System

NATS Message Integration

Strategy Control Messages:

strategy.control.start.{strategy_id}     # Start strategy command
strategy.control.stop.{strategy_id}      # Stop strategy command
strategy.control.restart.{strategy_id}   # Restart strategy command
strategy.status.{strategy_id}            # Strategy status updates

Health Monitoring Messages:

strategy.health.{strategy_id}            # Health status updates
strategy.alert.{strategy_id}             # Health alerts
scheduler.status                         # Scheduler system status

Database Integration

Strategy Scheduler Tables: - strategy_schedules: Schedule configurations - strategy_instances: Running strategy instances - health_logs: Health monitoring history - operation_logs: Scheduler operation audit trail

Business & Technical Value

Business Value

  • Operational Efficiency: Automated strategy management reduces manual overhead
  • Risk Reduction: Automatic failure detection and recovery
  • Scalability: Support for enterprise-scale strategy deployment
  • Professional Operations: 24/7 automated trading operations

Technical Value

  • Container Orchestration: Modern container-based deployment
  • API-First Design: Programmatic control and integration
  • Monitoring Integration: Comprehensive health and performance monitoring
  • Scalable Architecture: Support for hundreds of concurrent strategies