2.1 From the “Training Era” to the “Execution Era” of AI
The AI industry is shifting its focus from model training to large-scale, continuous model execution. According to Stanford University’s AI Index Report 2025, the global adoption rate of AI services rose from 55% in 2023 to 78% in 2024. Applications such as Copilot, intelligent customer service, automated workflows, and autonomous agents are driving persistent, high-frequency inference workloads.
MarketsandMarkets projects that the AI inference market will grow from $106.15 billion in 2025 to $254.98 billion in 2030, with a compound annual growth rate (CAGR) of 19.2%. Unlike the training phase, the execution phase emphasizes low latency, high availability, cost efficiency, and reliability. The Stanford report further notes that the inference cost for GPT-3.5-level systems has fallen by more than 280 times, yet frequent execution still requires optimization to meet massive demand.