
Problem
High-performance systems often face challenges in optimizing scheduling under uncertainty, where processing times are not deterministic. Traditional methods struggle to maintain efficiency in the presence of variable workloads, leading to suboptimal resource utilization and increased latency. The need to develop robust scheduling strategies for stochastic environments is critical to improving system performance.
Approach
- Study of Existing Scheduling Policies
- Simulation Development
- Impact Analysis
- Algorithm Refinement
Conduct an in-depth review of existing algorithms, focusing on SRPT (Shortest Remaining Processing Time) and SERPT (Shortest Estimated Remaining Processing Time) and its behavior under stochastic workloads.
Built a simulation framework to evaluate scheduling strategies. This includes incorporating variability in processing time estimates using controlled scaling factors.
Analyzed the effects of underestimations and overestimations of processing times on system performance metrics such as latency and throughput.
Iteratively refined machine learning model based on simulation results, optimizing for conditions where input estimates could be noisy or incomplete.
Challenges and Lessons Learned
- Managing Variability
- Computational Complexity
- Interpretation of Results
Estimating the impact of processing time inaccuracies required carefully designing experiments to isolate specific behaviors.
Simulating large-scale stochastic systems was resource-intensive, necessitating optimization of the simulation code to reduce runtime.
Understanding the interplay between scheduling decisions and system performance in stochastic environments presented a steep learning curve.
Outcomes and Next Steps
Still in progress!