Problem

High-performance systems often face challenges in optimizing scheduling under uncertainty, where processing times are not deterministic. Traditional methods struggle to maintain efficiency in the presence of variable workloads, leading to suboptimal resource utilization and increased latency. The need to develop robust scheduling strategies for stochastic environments is critical to improving system performance.

Approach

  1. Study of Existing Scheduling Policies
  2. Conduct an in-depth review of existing algorithms, focusing on SRPT (Shortest Remaining Processing Time) and SERPT (Shortest Estimated Remaining Processing Time) and its behavior under stochastic workloads.

  3. Simulation Development
  4. Built a simulation framework to evaluate scheduling strategies. This includes incorporating variability in processing time estimates using controlled scaling factors.

  5. Impact Analysis
  6. Analyzed the effects of underestimations and overestimations of processing times on system performance metrics such as latency and throughput.

  7. Algorithm Refinement
  8. Iteratively refined machine learning model based on simulation results, optimizing for conditions where input estimates could be noisy or incomplete.

Challenges and Lessons Learned

  1. Managing Variability
  2. Estimating the impact of processing time inaccuracies required carefully designing experiments to isolate specific behaviors.

  3. Computational Complexity
  4. Simulating large-scale stochastic systems was resource-intensive, necessitating optimization of the simulation code to reduce runtime.

  5. Interpretation of Results
  6. Understanding the interplay between scheduling decisions and system performance in stochastic environments presented a steep learning curve.

Outcomes and Next Steps

Still in progress!