On the Execution Time of Programs in Stochastic Scheduling
Abstract
Scheduling appears frequently in distributed, cloud and high-performance computing, as well as in embedded systems. Here, the execution time of subtasks is the major factor influencing decision-making, and despite being random variables they are majorly treated in the literature as being deterministic. Our project intends to shed more light on the underlying distribution of execution times, attempting to verify: 1) if the usual assumption of normal distribution is reasonable; 2) if there exist more suitable distribution families; and 3) if anything can be inferred a priori by analyzing general aspects of the program. We have modeled the problem and experimentally assessed distributions, showing that they are often not normal. We suggest alternative distributions, which we released as R packages, and propose estimators that ease parameter inference. With this we hope to promote usage of stochastic scheduling by making it easier for users to define distributions when requested.
References
Bittencourt, L. F., Madeira, E. R., and Da Fonseca, N. L. (2012). Scheduling in hybrid clouds. IEEE Communications Magazine, 50(9):42–47.
Braams, B. (2016). Deriving an execution time distribution by exhaustive evaluation. Bachelor’s thesis, University of Amsterdam.
Cai, Z., Li, Q., and Li, X. (2017). Elasticsim: A toolkit for simulating workflows with cloud resource runtime auto-scaling. J Grid Comput, 15(2):257–272.
David, L. and Puaut, I. (2004). Static determination of probabilistic execution times. In 16th ECRTS, pages 223–230. IEEE.
DeGroot, M. H. and Schervish, M. J. (2012). Probability and statistics. Pearson.
Dogan, A. and Ozguner, F. (2004). Genetic algorithm based scheduling of meta-tasks with stochastic execution times in heterogeneous systems. Cluster Computing, 7(2):177–190.
Dong, F., Luo, J., Song, A., and Jin, J. (2010). Resource load based stochastic dags scheduling mechanism for grid environment. In 12th HPCC, pages 197–204. IEEE.
Drozdetskiy, A., Cole, C., Procter, J., and Barton, G. J. (2015). Jpred4: a protein secondary structure prediction server. Nucleic acids research, 43(W1):W389–W394.
Ferdinand, C. and Heckmann, R. (2004). ait: Worst-case execution time prediction by static program analysis. In Building the Information Society, pages 377–383. Springer.
Hennessy, J. L. and Patterson, D. A. (2011). Computer architecture: a quantitative approach. Elsevier.
Jindal, V. and Bedi, P. (2017). Reducing waiting time with parallel preemptive algorithm in vanets. Vehicular Communications, 7:58–65.
Kadioglu, S., Malitsky, Y., Sabharwal, A., et al. (2011). Algorithm selection and scheduling. In 17th CP, pages 454–469. Springer.
Li, X., Liang, Y., Mitra, T., and Roychoudhury, A. (2007). Chronos: A timing analyzer for embedded software. Science of Computer Programming, 69(1-3):56–67.
Li, Y. A. and Antonio, J. K. (1997). Estimating the execution time distribution for a task graph in a heterogeneous computing system. In HCW’97, pages 172–184. IEEE.
Massart, P. (1990). The tight constant in the dvoretzky-kiefer-wolfowitz inequality. The annals of Probability, pages 1269–1283.
Panda, S. K. and Jana, P. K. (2015). Efficient task scheduling algorithms for heterogeneous multi-cloud environment. J supercomput, 71(4):1505–1533.
Prataviera, F., Cordeiro, G., Suzuki, A., et al. (2017). The odd log-logistic generalized gamma model. Biom Biostat Int J, 6(4):174.
Saldanha, M. and Souza, P. (2019). High performance algorithms for counting collisions and pairwise interactions. In 19th ICCS, pages 182–196. Springer.
Shestak, V., Smith, J., Maciejewski, A. A., and Siegel, H. J. (2008). Stochastic robustness metric and its use for static resource allocations. J Parallel Dist Com, 68(8):1157–1173.
Shishido, H., Estrella, J., Toledo, C., et al. (2018). Genetic algorithms applied to workflow scheduling with security and deadline constraints in clouds. Comput Electr Eng, 69.
Synergy, R. G. (2019). Half-yearly review shows $150 billion spent on cloud services and infrastructure. https://www.srgresearch.com/articles/half-yearly-reviewshows-150-billion-spent-cloud-services-and-infrastructure. Last access: 31-Mar-2020.
Takeuchi, I., Le, Q. V., Sears, T. D., and Smola, A. J. (2006). Nonparametric quantile estimation. Journal of machine learning research, 7:1231–1264.
Tanenbaum, A. S. and Bos, H. (2015). Modern operating systems. Pearson.
Ullman, J. (1975). Np-complete scheduling problems. J Comput Syst Sci, 10(3).
Valk, C. and Cai, J. (2018). A high quantile estimator based on the log-generalized weibull tail limit. Econometrics and statistics, 6:107–128.
Vouk, M. A. (2008). Cloud computing: issues, research and implementations. J comput inf tech, 16(4):235–246.
Wilhelm, R., Altmeyer, S., Burguière, C., et al. (2010). Static timing analysis for hard real-time systems. In Proceedings of VMCAI Workshops, pages 3–22. Springer.
Wilhelm, R., Engblom, J., Ermedahl, A., et al. (2008). The worst-case execution-time problem – overview of methods and survey of tools. ACM TECS, 7(3).
Xavier, V. A. and Annadurai, S. (2019). Chaotic social spider algorithm for load balance aware task scheduling in cloud computing. Cluster Computing, 22(1):287–297.
Yang, J., Yan, R., Roy, A., Xu, D., Poisson, J., and Zhang, Y. (2015). The i-tasser suite: protein structure and function prediction. Nature methods, 12(1):7.
Zheng, W. and Sakellariou, R. (2013). Stochastic dag scheduling using a monte carlo approach. J parallel distr com, 73(12):1673–1689.
Zuo, L., Shu, L., Dong, S., et al. (2015). A multi-objective optimization scheduling method based on the ant colony algorithm in cloud computing. Ieee Access, 3.