This work considers an online packet scheduling problem where packets arrive independently over a discrete time horizon and the goal is to minimize the cumulative weighted packet loss. The significant challenge of this problem is that the arrival model is not known in advance and may underlie dynamic changes. An important practical application of this setting is the scheduling of arriving IP packets in computer networks. The focus lies on the definition of online anticipatory algorithms that achieve an improvement over the oblivious approach of the greedy algorithm when scheduling requests in an uncertain, dynamic environment. The concept of anticipation is developed in this context by incorporating information of the environment's history to predict certain aspects of the future. Two distinct approaches are presented within the scope of this work: reinforcement learning and online stochastic combinatorial optimization. The theoretical background of both concepts is discussed in detail and the performance of the developed algorithms is analysed on the online packet scheduling problem. The experimental analysis shows that online stochastic combinatorial optimization yields the smallest cumulative weighted loss in any setting if the input distribution is modelled by Markov chains. However, it also requires the significantly largest runtime for each decision. To cope with a non-Markovian environment, first a conservative approach for the Q-learning algorithm is proposed that compared to the greedy algorithm achieves a significant improvement for the 2-class and 3-class problem. When more packet classes are present, the classical Q-learning algorithm has been found to be the best approach. However, it was not able to outperform greedy for the n-packet problem within the simulated time horizon, for n>=4.
|Translated title of the contribution||Vorausschauende Online-Algorithmen für die Reihenfolgeplanung|
|Publication status||Published - 2021|
Bibliographical noteembargoed until null