Scientific publications

Лазутина А.А., Соколов А.В.

Об оптимальном управлении Work-Stealing деками в двухуровневой памяти

// Вестник компьютерных и информационных технологий. Т. 17, № 4. 2020. C. 51-60

Lazutina A.A., Sokolov A.V. About optimal management of work-stealing deques in two-level memory // Vestnik komp'iuternykh i informatsionnykh tekhnologii. V. 17, № 4. 2020. Pp. 51-60

Keywords: Work-stealing balancers; Work-stealing deques; Data structures; Absorbing Markov chains; Random walks

In the parallel work-stealing load balancers, each core owns a personal buffer of tasks called deque. One end of the deque is used by its owner to add and retrieve tasks, while the second end is used by other cores to steal tasks. Our experience in software implementations has shown that it is important to optimize work with cache memory in parallel work-stealing load balancers. For example, by modifying deques so that they can hold task objects instead of pointers, we managed to increase the performance more than 2.5 times on the CPU-bound applications and decrease last-level cache misses up to 30 % compared to Intel TBB (Threading Building Blocks) and Intel / MIT (Massachusetts Institute of Technology) Cilk work-stealing schedulers. Therefore, it is important to investigate special methods of working with deques in two-level memory, and not just try to optimize the use of universal cache implementations. The paper analyzes the problem of optimal control of a work-stealing deque in two-level memory (for example, registers – random access memory), where probabilities of parallel operations with the deque are known. The classic sequential cyclic method for representing a deque in memory is considered. If a deque overflows or empty, we transfer elements from its middle part from the fast memory to the slow memory, since data from the end parts of the deque may be needed earlier. The problem is to find the optimal number of elements from both sides of the deque to leave in the fast memory if the deque is full or empty. The average time to get into the states when it is necessary to reallocate the memory is maximized.The simulation model and the mathematical model in the form of an absorbing Markov chain were constructed. The results of numerical experiments are presented.

URL: http://www.vkit.ru/index.php/current-issue-rus/914-051-060

DOI: 10.14489/vkit.2020.04.pp.051-060

Indexed at RSCI

Last modified: October 5, 2020