О двойственном методе градиентного спуска для задачи о распределении ресурсов в многоагентных системах

О двойственном методе градиентного спуска для задачи о распределении ресурсов в многоагентных системах

Рохлин Д. Б.

УДК 519.86 
DOI: 10.33048/SIBJIM.2024.27.206


Аннотация:

Рассматривается последовательность блочно-сепарабельных задач выпуклого программирования, описывающих распределение ресурсов в многоагентных системах. Построено несколько итерационных алгоритмов назначения цен на ресурсы. При различных предположениях о функциях полезности и ресурсных ограничениях получены оценки для среднего отклонения целевой функции от оптимального значения (сожаления) и величины невязки в ограничениях. Среднее здесь понимается как математическое ожидание для независимых одинаково распределённых данных, и как временное среднее в задаче онлайн оптимизации. Анализ задачи проводится на основе методов онлайн оптимизации и теории двойственности. Рассмотренные алгоритмы основаны на информации о разности между суммарным спросом и предложением, которая порождается реакциями агентов на цены и соответствует невязке в ограничениях.

Литература:
  1. Bertsekas D. P. Nonlinear Programming. Belmont: Athena Scientific, 2016. 
     
  2. Rockafellar R. T. Problem decomposition in block-separable convex optimization: Ideas old and new // J. Nonlinear Convex Anal. 2018. V. 19, N 9. P. 1459–1474. 
     
  3. Обен Ж.-П. Нелинейный анализ и его экономические приложения. М.: Мир, 1988. 
     
  4. Jalota D., Gopalakrishnan K., Azizan N., Johari R., Pavone M. Online learning for traffic routing under unknown preferences // Proc. 26th Int. Conf. Artif. Intell. Stat. 2023. V. 206. P. 3210–3229. 
     
  5. Beck A., Nedić A., Ozdaglar A., Teboulle M. An $O(1/k)$ gradient method for network resource allocation problems // IEEE Trans. Control. Netw. Syst. 2014. V. 1, N 1. P. 64–73; DOI: 10.1109/TCNS.2014.2309751
     
  6. Воронцова Е. А., Гасников А. В., Иванова А. С., Нурминский Е. А. Поиск равновесия по Вальрасу и централизованная распределённая оптимизация с точки зрения современных численных методов выпуклой оптимизации на примере задачи распределения ресурсов // Сиб. журн. вычисл. матем. 2019. V. 22, № 4. С. 415–436; DOI: 10.1134/S1995423919040037
     
  7. Dengl R. Smart grid and demand side management // Handbook of real-time computing. 2022. P. 681- 703; DOI: 10.1007/978-981-287-251-7_43
     
  8. Mahdavi M., Jin R., Yang T. Trading regret for efficiency: online convex optimization with long term constraints // J. Mach. Learn. Res. 2012. V. 13, N 1. P. 2503–2528.
     
  9. Zinkevich M. Online convex programming and generalized infinitesimal gradient ascent // Proc. Twentieth Int. Conf. Mach. Learn. 2003. P. 928–936.
     
  10. Hazan E. Introduction to online convex optimization. Cambridge—Massachusetts: MIT Press, 2022.
     
  11. Cesa-Bianchi N., Conconi A., Gentile C. On the generalization ability of on-line learning algorithms // IEEE Trans. Inf. Theory. 2004. V. 50, N 9. P. 2050–2057; DOI: 10.1109/TIT.2004.833339
     
  12. Shalev-Shwartz S. Online learning and online convex optimization // Found. Trends Mach. Learn. 2012. V. 4, N 2. P. 107–194; DOI: 10.1561/2200000018
     
  13. Cutkosky A. Anytime online-to-batch, optimism and acceleration // Proc. 36th Int. Conf. Mach. Learn. 2019. V. 97. P. 1446–1454.
     
  14. Chen T., Ling Q., Giannakis G. B. An online convex optimization approach to proactive network resource allocation // IEEE Trans. Signal Process. 2017. V. 65, N 24. P. 6350–6364; DOI: 10.1109/TSP.2017.2750109
     
  15. Yu H., Neely M. J. A low complexity algorithm with $O( \sqrt{T})$ regret and $O(1)$ constraint violations for online convex optimization with long term constraints // J. Mach. Learn. Res. 2020. V. 21, N 1. P. 1–24.
     
  16. Yi X., Li X., Yang T., Xie L., Johansson K. Regret and cumulative constraint violation analysis for online convex optimization with long term constraints // Proc. 38th Int. Conf. Mach. Learn. 2021. V 139, P. 11998–12008
     
  17. Jalota D., Sun H., Azizan N. Online learning for equilibrium pricing in markets under incomplete information // arXiv. 2023; DOI: 10.48550/arXiv.2303.11522
     
  18. Roth A., Ullman J., Wu Z. S. Watch and learn: optimizing from revealed preferences feedback // Proc. forty-eighth Annual ACM Symp. Theory Comput. 2016. P. 949–962.
     
  19. Ji Z., Mehta R., Telgarsky M. Social welfare and profit maximization from revealed preferences // Lect. Notes Comput. Sci. 2018. V. 11316. P. 264–281; DOI: 10.1007/978-3-030-04612-5_18
     
  20. Roth A., Slivkins A., Ullman J., Wu Z. S. Multidimensional dynamic pricing for welfare maximization // ACM Trans. Econ. Comput. 2020. V. 8, N 1. P.1–35; DOI: 10.1145/3381527
     
  21. Рокафеллар Р. Выпуклый анализ. М.: Мир, 1973.
     
  22. Bertsekas D. P., Nedic A., Ozdaglar A. E. Convex analysis and optimization. Belmont: Athena Scientific, 2003.
     
  23. Beck A. First-order methods in optimization. Philadelphia: SIAM, 2017; DOI: 10.1137/1.9781611974997
     
  24. Orabona F. A Modern Introduction to Online Learning // arXiv. 2019; DOI: 10.48550/arXiv.1912.13213
     
  25. Auer P., Cesa-Bianchi N., Gentile C. Adaptive and self-confident on-line learning algorithms // J. Comput. Syst. Sci. 2002. V. 64, N 1. P. 48–75; DOI: 10.1006/jcss.2001.1795
     
  26. Aliprantis C. D., Border K. C. Infinite dimensional analysis: a hitchhiker’s guide. Berlin: Springer, 2006; DOI: 10.1007/3-540-29587-9
     
  27. Rockafellar R. T., Wets R. J.-B. Variational analysis. Berlin: Springer, 2009.
     
  28. Шалев-Шварц Ш., Бен-Давид Ш. Идеи машинного обучения: от теории к алгоритмам. М.: ДМК Пресс, 2019.
     
  29. Nesterov Y., Shikhman V. Dual subgradient method with averaging for optimal resource allocation // Eur. J. Oper. Res. 2018. V. 270, N 3. P. 907–916; DOI: 10.1016/j.ejor.2017.09.043
     
  30. Shapiro A., Dentcheva D., Ruszczynski A. Lectures on stochastic programming: modeling and theory. Philadelphia: SIAM, 2009; DOI: 10.1137/1.9780898718751
     
  31. Bertsekas D.P. Stochastic optimization problems with nondifferentiable cost functionals // J. Optim. Theory Appl. 1973. V. 12, N 2. P. 218–231; DOI: 10.1007/bf00934819
     
  32. Shreve S. E. Stochastic сalculus for finance II, continuous-time models. N. Y.: Springer, 2004.
     
  33. Рохлин Д. Б. Распределение ресурсов в сетях связи с большим числом пользователей: двойственный стохастический градиентный метод // Теория вероятн. и её примен. 2021. Т. 66, № 1, P. 129–148; DOI: 10.1137/S0040585X97T990289
     
  34. Joulani P., Raj A., György A., Szepesvári C. A simpler approach to accelerated optimization: iterative averaging meets optimism // Proc. 37th Int. Conf. Mach. Learn. 2020 V. 119. P. 4984–4993.
     
  35. Joulani P., György A., Szepesvári C. A modular analysis of adaptive (non-)convex optimization: Optimism, composite objectives, variance reduction, and variational bounds // Theor. Comput. Sci. 2020. V. 808, N 2. P. 108–138; DOI: 10.1016/j.tcs.2019.11.015
     
  36. Воронцова Е. А., Гасников А. В., Двуреченский П. Е., Иванова А. С., Пасечнюк Д. А. Численные методы для задачи распределения ресурсов в компьютерной сети // Журн. вычисл. матем. и матем. физ. 2021. T. 61, N 2. P. 312–344; DOI: 10.31857/s0044466921020149
     
  37. Нестеров Ю. Е. Метод решения задач выпуклого программирования с трудоёмкостью $O(1/k^2)$ // Доклады АН СССР. 1982. Т. 269, № 3. С. 543–547.
     
  38. Нестеров Ю. Е. Введение в выпуклую оптимизацию. М.: МЦНМО, 2010.
     
  39. Nesterov Y. Universal gradient methods for convex optimization problems // Math. Program. 2015. V. 152, N 1–2. P. 381–404.

Работа выполнена при поддержке Регионального научно-образовательного математического центра Южного федерального университета, соглашение Минобрнауки РФ 075-02-2024- 1427. Других источников финансирования проведения или руководства данным конкретным исследованием не было.


Д. Б. Рохлин
  1. Южный федеральный университет, Институт математики, механики и компьютерных наук, 
    Региональный научно-образовательный математический центр, 
    ул. Мильчакова, 8а, г. Ростов-на-Дону 344090, Россия

E-mail: dbrohlin@sfedu.ru

Статья поступила 13.01.2024 г.
После доработки — 09.03.2024 г.
Принята к публикации 17.04.2024 г.

Abstract:

We consider a sequence of block-separable convex programming problems describing the resource allocation in multiagent systems. We construct several iterative algorithms for setting the resource prices. Under various assumptions about the utility functions and resource constraints, we obtain estimates for the average deviation (regret) of the objective function from the optimal value and the constraint residuals. Here the average is understood as the expectation for independent identically distributed data and as the time average in the online optimization problem. The analysis of the problem is carried out by online optimization methods and duality theory. The algorithms considered use the information concerning the difference between the total demand and supply that is generated by agents’ reactions to prices and corresponds to the constraint residuals.

References:
  1. D. P. Bertsekas, Nonlinear Programming (Athena Sci., Belmont, 2016).
     
  2. R. T. Rockafellar, “Problem decomposition in block-separable convex optimization: Ideas old and new,” J. Nonlinear Convex Anal. 19 (9), 1459–1474 (2018).
     
  3. J.-P. Aubin, L’analyse non linéaire et ses motivations économiques (Paris—New York—Barcelone, Masson, 1984; Mir, Moscow, 1988).
     
  4. D. Jalota, K. Gopalakrishnan, N. Azizan, R. Johari, and M. Pavone, “Online learning for traffic routing under unknown preferences,” Proc. 26th Int. Conf. Artif. Intell. Stat. (2023), 206, 3210–3229.
     
  5. A. Beck, A. NediВґc, A. Ozdaglar, and M. Teboulle, “An $O(1/k)$ gradient method for network resource allocation problems,” IEEE Trans. Control. Network Syst. 1 (1), 64–73 (2014). https://doi.org/10.1109/TCNS.2014.2309751
     
  6. E. A. Vorontsova, A. V. Gasnikov, A. S. Ivanova, and E. A. Nurminsky, “The Walrasian equilibrium and centralized distributed optimization in terms of modern convex optimization methods by an example of the resource allocation problem,” Numer. Anal. Appl. 12 (4), 338–358 (2019). https://doi.org/10.1134/S1995423919040037 
     
  7. R. Dengl, “Smart grid and demand side management,” Handbook of Real-Time Computing (2022), 681– 703. https://doi.org/10.1007/978-981-287-251-7_43
     
  8. M. Mahdavi, R. Jin, and T. Yang, “Trading regret for efficiency: Online convex optimization with long term constraints,” J. Mach. Learn. Res. 13 (1), 2503–2528 (2012).
     
  9. M. Zinkevich, “Online convex programming and generalized infinitesimal gradient ascent,” Proc. Twentieth Int. Conf. Mach. Learn. (2003), 928–936.
     
  10. E. Hazan, Introduction to Online Convex Optimization (MIT Press, Cambridge, MA, 2022).
     
  11. N. Cesa-Bianchi, A. Conconi, and C. Gentile, “On the generalization ability of on-line learning algorithms,” IEEE Trans. Inf. Theory 50 (9), 2050–2057 (2004). https://doi.org/10.1109/TIT.2004.833339
     
  12. S. Shalev-Shwartz, “Online learning and online convex optimization,” Found. Trends Mach. Learn. 4 (2), 107–194 (2012). https://doi.org/10.1561/2200000018
     
  13. A. Cutkosky, “Anytime online-to-batch, optimism and acceleration,” Proc. 36th Int. Conf. Mach. Learn. (2019), 97, 1446–1454.
     
  14. T. Chen, Q. Ling, and G. B. Giannakis, “An online convex optimization approach to proactive network resource allocation,” IEEE Trans. Signal Process. 65 (24), 6350–6364 (2017). https://doi.org/10.1109/TSP.2017.2750109 
     
  15. H. Yu and M. J. Neely, “A low complexity algorithm with $O( \sqrt{T})$ regret and $O(1)$ constraint violations for online convex optimization with long term constraints,” J. Mach. Learn. Res. 21 (1), 1–24 (2020).
     
  16. X. Yi, X. Li, T. Yang, L. Xie, and K. Johansson, “Regret and cumulative constraint violation analysis for online convex optimization with long term constraints,” Proc. 38th Int. Conf. Mach. Learn. (2021), 139, 11998–12008. 
     
  17. D. Jalota, H. Sun, and N. Azizan, “Online learning for equilibrium pricing in markets under incomplete information,” arXiv, 2023. https://doi.org/10.48550/arXiv.2303.11522
     
  18. A. Roth, J. Ullman, and Z. S. Wu, “Watch and learn: Optimizing from revealed preferences feedback,” Proc. Forty-Eighth Annu. ACM Symp. Theory Comput. (2016), 949–962.
     
  19. Z. Ji, R. Mehta, and M. Telgarsky, “Social welfare and profit maximization from revealed preferences,” Lect. Notes Comput. Sci. 11316, 264–281 (2018). https://doi.org/10.1007/978-3-030-04612-5_18
     
  20. A. Roth, A. Slivkins, J. Ullman, and Z. S. Wu, “Multidimensional dynamic pricing for welfare maximization,” ACM Trans. Econ. Comput. 8 (1), 1–35 (2020). https://doi.org/10.1145/3381527
     
  21. R. T. Rockafellar, Convex Analysis (Princeton Univ. Press, Princeton, 1970; Mir, Moscow, 1973).
     
  22. D. P. Bertsekas, A. Nedic, and A. E. Ozdaglar, Convex Analysis and Optimization (Athena Sci., Belmont, 2003). 
     
  23. A. Beck, First-Order Methods in Optimization (SIAM, Philadelphia, 2017). https://doi.org/10.1137/1.9781611974997
     
  24. F. Orabona, “A modern introduction to online learning,” arXiv, 2019. https://doi.org/10.48550/arXiv.1912.13213 
     
  25. P. Auer, N. Cesa-Bianchi, and C. Gentile, “Adaptive and self-confident on-line learning algorithms,” J. Comput. Syst. Sci. 64 (1), 48–75 (2002). https://doi.org/10.1006/jcss.2001.1795
     
  26. C. D. Aliprantis and K. C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide (Springer, Berlin, 2006). https://doi.org/10.1007/3-540-29587-9
     
  27. R. T. Rockafellar and R. J.-B.Wets, Variational Analysis (Springer, Berlin, 2009).
     
  28. Sh. Shalev-Shwartz and Sh. Ben-David, Understanding Machine Learning: From Theory to Algorithms (Cambridge Univ. Press, Cambridge, 2014; DMK Press, Moscow, 2019).
     
  29. Y. Nesterov and V. Shikhman, “Dual subgradient method with averaging for optimal resource allocation,” Eur. J. Oper. Res. 270 (3), 907–916 (2018). https://doi.org/10.1016/j.ejor.2017.09.043
     
  30. A. Shapiro, D. Dentcheva, and A. Ruszczynski, Lectures on Stochastic Programming: Modeling and Theory (SIAM, Philadelphia, 2009). https://doi.org/10.1137/1.9780898718751
     
  31. D. P. Bertsekas, “Stochastic optimization problems with nondifferentiable cost functionals,” J. Optim. Theory Appl. 12 (2), 218–231 (1973). https://doi.org/10.1007/bf00934819
     
  32. S. E. Shreve, Stochastic Calculus for Finance II, Continuous-Time Models (Springer, New York, 2004).
     
  33. D. B. Rokhlin, “Resource allocation in communication networks with large number of users: The dual stochastic gradient method,” Theory Probab. Appl. 66 (1), 105–120 (2021). https://doi.org/10.1137/S0040585X97T990289
     
  34. P. Joulani, A. Raj, A. György, and C. Szepesvári, “A simpler approach to accelerated optimization: ´ Iterative averaging meets optimism,” Proc. 37th Int. Conf. Mach. Learn. (2020), 119, 4984–4993.
     
  35. P. Joulani, P. György, and C. Szepesvári, “A modular analysis of adaptive (non-)convex optimization: Optimism, composite objectives, variance reduction, and variational bounds,” Theor. Comput. Sci. 808 (2), 108–138 (2020). https://doi.org/10.1016/j.tcs.2019.11.015
     
  36. E. A. Vorontsova, A. V. Gasnikov, P. E. Dvurechensky, A. S. Ivanova, and D. A. Pasechnyuk, “Numerical methods for the resource allocation problem in a computer network,” Comput. Math. Math. Phys. 61 (2), 297–328 (2021). https://doi.org/10.1134/S0965542521020135
     
  37. Yu. E. Nesterov, “A method for solving convex programming problems with complexity $O(1/k^2)$,” Dokl. Akad. Nauk SSSR 269 (3), 543–547 (1982) [in Russian].
     
  38. Yu. E. Nesterov, Introduction to Convex Optimization (MTsNMO, Moscow, 2010) [in Russian].
     
  39. Y. Nesterov, “Universal gradient methods for convex optimization problems,” Math. Program. 152 (1–2), 381–404 (2015).