Multi-modal data fusion for people perception in the social robot haru

Ragel de la Torre, RicardoRey Arcenegui, RafaelPáez, ÁlvaroPonce Chulani, JavierNakamura, KeisukeCaballero, FernandoMerino, LuisGómez, Randy2025-04-022025-04-022023-02-01R Ragel, R Rey, Á Páez, J Ponce, K Nakamura, F Caballero, L Merino and R Gómez. "Multi-modal Data Fusion for People Perception in the Social Robot Haru", International Conference on Social Robotics, 174-187, 202210.1007/978-3-031-24667-8_16https://hdl.handle.net/10433/23695This work is partially supported by Programa Operativo FEDER Andalucia 2014-2020, Consejeria de Economía y Conocimiento (DeepBot, PY20\_00817) and the project PLEC2021-007868, funded by MCIN/AEI/10.13039/501100011033 and the European Union NextGenerationEU/PRTR.Proyectos de investigación Proyecto DeepBot (PY20_00817) Proyecto NHoA (PLEC2021-007868)This article presents a people perception software architecture and its implementation, focused on the information of interest from the point of view of a social robot. The key modules employed to get the different people features, such as the body parts location, the face and hands information, and the speech, from a set of possible devices and configurations are described. The association and combination of these features using a temporal and geometric fusion system are explained in detail. A high-level interface for Human-Robot interaction using the resulting information from the fused people is proposed. The paper presents experimental results evaluating the relevant aspects of the system.application/pdfenAttribution 4.0 Internationalhttp://creativecommons.org/licenses/by/4.0/Social RoboticsData FusionHuman-Robot InteractionMulti-modal data fusion for people perception in the social robot haruconference outputopen access