arXiv:2212.08911 Abstract | arXiv Analytics

arXiv:2212.08911 [cs.CL]Abstract References Reviews Resources

AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech Translation

Published 2022-12-17Version 1

To alleviate the data scarcity problem in End-to-end speech translation (ST), pre-training on data for speech recognition and machine translation is considered as an important technique. However, the modality gap between speech and text prevents the ST model from efficiently inheriting knowledge from the pre-trained models. In this work, we propose AdaTranS for end-to-end ST. It adapts the speech features with a new shrinking mechanism to mitigate the length mismatch between speech and text features by predicting word boundaries. Experiments on the MUST-C dataset demonstrate that AdaTranS achieves better performance than the other shrinking-based methods, with higher inference speed and lower memory usage. Further experiments also show that AdaTranS can be equipped with additional alignment losses to further improve performance.

Categories: cs.CL, cs.SD, eess.AS

Keywords: end-to-end speech translation, boundary-based shrinking, adatrans achieves better performance, additional alignment losses, must-c dataset demonstrate

Related articles: Most relevant | Search more

arXiv:2412.04266 [cs.CL] (Published 2024-12-05)

Representation Purification for End-to-End Speech Translation

Chengwei Zhang, Yue Zhou, Rui Zhao, Yidong Chen, Xiaodong Shi

arXiv:2306.07650 [cs.CL] (Published 2023-06-13)

Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation

Yuchen Han, Chen Xu, Tong Xiao, Jingbo Zhu

arXiv:2204.05076 [cs.CL] (Published 2022-04-11)

End-to-End Speech Translation for Code Switched Speech

Orion Weller, Matthias Sperber, Telmo Pires, Hendra Setiawan, Christian Gollan, Dominic Telaar, Matthias Paulik