Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation
Abstract
:1. Introduction
2. Background
2.1. Sign Language Machine Translation
2.2. Frozen Pretrained Transformers
2.3. Related Work
3. Materials and Methods
3.1. Sign Language Representation
3.2. Sign Language Transformers
- Baseline
- BERT2RNDscratch
- BERT2BERTscratch
- BERT2RNDff
- BERT2BERTff
- BERT2RNDln
- BERT2BERTln
3.3. Sign Language Translation
3.4. Implementation Details
3.5. Evaluation
4. Results
4.1. Experimental Results
4.2. Performance Comparison
4.3. Model Architecture
4.4. Glosses
4.5. Learning Curves
4.6. Example Translations
5. Discussion
5.1. Discussion of Results
5.2. Representation Power of Neural Sign Language Translation Models
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
MT | Machine Translation |
SLR | Sign Language Recognition |
CSLR | Continuous Sign Language Recognition |
SLT | Sign Language Translation |
FPT | Frozen Pretrained Transformer |
BERT | Bidirectional Encoder Representations from Transformers |
GPT-2 | Generative Pretrained Transformer 2 |
RND | Random |
CNN | Convolutional Neural Network |
LSTM | Long Short-Term Memory |
HMM | Hidden Markov Model |
Appendix A
Task | Model | Seed |
---|---|---|
Sign2Text | Baseline | 1 |
BERT2RNDscratch | 93 | |
BERT2RNDff | 2021 | |
BERT2RNDln | 1 | |
Sign2Text | BERT2BERTscratch | 7366756 |
BERT2BERTff | 251016 | |
BERT2BERTln | 2021 | |
Sign2(Gloss+Text) | Baseline | 93 |
BERT2RNDscratch | 93 | |
BERT2RNDff | 7366756 | |
BERT2RNDln | 93 | |
BERT2BERTscratch | 251016 | |
BERT2BERTff | 7366756 | |
BERT2BERTln | 93 |
References
- Camgoz, N.C.; Hadfield, S.; Koller, O.; Ney, H.; Bowden, R. Neural sign language translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7784–7793. [Google Scholar]
- Esplà-Gomis, M.; Forcada, M.; Ramírez-Sánchez, G.; Hoang, H.T. ParaCrawl: Web-scale parallel corpora for the languages of the EU. In Proceedings of the MTSummit, Dublin, Ireland, 19–23 August 2019. [Google Scholar]
- Moryossef, A.; Yin, K.; Neubig, G.; Goldberg, Y. Data Augmentation for Sign Language Gloss Translation. In Proceedings of the 1st International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL), Online, 16–20 August 2021; pp. 1–11. [Google Scholar]
- Zhang, X.; Duh, K. Approaching Sign Language Gloss Translation as a Low-Resource Machine Translation Task. In Proceedings of the 1st International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL), Online, 16–20 August 2021; pp. 60–70. [Google Scholar]
- Zhou, H.; Zhou, W.; Qi, W.; Pu, J.; Li, H. Improving Sign Language Translation with Monolingual Data by Sign Back-Translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 1316–1325. [Google Scholar]
- De Coster, M.; D’Oosterlinck, K.; Pizurica, M.; Rabaey, P.; Verlinden, S.; Van Herreweghe, M.; Dambre, J. Frozen Pretrained Transformers for Neural Sign Language Translation. In Proceedings of the 1st International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL), Virtual, 20 August 2021; pp. 88–97. [Google Scholar]
- Zoph, B.; Yuret, D.; May, J.; Knight, K. Transfer Learning for Low-Resource Neural Machine Translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–4 November 2016; pp. 1568–1575. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers); Association for Computational Linguistics, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Rothe, S.; Narayan, S.; Severyn, A. Leveraging pre-trained checkpoints for sequence generation tasks. Trans. Assoc. Comput. Linguist. 2020, 8, 264–280. [Google Scholar] [CrossRef]
- Artetxe, M.; Ruder, S.; Yogatama, D. On the Cross-lingual Transferability of Monolingual Representations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 4623–4637. [Google Scholar]
- Gogoulou, E.; Ekgren, A.; Isbister, T.; Sahlgren, M. Cross-lingual Transfer of Monolingual Models. arXiv 2021, arXiv:2109.07348. [Google Scholar]
- Tsimpoukelli, M.; Menick, J.; Cabi, S.; Eslami, S.; Vinyals, O.; Hill, F. Multimodal few-shot learning with frozen language models. Adv. Neural Inf. Process. Syst. 2021, 34, 200–212. [Google Scholar]
- Lu, K.; Grover, A.; Abbeel, P.; Mordatch, I. Pretrained transformers as universal computation engines. arXiv 2021, arXiv:2103.05247. [Google Scholar]
- Camgoz, N.C.; Koller, O.; Hadfield, S.; Bowden, R. Sign language transformers: Joint end-to-end sign language recognition and translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Online, 14–19 June 2020; pp. 10023–10033. [Google Scholar]
- Yin, K.; Read, J. Better Sign Language Translation with STMC-Transformer. In Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, 8–13 December 2020; pp. 5975–5989. [Google Scholar]
- Bungeroth, J.; Ney, H. Statistical sign language translation. In Proceedings of the Workshop on Representation and Processing of Sign Languages, LREC, Citeseer, Lisbon, Portugal, 24–30 May 2004; Volume 4, pp. 105–108. [Google Scholar]
- Morrissey, S.; Way, A.; Stein, D.; Bungeroth, J.; Ney, H. Combining data-driven MT systems for improved sign language translation. In Proceedings of the Machine Translation Summit XI, Copenhagen, Denmark, 10–14 September 2007. [Google Scholar]
- Stein, D.; Schmidt, C.; Ney, H. Sign language machine translation overkill. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT), Paris, France, 2–3 December 2010. [Google Scholar]
- Forster, J.; Schmidt, C.; Koller, O.; Bellgardt, M.; Ney, H. Extensions of the Sign Language Recognition and Translation Corpus RWTH-PHOENIX-Weather. In Proceedings of the LREC, Reykjavik, Iceland, 26–31 May 2014; pp. 1911–1916. [Google Scholar]
- Frishberg, N.; Hoiting, N.; Slobin, D.I. Transcription. In Sign Language; Pfau, R., Steinbach, M., Woll, B., Eds.; De Gruyter Mouton: Berlin, Germany, 2012; pp. 1045–1075. [Google Scholar] [CrossRef]
- Vermeerbergen, M.; Leeson, L.; Crasborn, O.A. Simultaneity in Signed Languages: Form and Function; John Benjamins Publishing: Amsterdam, The Netherlands, 2007; Volume 281. [Google Scholar]
- Vermeerbergen, M. Past and current trends in sign language research. Lang. Commun. 2006, 26, 168–192. [Google Scholar] [CrossRef]
- Orbay, A.; Akarun, L. Neural sign language translation by learning tokenization. In Proceedings of the 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), Buenos Aires, Argentina, 16–20 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 222–228. [Google Scholar]
- Zhou, H.; Zhou, W.; Zhou, Y.; Li, H. Spatial-temporal multi-cue network for sign language recognition and translation. IEEE Trans. Multimed. 2021, 24, 768–779. [Google Scholar] [CrossRef]
- De Coster, M.; Shterionov, D.; Van Herreweghe, M.; Dambre, J. Machine Translation from Signed to Spoken Languages: State of the Art and Challenges. arXiv 2022, arXiv:2202.03086. [Google Scholar]
- Luong, M.T.; Pham, H.; Manning, C.D. Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 1412–1421. [Google Scholar]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
- Imamura, K.; Sumita, E. Recycling a pre-trained BERT encoder for neural machine translation. In Proceedings of the 3rd Workshop on Neural Generation and Translation, Hong Kong, China, 4 November 2019; pp. 23–31. [Google Scholar]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A robustly optimized bert pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Miyazaki, T.; Morita, Y.; Sano, M. Machine translation from spoken language to Sign language using pre-trained language model as encoder. In Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives, Marseille, France, 11–16 May 2020; pp. 139–144. [Google Scholar]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A.A. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
- Koller, O.; Camgoz, N.C.; Ney, H.; Bowden, R. Weakly supervised learning with multi-stream CNN-LSTM-HMMs to discover sequential parallelism in sign language videos. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 2306–2320. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Graves, A.; Fernández, S.; Gomez, F.; Schmidhuber, J. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 369–376. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2019; pp. 8024–8035. [Google Scholar]
- Kreutzer, J.; Bastings, J.; Riezler, S. Joey NMT: A Minimalist NMT Toolkit for Novices. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, Hong Kong, China, 3–7 November 2019; Association for Computational Linguistics: Hong Kong, China, 2019; pp. 109–114. [Google Scholar] [CrossRef] [Green Version]
- Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.J. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, 7–17 July 2002; pp. 311–318. [Google Scholar]
- Lin, C.Y.; Och, F.J. Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), Barcelona, Spain, 21–26 July 2004; pp. 605–612. [Google Scholar]
- Popović, M. chrF: Character n-gram F-score for automatic MT evaluation. In Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisbon, Portugal, 17–18 September 2015; pp. 392–395. [Google Scholar]
- Belinkov, Y.; Durrani, N.; Dalvi, F.; Sajjad, H.; Glass, J. On the linguistic representational power of neural machine translation models. Comput. Linguist. 2020, 46, 1–52. [Google Scholar] [CrossRef]
- Qi, J.; Du, J.; Siniscalchi, S.M.; Ma, X.; Lee, C.H. Analyzing upper bounds on mean absolute errors for deep neural network-based vector-to-vector regression. IEEE Trans. Signal Process. 2020, 68, 3411–3422. [Google Scholar] [CrossRef]
Task | Model | BLEU-4 | ROUGE-L | CHRF |
---|---|---|---|---|
Sign2Text | Baseline | |||
BERT2RNDscratch | ||||
BERT2RNDff | ||||
BERT2RNDln | ||||
BERT2BERTscratch | ||||
BERT2BERTff | ||||
BERT2BERTln | ||||
Sign2(Gloss+Text) | Baseline | |||
BERT2RNDscratch | ||||
BERT2RNDff | ||||
BERT2RNDln | ||||
BERT2BERTscratch | ||||
BERT2BERTff | ||||
BERT2BERTln |
Task | Model | BLEU-4 | ROUGE-L | CHRF |
---|---|---|---|---|
Sign2Text | Baseline | |||
BERT2RNDscratch | ||||
BERT2RNDff | ||||
BERT2RNDln | ||||
BERT2BERTscratch | ||||
BERT2BERTff | ||||
BERT2BERTln | ||||
Sign2(Gloss+Text) | Baseline | |||
BERT2RNDscratch | ||||
BERT2RNDff | ||||
BERT2RNDln | ||||
BERT2BERTscratch | ||||
BERT2BERTff | ||||
BERT2BERTln |
Model | Hypothesis |
---|---|
Example: 03July_2011_Sunday_tagesschau-1665 | |
Reference: in den übrigen landesteilen wird es meist freundlich dank einer hochdruckzone die von der biskaya bis zu den shetlandinseln reicht | |
English translation: in the remaining parts of the country it is mostly friendly thanks to a high pressure zone that stretches from the bay of biscay to the shetland islands | |
Sign2Text | |
Baseline | sonst klart es verbreitet auf und es wird später auf trotzdem wird es recht freund- |
lich und es fällt dort bis in tiefe lagen | |
otherwise it clears up widely and it gets up later, nevertheless it becomes quite friendly and | |
it drops down to low levels there | |
BERT2RNDff | im übrigen land wird es verbreitet freundlich die macht sich morgen dann in den |
kommenden tagen von der ems bis nach brandenburg | |
in the rest of the country it is spreading friendly which will then make itself felt tomorrow | |
in the coming days from the ems to brandenburg | |
Sign2(Gloss+Text) | |
Baseline | sonst wird es deutlich freundlicher und erreicht das hoch nur einen recht freundl- |
iches wetter in den kommenden tagen mit den regenwolken | |
otherwise it will be much friendlier and the high will only be quite friendly weather in the | |
coming days with the rain clouds | |
BERT2RNDff | sonst wird es meist freundlich und von westen weht ein schwacher bis mäßiger |
wind aus unterschiedlichen richtungen | |
otherwise it will mostly be friendly and a weak to moderate wind will blow from the west | |
from different directions |
Model | Hypothesis |
---|---|
Example: 12July_2010_Monday_tagesschau-374 | |
Reference: morgen gibt es im osten und südosten bei einer mischung aus sonne und wolken zum teil kräftige schauer oder gewitter | |
English translation: tomorrow there will be some heavy showers or thunderstorms in the east and southeast with a mixture of sun and clouds | |
Sign2Text | |
Baseline | morgen im osten und südosten noch sommerliche werte am nachmittag einzelne schauer und gewitter |
tomorrow in the east and southeast some showers and thunderstorms in the afternoon | |
BERT2RNDff | morgen im osten und südosten zunächst noch freundlich sonst viele wolken und zum teil kräftige gewittrige regenfälle |
tomorrow in the east and south-east initially still friendly otherwise lots of clouds and partly heavy thundery rain | |
Sign2(Gloss+Text) | |
Baseline | morgen im osten und südosten noch zweistellige regenfälle sonst teils wolkig oder zum teil heftige gewitter |
tomorrow in the east and south-east there will still be double-digit rainfall, otherwise partly cloudy or partly heavy thunderstorms | |
BERT2RNDff | morgen im osten und südosten noch teilweise gewittrige schauer |
partly thundery showers in the east and southeast tomorrow |
Model | Hypothesis |
---|---|
Example: 02December_2009_Wednesday_tagesschau-4039 | |
Reference: und nun die wettervorhersage für morgen donnerstag den dritten dezember | |
English translation: and now the weather forecast for tomorrow thursday the third of december | |
Sign2Text | |
Baseline | und nun die wettervorhersage für morgen donnerstag den dritten dezember |
and now the weather forecast for tomorrow thursday the third of december | |
BERT2RNDff | und nun die wettervorhersage für morgen donnerstag den dritten dezember |
and now the weather forecast for tomorrow thursday the third of december | |
Sign2(Gloss+Text) | |
Baseline | und nun die wettervorhersage für morgen donnerstag den dritten dezember |
and now the weather forecast for tomorrow thursday the third of december | |
BERT2RNDff | und nun die wettervorhersage für morgen donnerstag den dritten dezember |
and now the weather forecast for tomorrow thursday the third of december |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
De Coster, M.; Dambre, J. Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation. Information 2022, 13, 220. https://doi.org/10.3390/info13050220
De Coster M, Dambre J. Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation. Information. 2022; 13(5):220. https://doi.org/10.3390/info13050220
Chicago/Turabian StyleDe Coster, Mathieu, and Joni Dambre. 2022. "Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation" Information 13, no. 5: 220. https://doi.org/10.3390/info13050220
APA StyleDe Coster, M., & Dambre, J. (2022). Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation. Information, 13(5), 220. https://doi.org/10.3390/info13050220