Word segmentation from transcriptions of child-directed speech using lexical and sub-lexical cues,Journal of Child Language

当前位置： X-MOL 学术 › J. Child Lang. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Word segmentation from transcriptions of child-directed speech using lexical and sub-lexical cues
Journal of Child Language ( IF 2.701 ) Pub Date : 2023-09-12 , DOI: 10.1017/s0305000923000491
Zébulon GORIELY , Andrew CAINES , Paula BUTTERY

We compare two frameworks for the segmentation of words in child-directed speech, PHOCUS and MULTICUE. PHOCUS is driven by lexical recognition, whereas MULTICUE combines sub-lexical properties to make boundary decisions, representing differing views of speech processing. We replicate these frameworks, perform novel benchmarking and confirm that both achieve competitive results. We develop a new framework for segmentation, the DYnamic Programming MULTIple-cue framework (DYMULTI), which combines the strengths of PHOCUS and MULTICUE by considering both sub-lexical and lexical cues when making boundary decisions. DYMULTI achieves state-of-the-art results and outperforms PHOCUS and MULTICUE on 15 of 26 languages in a cross-lingual experiment. As a model built on psycholinguistic principles, this validates DYMULTI as a robust model for speech segmentation and a contribution to the understanding of language acquisition.

中文翻译：

使用词汇和亚词汇线索对儿童定向语音转录进行分词

我们比较了面向儿童的语音中的两个单词分割框架：PHOCUS 和 MULTICUE。PHOCUS 由词汇识别驱动，而 MULTICUE 结合子词汇属性来做出边界决策，代表语音处理的不同视图。我们复制这些框架，执行新颖的基准测试，并确认两者都取得了有竞争力的结果。我们开发了一种新的分割框架，即动态编程多线索框架（DYMULTI），它在做出边界决策时考虑了子词汇和词汇线索，结合了 PHOCUS 和 MULTICUE 的优势。在跨语言实验中，DYMULTI 在 26 种语言中的 15 种上取得了最先进的结果，并且优于 PHOCUS 和 MULTICUE。作为建立在心理语言学原理之上的模型，

更新日期：2023-09-12

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>