当前位置: X-MOL 学术J. Child Lang. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Word segmentation from transcriptions of child-directed speech using lexical and sub-lexical cues
Journal of Child Language ( IF 2.701 ) Pub Date : 2023-09-12 , DOI: 10.1017/s0305000923000491
Zébulon GORIELY , Andrew CAINES , Paula BUTTERY

We compare two frameworks for the segmentation of words in child-directed speech, PHOCUS and MULTICUE. PHOCUS is driven by lexical recognition, whereas MULTICUE combines sub-lexical properties to make boundary decisions, representing differing views of speech processing. We replicate these frameworks, perform novel benchmarking and confirm that both achieve competitive results. We develop a new framework for segmentation, the DYnamic Programming MULTIple-cue framework (DYMULTI), which combines the strengths of PHOCUS and MULTICUE by considering both sub-lexical and lexical cues when making boundary decisions. DYMULTI achieves state-of-the-art results and outperforms PHOCUS and MULTICUE on 15 of 26 languages in a cross-lingual experiment. As a model built on psycholinguistic principles, this validates DYMULTI as a robust model for speech segmentation and a contribution to the understanding of language acquisition.

中文翻译:

使用词汇和亚词汇线索对儿童定向语音转录进行分词

我们比较了面向儿童的语音中的两个单词分割框架:PHOCUS 和 MULTICUE。PHOCUS 由词汇识别驱动,而 MULTICUE 结合子词汇属性来做出边界决策,代表语音处理的不同视图。我们复制这些框架,执行新颖的基准测试,并确认两者都取得了有竞争力的结果。我们开发了一种新的分割框架,即动态编程多线索框架(DYMULTI),它在做出边界决策时考虑了子词汇和词汇线索,结合了 PHOCUS 和 MULTICUE 的优势。在跨语言实验中,DYMULTI 在 26 种语言中的 15 种上取得了最先进的结果,并且优于 PHOCUS 和 MULTICUE。作为建立在心理语言学原理之上的模型,
更新日期:2023-09-12
down
wechat
bug