当前位置: X-MOL 学术Inf. Syst. Front. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mining Data Wrangling Workflows for Design Patterns Discovery and Specification
Information Systems Frontiers ( IF 5.9 ) Pub Date : 2024-02-01 , DOI: 10.1007/s10796-023-10458-7
Abdullah AlMasaud , Sandra Sampaio , Pedro Sampaio

In this paper, we investigate Data Wrangling (DW) pipelines in the form of workflows devised by data analysts with varying levels of experience to find commonalities or patterns. We propose an approach for pattern discovery based on workflow mining techniques, addressing key challenges associated with finding patterns in data preparation solutions. The findings provide insights into the most commonly used DW operations, solution patterns, redundancies, and reuse opportunities in data preparation. The findings were used to create design pattern specifications curated into a catalog in the form of a DW Design Patterns Handbook. The evaluation of the proposed handbook is performed by surveying professionals with results confirming the usefulness of discovered patterns to the construction of DW solutions and assisting data analysts/scientists via the reuse of patterns and best practices in DW.



中文翻译:

挖掘数据整理工作流程以发现设计模式和规范

在本文中,我们以具有不同经验水平的数据分析师设计的工作流形式研究数据整理(DW)管道,以发现共性或模式。我们提出了一种基于工作流挖掘技术的模式发现方法,解决了与在数据准备解决方案中查找模式相关的关键挑战。研究结果提供了对数据准备中最常用的 DW 操作、解决方案模式、冗余和重用机会的见解。研究结果用于创建设计模式规范,并以 DW 设计模式手册的形式整理到目录中。对拟议手册的评估由专业人士进行调查,结果证实了所发现的模式对于构建 DW 解决方案的有用性,并通过重用 DW 中的模式和最佳实践来协助数据分析师/科学家。

更新日期:2024-02-01
down
wechat
bug