Abstract
Novelty detection in data streams is the task of detecting concepts that were not known prior, in streams of data. Many machine learning algorithms have been proposed to detect these novelties, as well as integrate them. This study provides a systematic literature review of the state of novelty detection in data streams, including its advancement in recent years, its main challenges and solutions, an updated taxonomy for the classification of the proposed frameworks, and a comparative analysis of different key algorithms in this field. Additionally, we highlight ongoing challenges and future research directions that could be tackled moving forward.
- [1] . 2016. AnyNovel: Detection of novel concepts in evolving data streams. Evolv. Syst. 7, 2, SI (2016), 73–93.
DOI: Google ScholarCross Ref - [2] . 2016. Novelty detection in data stream clustering using the artificial immune system. In European Mediterranean & Middle Eastern Conference on Information Systems.Google Scholar
- [3] . 2019. AIS-Clus: A bio-inspired method for textual data stream clustering. Vietnam. J. Comput. Sci. 6, 2 (2019), 223–256.
DOI: Google ScholarCross Ref - [4] . 2003. A framework for clustering evolving data streams. In VLDB Conference, , , , , , and (Eds.). Morgan Kaufmann, San Francisco, 81–92.
DOI: Google ScholarCross Ref - [5] . 2022. Concept drift detection in data stream mining: A literature review. J. King Saud Univ. - Comput. Inf. Sci. 34, 10, Part B (2022), 9523–9540.
DOI: Google ScholarCross Ref - [6] . 2015. Extreme learning machine based novelty detection for incremental semi-supervised learning. In 3rd International Conference on Image Information Processing (ICIIP’15). 230–235.
DOI: Google ScholarDigital Library - [7] . 2016. Incremental Parzen window classifier for a multi-class system. Int. J. Simul. Syst. Sci. Technol. 17, 34 (2016), 6.1–6.11.
DOI: Google ScholarCross Ref - [8] . 2016. Semi-supervised learning using incremental support vector machine and extreme value theory in gesture data. In UKSim-AMSS 18th International Conference on Computer Modelling and Simulation (UKSim’16). 184–189.
DOI: Google ScholarCross Ref - [9] . 2015. Semi-supervised learning using incremental polynomial classifier and extreme value theory. In 3rd International Conference on Artificial Intelligence, Modelling and Simulation (AIMS’15). 332–337.
DOI: Google ScholarCross Ref - [10] . 2016. Recurring and novel class detection using class-based ensemble for evolving data stream. IEEE Trans. Knowl. Data Eng. 28, 10 (2016), 2752–2764.
DOI: Google ScholarDigital Library - [11] . 2021. A review of local outlier factor algorithms for outlier detection in big data streams. Big Data Cog. Comput. 5, 1 (2021).
DOI: Google ScholarCross Ref - [12] . 2019. Novelty detection in social media by fusing text and image into a single structure. IEEE Access 7 (2019), 132786–132802.
DOI: Google ScholarCross Ref - [13] Annalisa Appice, Michelangelo Ceci, Corrado Loglisci, Costantina Caruso, Fabio Fumarola, Michele Todaro, and Donato Malerba. 2009. A relational approach to novelty detection in data streams. In 17th Italian Symposium on Advanced Database Systems (SEBD’09). 89–100.Google Scholar
- [14] . 2010. Detecting outliers on arbitrary data streams using anytime approaches. In 1st International Workshop on Novel Data Stream Pattern Mining Techniques (StreamKDD’10). Association for Computing Machinery, New York, NY, 10–15.
DOI: Google ScholarDigital Library - [15] . 2002. Sampling from a moving window over streaming data. In 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’02). Society for Industrial and Applied Mathematics, 633–634. Google Scholar
- [16] . 2016. A novel algorithm for dynamic clustering: Properties and performance. In 15th IEEE International Conference on Machine Learning and Applications (ICMLA’16). 565–570.
DOI: Google ScholarCross Ref - [17] . 2015. SNCStream: A social network-based data stream clustering algorithm. In 30th Annual ACM Symposium on Applied Computing (SAC’15). Association for Computing Machinery, New York, NY, 935–940.
DOI: Google ScholarDigital Library - [18] . 2020. An evolving approach to data streams clustering based on typicality and eccentricity data analytics. Inf. Sci. 518 (2020), 13–28.
DOI: Google ScholarDigital Library - [19] . 2020. CODES: Efficient incremental semi-supervised classification over drifting and evolving social streams. IEEE Access 8 (2020), 14024–14035.
DOI: Google ScholarCross Ref - [20] . 2010. MOA: Massive online analysis. J. Mach. Learn. Res. 11 (2010), 1601–1604.
DOI: Google ScholarDigital Library - [21] . 1998. Covertype. UCI Machine Learning Repository.
DOI: Google ScholarCross Ref - [22] . 2014. Efficient active novel class detection for data stream classification. In 22nd International Conference on Pattern Recognition. 2826–2831.
DOI: Google ScholarDigital Library - [23] . 2018. An adaptive algorithm for anomaly and novelty detection in evolving data streams. Data Min. Knowl. Discov. 32, 6 (2018), 1597–1633.
DOI: Google ScholarDigital Library - [24] . 2016. A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. 49, 2, Article
31 (Aug. 2016), 50 pages.DOI: Google ScholarDigital Library - [25] . 2019. Finding and tracking multi-density clusters in online dynamic data streams. IEEE Trans. Big Data 8, 1 (2019).
DOI: Google ScholarCross Ref - [26] . 2021. Classification in dynamic data streams with a scarcity of labels. IEEE Trans. Knowl. Data Eng. 35, 4 (2021).
DOI: Google ScholarDigital Library - [27] . 2022. Asymmetric HMMs for online ball-bearing health assessments. IEEE Internet Things J. 9, 20 (2022).
DOI: Google ScholarCross Ref - [28] . 2018. Fault detection and identification methodology under an incremental learning framework applied to industrial machinery. IEEE Access 6 (2018), 49755–49766.
DOI: Google ScholarCross Ref - [29] . 2023. SNDProb: A probabilistic approach for streaming novelty detection. IEEE Trans. Knowl. Data Eng. 35, 6 (2023), 6335–6348.
DOI: Google ScholarDigital Library - [30] . 2007. Poker Hand. UCI Machine Learning Repository.
DOI: Google ScholarCross Ref - [31] . 2009. Novelty detection from evolving complex data streams with time windows. In Foundations of Intelligent Systems, , , , and (Eds.). Springer Berlin, 563–572. Google ScholarDigital Library
- [32] . 2009. Relational frequent patterns mining for novelty detection from data streams. In Machine Learning and Data Mining in Pattern Recognition, (Ed.). Springer Berlin, 427–439. Google ScholarDigital Library
- [33] . 2016. Role of big-data in classification and novel class detection in data streams. J. Big Data 3, 1 (2016).
DOI: Google ScholarCross Ref - [34] . 2009. Anomaly detection: A survey. ACM Comput. Surv. 41, 3, Article
15 (July 2009), 58 pages.DOI: Google ScholarDigital Library - [35] . 2019. How to detect novelty in textual data streams? A comparative study of existing methods. CoRR abs/1909.05099 (2019).Google Scholar
- [36] . 2019. Novelty detection for multi-label stream classification. In 8th Brazilian Conference on Intelligent Systems (BRACIS’19). 144–149.
DOI: Google ScholarCross Ref - [37] . 2019. Pruned sets for multi-label stream classification without true labels. In International Joint Conference on Neural Networks (IJCNN’19). 1–8.
DOI: Google ScholarCross Ref - [38] . 2020. A Fuzzy Approach for Classification and Novelty Detection in Data Streams under Intermediate Latency. Vol. 12320 LNAI. Springer Science and Business Media Deutschland GmbH. Retrieved from https://www.scopus.com/inward/record.uri?eid=2-s2.0-85094105335&doi=10.1007%2f978-3-030-61380-8_12&partnerID=40&md5=0eae673986a6bbf4fa8fed3560ced91fGoogle ScholarDigital Library
- [39] . 2021. A fuzzy multi-class novelty detector for data streams under intermediate latency. In IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’21). 1–6.
DOI: Google ScholarDigital Library - [40] . 2022. Learning to classify with incremental new class. IEEE Trans. Neural Netw. Learn. Syst. 33, 6 (2022), 2429–2443.
DOI: Google ScholarCross Ref - [41] . 2020. Possibilistic approach for novelty detection in data streams. In IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’20). 1–8.
DOI: Google ScholarDigital Library - [42] . 2018. A fuzzy multiclass novelty detector for data streams. In IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’18). 1–8.
DOI: Google ScholarDigital Library - [43] . 2016. MINAS: Multiclass learning algorithm for novelty detection in data streams. Data Min. Knowl. Discov. 30, 3 (2016), 640–680.
DOI: Google ScholarDigital Library - [44] . 2015. Evaluation of multiclass novelty detection algorithms for data streams. IEEE Trans. Knowl. Data Eng. 27, 11 (2015), 2961–2973.
DOI: Google ScholarDigital Library - [45] . 2012. The MNIST database of handwritten digit images for machine learning research. IEEE Sig. Process. Mag. 29, 6 (2012), 141–142.Google ScholarCross Ref
- [46] . 2021. Data stream classification with novel class detection: A review, comparison and challenges. Knowl. Inf. Syst. 63, 9 (2021), 2231–2276.
DOI: Google ScholarDigital Library - [47] . 2020. Exploiting evolving micro-clusters for data stream classification with emerging class detection. Inf. Sci. 507 (2020), 404–420.
DOI: Google ScholarDigital Library - [48] . 2017. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/mlGoogle Scholar
- [49] . 2022. Incremental and compressible kernel null discriminant analysis. Pattern Recog. 127 (2022).
DOI: Google ScholarDigital Library - [50] . 2011. An efficient approach to detecting concept-evolution in network data streams. In Australasian Telecommunication Networks and Applications Conference (ATNAC’11). 1–7.
DOI: Google ScholarCross Ref - [51] . 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In 2nd International Conference on Knowledge Discovery and Data Mining (KDD’96). AAAI Press, 226–231.Google ScholarDigital Library
- [52] . 2022. Finding and tracking multi-density clusters in online dynamic data streams. IEEE Trans. Big Data 8, 1 (2022), 178–192.
DOI: Google ScholarCross Ref - [53] . 2022. Scarcity of labels in non-stationary data streams: A survey. ACM Comput. Surv. 55, 2 (2022).
DOI: Google ScholarDigital Library - [54] . 2013. Novelty detection algorithm for data streams multi-class problems. In 28th Annual ACM Symposium on Applied Computing (SAC’13). Association for Computing Machinery, New York, NY, 795–800.
DOI: Google ScholarDigital Library - [55] . 2016. Novelty detection in data streams. Artif. Intell. Rev. 45, 2 (
01 Feb. 2016), 235–269.DOI: Google ScholarDigital Library - [56] . 2012. Novel class detection in concept-drifting data stream mining employing decision tree. In 7th International Conference on Electrical and Computer Engineering. 630–633.
DOI: Google ScholarCross Ref - [57] . 2013. An adaptive ensemble classifier for mining concept drifting data streams. Expert Syst. Applic. 40, 15 (2013), 5895–5906.
DOI: Google ScholarDigital Library - [58] . 2016. Learning cumulatively to become more knowledgeable. In 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16). Association for Computing Machinery, New York, NY, 1565–1574.
DOI: Google ScholarDigital Library - [59] . 1988. Iris. UCI Machine Learning Repository.
DOI: Google ScholarCross Ref - [60] . 2013. Online face recognition and learning for cognitive robots. In 16th International Conference on Advanced Robotics (ICAR’13). 1–9.
DOI: Google ScholarCross Ref - [61] . 2010. Knowledge Discovery from Data Streams (1st ed.). Chapman & Hall/CRC. Google ScholarDigital Library
- [62] . 2012. A survey on learning from data streams: Current and future trends. Progr. Artif. Intell. 1, 1 (
01 Apr. 2012), 45–55.DOI: Google ScholarCross Ref - [63] . 2008. Knowledge discovery from data streams. Intell. Data Anal. 12, 3 (2008), 251–252.Google ScholarCross Ref
- [64] . 2009. An Overview on Mining Data Streams. Vol. 206. Springer Verlag.Google Scholar
- [65] . 2019. SIM: Open-world multi-task stream classifier with integral similarity metrics. In IEEE International Conference on Big Data (Big Data’19). 751–760.
DOI: Google ScholarCross Ref - [66] . 2019. Ensemble Clustering for Novelty Detection in Data Streams. Vol. 11828 LNAI. Springer. Retrieved from https://www.scopus.com/inward/record.uri?eid=2-s2.0-85075818430&doi=10.1007%2f978-3-030-33778-0_34&partnerID=40&md5=f7e0ef1c2bd475220bfc08ad313f7892Google ScholarDigital Library
- [67] . 2019. Online Clustering for Novelty Detection and Concept Drift in Data Streams. Vol. 11805 LNAI. Springer Verlag. Retrieved from https://www.scopus.com/inward/record.uri?eid=2-s2.0-85072863596&doi=10.1007%2f978-3-030-30244-3_37&partnerID=40&md5=999cff0b90e22b120a04beb28879dee8Google ScholarDigital Library
- [68] . 2023. Toward streamlining the evaluation of novelty detection in data streams. In Discovery Science, , , , , and (Eds.). Springer Nature Switzerland, Cham, 703–717. Google ScholarDigital Library
- [69] . 2021. An analysis of performance metrics for imbalanced classification. In Discovery Science, and (Eds.). Springer International Publishing, Cham, 67–77. Google ScholarDigital Library
- [70] . 2018. Recent advances in open set recognition: A survey. CoRR abs/1811.08581 (2018).Google Scholar
- [71] . 2017. A survey on ensemble learning for data stream classification. ACM Comput. Surv. 50, 2 (2017).
DOI: Google ScholarDigital Library - [72] . 2022. A survey on semi-supervised learning for delayed partially labelled data streams. ACM Comput. Surv. 55, 4 (2022).
DOI: Google ScholarDigital Library - [73] . 2019. Machine learning for streaming data: State of the art, challenges, and opportunities. SIGKDD Explor. Newslett. 21, 2 (2019), 6–22.
DOI: Google ScholarDigital Library - [74] . 2021. The problem with real-world novelty detection—Issues in multivariate probabilistic models. In IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion (ACSOS-C’21). 204–209.
DOI: Google ScholarCross Ref - [75] . 2019. Multi-stage deep classifier cascades for open world recognition. In 28th ACM International Conference on Information and Knowledge Management (CIKM’19). Association for Computing Machinery, New York, NY, 179–188.
DOI: Google ScholarDigital Library - [76] . 2014. Outlier detection for temporal data: A survey. IEEE Trans. Knowl. Data Eng. 26, 9 (2014), 2250–2267.
DOI: Google ScholarCross Ref - [77] . 2011. Learning random subspace novelty detection filters. In International Joint Conference on Neural Networks. 2273–2280.
DOI: Google ScholarCross Ref - [78] . 2015. Semi supervised adaptive framework for classifying evolving data stream. In Advances in Knowledge Discovery and Data Mining, , , , , , and (Eds.). Springer International Publishing, Cham, 383–394. Google ScholarCross Ref
- [79] . 2016. SAND: Semi-supervised adaptive novel class detection and classification over data stream. In 30th AAAI Conference on Artificial Intelligence (AAAI’16). AAAI Press, 1652–1658.Google ScholarCross Ref
- [80] . 2016. Efficient handling of concept drift and concept evolution over stream data. In IEEE 32nd International Conference on Data Engineering (ICDE’16). 481–492.
DOI: Google ScholarCross Ref - [81] Michael Harries and University of New South Wales. 1999. Splice-2 Comparative Evaluation: Electricity Pricing. School of Computer Science and Engineering. https://webarchive.nla.gov.au/awa/20040915173921/ http://pandora.nla.gov.au/pan/32869/20040907-0000/ftp.cse.unsw.edu.au/pub/doc/papers/UNSW/9905.pdfGoogle Scholar
- [82] . 1980. Identification of Outliers. Vol. 11. Springer.Google ScholarCross Ref
- [83] . 2010. A DCT based approach for detecting novelty and concept drift in data streams. In International Conference of Soft Computing and Pattern Recognition. 373–378.
DOI: Google ScholarCross Ref - [84] . 2011. Learning model trees from evolving data streams. Data Min. Knowl. Discov. 23, 1 (
01 July 2011), 128–168.DOI: Google ScholarDigital Library - [85] . 2014. Recurring and Novel Class Detection in Concept-drifting Data Streams using Class-based Ensemble. Vol. 8444 LNAI. Springer Verlag, Tainan. Retrieved from https://www.scopus.com/inward/record.uri?eid=2-s2.0-84901284721&doi=10.1007%2f978-3-319-06605-9_35&partnerID=40&md5=815b3814260fc7a9d82a6e5a7ef48835Google ScholarCross Ref
- [86] . 2022. FHC-NDS: Fuzzy hierarchical clustering of multiple nominal data streams. IEEE Trans. Fuzzy Syst. 31, 3 (2022), 1–12.
DOI: Google ScholarDigital Library - [87] . 2020. An adaptive deep learning framework for dynamic image classification in the internet of things environment. Sensors 20, 20 (2020).
DOI: Google ScholarCross Ref - [88] . 2021. Adaptive novelty detection over contextual data streams at the edge using one-class classification. In 12th International Conference on Information and Communication Systems (ICICS’12). 213–219.
DOI: Google ScholarCross Ref - [89] . 2007. Cognitively motivated novelty detection in video data streams. In Multimedia Data ing and Knowledge Discovery.Springer London, 209–233. Retrieved from https://www.scopus.com/inward/record.uri?eid=2-s2.0-84861620807&doi=10.1007%2f978-1-84628-799-2_11&partnerID=40&md5=991b9f9a639c084c49f4fa3f204fe7b9Google ScholarCross Ref
- [90] . 2009. Comparing anomaly-detection algorithms for keystroke dynamics. In IEEE/IFIP International Conference on Dependable Systems & Networks. 125–134.
DOI: Google ScholarCross Ref - [91] . 2009. Systematic literature reviews in software engineering—A systematic literature review. Inf. Softw. Technol. 51, 1 (2009), 7–15.
DOI: Google ScholarDigital Library - [92] . 2020. An incremental kernel extreme learning machine for multi-label learning with emerging new labels. IEEE Access 8 (2020), 46055–46070.
DOI: Google ScholarCross Ref - [93] . 2017. Ensemble learning for data stream analysis: A survey. Inf. Fusion 37 (2017), 132–156.
DOI: Google ScholarDigital Library - [94] . 2015. One-class classifiers with incremental learning and forgetting for data streams with concept drift. Soft Comput. 19, 12, SI (2015), 3387–3400.
DOI: Google ScholarDigital Library - [95] . 2015. Reacting to different types of concept drift with adaptive and incremental one-class classifiers. In IEEE 2nd International Conference on Cybernetics (CYBCONF’15). 30–35.
DOI: Google ScholarCross Ref - [96] . 2013. Adaptive fault detection and diagnosis using an evolving fuzzy classifier. Inf. Sci. 220 (2013), 64–85.
DOI: Google ScholarDigital Library - [97] . 2020. A classification and novel class detection algorithm for concept drift data stream based on the cohesiveness and separation index of Mahalanobis distance. J. Electric. Comput. Eng. 2020 (2020). Google ScholarDigital Library
- [98] . 2013. A semi-supervised ensemble approach for mining data streams. J. Comput. 8, 11, SI (2013), 2873–2879.
DOI: Google ScholarCross Ref - [99] . 2022. The design of error-correcting output codes algorithm for the open-set recognition. Appl. Intell. 52, 7 (2022), 7843–7869.
DOI: Google ScholarDigital Library - [100] . 2021. A systematic literature review on federated machine learning: From a software engineering perspective. ACM Comput. Surv. 54, 5, Article
95 (May 2021), 39 pages.DOI: Google ScholarDigital Library - [101] . 2021. A survey on open set recognition. In IEEE 4th International Conference on Artificial Intelligence and Knowledge Engineering (AIKE’21). IEEE.
DOI: Google ScholarCross Ref - [102] . 2019. Robust hierarchical clustering for novelty identification in sensor networks: With applications to industrial systems. Appl. Soft Comput. 85 (2019).
DOI: Google ScholarDigital Library - [103] . 2011. Detecting recurring and novel classes in concept-drifting data streams. In IEEE 11th International Conference on Data Mining. 1176–1181.
DOI: Google ScholarDigital Library - [104] . 2010. Classification and novel class detection of data streams in a dynamic feature space. In Machine Learning and Knowledge Discovery in Databases, , , , and (Eds.). Springer Berlin, 337–352. Google ScholarDigital Library
- [105] . 2010. Addressing concept-evolution in concept-drifting data streams. In IEEE International Conference on Data Mining. 929–934.
DOI: Google ScholarDigital Library - [106] . 2013. Classification and adaptive novel class detection of feature-evolving data streams. IEEE Trans. Knowl. Data Eng. 25, 7 (2013), 1484–1497.
DOI: Google ScholarDigital Library - [107] . 2009. Integrating novel class detection with classification for concept-drifting data streams. In Machine Learning and Knowledge Discovery in Databases, , , , and (Eds.). Springer Berlin, 79–94. Google ScholarCross Ref
- [108] . 2010. Classification and novel class detection in data streams with active mining. In Advances in Knowledge Discovery and Data Mining, , , , and (Eds.). Springer Berlin, 311–324. Google ScholarDigital Library
- [109] . 2011. Classification and novel class detection in concept-drifting data streams under time constraints. IEEE Trans. Knowl. Data Eng. 23, 6 (2011), 859–874.
DOI: Google ScholarDigital Library - [110] . 2020. Online cluster drift detection for novelty detection in data streams. In 19th IEEE International Conference on Machine Learning and Applications (ICMLA’20). 171–178.
DOI: Google ScholarCross Ref - [111] . 2019. Distributed online one-class support vector machine for anomaly detection over networks. IEEE Trans. Cybern. 49, 4 (2019), 1475–1488.
DOI: Google ScholarCross Ref - [112] . 2013. Novel Class Detection within Classification for Data Streams. Vol. 7952 LNCS. Springer Verlag, Dalian. Retrieved from https://www.scopus.com/inward/record.uri?eid=2-s2.0-84880754617&doi=10.1007%2f978-3-642-39068-5_50&partnerID=40&md5=c39851e79cce6a28d2eed8c34ca8b9aaGoogle ScholarDigital Library
- [113] . 2018. Active learning for classifying data streams with unknown number of classes. Neural Netw. 98 (2018), 1–15.
DOI: Google ScholarCross Ref - [114] . 2017. Classification under streaming emerging new classes: A solution using completely-random trees. IEEE Trans. Knowl. Data Eng. 29, 8 (2017), 1605–1618.
DOI: Google ScholarDigital Library - [115] . 2017. Unsupervised deep embedding for novel class detection over data stream. In IEEE International Conference on Big Data (Big Data’17). 1830–1839.
DOI: Google ScholarCross Ref - [116] . 2018. DILOF: Effective and memory efficient local outlier detection in data streams. In 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD’18). Association for Computing Machinery, New York, NY, 1993–2002.
DOI: Google ScholarDigital Library - [117] . 2019. Experiments in online expectation-based novelty-detection using 3D shape and colour perceptions for mobile robot inspection. Robot. Auton. Syst. 117 (2019), 68–79.
DOI: Google ScholarDigital Library - [118] . 2022. Few-shot egocentric multimodal activity recognition. In ACM Multimedia Asia (MMAsia’21). Association for Computing Machinery, New York, NY, Article
23 , 7 pages.DOI: Google ScholarDigital Library - [119] . 2019. Outlier and anomaly pattern detection on data streams. J. Supercomput. 75, 9 (2019), 6118–6128.
DOI: Google ScholarDigital Library - [120] . 2012. Novel class detection and feature via a tiered ensemble approach for stream mining. In IEEE 24th International Conference on Tools with Artificial Intelligence, Vol. 1. 1171–1178.
DOI: Google ScholarDigital Library - [121] . 2015. Detecting and tracking concept class drift and emergence in non-stationary fast data streams. In 29th AAAI Conference on Artificial Intelligence (AAAI’15). AAAI Press, 2908–2913. Google ScholarCross Ref
- [122] . 2020. The robustness-fidelity trade-off in Grow When Required neural networks performing continuous novelty detection. Neural Netw. 122 (2020), 183–195.
DOI: Google ScholarDigital Library - [123] . 2019. A novelty detector and extreme verification latency model for nonstationary environments. IEEE Trans. Industr. Electron. 66, 1 (2019), 561–570.
DOI: Google ScholarCross Ref - [124] . 2012. PAMAP2 Physical Activity Monitoring. UCI Machine Learning Repository.
DOI: Google ScholarCross Ref - [125] . 2017. Anomaly detection based on a dynamic Markov model. Inf. Sci. 411 (2017), 52–65.
DOI: Google ScholarCross Ref - [126] . 2019. DyClee: Dynamic clustering for tracking evolving environments. Pattern Recog. 94 (2019), 162–186.
DOI: Google ScholarDigital Library - [127] . 2020. Detection of hazardous road events from audio streams: An ensemble outlier detection approach. In IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS’20). 1–6.
DOI: Google ScholarCross Ref - [128] . 2012. Robust neural network for novelty detection on data streams. In Artificial Intelligence and Soft Computing, , , , , , and (Eds.). Springer Berlin, 178–186. Google ScholarDigital Library
- [129] . 2018. Ensemble learning: A survey. Wiley Interdisc. Rev.: Data Min. Knowl. Discov. 8, 4 (2018), e1249.Google ScholarCross Ref
- [130] . 2018. IEEE WIECON-ECE 2018 novel class detection in concept drifting data streams using decision tree leaves. In IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE’18). 87–90.
DOI: Google ScholarCross Ref - [131] . 2021. A unified survey on anomaly, novelty, open-set, and out-of-distribution detection: Solutions and future challenges. CoRR abs/2110.14051 (2021).Google Scholar
- [132] . 2013. Toward open set recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35, 7 (2013), 1757–1772.
DOI: Google ScholarDigital Library - [133] . 2018. Detection of evolving concepts in non-stationary data streams: A multiple kernel learning approach. Expert Syst. Applic. 91 (2018), 187–197.
DOI: Google ScholarDigital Library - [134] . 2016. Unsupervised classification of data streams based on Typicality and Eccentricity Data Analytics. In IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’16). 58–63.
DOI: Google ScholarDigital Library - [135] . 2013. Data stream clustering: A survey. ACM Comput. Surv. 46, 1, Article
13 (July 2013), 31 pages.DOI: Google ScholarDigital Library - [136] . 2016. A framework for threat detection in communication systems. In 20th Pan-Hellenic Conference on Informatics (PCI’16). Association for Computing Machinery, New York, NY, Article
68 , 6 pages.DOI: Google ScholarDigital Library - [137] . 2019. Unsupervised continual learning and self-taught associative memory hierarchies. CoRR abs/1904.02021 (2019).Google Scholar
- [138] . 2014. Maritime abnormality detection using Gaussian processes. Knowl. Inf. Syst. 38, 3 (2014), 717–741.
DOI: Google ScholarCross Ref - [139] . 2022. A system reliability approach to real-time unsupervised structural health monitoring without prior information. Mech. Syst. Sig. Process. 171 (2022).
DOI: Google ScholarCross Ref - [140] . 2020. Challenges in benchmarking stream learning algorithms with real-world data. Data Min. Knowl. Discov. 34, 6 (
01 Nov. 2020), 1805–1858.DOI: Google ScholarDigital Library - [141] . 2009. Novelty detection with application to data streams. Intell. Data Anal. 13, 3 (2009), 405–422.
DOI: Google ScholarCross Ref - [142] . 2007. OLINDDA: A cluster-based approach for detecting novelty and concept drift in data streams. In ACM Symposium on Applied Computing (SAC’07). Association for Computing Machinery, New York, NY, 448–452.
DOI: Google ScholarDigital Library - [143] . 2008. Cluster-based novel concept detection in data streams applied to intrusion detection in computer networks. In ACM Symposium on Applied Computing (SAC’08). Association for Computing Machinery, New York, NY, 976–980.
DOI: Google ScholarDigital Library - [144] . 2017. Concept drift detection for graph-structured classifiers under scarcity of true labels. In IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI’17). 461–468.
DOI: Google ScholarCross Ref - [145] . 1999. KDD Cup 1999 Data. UCI Machine Learning Repository.
DOI: Google ScholarCross Ref - [146] . 2012. Unsupervised outlier detection in streaming data using weighted clustering. In 12th International Conference on Intelligent Systems Design and Applications (ISDA’12). 947–952.
DOI: Google ScholarCross Ref - [147] . 2012. A density-based clustering approach for behavior change detection in data streams. In Brazilian Symposium on Neural Networks. 37–42.
DOI: Google ScholarDigital Library - [148] . 2013. Online behavior change detection in computer games. Expert Syst. Applic. 40, 16 (2013), 6258–6265.
DOI: Google ScholarDigital Library - [149] . 2014. Unsupervised density-based behavior change detection in data streams. Intell. Data Anal. 18, 2 (2014), 181–201.
DOI: Google ScholarCross Ref - [150] . 2011. Text mining and visualization using VOSviewer.
DOI: Google ScholarCross Ref - [151] . 2018. Unsupervised real-time stream-based novelty detection technique an approach in a corporate cloud. In IEEE 2nd International Conference on Data Stream Mining & Processing (DSMP’18). 166–170.
DOI: Google ScholarCross Ref - [152] . 2021. How does machine learning change software development practices? IEEE Trans. Softw. Eng. 47, 9 (2021), 1857–1871.
DOI: Google ScholarCross Ref - [153] . 2019. An online support vector machine for the open-ended environment. Expert Syst. Applic. 120 (2019), 72–86.
DOI: Google ScholarCross Ref - [154] . 2021. CIFDM: Continual and interactive feature distillation for multi-label stream learning. In 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’21). Association for Computing Machinery, New York, NY, 2121–2125.
DOI: Google ScholarDigital Library - [155] . 2018. Fast factorization-free kernel learning for unlabeled chunk data streams. In 27th International Joint Conference on Artificial Intelligence (IJCAI’18). AAAI Press, 2833–2839. Google ScholarDigital Library
- [156] . 2019. Robust high dimensional stream classification with novel class detection. In IEEE 35th International Conference on Data Engineering (ICDE’19). 1418–1429.
DOI: Google ScholarCross Ref - [157] . 2019. Metric learning based framework for streaming classification with concept evolution. In International Joint Conference on Neural Networks (IJCNN’19). 1–8.
DOI: Google ScholarCross Ref - [158] . 2020. Few-sample and adversarial representation learning for continual stream mining. In The Web Conference (WWW’20). Association for Computing Machinery, New York, NY, 718–728.
DOI: Google ScholarDigital Library - [159] . 2008. Novelty detection and online learning for vibration-based terrain classification. Intell. Auton. Syst. 10, IAS 2008 (
01 2008).DOI: Google ScholarCross Ref - [160] . 2018. Parameterizing kterm hashing. In 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR’18). Association for Computing Machinery, New York, NY, 945–948.
DOI: Google ScholarDigital Library - [161] . 2019. Fuzzy ARTMAP network and clustering for streaming classification under emerging new classes. In IEEE International Conference on Signal, Information and Data Processing (ICSIDP’19). 1–5.
DOI: Google ScholarCross Ref - [162] . 2020. SACCOS: A semi-supervised framework for emerging class detection and concept drift adaption over data streams. IEEE Trans. Knowl. Data Eng. 34, 3 (2020).
DOI: Google ScholarCross Ref - [163] . 2021. Novelty detection and online learning for chunk data streams. IEEE Trans. Pattern Anal. Mach. Intell. 43, 7 (2021), 2400–2412.
DOI: Google ScholarCross Ref - [164] . 2021. An EM framework for online incremental learning of semantic segmentation. In 29th ACM International Conference on Multimedia (MM’21). Association for Computing Machinery, New York, NY, 3052–3060.
DOI: Google ScholarDigital Library - [165] . 2019. A clustering system for dynamic data streams based on metaheuristic optimisation. Mathematics 7, 12 (2019).
DOI: Google ScholarCross Ref - [166] . 2012. A framework for outlier detection in evolving data streams by weighting attributes in clustering. Procedia Technol. 6 (2012), 214–222.
DOI: Google ScholarCross Ref - [167] . 2013. Clustering techniques for streaming data-a survey. In 3rd IEEE International Advance Computing Conference (IACC’13). 951–956.
DOI: Google ScholarCross Ref - [168] . 2015. Novel class detection in data streams using local patterns and neighborhood graph. Neurocomputing 158 (2015), 234–245.
DOI: Google ScholarDigital Library - [169] . 2016. A support vector based approach for classification beyond the learned label space in data streams. In 31st Annual ACM Symposium on Applied Computing (SAC’16). Association for Computing Machinery, New York, NY, 910–915.
DOI: Google ScholarDigital Library - [170] . 2019. Concept-evolution detection in non-stationary data streams: A fuzzy clustering approach. Knowl. Inf. Syst. 60, 3 (2019), 1329–1352.
DOI: Google ScholarDigital Library - [171] . 2011. Serendipitous learning: Learning beyond the predefined label space. In 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11). Association for Computing Machinery, New York, NY, 1343–1351.
DOI: Google ScholarDigital Library - [172] . 1997. BIRCH: A new data clustering algorithm and its applications. Data Min. Knowl. Discov. 1, 2 (1997), 141–182.Google ScholarDigital Library
- [173] . 2018. Adaptive matrix sketching and clustering for semisupervised incremental learning. IEEE Sig. Process Lett. 25, 7 (2018), 1069–1073.
DOI: Google ScholarCross Ref - [174] . 2021. Semi-supervised classification on data streams with recurring concept drift and concept evolution. Knowl.-based Syst. 215 (2021).
DOI: Google ScholarCross Ref - [175] . 2021. Detecting sequentially novel classes with stable generalization ability. In Advances in Knowledge Discovery and Data Mining, , , , , , , and (Eds.). Springer International Publishing, Cham, 371–382. Google ScholarDigital Library
- [176] . 2016. Online learning of contextual hidden Markov models for temporal-spatial data analysis. In IEEE 55th Conference on Decision and Control (CDC’16). 6335–6341.
DOI: Google ScholarDigital Library - [177] . 2020. Semi-supervised streaming learning with emerging new labels. Proc. AAAI Conf. Artif. Intell. 34, 04 (
Apr. 2020), 7015–7022.DOI: Google ScholarCross Ref - [178] . 2016. On-line expectation-based novelty detection for mobile robots. Robot Autom. Syst. 81 (2016), 33–47.
DOI: Google ScholarDigital Library - [179] . 2020. Novel class detection in non-stationary streaming environment with a discriminative classifier. In 43rd International Convention on Information, Communication and Electronic Technology (MIPRO’20). 1109–1113.
DOI: Google ScholarCross Ref
Index Terms
- A Systematic Literature Review of Novelty Detection in Data Streams: Challenges and Opportunities
Recommendations
Review on novelty detection in the non-stationary environment
AbstractNovelty detection and concept drift detection are essential for the plethora of machine learning applications. The statistical properties of application generated data change over time in the streaming environment, known as concept drift. These ...
Online Clustering for Novelty Detection and Concept Drift in Data Streams
Progress in Artificial IntelligenceAbstractData streams are related to large amounts of data that can continuously arrive with a probability distribution that may change over time. Depending on the changes in the data distribution, different phenomena can occur, like new classes can appear ...
Novelty detection in data streams
In massive data analysis, data usually come in streams. In the last years, several studies have investigated novelty detection in these data streams. Different approaches have been proposed and validated in many application domains. A review of the main ...
Comments