survey

Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep Learning

Authors:
Max Sponner

Infineon Technologies Dresden GmbH, Dresden, Germany and TU Dresden, Dresden, Germany

Infineon Technologies Dresden GmbH, Dresden, Germany and TU Dresden, Dresden, Germany

0000-0002-4830-9440
Search about this author

,
Bernd Waschneck

Infineon Technologies AG, Neubiberg, Germany

Infineon Technologies AG, Neubiberg, Germany

0000-0003-0294-8594
Search about this author

,
Akash Kumar

CFAED, Chair for Processor Design, Technical University Dresden, TU Dresden, Dresden, Germany

CFAED, Chair for Processor Design, Technical University Dresden, TU Dresden, Dresden, Germany

0000-0001-7125-1737
Search about this author

Authors Info & Claims

ACM Computing Surveys Volume 56 Issue 10Article No.: 248pp 1–40https://doi.org/10.1145/3657283

Published:14 May 2024Publication History

ACM Computing Surveys

Abstract

Adaptive optimization methods for deep learning adjust the inference task to the current circumstances at runtime to improve the resource footprint while maintaining the model’s performance. These methods are essential for the widespread adoption of deep learning, as they offer a way to reduce the resource footprint of the inference task while also having access to additional information about the current environment. This survey covers the state-of-the-art at-runtime optimization methods, provides guidance for readers to choose the best method for their specific use-case, and also highlights current research gaps in this field.

REFERENCES

[1] Akhlaghi Vahideh, Yazdanbakhsh Amir, Samadi Kambiz, Gupta Rajesh K., and Esmaeilzadeh Hadi. 2018. SnaPEA: Predictive early activation for reducing computation in deep convolutional neural networks. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 662–673. DOI:Google ScholarDigital Library
[2] Almahairi Amjad, Ballas Nicolas, Cooijmans Tim, Zheng Yin, Larochelle Hugo, and Courville Aaron. 2016. Dynamic capacity networks. In Proceedings of The 33rd International Conference on Machine Learning. PMLR, 2549–2558.Google Scholar
[3] Alwassel Humam, Heilbron Fabian Caba, and Ghanem Bernard. 2018. Action search: Spotting actions in videos and its application to temporal action localization. In Proceedings of the European Conference on Computer Vision (ECCV). 251–266.Google ScholarDigital Library
[4] Alwis Udari De and Alioto Massimo. 2021. TempDiff: Temporal difference-based feature map-level sparsity induction in CNNs with $\lt$4% memory overhead. In 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS). 1–4. DOI:Google ScholarCross Ref
[5] Amthor Manuel, Rodner Erik, and Denzler Joachim. 2016. Impatient DNNs - Deep Neural Networks with Dynamic Time Budgets. DOI:arxiv:1610.02850 [cs].Google ScholarCross Ref
[6] Apicharttrisorn Kittipat, Ran Xukan, Chen Jiasi, Krishnamurthy Srikanth V., and Roy-Chowdhury Amit K.. 2019. Frugal following: Power thrifty object detection and tracking for mobile augmented reality. In Proceedings of the 17th Conference on Embedded Networked Sensor Systems (SenSys’19). Association for Computing Machinery, New York, NY, USA, 96–109. DOI:Google ScholarDigital Library
[7] Bejnordi Babak Ehteshami, Blankevoort Tijmen, and Welling Max. 2020. Batch-Shaping for Learning Conditional Channel Gated Networks. DOI:arxiv:1907.06627 [cs, stat].Google ScholarCross Ref
[8] Bengio Emmanuel, Bacon Pierre-Luc, Pineau Joelle, and Precup Doina. 2016. Conditional Computation in Neural Networks for Faster Models. DOI:arxiv:1511.06297 [cs].Google ScholarCross Ref
[9] Bolukbasi Tolga, Wang Joseph, Dekel Ofer, and Saligrama Venkatesh. 2017. Adaptive neural networks for efficient inference. In Proceedings of the 34th International Conference on Machine Learning. PMLR, 527–536.Google ScholarDigital Library
[10] Bolukbasi Tolga, Wang Joseph, Dekel Ofer, and Saligrama Venkatesh. 2017. Adaptive neural networks for fast test-time prediction. (Feb.2017).Google Scholar
[11] Buckler Mark, Bedoukian Philip, Jayasuriya Suren, and Sampson Adrian. 2018. EVA$^2$: Exploiting temporal redundancy in live computer vision. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 533–546. DOI:Google ScholarDigital Library
[12] Busia Paola, Theodorakopoulos Ilias, Pothos Vasileios, Fragoulis Nikos, and Meloni Paolo. 2022. Dynamic pruning for parsimonious CNN inference on embedded systems. In Design and Architecture for Signal and Image Processing (Lecture Notes in Computer Science), Desnos Karol and Pertuz Sergio (Eds.). Springer International Publishing, Cham, 45–56. DOI:Google ScholarDigital Library
[13] Cai Shaofeng, Shu Yao, and Wang Wei. 2021. Dynamic routing networks. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 3588–3597.Google ScholarCross Ref
[14] Campos Victor, Jou Brendan, Giro-i-Nieto Xavier, Torres Jordi, and Chang Shih-Fu. 2018. Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks. DOI:arxiv:1708.06834 [cs].Google ScholarCross Ref
[15] Canel Christopher, Kim Thomas, Zhou Giulio, Li Conglong, Lim Hyeontaek, Andersen David G., Kaminsky Michael, and Dulloor Subramanya. 2019. Scaling video analytics on constrained edge nodes. Proceedings of Machine Learning and Systems 1 (April2019), 406–417.Google Scholar
[16] Cao Shijie, Ma Lingxiao, Xiao Wencong, Zhang Chen, Liu Yunxin, Zhang Lintao, Nie Lanshun, and Yang Zhi. 2019. SeerNet: Predicting convolutional neural network feature-map sparsity through low-bit quantization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11216–11225.Google ScholarCross Ref
[17] Cavigelli Lukas and Benini Luca. 2020. CBinfer: Exploiting frame-to-frame locality for faster convolutional network inference on video streams. IEEE Transactions on Circuits and Systems for Video Technology 30, 5 (May2020), 1451–1465. DOI:Google ScholarCross Ref
[18] Cavigelli Lukas, Degen Philippe, and Benini Luca. 2017. CBinfer: Change-based inference for convolutional neural networks on video data. In Proceedings of the 11th International Conference on Distributed Smart Cameras (ICDSC 2017). Association for Computing Machinery, New York, NY, USA, 1–8. DOI:Google ScholarDigital Library
[19] Chen Jinting, Zhu Zhaocheng, Li Cheng, and Zhao Yuming. 2019. Self-adaptive network pruning. In Neural Information Processing (Lecture Notes in Computer Science), Gedeon Tom, Wong Kok Wai, and Lee Minho (Eds.). Springer International Publishing, Cham, 175–186. DOI:Google ScholarDigital Library
[20] Chen Jou-An, Niu Wei, Ren Bin, Wang Yanzhi, and Shen Xipeng. 2023. Survey: Exploiting data redundancy for optimization of deep learning. Comput. Surveys 55, 10 (Feb.2023), 212:1–212:38. DOI:Google ScholarDigital Library
[21] Chen Tiffany Yu-Han, Ravindranath Lenin, Deng Shuo, Bahl Paramvir, and Balakrishnan Hari. 2015. Glimpse: Continuous, real-time object recognition on mobile devices. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems (SenSys’15). Association for Computing Machinery, New York, NY, USA, 155–168. DOI:Google ScholarDigital Library
[22] Chen Yinpeng, Dai Xiyang, Liu Mengchen, Chen Dongdong, Yuan Lu, and Liu Zicheng. 2020. Dynamic ReLU. In Computer Vision – ECCV 2020 (Lecture Notes in Computer Science), Vedaldi Andrea, Bischof Horst, Brox Thomas, and Frahm Jan-Michael (Eds.). Springer International Publishing, Cham, 351–367. DOI:Google ScholarDigital Library
[23] Chen Zhourong, Li Yang, Bengio Samy, and Si Si. 2019. You look twice: GaterNet for dynamic filter selection in CNNs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9172–9180.Google ScholarCross Ref
[24] Cheng An-Chieh, Lin Chieh Hubert, Juan Da-Cheng, Wei Wei, and Sun Min. 2020. InstaNAS: Instance-aware neural architecture search. Proceedings of the AAAI Conference on Artificial Intelligence 34, 04 (April2020), 3577–3584. DOI:Google ScholarCross Ref
[25] Chiang Chang-Han, Liu Pangfeng, Wang Da-Wei, Hong Ding-Yong, and Wu Jan-Jan. 2021. Optimal branch location for cost-effective inference on Branchynet. In 2021 IEEE International Conference on Big Data (Big Data). 5071–5080. DOI:Google ScholarCross Ref
[26] Chierichetti Flavio, Kumar Ravi, and Vassilvitskii Sergei. 2009. Similarity caching. In Proceedings of the Twenty-Eighth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’09). Association for Computing Machinery, New York, NY, USA, 127–136. DOI:Google ScholarDigital Library
[27] Cordonnier Jean-Baptiste, Mahendran Aravindh, Dosovitskiy Alexey, Weissenborn Dirk, Uszkoreit Jakob, and Unterthiner Thomas. 2021. Differentiable patch selection for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2351–2360.Google ScholarCross Ref
[28] Cox Bart, Birke Robert, and Chen Lydia Y.. 2022. Memory-aware and context-aware multi-DNN inference on the edge. Pervasive and Mobile Computing 83 (July2022), 101594. DOI:Google ScholarDigital Library
[29] Cruz Yarens J., Rivas Marcelino, Quiza Ramón, Haber Rodolfo E., Castaño Fernando, and Villalonga Alberto. 2022. A two-step machine learning approach for dynamic model selection: A case study on a micro milling process. Computers in Industry 143 (Dec.2022), 103764. DOI:Google ScholarDigital Library
[30] Dehghani Mostafa, Gouws Stephan, Vinyals Oriol, Uszkoreit Jakob, and Kaiser Łukasz. 2019. Universal Transformers. DOI:arxiv:1807.03819 [cs, stat].Google ScholarCross Ref
[31] Dong Xuanyi, Huang Junshi, Yang Yi, and Yan Shuicheng. 2017. More is less: A more complicated network with less inference complexity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5840–5848.Google ScholarCross Ref
[32] Utsav Drolia, Katherine Guo, and Priya Narasimhan. 2017. Precog: Prefetching for image recognition applications at the edge. In Proceedings of the Second ACM/IEEE Symposium on Edge Computing (SEC’17). Association for Computing Machinery, New York, NY, USA, 1–13. Google ScholarDigital Library
[33] Drolia Utsav, Guo Katherine, Tan Jiaqi, Gandhi Rajeev, and Narasimhan Priya. 2017. Cachier: Edge-caching for recognition applications. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). 276–286. DOI:Google ScholarCross Ref
[34] Bejnordi Ali Ehteshami and Krestel Ralf. 2020. Dynamic channel and layer gating in convolutional neural networks. In KI 2020: Advances in Artificial Intelligence (Lecture Notes in Computer Science), Schmid Ute, Klügl Franziska, and Wolter Diedrich (Eds.). Springer International Publishing, Cham, 33–45. DOI:Google ScholarDigital Library
[35] Elbayad Maha, Gu Jiatao, Grave Edouard, and Auli Michael. 2020. Depth-Adaptive Transformer. DOI:arxiv:1910.10073 [cs].Google ScholarCross Ref
[36] Elkerdawy Sara, Elhoushi Mostafa, Zhang Hong, and Ray Nilanjan. 2022. Fire together wire together: A dynamic pruning approach with self-supervised mask prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12454–12463.Google ScholarCross Ref
[37] al. OpenAI et2023. GPT-4 Technical Report. Technical Report arXiv:2303.08774. DOI:arxiv:2303.08774 [cs].Google ScholarCross Ref
[38] Falchi Fabrizio, Lucchese Claudio, Orlando Salvatore, Perego Raffaele, and Rabitti Fausto. 2008. A metric cache for similarity search. In Proceedings of the 2008 ACM Workshop on Large-Scale Distributed Systems for Information Retrieval (LSDS-IR’08). Association for Computing Machinery, New York, NY, USA, 43–50. DOI:Google ScholarDigital Library
[39] Falchi Fabrizio, Lucchese Claudio, Orlando Salvatore, Perego Raffaele, and Rabitti Fausto. 2009. Caching content-based queries for robust and efficient image retrieval. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (EDBT’09). Association for Computing Machinery, New York, NY, USA, 780–790. DOI:Google ScholarDigital Library
[40] Falchi Fabrizio, Lucchese Claudio, Orlando Salvatore, Perego Raffaele, and Rabitti Fausto. 2012. Similarity caching in large-scale image retrieval. Information Processing & Management 48, 5 (Sept.2012), 803–818. DOI:Google ScholarDigital Library
[41] Fan H., Xu Z., Zhu L., Yan C., Ge J., and Yang Y.. 2018. Watching a Small Portion Could Be as Good as Watching All: Towards Efficient Video Classification.Google Scholar
[42] Fang Biyi, Zeng Xiao, Zhang Faen, Xu Hui, and Zhang Mi. 2020. FlexDNN: Input-adaptive on-device deep learning for efficient mobile vision. In 2020 IEEE/ACM Symposium on Edge Computing (SEC). 84–95. DOI:Google ScholarCross Ref
[43] Fang Yihao, Shalmani Shervin Manzuri, and Zheng Rong. 2020. CacheNet: A Model Caching Framework for Deep Learning Inference on the Edge. DOI:arxiv:2007.01793 [cs, eess].Google ScholarCross Ref
[44] Fayyaz Mohsen, Koohpayegani Soroush Abbasi, Jafari Farnoush Rezaei, Sengupta Sunando, Joze Hamid Reza Vaezi, Sommerlade Eric, Pirsiavash Hamed, and Gall Jürgen. 2022. Adaptive token sampling for efficient vision transformers. In Computer Vision – ECCV 2022 (Lecture Notes in Computer Science), Avidan Shai, Brostow Gabriel, Cissé Moustapha, Farinella Giovanni Maria, and Hassner Tal (Eds.). Springer Nature Switzerland, Cham, 396–414. DOI:Google ScholarDigital Library
[45] Figurnov Michael, Collins Maxwell D., Zhu Yukun, Zhang Li, Huang Jonathan, Vetrov Dmitry, and Salakhutdinov Ruslan. 2017. Spatially adaptive computation time for residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1039–1048.Google ScholarCross Ref
[46] Finamore Alessandro, Roberts James, Gallo Massimo, and Rossi Dario. 2022. Accelerating deep learning classification with error-controlled approximate-key caching. In IEEE INFOCOM 2022 - IEEE Conference on Computer Communications. 2118–2127. DOI:Google ScholarDigital Library
[47] Fu Jianlong, Zheng Heliang, and Mei Tao. 2017. Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4438–4446.Google ScholarCross Ref
[48] Fu Tsu-Jui and Ma Wei-Yun. 2018. Speed reading: Learning to read forbackward via shuttle. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 4439–4448. DOI:Google ScholarCross Ref
[49] Gao Xitong, Zhao Yiren, Dudziak Łukasz, Mullins Robert, and Xu Cheng-zhong. 2019. Dynamic Channel Pruning: Feature Boosting and Suppression. DOI:arxiv:1810.05331 [cs].Google ScholarCross Ref
[50] Ghanathe Nikhil P. and Wilton Steve. 2022. T-RECX: Tiny-Resource Efficient Convolutional Neural Networks with Early-Exit. DOI:arxiv:2207.06613 [cs, eess].Google ScholarCross Ref
[51] Ghodrati Amir, Bejnordi Babak Ehteshami, and Habibian Amirhossein. 2021. FrameExit: Conditional early exiting for efficient video recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15608–15618.Google ScholarCross Ref
[52] Gilman Guin R., Ogden Samuel S., Walls Robert J., and Guo Tian. 2019. Challenges and opportunities of DNN model execution caching. In Proceedings of the Workshop on Distributed Infrastructures for Deep Learning (DIDL’19). Association for Computing Machinery, New York, NY, USA, 7–12. DOI:Google ScholarDigital Library
[53] Gong Chao, Lin Fuhong, Gong Xiaowen, and Lu Yueming. 2020. Intelligent cooperative edge computing in internet of things. IEEE Internet of Things Journal 7, 10 (Oct.2020), 9372–9382. DOI:Google ScholarCross Ref
[54] Gong Hongyu, Li Xian, and Genzel Dmitriy. 2022. Adaptive Sparse Transformer for Multilingual Translation. DOI:arxiv:2104.07358 [cs].Google ScholarCross Ref
[55] Graves Alex. 2017. Adaptive Computation Time for Recurrent Neural Networks. DOI:arxiv:1603.08983 [cs].Google ScholarCross Ref
[56] Guo Peizhen and Hu Wenjun. 2018. Potluck: Cross-application approximate deduplication for computation-intensive mobile applications. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’18). Association for Computing Machinery, New York, NY, USA, 271–284. DOI:Google ScholarDigital Library
[57] Guo Peizhen, Li Rui, Hu Bo, and Hu Wenjun. 2018. FoggyCache: Cross-device approximate computation reuse. Living on the Edge (2018), 16.Google Scholar
[58] Guo Qiushan, Yu Zhipeng, Wu Yichao, Liang Ding, Qin Haoyu, and Yan Junjie. 2019. Dynamic recursive neural network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5147–5156.Google ScholarCross Ref
[59] Guo Yunhui. 2018. A Survey on Methods and Theories of Quantized Neural Networks. DOI:arxiv:1808.04752 [cs, stat].Google ScholarCross Ref
[60] Hadifar Amir, Deleu Johannes, Develder Chris, and Demeester Thomas. 2021. Exploration of block-wise dynamic sparseness. Pattern Recognition Letters 151 (Nov.2021), 187–192. DOI:Google ScholarDigital Library
[61] Han Seungyeop, Shen Haichen, Philipose Matthai, Agarwal Sharad, Wolman Alec, and Krishnamurthy Arvind. 2016. MCDNN: An approximation-based execution framework for deep stream processing under resource constraints. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys’16). Association for Computing Machinery, New York, NY, USA, 123–136. DOI:Google ScholarDigital Library
[62] Han Yizeng, Huang Gao, Song Shiji, Yang Le, Wang Honghui, and Wang Yulin. 2021. Dynamic neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021), 1–1. DOI:Google ScholarCross Ref
[63] Hansen Christian, Hansen Casper, Alstrup Stephen, Simonsen Jakob Grue, and Lioma Christina. 2019. Neural Speed Reading with Structural-Jump-LSTM. DOI:arxiv:1904.00761 [cs, stat].Google ScholarCross Ref
[64] Haque Mirazul, Chauhan Anki, Liu Cong, and Yang Wei. 2020. ILFO: Adversarial attack on adaptive neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14264–14273.Google ScholarCross Ref
[65] Hazimeh Hussein, Ponomareva Natalia, Mol Petros, Tan Zhenyu, and Mazumder Rahul. 2020. The tree ensemble layer: Differentiability meets conditional computation. In Proceedings of the 37th International Conference on Machine Learning. PMLR, 4138–4148.Google Scholar
[66] Herrmann Charles, Bowen Richard Strong, and Zabih Ramin. 2020. Channel selection using Gumbel Softmax. In Computer Vision – ECCV 2020 (Lecture Notes in Computer Science), Vedaldi Andrea, Bischof Horst, Brox Thomas, and Frahm Jan-Michael (Eds.). Springer International Publishing, Cham, 241–257. DOI:Google ScholarDigital Library
[67] Hong Sanghyun, Kaya Yiğitcan, Modoranu Ionuţ-Vlad, and Dumitraş Tudor. 2021. A Panda? No, It’s a Sloth: Slowdown Attacks on Adaptive Multi-Exit Neural Network Inference. DOI:arxiv:2010.02432 [cs].Google ScholarCross Ref
[68] Hou Lu, Huang Zhiqi, Shang Lifeng, Jiang Xin, Chen Xiao, and Liu Qun. 2020. DynaBERT: Dynamic BERT with adaptive width and depth. In Advances in Neural Information Processing Systems, Vol. 33. Curran Associates, Inc., 9782–9793.Google Scholar
[69] Hsieh Kevin, Ananthanarayanan Ganesh, Bodik Peter, Venkataraman Shivaram, Bahl Paramvir, Philipose Matthai, Gibbons Phillip B., and Mutlu Onur. 2018. Focus: Querying large video datasets with low latency and low cost. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 269–286.Google Scholar
[70] Hu Hanzhang, Dey Debadeepta, Hebert Martial, and Bagnell J. Andrew. 2018. Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing. DOI:arxiv:1708.06832 [cs].Google ScholarCross Ref
[71] Hu Ting-Kuei, Chen Tianlong, Wang Haotao, and Wang Zhangyang. 2020. Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference. DOI:arxiv:2002.10025 [cs].Google ScholarCross Ref
[72] Hu Zilong, Tang Jinshan, Wang Ziming, Zhang Kai, Zhang Ling, and Sun Qingling. 2018. Deep learning for image-based cancer detection and diagnosis - a survey. Pattern Recognition 83 (Nov.2018), 134–149. DOI:Google ScholarDigital Library
[73] Hua Weizhe, Zhou Yuan, Sa Christopher De, Zhang Zhiru, and Suh G. Edward. 2019. Boosting the performance of CNN accelerators with dynamic fine-grained channel gating. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’52). Association for Computing Machinery, New York, NY, USA, 139–150. DOI:Google ScholarDigital Library
[74] Hua Weizhe, Zhou Yuan, Sa Christopher M. De, Zhang Zhiru, and Suh G. Edward. 2019. Channel gating neural networks. In Advances in Neural Information Processing Systems, Vol. 32. Curran Associates, Inc.Google Scholar
[75] Huang Gao, Chen Danlu, Li Tianhong, Wu Felix, Maaten Laurens van der, and Weinberger Kilian Q.. 2018. Multi-Scale Dense Networks for Resource Efficient Image Classification. DOI:arxiv:1703.09844 [cs].Google ScholarCross Ref
[76] Huang Gao, Wang Yulin, Lv Kangchen, Jiang Haojun, Huang Wenhui, Qi Pengfei, and Song Shiji. 2022. Glance and Focus Networks for Dynamic Visual Recognition. DOI:arxiv:2201.03014 [cs].Google ScholarCross Ref
[77] Huang Zhengjie, Ye Zi, Li Shuangyin, and Pan Rong. 2017. Length adaptive recurrent model for text classification. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM’17). Association for Computing Machinery, New York, NY, USA, 1019–1027. DOI:Google ScholarDigital Library
[78] Huynh Loc N., Lee Youngki, and Balan Rajesh Krishna. 2017. DeepMon: Mobile GPU-based deep learning framework for continuous vision applications. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys’17). Association for Computing Machinery, New York, NY, USA, 82–95. DOI:Google ScholarDigital Library
[79] Ioannou Yani, Robertson Duncan, Zikic Darko, Kontschieder Peter, Shotton Jamie, Brown Matthew, and Criminisi Antonio. 2016. Decision Forests, Convolutional Networks and the Models in-Between. DOI:arxiv:1603.01250 [cs].Google ScholarCross Ref
[80] Jain Samvit, Zhang Xun, Zhou Yuhao, Ananthanarayanan Ganesh, Jiang Junchen, Shu Yuanchao, Bahl Paramvir, and Gonzalez Joseph. 2020. Spatula: Efficient cross-camera video analytics on large camera networks. In 2020 IEEE/ACM Symposium on Edge Computing (SEC). 110–124. DOI:Google ScholarCross Ref
[81] Jain Samvit, Zhang Xun, Zhou Yuhao, Ananthanarayanan Ganesh, Jiang Junchen, Shu Yuanchao, and Gonzalez Joseph. 2019. ReXCam: Resource-Efficient, Cross-Camera Video Analytics at Scale. DOI:arxiv:1811.01268 [cs].Google ScholarCross Ref
[82] Jernite Yacine, Grave Edouard, Joulin Armand, and Mikolov Tomas. 2017. Variable Computation in Recurrent Neural Networks. DOI:arxiv:1611.06188 [cs, stat].Google ScholarCross Ref
[83] Jiang Junchen, Ananthanarayanan Ganesh, Bodik Peter, Sen Siddhartha, and Stoica Ion. 2018. Chameleon: Scalable adaptation of video analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’18). Association for Computing Machinery, New York, NY, USA, 253–266. DOI:Google ScholarDigital Library
[84] Jiang Zutao, Li Changlin, Chang Xiaojun, Zhu Jihua, and Yang Yi. 2021. Dynamic Slimmable Denoising Network. DOI:arxiv:2110.08940 [cs, eess].Google ScholarCross Ref
[85] Jin Qing, Yang Linjie, and Liao Zhenyu. 2020. AdaBits: Neural network quantization with adaptive bit-widths. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2146–2156.Google ScholarCross Ref
[86] Kang Daniel, Emmons John, Abuzaid Firas, Bailis Peter, and Zaharia Matei. 2017. NoScope: Optimizing Neural Network Queries over Video at Scale. DOI:arxiv:1703.02529 [cs].Google ScholarCross Ref
[87] Kaya Yigitcan, Hong Sanghyun, and Dumitras Tudor. 2019. Shallow-deep networks: Understanding and mitigating network overthinking. In Proceedings of the 36th International Conference on Machine Learning. PMLR, 3301–3310.Google Scholar
[88] Kim Gyuwan and Cho Kyunghyun. 2021. Length-Adaptive Transformer: Train Once with Length Drop, Use Anytime with Search. DOI:arxiv:2010.07003 [cs].Google ScholarCross Ref
[89] Kirillov Alexander, Wu Yuxin, He Kaiming, and Girshick Ross. 2020. PointRend: Image segmentation as rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9799–9808.Google ScholarCross Ref
[90] Kong Shu and Fowlkes Charless. 2019. Pixel-wise attentional gating for scene parsing. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). 1024–1033. DOI:Google ScholarCross Ref
[91] Kontschieder Peter, Fiterau Madalina, Criminisi Antonio, and Bulo Samuel Rota. 2015. Deep neural decision forests. In Proceedings of the IEEE International Conference on Computer Vision. 1467–1475.Google ScholarDigital Library
[92] Kouris Alexandros, Venieris Stylianos I., Laskaridis Stefanos, and Lane Nicholas D.. 2022. Multi-Exit Semantic Segmentation Networks. arxiv:2106.03527 [cs].Google Scholar
[93] Krishna Tarun, Rai Ayush K., Djilali Yasser A. D., Smeaton Alan F., McGuinness Kevin, and O’Connor Noel E.. 2022. Dynamic Channel Selection in Self-Supervised Learning. DOI:arxiv:2207.12065 [cs].Google ScholarCross Ref
[94] Kuen Jason, Kong Xiangfei, Lin Zhe, Wang Gang, Yin Jianxiong, See Simon, and Tan Yap-Peng. 2018. Stochastic downsampling for cost-adjustable inference and improved regularization in convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7929–7938.Google ScholarCross Ref
[95] Laskaridis Stefanos, Kouris Alexandros, and Lane Nicholas D.. 2021. Adaptive inference through early-exit networks: Design, challenges and directions. In Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning (EMDL’21). Association for Computing Machinery, New York, NY, USA, 1–6. DOI:Google ScholarDigital Library
[96] Laskaridis Stefanos, Venieris Stylianos I., Almeida Mario, Leontiadis Ilias, and Lane Nicholas D.. 2020. SPINN: Synergistic progressive inference of neural networks over device and cloud. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking (MobiCom’20). Association for Computing Machinery, New York, NY, USA, 1–15. DOI:Google ScholarDigital Library
[97] Lee Changsik, Hong Seungwoo, Hong Sungback, and Kim Taeyeon. 2020. Performance analysis of local exit for distributed deep neural networks over cloud and edge computing. ETRI Journal 42, 5 (2020), 658–668. DOI:Google ScholarCross Ref
[98] Lee Hankook and Shin Jinwoo. 2018. Anytime Neural Prediction via Slicing Networks Vertically. DOI:arxiv:1807.02609 [cs, stat].Google ScholarCross Ref
[99] Lee Royson, Venieris Stylianos I., Dudziak Lukasz, Bhattacharya Sourav, and Lane Nicholas D.. 2019. MobiSR: Efficient on-device super-resolution through heterogeneous mobile processors. In The 25th Annual International Conference on Mobile Computing and Networking (MobiCom’19). Association for Computing Machinery, New York, NY, USA, 1–16. DOI:Google ScholarDigital Library
[100] Leroux Sam, Bohez Steven, Boom Cedric De, Coninck Elias De, Verbelen Tim, Vankeirsbilck Bert, Simoens Pieter, and Dhoedt Bart. 2016. Lazy Evaluation of Convolutional Filters. DOI:arxiv:1605.08543 [cs].Google ScholarCross Ref
[101] Leroux Sam, Bohez Steven, Coninck Elias De, Verbelen Tim, Vankeirsbilck Bert, Simoens Pieter, and Dhoedt Bart. 2017. The cascading neural network: Building the internet of smart things. Knowledge and Information Systems 52, 3 (Sept.2017), 791–814. DOI:Google ScholarDigital Library
[102] Leroux Sam, Molchanov Pavlo, Simoens Pieter, Dhoedt Bart, Breuel Thomas, and Kautz Jan. 2018. IamNN: Iterative and Adaptive Mobile Neural Network for Efficient Image Classification. DOI:arxiv:1804.10123 [cs].Google ScholarCross Ref
[103] Li Changlin, Wang Guangrun, Wang Bing, Liang Xiaodan, Li Zhihui, and Chang Xiaojun. 2021. DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers. DOI:arxiv:2109.10060 [cs].Google ScholarCross Ref
[104] Li Changlin, Wang Guangrun, Wang Bing, Liang Xiaodan, Li Zhihui, and Chang Xiaojun. 2021. Dynamic slimmable network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8607–8617.Google ScholarCross Ref
[105] Li Hengduo, Wu Zuxuan, Shrivastava Abhinav, and Davis Larry S.. 2021. 2D or not 2D? Adaptive 3D convolution selection for efficient video recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6155–6164.Google ScholarCross Ref
[106] Li Hao, Zhang Hong, Qi Xiaojuan, Yang Ruigang, and Huang Gao. 2019. Improved techniques for training adaptive deep networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1891–1900.Google ScholarCross Ref
[107] Li Liangzhi, Ota Kaoru, and Dong Mianxiong. 2018. Deep learning for smart industry: Efficient manufacture inspection system with fog computing. IEEE Transactions on Industrial Informatics 14, 10 (Oct.2018), 4665–4673. DOI:Google ScholarCross Ref
[108] Li Xiaoxiao, Liu Ziwei, Luo Ping, Loy Chen Change, and Tang Xiaoou. 2017. Not all pixels are equal: Difficulty-aware semantic segmentation via deep layer cascade. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3193–3202.Google ScholarCross Ref
[109] Li Yanwei, Song Lin, Chen Yukang, Li Zeming, Zhang Xiangyu, Wang Xingang, and Sun Jian. 2020. Learning dynamic routing for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8553–8562.Google ScholarCross Ref
[110] Li Zhichao, Yang Yi, Liu Xiao, Zhou Feng, Wen Shilei, and Xu Wei. 2017. Dynamic computational time for visual attention. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 1199–1209.Google ScholarCross Ref
[111] LiKamWa Robert and Zhong Lin. 2015. Starfish: Efficient concurrency support for computer vision applications. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys’15). Association for Computing Machinery, New York, NY, USA, 213–226. DOI:Google ScholarDigital Library
[112] Lin Ji, Rao Yongming, Lu Jiwen, and Zhou Jie. 2017. Runtime neural pruning. In Advances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc.Google Scholar
[113] Lin Yingyan, Sakr Charbel, Kim Yongjune, and Shanbhag Naresh. 2017. PredictiveNet: An energy-efficient convolutional neural network via zero prediction. In 2017 IEEE International Symposium on Circuits and Systems (ISCAS). 1–4. DOI:Google ScholarCross Ref
[114] Liu Chuanjian, Wang Yunhe, Han Kai, Xu Chunjing, and Xu Chang. 2019. Learning Instance-wise Sparsity for Accelerating Deep Models. DOI:arxiv:1907.11840 [cs].Google ScholarCross Ref
[115] Liu Lanlan and Deng Jia. 2018. Dynamic deep neural networks: Optimizing accuracy-efficiency trade-offs by selective execution. Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (April2018). DOI:Google ScholarCross Ref
[116] Liu Luyang, Li Hongyu, and Gruteser Marco. 2019. Edge assisted real-time object detection for mobile augmented reality. In The 25th Annual International Conference on Mobile Computing and Networking (MobiCom’19). Association for Computing Machinery, New York, NY, USA, 1–16. DOI:Google ScholarDigital Library
[117] Liu Miaomiao, Ding Xianzhong, and Du Wan. 2020. Continuous, real-time object detection on mobile devices without offloading. In 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). 976–986. DOI:Google ScholarCross Ref
[118] Liu Sicong, Lin Yingyan, Zhou Zimu, Nan Kaiming, Liu Hui, and Du Junzhao. 2018. On-demand deep model compression for mobile devices: A usage-driven model selection framework. In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys’18). Association for Computing Machinery, New York, NY, USA, 389–400. DOI:Google ScholarDigital Library
[119] Liu Weijie, Zhou Peng, Zhao Zhe, Wang Zhiruo, Deng Haotang, and Ju Qi. 2020. FastBERT: A Self-distilling BERT with Adaptive Inference Time. DOI:arxiv:2004.02178 [cs].Google ScholarCross Ref
[120] Liu Xianggen, Mou Lili, Cui Haotian, Lu Zhengdong, and Song Sen. 2020. Finding decision jumps in text classification. Neurocomputing 371 (Jan.2020), 177–187. DOI:Google ScholarDigital Library
[121] Lo Chi, Su Yu-Yi, Lee Chun-Yi, and Chang Shih-Chieh. 2017. A dynamic deep neural network design for efficient workload allocation in edge computing. In 2017 IEEE International Conference on Computer Design (ICCD). 273–280. DOI:Google ScholarCross Ref
[122] Lou Wei, Xun Lei, Sabet Amin, Bi Jia, Hare Jonathon, and Merrett Geoff V.. 2021. Dynamic-OFA: Runtime DNN architecture switching for performance scaling on heterogeneous embedded platforms. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3110–3118.Google ScholarCross Ref
[123] Lovagnini Luca, Zhang Wenxiao, Bijarbooneh Farshid Hassani, and Hui Pan. 2018. CIRCE: Real-time caching for instance recognition on cloud environments and multi-core architectures. In Proceedings of the 26th ACM International Conference on Multimedia (MM’18). Association for Computing Machinery, New York, NY, USA, 346–354. DOI:Google ScholarDigital Library
[124] Mao Jiachen, Yang Qing, Li Ang, Nixon Kent W., Li Hai, and Chen Yiran. 2022. Toward efficient and adaptive design of video detection system with deep neural networks. ACM Transactions on Embedded Computing Systems 21, 3 (July2022), 33:1–33:21. DOI:Google ScholarDigital Library
[125] Marco Vicent Sanz, Taylor Ben, Wang Zheng, and Elkhatib Yehia. 2020. Optimizing deep learning inference on embedded systems through adaptive model selection. ACM Transactions on Embedded Computing Systems 19, 1 (Feb.2020), 2:1–2:28. DOI:Google ScholarDigital Library
[126] Matsubara Yoshitomo, Levorato Marco, and Restuccia Francesco. 2022. Split computing and early exiting for deep learning applications: Survey and research challenges. Comput. Surveys 55, 5 (Dec.2022), 90:1–90:30. DOI:Google ScholarDigital Library
[127] McGill Mason and Perona Pietro. 2017. Deciding how to decide: Dynamic routing in artificial neural networks. In Proceedings of the 34th International Conference on Machine Learning. PMLR, 2363–2372.Google Scholar
[128] Meng Lingchen, Li Hengduo, Chen Bor-Chun, Lan Shiyi, Wu Zuxuan, Jiang Yu-Gang, and Lim Ser-Nam. 2022. AdaViT: Adaptive vision transformers for efficient image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12309–12318.Google ScholarCross Ref
[129] Meng Yue, Lin Chung-Ching, Panda Rameswar, Sattigeri Prasanna, Karlinsky Leonid, Oliva Aude, Saenko Kate, and Feris Rogerio. 2020. AR-Net: Adaptive frame resolution for efficient action recognition. In Computer Vision – ECCV 2020 (Lecture Notes in Computer Science), Vedaldi Andrea, Bischof Horst, Brox Thomas, and Frahm Jan-Michael (Eds.). Springer International Publishing, Cham, 86–104. DOI:Google ScholarDigital Library
[130] Meng Yue, Panda Rameswar, Lin Chung-Ching, Sattigeri Prasanna, Karlinsky Leonid, Saenko Kate, Oliva Aude, and Feris Rogerio. 2021. AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition. DOI:arxiv:2102.05775 [cs].Google ScholarCross Ref
[131] Mnih Volodymyr, Heess Nicolas, Graves Alex, and Kavukcuoglu Koray. 2014. Recurrent models of visual attention. In Advances in Neural Information Processing Systems, Vol. 27. Curran Associates, Inc.Google ScholarDigital Library
[132] Mullapudi Ravi Teja, Mark William R., Shazeer Noam, and Fatahalian Kayvon. 2018. HydraNets: Specialized dynamic architectures for efficient inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8080–8089.Google Scholar
[133] Nalaie Keivan, Xu Renjie, and Zheng Rong. 2022. DeepScale: Online frame size adaptation for multi-object tracking on smart cameras and edge servers. In 2022 IEEE/ACM Seventh International Conference on Internet-of-Things Design and Implementation (IoTDI). 67–79. DOI:Google ScholarCross Ref
[134] Namuduri Srikanth, Narayanan Barath Narayanan, Davuluru Venkata Salini Priyamvada, Burton Lamar, and Bhansali Shekhar. 2020. Review—Deep learning methods for sensor based predictive maintenance and future perspectives for electrochemical sensors. Journal of The Electrochemical Society 167, 3 (Jan.2020), 037552. DOI:Google ScholarCross Ref
[135] Neumann Mark, Stenetorp Pontus, and Riedel Sebastian. 2016. Learning to Reason with Adaptive Computation. DOI:arxiv:1610.07647 [cs, stat].Google ScholarCross Ref
[136] O’Connor Peter and Welling Max. 2016. Sigma Delta Quantized Networks. DOI:arxiv:1611.02024 [cs].Google ScholarCross Ref
[137] Odena Augustus, Lawson Dieterich, and Olah Christopher. 2017. Changing Model Behavior at Test-Time Using Reinforcement Learning. DOI:arxiv:1702.07780 [cs, stat].Google ScholarCross Ref
[138] Ogden Samuel S. and Guo Tian. 2018. {MODI}: Mobile deep inference made efficient by edge computing. In USENIX Workshop on Hot Topics in Edge Computing (HotEdge 18).Google Scholar
[139] Pan Bowen, Lin Wuwei, Fang Xiaolin, Huang Chaoqin, Zhou Bolei, and Lu Cewu. 2018. Recurrent residual module for fast inference in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1536–1545.Google ScholarCross Ref
[140] Pan Bowen, Panda Rameswar, Fosco Camilo, Lin Chung-Ching, Andonian Alex, Meng Yue, Saenko Kate, Oliva Aude, and Feris Rogerio. 2021. VA-RED$$\hat2$$: Video Adaptive Redundancy Reduction. DOI:arxiv:2102.07887 [cs].Google ScholarCross Ref
[141] Pan Bowen, Panda Rameswar, Jiang Yifan, Wang Zhangyang, Feris Rogerio, and Oliva Aude. 2021. IA-RED2: Interpretability-aware redundancy reduction for vision transformers. In Advances in Neural Information Processing Systems, Vol. 34. Curran Associates, Inc., 24898–24911.Google Scholar
[142] Panda Priyadarshini, Ankit Aayush, Wijesinghe Parami, and Roy Kaushik. 2017. FALCON: Feature driven selective classification for energy-efficient image recognition. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 36, 12 (Dec.2017), 2017–2029. DOI:Google ScholarCross Ref
[143] Panda Priyadarshini, Sengupta Abhronil, and Roy Kaushik. 2016. Conditional deep learning for energy-efficient and enhanced pattern recognition. In 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE). 475–480.Google ScholarDigital Library
[144] Panda Priyadarshini, Sengupta Abhronil, and Roy Kaushik. 2017. Energy-efficient and improved image recognition with conditional deep learning. ACM Journal on Emerging Technologies in Computing Systems 13, 3 (Feb.2017), 33:1–33:21. DOI:Google ScholarDigital Library
[145] Parger Mathias, Tang Chengcheng, Twigg Christopher D., Keskin Cem, Wang Robert, and Steinberger Markus. 2022. DeltaCNN: End-to-End CNN inference of sparse frame differences in videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12497–12506.Google ScholarCross Ref
[146] Park Eunhyeok, Kim Dongyoung, Kim Soobeom, Kim Yong-Deok, Kim Gunhee, Yoon Sungroh, and Yoo Sungjoo. 2015. Big/little deep neural network for ultra low power inference. In 2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). 124–132. DOI:Google ScholarCross Ref
[147] Rao Yongming, Liu Zuyan, Zhao Wenliang, Zhou Jie, and Lu Jiwen. 2022. Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks. DOI:arxiv:2207.01580 [cs].Google ScholarCross Ref
[148] Rao Yongming, Lu Jiwen, and Zhou Jie. 2017. Attention-aware deep reinforcement learning for video face recognition. In Proceedings of the IEEE International Conference on Computer Vision. 3931–3940.Google ScholarCross Ref
[149] Rao Yongming, Zhao Wenliang, Liu Benlin, Lu Jiwen, Zhou Jie, and Hsieh Cho-Jui. 2021. DynamicViT: Efficient vision transformers with dynamic token sparsification. In Advances in Neural Information Processing Systems, Vol. 34. Curran Associates, Inc., 13937–13949.Google Scholar
[150] Rashid Nafiul, Demirel Berken Utku, Odema Mohanad, and Faruque Mohammad Abdullah Al. 2022. Template matching based early exit CNN for energy-efficient myocardial infarction detection on low-power wearable devices. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 2 (July2022), 68:1–68:22. DOI:Google ScholarDigital Library
[151] Ren Mengye, Pokrovsky Andrei, Yang Bin, and Urtasun Raquel. 2018. SBNet: Sparse blocks network for fast inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8711–8720.Google ScholarCross Ref
[152] Rosenbaum Clemens, Klinger Tim, and Riemer Matthew. 2017. Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning. DOI:arxiv:1711.01239 [cs].Google ScholarCross Ref
[153] Bulo Samuel Rota and Kontschieder Peter. 2014. Neural decision forests for semantic image labelling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 81–88.Google ScholarDigital Library
[154] Sabetsarvestani Mohammadamin, Hare Jonathon, Al-Hashimi Bashir, and Merrett Geoff. 2021. Similarity-aware CNN for efficient video recognition at the edge. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (Dec.2021). DOI:Google ScholarDigital Library
[155] Sabih Muhammad, Hannig Frank, and Teich Jürgen. 2022. DyFiP: Explainable AI-based dynamic filter pruning of convolutional neural networks. In Proceedings of the 2nd European Workshop on Machine Learning and Systems (EuroMLSys’22). Association for Computing Machinery, New York, NY, USA, 109–115. DOI:Google ScholarDigital Library
[156] Salem Tareq Si, Neglia Giovanni, and Carra Damiano. 2021. AÇAI: Ascent similarity caching with approximate indexes. In 2021 33rd International Teletraffic Congress (ITC-33). 1–9.Google Scholar
[157] Scardapane Simone, Scarpiniti Michele, Baccarelli Enzo, and Uncini Aurelio. 2020. Why should we add early exits to neural networks? Cognitive Computation 12, 5 (Sept.2020), 954–966. DOI:Google ScholarCross Ref
[158] Schmerge Jordan, Mawhirter Daniel, Holmes Connor, McClurg Jedidiah, and Wu Bo. 2021. ELI$\chi$R: Eliminating computation redundancy in CNN-based video processing. In 2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA). 34–44. DOI:Google ScholarCross Ref
[159] Schuster Tal, Fisch Adam, Gupta Jai, Dehghani Mostafa, Bahri Dara, Tran Vinh, Tay Yi, and Metzler Donald. 2022. Confident adaptive language modeling. Advances in Neural Information Processing Systems 35 (Dec.2022), 17456–17472.Google Scholar
[160] Schwartz Roy, Stanovsky Gabriel, Swayamdipta Swabha, Dodge Jesse, and Smith Noah A.. 2020. The Right Tool for the Job: Matching Model and Instance Complexities. DOI:arxiv:2004.07453 [cs].Google ScholarCross Ref
[161] Seo Minjoon, Min Sewon, Farhadi Ali, and Hajishirzi Hannaneh. 2018. Neural Speed Reading via Skim-RNN. DOI:arxiv:1711.02085 [cs].Google ScholarCross Ref
[162] Shazeer Noam, Mirhoseini Azalia, Maziarz Krzysztof, Davis Andy, Le Quoc, Hinton Geoffrey, and Dean Jeff. 2017. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. DOI:arxiv:1701.06538 [cs, stat].Google ScholarCross Ref
[163] Shen Jianghao, Wang Yue, Xu Pengfei, Fu Yonggan, Wang Zhangyang, and Lin Yingyan. 2020. Fractional skipping: Towards finer-grained dynamic CNN inference. Proceedings of the AAAI Conference on Artificial Intelligence 34, 04 (April2020), 5700–5708. DOI:Google ScholarCross Ref
[164] Shi Mengnan, Liu Chang, Ye Qixiang, and Jiao Jianbin. 2021. Feature-Gate Coupling for Dynamic Network Pruning. DOI:arxiv:2111.14302 [cs].Google ScholarCross Ref
[165] Simonovsky Martin and Komodakis Nikos. 2017. Dynamic edge-conditioned filters in convolutional neural networks on graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3693–3702.Google ScholarCross Ref
[166] Song Zhuoran, Wu Feiyang, Liu Xueyuan, Ke Jing, Jing Naifeng, and Liang Xiaoyao. 2020. VR-DANN: Real-time video recognition via decoder-assisted neural network acceleration. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 698–710. DOI:Google ScholarCross Ref
[167] Stamoulis Dimitrios, Chin Ting-Wu Rudy, Prakash Anand Krishnan, Fang Haocheng, Sajja Sribhuvan, Bognar Mitchell, and Marculescu Diana. 2018. Designing adaptive neural networks for energy-constrained image classification. In 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE Press, San Diego, CA, USA, 1–8. DOI:Google ScholarDigital Library
[168] Su Yu-Chuan and Grauman Kristen. 2016. Leaving some stones unturned: Dynamic feature prioritization for activity detection in streaming video. In Computer Vision – ECCV 2016 (Lecture Notes in Computer Science), Leibe Bastian, Matas Jiri, Sebe Nicu, and Welling Max (Eds.). Springer International Publishing, Cham, 783–800. DOI:Google ScholarCross Ref
[169] Sukhbaatar Sainbayar, Grave Edouard, Bojanowski Piotr, and Joulin Armand. 2019. Adaptive Attention Span in Transformers. arxiv:1905.07799 [cs, stat].Google Scholar
[170] Sun Ximeng, Panda Rameswar, Chen Chun-Fu (Richard), Oliva Aude, Feris Rogerio, and Saenko Kate. 2021. Dynamic network quantization for efficient video inference. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7375–7385.Google ScholarCross Ref
[171] Takhirov Zafar, Wang Joseph, Saligrama Venkatesh, and Joshi Ajay. 2016. Energy-efficient adaptive classifier design for mobile systems. In Proceedings of the 2016 International Symposium on Low Power Electronics and Design (ISLPED’16). Association for Computing Machinery, New York, NY, USA, 52–57. DOI:Google ScholarDigital Library
[172] Tan Tianxiang and Cao Guohong. 2021. Efficient execution of deep neural networks on mobile devices with NPU. In Proceedings of the 20th International Conference on Information Processing in Sensor Networks (Co-Located with CPS-IoT Week 2021) (IPSN’21). Association for Computing Machinery, New York, NY, USA, 283–298. DOI:Google ScholarDigital Library
[173] Tang Chen, Sun Wenyu, Wang Wenxun, and Liu Yongpan. 2022. Dynamic CNN accelerator supporting efficient filter generator with kernel enhancement and online channel pruning. In 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC). 436–441. DOI:Google ScholarDigital Library
[174] Tang Chen, Zhai Haoyu, Ouyang Kai, Wang Zhi, Zhu Yifei, and Zhu Wenwu. 2022. Arbitrary Bit-width Network: A Joint Layer-Wise Quantization and Adaptive Inference Approach. DOI:arxiv:2204.09992 [cs].Google ScholarCross Ref
[175] Tang Yansong, Tian Yi, Lu Jiwen, Li Peiyang, and Zhou Jie. 2018. Deep progressive reinforcement learning for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5323–5332.Google ScholarCross Ref
[176] Tann Hokchhay, Hashemi Soheil, Bahar R. Iris, and Reda Sherief. 2016. Runtime configurable deep neural networks for energy-accuracy trade-off. In 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). 1–10.Google ScholarDigital Library
[177] Tanno Ryutaro, Arulkumaran Kai, Alexander Daniel, Criminisi Antonio, and Nori Aditya. 2019. Adaptive neural trees. In Proceedings of the 36th International Conference on Machine Learning. PMLR, 6166–6175.Google Scholar
[178] Taylor Ben, Marco Vicent Sanz, Wolff Willy, Elkhatib Yehia, and Wang Zheng. 2018. Adaptive deep learning model selection on embedded systems. ACM SIGPLAN Notices 53, 6 (June2018), 31–43. DOI:Google ScholarDigital Library
[179] Teerapittayanon Surat, McDanel Bradley, and Kung H. T.. 2016. BranchyNet: Fast inference via early exiting from deep neural networks. In 2016 23rd International Conference on Pattern Recognition (ICPR). 2464–2469. DOI:Google ScholarCross Ref
[180] Teerapittayanon Surat, McDanel Bradley, and Kung H. T.. 2017. Distributed deep neural networks over the cloud, the edge and end devices. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). 328–339. DOI:Google ScholarCross Ref
[181] Vaudaux-Ruth Guillaume, Chan-Hon-Tong Adrien, and Achard Catherine. 2021. ActionSpotter: Deep reinforcement learning framework for temporal action spotting in videos. In 2020 25th International Conference on Pattern Recognition (ICPR). 631–638. DOI:Google ScholarCross Ref
[182] Veit Andreas and Belongie Serge. 2018. Convolutional networks with adaptive inference graphs. In Proceedings of the European Conference on Computer Vision (ECCV). 3–18.Google ScholarDigital Library
[183] Venugopal Srikumar, Gazzetti Michele, Gkoufas Yiannis, and Katrinis Kostas. 2018. Shadow puppets: Cloud-level accurate {AI} inference at the speed and economy of edge. In USENIX Workshop on Hot Topics in Edge Computing (HotEdge 18).Google Scholar
[184] Verelst Thomas and Tuytelaars Tinne. 2020. Dynamic convolutions: Exploiting spatial sparsity for faster inference. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2320–2329.Google ScholarCross Ref
[185] Verelst Thomas and Tuytelaars Tinne. 2021. BlockCopy: High-resolution video processing with block-sparse feature propagation and online policies. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5158–5167.Google ScholarCross Ref
[186] Wang Huiyu, Kembhavi Aniruddha, Farhadi Ali, Yuille Alan L., and Rastegari Mohammad. 2019. ELASTIC: Improving CNNs With dynamic scaling policies. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2258–2267.Google ScholarCross Ref
[187] Wang Junjue, Feng Ziqiang, Chen Zhuo, George Shilpa, Bala Mihir, Pillai Padmanabhan, Yang Shao-Wen, and Satyanarayanan Mahadev. 2018. Bandwidth-efficient live video analytics for drones via edge computing. In 2018 IEEE/ACM Symposium on Edge Computing (SEC). 159–173. DOI:Google ScholarCross Ref
[188] Wang Limin, Xiong Yuanjun, Wang Zhe, Qiao Yu, Lin Dahua, Tang Xiaoou, and Gool Luc Van. 2016. Temporal segment networks: Towards good practices for deep action recognition. In Computer Vision – ECCV 2016 (Lecture Notes in Computer Science), Leibe Bastian, Matas Jiri, Sebe Nicu, and Welling Max (Eds.). Springer International Publishing, Cham, 20–36. DOI:Google ScholarCross Ref
[189] Wang Qilong, Wu Banggu, Zhu Pengfei, Li Peihua, Zuo Wangmeng, and Hu Qinghua. 2020. ECA-Net: Efficient channel attention for deep convolutional neural networks. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 11531–11539. DOI:Google ScholarCross Ref
[190] Wang Xin, Yu Fisher, Dou Zi-Yi, Darrell Trevor, and Gonzalez Joseph E.. 2018. SkipNet: Learning dynamic routing in convolutional networks. In Proceedings of the European Conference on Computer Vision (ECCV). 409–424.Google ScholarDigital Library
[191] Wang Xin, Yu Fisher, Dunlap Lisa, Ma Yi-An, Wang Ruth, Mirhoseini Azalia, Darrell Trevor, and Gonzalez Joseph E.. 2020. Deep mixture of experts via shallow embedding. In Proceedings of The 35th Uncertainty in Artificial Intelligence Conference. PMLR, 552–562.Google Scholar
[192] Wang Yulin, Chen Zhaoxi, Jiang Haojun, Song Shiji, Han Yizeng, and Huang Gao. 2021. Adaptive focus for efficient video recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 16249–16258.Google ScholarCross Ref
[193] Wang Yulin, Huang Rui, Song Shiji, Huang Zeyi, and Huang Gao. 2021. Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition. DOI:arxiv:2105.15075 [cs].Google ScholarCross Ref
[194] Wang Yue, Shen Jianghao, Hu Ting-Kuei, Xu Pengfei, Nguyen Tan, Baraniuk Richard, Wang Zhangyang, and Lin Yingyan. 2020. Dual dynamic inference: Enabling more efficient, adaptive, and controllable deep inference. IEEE Journal of Selected Topics in Signal Processing 14, 4 (May2020), 623–633. DOI:Google ScholarCross Ref
[195] Wu Wenhao, He Dongliang, Tan Xiao, Chen Shifeng, and Wen Shilei. 2019. Multi-agent reinforcement learning based frame sampling for effective untrimmed video recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6222–6231.Google ScholarCross Ref
[196] Wu Wenhao, He Dongliang, Tan Xiao, Chen Shifeng, Yang Yi, and Wen Shilei. 2020. Dynamic inference: A new approach toward efficient video action recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 676–677.Google ScholarCross Ref
[197] Wu Zuxuan, Nagarajan Tushar, Kumar Abhishek, Rennie Steven, Davis Larry S., Grauman Kristen, and Feris Rogerio. 2018. BlockDrop: Dynamic inference paths in residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8817–8826.Google ScholarCross Ref
[198] Wu Zuxuan, Xiong Caiming, Jiang Yu-Gang, and Davis Larry S.. 2019. LiteEval: A coarse-to-fine framework for resource efficient video recognition. In Advances in Neural Information Processing Systems, Vol. 32. Curran Associates, Inc.Google Scholar
[199] Wu Zuxuan, Xiong Caiming, Ma Chih-Yao, Socher Richard, and Davis Larry S.. 2019. AdaFrame: Adaptive frame selection for fast video recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1278–1287.Google ScholarCross Ref
[200] Wu Zhaofeng, Zhao Ding, Liang Qiao, Yu Jiahui, Gulati Anmol, and Pang Ruoming. 2021. Dynamic sparsity neural networks for automatic speech recognition. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 6014–6018. DOI:Google ScholarCross Ref
[201] Xia Wenhan, Yin Hongxu, Dai Xiaoliang, and Jha Niraj K.. 2022. Fully dynamic inference with deep neural networks. IEEE Transactions on Emerging Topics in Computing 10, 2 (April2022), 962–972. DOI:Google ScholarCross Ref
[202] Xie Zhenda, Zhang Zheng, Zhu Xizhou, Huang Gao, and Lin Stephen. 2020. Spatially adaptive inference with stochastic feature sampling and interpolation. In Computer Vision – ECCV 2020 (Lecture Notes in Computer Science), Vedaldi Andrea, Bischof Horst, Brox Thomas, and Frahm Jan-Michael (Eds.). Springer International Publishing, Cham, 531–548. DOI:Google ScholarDigital Library
[203] Xin Ji, Tang Raphael, Lee Jaejun, Yu Yaoliang, and Lin Jimmy. 2020. DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference. DOI:arxiv:2004.12993 [cs].Google ScholarCross Ref
[204] Xin Ji, Tang Raphael, Yu Yaoliang, and Lin Jimmy. 2021. BERxiT: Early exiting for BERT with better fine-tuning and extension to regression. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Merlo Paola, Tiedemann Jorg, and Tsarfaty Reut (Eds.). Association for Computational Linguistics, Online, 91–104. DOI:Google ScholarCross Ref
[205] Xu Dianlei, Li Tong, Li Yong, Su Xiang, Tarkoma Sasu, Jiang Tao, Crowcroft Jon, and Hui Pan. 2021. Edge intelligence: Empowering intelligence to the edge of network. Proc. IEEE 109, 11 (Nov.2021), 1778–1837. DOI:Google ScholarCross Ref
[206] Xu Kelvin, Ba Jimmy, Kiros Ryan, Cho Kyunghyun, Courville Aaron, Salakhudinov Ruslan, Zemel Rich, and Bengio Yoshua. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the 32nd International Conference on Machine Learning. PMLR, 2048–2057.Google ScholarDigital Library
[207] Xu Lanyu, Iyengar Arun, and Shi Weisong. 2020. CHA: A caching framework for home-based voice assistant systems. In 2020 IEEE/ACM Symposium on Edge Computing (SEC). 293–306. DOI:Google ScholarCross Ref
[208] Mengwei Xu, Xuanzhe Liu, Yunxin Liu, and Felix Xiaozhu Lin. 2017. Accelerating convolutional neural networks for continuous mobile vision via cache reuse. CoRR abs/1712.01670 (2017). arXiv:1712.01670. http://arxiv.org/abs/1712.01670Google Scholar
[209] Xu Mengwei, Zhu Mengze, Liu Yunxin, Lin Felix Xiaozhu, and Liu Xuanzhe. 2018. DeepCache: Principled cache for mobile deep vision. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking (MobiCom’18). Association for Computing Machinery, New York, NY, USA, 129–144. DOI:Google ScholarDigital Library
[210] Xun Lei, Tran-Thanh Long, Al-Hashimi Bashir M., and Merrett Geoff V.. 2019. Incremental training and group convolution pruning for runtime DNN performance scaling on heterogeneous embedded platforms. In 2019 ACM/IEEE 1st Workshop on Machine Learning for CAD (MLCAD). 1–6. DOI:Google ScholarCross Ref
[211] Xun Lei, Tran-Thanh Long, Al-Hashimi Bashir M., and Merrett Geoff V.. 2020. Optimising resource management for embedded machine learning. In 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE). 1556–1561. DOI:Google ScholarCross Ref
[212] Yan Zhicheng, Zhang Hao, Piramuthu Robinson, Jagadeesh Vignesh, DeCoste Dennis, Di Wei, and Yu Yizhou. 2015. HD-CNN: Hierarchical deep convolutional neural networks for large scale visual recognition. In Proceedings of the IEEE International Conference on Computer Vision. 2740–2748.Google ScholarDigital Library
[213] Yang Kang, Xing Tianzhang, Liu Yang, Li Zhenjiang, Gong Xiaoqing, Chen Xiaojiang, and Fang Dingyi. 2019. cDeepArch: A compact deep neural network architecture for mobile sensing. IEEE/ACM Transactions on Networking 27, 5 (Oct.2019), 2043–2055. DOI:Google ScholarDigital Library
[214] Yang Kichang, Yi Juheon, Lee Kyungjin, and Lee Youngki. 2022. FlexPatch: Fast and accurate object detection for on-device high-resolution live video analytics. In IEEE INFOCOM 2022 - IEEE Conference on Computer Communications. 1898–1907. DOI:Google ScholarDigital Library
[215] Yang Le, Han Yizeng, Chen Xi, Song Shiji, Dai Jifeng, and Huang Gao. 2020. Resolution adaptive networks for efficient inference. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2369–2378.Google ScholarCross Ref
[216] Yang Yu, Liu Di, Fang Hui, Huang Yi-Xiong, Sun Ying, and Zhang Zhi-Yuan. 2022. Once for all skip: Efficient adaptive deep neural networks. In 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE). 568–571. DOI:Google ScholarCross Ref
[217] Yang Zerui, Xu Yuhui, Dai Wenrui, and Xiong Hongkai. 2019. Dynamic-stride-net: Deep convolutional neural network with dynamic stride. In Optoelectronic Imaging and Multimedia Technology VI, Vol. 11187. SPIE, 42–53. DOI:Google ScholarCross Ref
[218] Yeung Serena, Russakovsky Olga, Mori Greg, and Fei-Fei Li. 2016. End-to-end learning of action detection from frame glimpses in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2678–2687.Google ScholarCross Ref
[219] Yin Hongxu, Vahdat Arash, Alvarez Jose M., Mallya Arun, Kautz Jan, and Molchanov Pavlo. 2022. A-ViT: Adaptive tokens for efficient vision transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10809–10818.Google ScholarCross Ref
[220] Yousefzadeh Amirreza and Sifalakis Manolis. 2022. Delta Activation Layer Exploits Temporal Sparsity for Efficient Embedded Video Processing.Google ScholarCross Ref
[221] Yu Adams Wei, Lee Hongrae, and Le Quoc V.. 2017. Learning to Skim Text. DOI:arxiv:1704.06877 [cs].Google ScholarCross Ref
[222] Yu Haichao, Li Haoxiang, Shi Humphrey, Huang Thomas S., and Hua Gang. 2021. Any-precision deep neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 35, 12 (May2021), 10763–10771. DOI:Google ScholarCross Ref
[223] Yu Jiahui and Huang Thomas S.. 2019. Universally slimmable networks and improved training techniques. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1803–1811.Google ScholarCross Ref
[224] Yu Jiahui, Yang Linjie, Xu Ning, Yang Jianchao, and Huang Thomas. 2018. Slimmable Neural Networks. DOI:arxiv:1812.08928 [cs].Google ScholarCross Ref
[225] Yu Keyi, Liu Yang, Schwing Alexander G., and Peng Jian. 2022. Fast and accurate text classification: Skimming, rereading and early stopping. (Feb.2022).Google Scholar
[226] Yuan Kun, Li Quanquan, Guo Shaopeng, Chen Dapeng, Zhou Aojun, Yu Fengwei, and Liu Ziwei. 2021. Differentiable dynamic wirings for neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 327–336.Google ScholarCross Ref
[227] Yuan Zhihang, Wu Bingzhe, Sun Guangyu, Liang Zheng, Zhao Shiwan, and Bi Weichen. 2020. S2DNAS: Transforming static CNN model for dynamic inference via neural architecture search. In Computer Vision – ECCV 2020 (Lecture Notes in Computer Science), Vedaldi Andrea, Bischof Horst, Brox Thomas, and Frahm Jan-Michael (Eds.). Springer International Publishing, Cham, 175–192. DOI:Google ScholarDigital Library
[228] Zeng Liekang, Li En, Zhou Zhi, and Chen Xu. 2019. Boomerang: On-demand cooperative deep neural network inference for edge intelligence on the industrial internet of things. IEEE Network 33, 5 (Sept.2019), 96–103. DOI:Google ScholarDigital Library
[229] Zhang Chen, Cao Qiang, Jiang Hong, Zhang Wenhui, Li Jingjun, and Yao Jie. 2018. FFS-VA: A fast filtering system for large-scale video analytics. In Proceedings of the 47th International Conference on Parallel Processing (ICPP 2018). Association for Computing Machinery, New York, NY, USA, 1–10. DOI:Google ScholarDigital Library
[230] Zhang Chen, Cao Qiang, Jiang Hong, Zhang Wenhui, Li Jingjun, and Yao Jie. 2020. A fast filtering mechanism to improve efficiency of large-scale video analytics. IEEE Trans. Comput. 69, 6 (June2020), 914–928. DOI:Google ScholarCross Ref
[231] Zhang Jinrui, Zhang Deyu, Yang Huan, Liu Yunxin, Ren Ju, Xu Xiaohui, Jia Fucheng, and Zhang Yaoxue. 2022. MVPose:Realtime multi-person pose estimation using motion vector on mobile devices. IEEE Transactions on Mobile Computing (2022), 1–1. DOI:Google ScholarDigital Library
[232] Zhang Linfeng, Tan Zhanhong, Song Jiebo, Chen Jingwei, Bao Chenglong, and Ma Kaisheng. 2019. SCAN: A scalable neural networks framework towards compact and efficient models. In Advances in Neural Information Processing Systems, Vol. 32. Curran Associates, Inc.Google Scholar
[233] Zhang Pei, Liang Tailin, Glossner John, Wang Lei, Shi Shaobo, and Zhang Xiaotong. 2021. Dynamic runtime feature map pruning. In Pattern Recognition and Computer Vision (Lecture Notes in Computer Science), Ma Huimin, Wang Liang, Zhang Changshui, Wu Fei, Tan Tieniu, Wang Yaonan, Lai Jianhuang, and Zhao Yao (Eds.). Springer International Publishing, Cham, 411–422. DOI:Google ScholarDigital Library
[234] Zhang Wuyang, He Zhezhi, Liu Luyang, Jia Zhenhua, Liu Yunxin, Gruteser Marco, Raychaudhuri Dipankar, and Zhang Yanyong. 2021. Elf: Accelerate high-resolution mobile deep vision with content-aware parallel offloading. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking (MobiCom’21). Association for Computing Machinery, New York, NY, USA, 201–214. DOI:Google ScholarDigital Library
[235] Zhang Yu, Liu Dajiang, and Xing Yongkang. 2021. Dynamic convolution pruning using pooling characteristic in convolution neural networks. In Neural Information Processing (Communications in Computer and Information Science), Mantoro Teddy, Lee Minho, Ayu Media Anugerah, Wong Kok Wai, and Hidayanto Achmad Nizar (Eds.). Springer International Publishing, Cham, 558–565. DOI:Google ScholarCross Ref
[236] Zheng Yin-Dong, Liu Zhaoyang, Lu Tong, and Wang Limin. 2020. Dynamic sampling networks for efficient action recognition in videos. IEEE Transactions on Image Processing 29 (2020), 7970–7983. DOI:Google ScholarCross Ref
[237] Zhou Wangchunshu, Xu Canwen, Ge Tao, McAuley Julian, Xu Ke, and Wei Furu. 2020. BERT loses patience: Fast and robust inference with early exit. In Advances in Neural Information Processing Systems, Vol. 33. Curran Associates, Inc., 18330–18341.Google Scholar
[238] Zolfaghari Mohammadreza, Singh Kamaljeet, and Brox Thomas. 2018. ECO: Efficient convolutional network for online video understanding. In Proceedings of the European Conference on Computer Vision (ECCV). 695–712.Google ScholarDigital Library
[239] Get Your Footage. 2021. Hands Up Waving Hello Green Screen Effect | Gesture Say Hi Chroma Key in HD 4K.Google Scholar
[240] PCV. 2022. Vehicle Detection Dataset. https://universe.roboflow.com/pcv-wndzh/vehicle-detection-bq16s. Visited on 2024-04-09.Google Scholar

Index Terms

Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep Learning
1. Computing methodologies
  1. Artificial intelligence

Recommendations

Deep learning in neural networks

In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarizes relevant work, much of it from the previous millennium. ...
Read More
Neural Networks and Deep Learning: Neural Networks & Deep Learning, Deep Learning, Blockchain Blueprint
Read More
Continual Learning with Neural Networks: A Review
CODS-COMAD '19: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data

Continual learning broadly refers to the algorithms which aim to learn continuously over time across varying domains, tasks or data distributions. This is in contrast to algorithms restricted to learning a fixed number of tasks in a given domain, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Computing Surveys Volume 56, Issue 10
October 2024
325 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3613652
Issue’s Table of Contents

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 May 2024
- Online AM: 10 April 2024
- Accepted: 2 April 2024
- Revised: 8 March 2024
- Received: 18 January 2023
Published in csur Volume 56, Issue 10

Check for updates
Author Tags
Deep learning
at-runtime adaption
dynamic neural networks
spatio-temporal correlation
Qualifiers
- survey
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 328
  Total Downloads
- Downloads (Last 12 months)328
- Downloads (Last 6 weeks)316
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep Learning

ACM Computing Surveys

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Deep learning in neural networks

Neural Networks and Deep Learning: Neural Networks & Deep Learning, Deep Learning, Blockchain Blueprint

Continual Learning with Neural Networks: A Review

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

Caption

Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep Learning

ACM Computing Surveys

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Deep learning in neural networks

Neural Networks and Deep Learning: Neural Networks & Deep Learning, Deep Learning, Blockchain Blueprint

Continual Learning with Neural Networks: A Review

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

Share this Publication link

Share on Social Media