skip to main content
survey

Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep Learning

Published:14 May 2024Publication History
Skip Abstract Section

Abstract

Adaptive optimization methods for deep learning adjust the inference task to the current circumstances at runtime to improve the resource footprint while maintaining the model’s performance. These methods are essential for the widespread adoption of deep learning, as they offer a way to reduce the resource footprint of the inference task while also having access to additional information about the current environment. This survey covers the state-of-the-art at-runtime optimization methods, provides guidance for readers to choose the best method for their specific use-case, and also highlights current research gaps in this field.

REFERENCES

  1. [1] Akhlaghi Vahideh, Yazdanbakhsh Amir, Samadi Kambiz, Gupta Rajesh K., and Esmaeilzadeh Hadi. 2018. SnaPEA: Predictive early activation for reducing computation in deep convolutional neural networks. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 662673. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Almahairi Amjad, Ballas Nicolas, Cooijmans Tim, Zheng Yin, Larochelle Hugo, and Courville Aaron. 2016. Dynamic capacity networks. In Proceedings of The 33rd International Conference on Machine Learning. PMLR, 25492558.Google ScholarGoogle Scholar
  3. [3] Alwassel Humam, Heilbron Fabian Caba, and Ghanem Bernard. 2018. Action search: Spotting actions in videos and its application to temporal action localization. In Proceedings of the European Conference on Computer Vision (ECCV). 251266.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Alwis Udari De and Alioto Massimo. 2021. TempDiff: Temporal difference-based feature map-level sparsity induction in CNNs with \(\lt\)4% memory overhead. In 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS). 14. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Amthor Manuel, Rodner Erik, and Denzler Joachim. 2016. Impatient DNNs - Deep Neural Networks with Dynamic Time Budgets. DOI:arxiv:1610.02850 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Apicharttrisorn Kittipat, Ran Xukan, Chen Jiasi, Krishnamurthy Srikanth V., and Roy-Chowdhury Amit K.. 2019. Frugal following: Power thrifty object detection and tracking for mobile augmented reality. In Proceedings of the 17th Conference on Embedded Networked Sensor Systems (SenSys’19). Association for Computing Machinery, New York, NY, USA, 96109. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Bejnordi Babak Ehteshami, Blankevoort Tijmen, and Welling Max. 2020. Batch-Shaping for Learning Conditional Channel Gated Networks. DOI:arxiv:1907.06627 [cs, stat].Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Bengio Emmanuel, Bacon Pierre-Luc, Pineau Joelle, and Precup Doina. 2016. Conditional Computation in Neural Networks for Faster Models. DOI:arxiv:1511.06297 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Bolukbasi Tolga, Wang Joseph, Dekel Ofer, and Saligrama Venkatesh. 2017. Adaptive neural networks for efficient inference. In Proceedings of the 34th International Conference on Machine Learning. PMLR, 527536.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Bolukbasi Tolga, Wang Joseph, Dekel Ofer, and Saligrama Venkatesh. 2017. Adaptive neural networks for fast test-time prediction. (Feb.2017).Google ScholarGoogle Scholar
  11. [11] Buckler Mark, Bedoukian Philip, Jayasuriya Suren, and Sampson Adrian. 2018. EVA\(^2\): Exploiting temporal redundancy in live computer vision. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). 533546. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Busia Paola, Theodorakopoulos Ilias, Pothos Vasileios, Fragoulis Nikos, and Meloni Paolo. 2022. Dynamic pruning for parsimonious CNN inference on embedded systems. In Design and Architecture for Signal and Image Processing (Lecture Notes in Computer Science), Desnos Karol and Pertuz Sergio (Eds.). Springer International Publishing, Cham, 4556. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Cai Shaofeng, Shu Yao, and Wang Wei. 2021. Dynamic routing networks. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 35883597.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Campos Victor, Jou Brendan, Giro-i-Nieto Xavier, Torres Jordi, and Chang Shih-Fu. 2018. Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks. DOI:arxiv:1708.06834 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Canel Christopher, Kim Thomas, Zhou Giulio, Li Conglong, Lim Hyeontaek, Andersen David G., Kaminsky Michael, and Dulloor Subramanya. 2019. Scaling video analytics on constrained edge nodes. Proceedings of Machine Learning and Systems 1 (April2019), 406417.Google ScholarGoogle Scholar
  16. [16] Cao Shijie, Ma Lingxiao, Xiao Wencong, Zhang Chen, Liu Yunxin, Zhang Lintao, Nie Lanshun, and Yang Zhi. 2019. SeerNet: Predicting convolutional neural network feature-map sparsity through low-bit quantization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1121611225.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Cavigelli Lukas and Benini Luca. 2020. CBinfer: Exploiting frame-to-frame locality for faster convolutional network inference on video streams. IEEE Transactions on Circuits and Systems for Video Technology 30, 5 (May2020), 14511465. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Cavigelli Lukas, Degen Philippe, and Benini Luca. 2017. CBinfer: Change-based inference for convolutional neural networks on video data. In Proceedings of the 11th International Conference on Distributed Smart Cameras (ICDSC 2017). Association for Computing Machinery, New York, NY, USA, 18. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Chen Jinting, Zhu Zhaocheng, Li Cheng, and Zhao Yuming. 2019. Self-adaptive network pruning. In Neural Information Processing (Lecture Notes in Computer Science), Gedeon Tom, Wong Kok Wai, and Lee Minho (Eds.). Springer International Publishing, Cham, 175186. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Chen Jou-An, Niu Wei, Ren Bin, Wang Yanzhi, and Shen Xipeng. 2023. Survey: Exploiting data redundancy for optimization of deep learning. Comput. Surveys 55, 10 (Feb.2023), 212:1–212:38. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Chen Tiffany Yu-Han, Ravindranath Lenin, Deng Shuo, Bahl Paramvir, and Balakrishnan Hari. 2015. Glimpse: Continuous, real-time object recognition on mobile devices. In Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems (SenSys’15). Association for Computing Machinery, New York, NY, USA, 155168. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Chen Yinpeng, Dai Xiyang, Liu Mengchen, Chen Dongdong, Yuan Lu, and Liu Zicheng. 2020. Dynamic ReLU. In Computer Vision – ECCV 2020 (Lecture Notes in Computer Science), Vedaldi Andrea, Bischof Horst, Brox Thomas, and Frahm Jan-Michael (Eds.). Springer International Publishing, Cham, 351367. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Chen Zhourong, Li Yang, Bengio Samy, and Si Si. 2019. You look twice: GaterNet for dynamic filter selection in CNNs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 91729180.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Cheng An-Chieh, Lin Chieh Hubert, Juan Da-Cheng, Wei Wei, and Sun Min. 2020. InstaNAS: Instance-aware neural architecture search. Proceedings of the AAAI Conference on Artificial Intelligence 34, 04 (April2020), 35773584. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Chiang Chang-Han, Liu Pangfeng, Wang Da-Wei, Hong Ding-Yong, and Wu Jan-Jan. 2021. Optimal branch location for cost-effective inference on Branchynet. In 2021 IEEE International Conference on Big Data (Big Data). 50715080. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Chierichetti Flavio, Kumar Ravi, and Vassilvitskii Sergei. 2009. Similarity caching. In Proceedings of the Twenty-Eighth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’09). Association for Computing Machinery, New York, NY, USA, 127136. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Cordonnier Jean-Baptiste, Mahendran Aravindh, Dosovitskiy Alexey, Weissenborn Dirk, Uszkoreit Jakob, and Unterthiner Thomas. 2021. Differentiable patch selection for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 23512360.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Cox Bart, Birke Robert, and Chen Lydia Y.. 2022. Memory-aware and context-aware multi-DNN inference on the edge. Pervasive and Mobile Computing 83 (July2022), 101594. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Cruz Yarens J., Rivas Marcelino, Quiza Ramón, Haber Rodolfo E., Castaño Fernando, and Villalonga Alberto. 2022. A two-step machine learning approach for dynamic model selection: A case study on a micro milling process. Computers in Industry 143 (Dec.2022), 103764. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Dehghani Mostafa, Gouws Stephan, Vinyals Oriol, Uszkoreit Jakob, and Kaiser Łukasz. 2019. Universal Transformers. DOI:arxiv:1807.03819 [cs, stat].Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Dong Xuanyi, Huang Junshi, Yang Yi, and Yan Shuicheng. 2017. More is less: A more complicated network with less inference complexity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 58405848.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Utsav Drolia, Katherine Guo, and Priya Narasimhan. 2017. Precog: Prefetching for image recognition applications at the edge. In Proceedings of the Second ACM/IEEE Symposium on Edge Computing (SEC’17). Association for Computing Machinery, New York, NY, USA, 1–13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Drolia Utsav, Guo Katherine, Tan Jiaqi, Gandhi Rajeev, and Narasimhan Priya. 2017. Cachier: Edge-caching for recognition applications. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). 276286. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Bejnordi Ali Ehteshami and Krestel Ralf. 2020. Dynamic channel and layer gating in convolutional neural networks. In KI 2020: Advances in Artificial Intelligence (Lecture Notes in Computer Science), Schmid Ute, Klügl Franziska, and Wolter Diedrich (Eds.). Springer International Publishing, Cham, 3345. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Elbayad Maha, Gu Jiatao, Grave Edouard, and Auli Michael. 2020. Depth-Adaptive Transformer. DOI:arxiv:1910.10073 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Elkerdawy Sara, Elhoushi Mostafa, Zhang Hong, and Ray Nilanjan. 2022. Fire together wire together: A dynamic pruning approach with self-supervised mask prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1245412463.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] al. OpenAI et2023. GPT-4 Technical Report. Technical Report arXiv:2303.08774. DOI:arxiv:2303.08774 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Falchi Fabrizio, Lucchese Claudio, Orlando Salvatore, Perego Raffaele, and Rabitti Fausto. 2008. A metric cache for similarity search. In Proceedings of the 2008 ACM Workshop on Large-Scale Distributed Systems for Information Retrieval (LSDS-IR’08). Association for Computing Machinery, New York, NY, USA, 4350. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Falchi Fabrizio, Lucchese Claudio, Orlando Salvatore, Perego Raffaele, and Rabitti Fausto. 2009. Caching content-based queries for robust and efficient image retrieval. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (EDBT’09). Association for Computing Machinery, New York, NY, USA, 780790. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Falchi Fabrizio, Lucchese Claudio, Orlando Salvatore, Perego Raffaele, and Rabitti Fausto. 2012. Similarity caching in large-scale image retrieval. Information Processing & Management 48, 5 (Sept.2012), 803818. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Fan H., Xu Z., Zhu L., Yan C., Ge J., and Yang Y.. 2018. Watching a Small Portion Could Be as Good as Watching All: Towards Efficient Video Classification.Google ScholarGoogle Scholar
  42. [42] Fang Biyi, Zeng Xiao, Zhang Faen, Xu Hui, and Zhang Mi. 2020. FlexDNN: Input-adaptive on-device deep learning for efficient mobile vision. In 2020 IEEE/ACM Symposium on Edge Computing (SEC). 8495. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Fang Yihao, Shalmani Shervin Manzuri, and Zheng Rong. 2020. CacheNet: A Model Caching Framework for Deep Learning Inference on the Edge. DOI:arxiv:2007.01793 [cs, eess].Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Fayyaz Mohsen, Koohpayegani Soroush Abbasi, Jafari Farnoush Rezaei, Sengupta Sunando, Joze Hamid Reza Vaezi, Sommerlade Eric, Pirsiavash Hamed, and Gall Jürgen. 2022. Adaptive token sampling for efficient vision transformers. In Computer Vision – ECCV 2022 (Lecture Notes in Computer Science), Avidan Shai, Brostow Gabriel, Cissé Moustapha, Farinella Giovanni Maria, and Hassner Tal (Eds.). Springer Nature Switzerland, Cham, 396414. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Figurnov Michael, Collins Maxwell D., Zhu Yukun, Zhang Li, Huang Jonathan, Vetrov Dmitry, and Salakhutdinov Ruslan. 2017. Spatially adaptive computation time for residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10391048.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Finamore Alessandro, Roberts James, Gallo Massimo, and Rossi Dario. 2022. Accelerating deep learning classification with error-controlled approximate-key caching. In IEEE INFOCOM 2022 - IEEE Conference on Computer Communications. 21182127. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Fu Jianlong, Zheng Heliang, and Mei Tao. 2017. Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 44384446.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Fu Tsu-Jui and Ma Wei-Yun. 2018. Speed reading: Learning to read forbackward via shuttle. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 44394448. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Gao Xitong, Zhao Yiren, Dudziak Łukasz, Mullins Robert, and Xu Cheng-zhong. 2019. Dynamic Channel Pruning: Feature Boosting and Suppression. DOI:arxiv:1810.05331 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Ghanathe Nikhil P. and Wilton Steve. 2022. T-RECX: Tiny-Resource Efficient Convolutional Neural Networks with Early-Exit. DOI:arxiv:2207.06613 [cs, eess].Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Ghodrati Amir, Bejnordi Babak Ehteshami, and Habibian Amirhossein. 2021. FrameExit: Conditional early exiting for efficient video recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1560815618.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Gilman Guin R., Ogden Samuel S., Walls Robert J., and Guo Tian. 2019. Challenges and opportunities of DNN model execution caching. In Proceedings of the Workshop on Distributed Infrastructures for Deep Learning (DIDL’19). Association for Computing Machinery, New York, NY, USA, 712. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Gong Chao, Lin Fuhong, Gong Xiaowen, and Lu Yueming. 2020. Intelligent cooperative edge computing in internet of things. IEEE Internet of Things Journal 7, 10 (Oct.2020), 93729382. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  54. [54] Gong Hongyu, Li Xian, and Genzel Dmitriy. 2022. Adaptive Sparse Transformer for Multilingual Translation. DOI:arxiv:2104.07358 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  55. [55] Graves Alex. 2017. Adaptive Computation Time for Recurrent Neural Networks. DOI:arxiv:1603.08983 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  56. [56] Guo Peizhen and Hu Wenjun. 2018. Potluck: Cross-application approximate deduplication for computation-intensive mobile applications. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’18). Association for Computing Machinery, New York, NY, USA, 271284. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. [57] Guo Peizhen, Li Rui, Hu Bo, and Hu Wenjun. 2018. FoggyCache: Cross-device approximate computation reuse. Living on the Edge (2018), 16.Google ScholarGoogle Scholar
  58. [58] Guo Qiushan, Yu Zhipeng, Wu Yichao, Liang Ding, Qin Haoyu, and Yan Junjie. 2019. Dynamic recursive neural network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 51475156.Google ScholarGoogle ScholarCross RefCross Ref
  59. [59] Guo Yunhui. 2018. A Survey on Methods and Theories of Quantized Neural Networks. DOI:arxiv:1808.04752 [cs, stat].Google ScholarGoogle ScholarCross RefCross Ref
  60. [60] Hadifar Amir, Deleu Johannes, Develder Chris, and Demeester Thomas. 2021. Exploration of block-wise dynamic sparseness. Pattern Recognition Letters 151 (Nov.2021), 187192. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. [61] Han Seungyeop, Shen Haichen, Philipose Matthai, Agarwal Sharad, Wolman Alec, and Krishnamurthy Arvind. 2016. MCDNN: An approximation-based execution framework for deep stream processing under resource constraints. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys’16). Association for Computing Machinery, New York, NY, USA, 123136. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. [62] Han Yizeng, Huang Gao, Song Shiji, Yang Le, Wang Honghui, and Wang Yulin. 2021. Dynamic neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021), 11. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  63. [63] Hansen Christian, Hansen Casper, Alstrup Stephen, Simonsen Jakob Grue, and Lioma Christina. 2019. Neural Speed Reading with Structural-Jump-LSTM. DOI:arxiv:1904.00761 [cs, stat].Google ScholarGoogle ScholarCross RefCross Ref
  64. [64] Haque Mirazul, Chauhan Anki, Liu Cong, and Yang Wei. 2020. ILFO: Adversarial attack on adaptive neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1426414273.Google ScholarGoogle ScholarCross RefCross Ref
  65. [65] Hazimeh Hussein, Ponomareva Natalia, Mol Petros, Tan Zhenyu, and Mazumder Rahul. 2020. The tree ensemble layer: Differentiability meets conditional computation. In Proceedings of the 37th International Conference on Machine Learning. PMLR, 41384148.Google ScholarGoogle Scholar
  66. [66] Herrmann Charles, Bowen Richard Strong, and Zabih Ramin. 2020. Channel selection using Gumbel Softmax. In Computer Vision – ECCV 2020 (Lecture Notes in Computer Science), Vedaldi Andrea, Bischof Horst, Brox Thomas, and Frahm Jan-Michael (Eds.). Springer International Publishing, Cham, 241257. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. [67] Hong Sanghyun, Kaya Yiğitcan, Modoranu Ionuţ-Vlad, and Dumitraş Tudor. 2021. A Panda? No, It’s a Sloth: Slowdown Attacks on Adaptive Multi-Exit Neural Network Inference. DOI:arxiv:2010.02432 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  68. [68] Hou Lu, Huang Zhiqi, Shang Lifeng, Jiang Xin, Chen Xiao, and Liu Qun. 2020. DynaBERT: Dynamic BERT with adaptive width and depth. In Advances in Neural Information Processing Systems, Vol. 33. Curran Associates, Inc., 97829793.Google ScholarGoogle Scholar
  69. [69] Hsieh Kevin, Ananthanarayanan Ganesh, Bodik Peter, Venkataraman Shivaram, Bahl Paramvir, Philipose Matthai, Gibbons Phillip B., and Mutlu Onur. 2018. Focus: Querying large video datasets with low latency and low cost. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 269286.Google ScholarGoogle Scholar
  70. [70] Hu Hanzhang, Dey Debadeepta, Hebert Martial, and Bagnell J. Andrew. 2018. Learning Anytime Predictions in Neural Networks via Adaptive Loss Balancing. DOI:arxiv:1708.06832 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  71. [71] Hu Ting-Kuei, Chen Tianlong, Wang Haotao, and Wang Zhangyang. 2020. Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference. DOI:arxiv:2002.10025 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  72. [72] Hu Zilong, Tang Jinshan, Wang Ziming, Zhang Kai, Zhang Ling, and Sun Qingling. 2018. Deep learning for image-based cancer detection and diagnosis - a survey. Pattern Recognition 83 (Nov.2018), 134149. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. [73] Hua Weizhe, Zhou Yuan, Sa Christopher De, Zhang Zhiru, and Suh G. Edward. 2019. Boosting the performance of CNN accelerators with dynamic fine-grained channel gating. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’52). Association for Computing Machinery, New York, NY, USA, 139150. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. [74] Hua Weizhe, Zhou Yuan, Sa Christopher M. De, Zhang Zhiru, and Suh G. Edward. 2019. Channel gating neural networks. In Advances in Neural Information Processing Systems, Vol. 32. Curran Associates, Inc.Google ScholarGoogle Scholar
  75. [75] Huang Gao, Chen Danlu, Li Tianhong, Wu Felix, Maaten Laurens van der, and Weinberger Kilian Q.. 2018. Multi-Scale Dense Networks for Resource Efficient Image Classification. DOI:arxiv:1703.09844 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  76. [76] Huang Gao, Wang Yulin, Lv Kangchen, Jiang Haojun, Huang Wenhui, Qi Pengfei, and Song Shiji. 2022. Glance and Focus Networks for Dynamic Visual Recognition. DOI:arxiv:2201.03014 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  77. [77] Huang Zhengjie, Ye Zi, Li Shuangyin, and Pan Rong. 2017. Length adaptive recurrent model for text classification. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM’17). Association for Computing Machinery, New York, NY, USA, 10191027. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. [78] Huynh Loc N., Lee Youngki, and Balan Rajesh Krishna. 2017. DeepMon: Mobile GPU-based deep learning framework for continuous vision applications. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys’17). Association for Computing Machinery, New York, NY, USA, 8295. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. [79] Ioannou Yani, Robertson Duncan, Zikic Darko, Kontschieder Peter, Shotton Jamie, Brown Matthew, and Criminisi Antonio. 2016. Decision Forests, Convolutional Networks and the Models in-Between. DOI:arxiv:1603.01250 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  80. [80] Jain Samvit, Zhang Xun, Zhou Yuhao, Ananthanarayanan Ganesh, Jiang Junchen, Shu Yuanchao, Bahl Paramvir, and Gonzalez Joseph. 2020. Spatula: Efficient cross-camera video analytics on large camera networks. In 2020 IEEE/ACM Symposium on Edge Computing (SEC). 110124. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  81. [81] Jain Samvit, Zhang Xun, Zhou Yuhao, Ananthanarayanan Ganesh, Jiang Junchen, Shu Yuanchao, and Gonzalez Joseph. 2019. ReXCam: Resource-Efficient, Cross-Camera Video Analytics at Scale. DOI:arxiv:1811.01268 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  82. [82] Jernite Yacine, Grave Edouard, Joulin Armand, and Mikolov Tomas. 2017. Variable Computation in Recurrent Neural Networks. DOI:arxiv:1611.06188 [cs, stat].Google ScholarGoogle ScholarCross RefCross Ref
  83. [83] Jiang Junchen, Ananthanarayanan Ganesh, Bodik Peter, Sen Siddhartha, and Stoica Ion. 2018. Chameleon: Scalable adaptation of video analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’18). Association for Computing Machinery, New York, NY, USA, 253266. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. [84] Jiang Zutao, Li Changlin, Chang Xiaojun, Zhu Jihua, and Yang Yi. 2021. Dynamic Slimmable Denoising Network. DOI:arxiv:2110.08940 [cs, eess].Google ScholarGoogle ScholarCross RefCross Ref
  85. [85] Jin Qing, Yang Linjie, and Liao Zhenyu. 2020. AdaBits: Neural network quantization with adaptive bit-widths. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 21462156.Google ScholarGoogle ScholarCross RefCross Ref
  86. [86] Kang Daniel, Emmons John, Abuzaid Firas, Bailis Peter, and Zaharia Matei. 2017. NoScope: Optimizing Neural Network Queries over Video at Scale. DOI:arxiv:1703.02529 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  87. [87] Kaya Yigitcan, Hong Sanghyun, and Dumitras Tudor. 2019. Shallow-deep networks: Understanding and mitigating network overthinking. In Proceedings of the 36th International Conference on Machine Learning. PMLR, 33013310.Google ScholarGoogle Scholar
  88. [88] Kim Gyuwan and Cho Kyunghyun. 2021. Length-Adaptive Transformer: Train Once with Length Drop, Use Anytime with Search. DOI:arxiv:2010.07003 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  89. [89] Kirillov Alexander, Wu Yuxin, He Kaiming, and Girshick Ross. 2020. PointRend: Image segmentation as rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 97999808.Google ScholarGoogle ScholarCross RefCross Ref
  90. [90] Kong Shu and Fowlkes Charless. 2019. Pixel-wise attentional gating for scene parsing. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). 10241033. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  91. [91] Kontschieder Peter, Fiterau Madalina, Criminisi Antonio, and Bulo Samuel Rota. 2015. Deep neural decision forests. In Proceedings of the IEEE International Conference on Computer Vision. 14671475.Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. [92] Kouris Alexandros, Venieris Stylianos I., Laskaridis Stefanos, and Lane Nicholas D.. 2022. Multi-Exit Semantic Segmentation Networks. arxiv:2106.03527 [cs].Google ScholarGoogle Scholar
  93. [93] Krishna Tarun, Rai Ayush K., Djilali Yasser A. D., Smeaton Alan F., McGuinness Kevin, and O’Connor Noel E.. 2022. Dynamic Channel Selection in Self-Supervised Learning. DOI:arxiv:2207.12065 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  94. [94] Kuen Jason, Kong Xiangfei, Lin Zhe, Wang Gang, Yin Jianxiong, See Simon, and Tan Yap-Peng. 2018. Stochastic downsampling for cost-adjustable inference and improved regularization in convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 79297938.Google ScholarGoogle ScholarCross RefCross Ref
  95. [95] Laskaridis Stefanos, Kouris Alexandros, and Lane Nicholas D.. 2021. Adaptive inference through early-exit networks: Design, challenges and directions. In Proceedings of the 5th International Workshop on Embedded and Mobile Deep Learning (EMDL’21). Association for Computing Machinery, New York, NY, USA, 16. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. [96] Laskaridis Stefanos, Venieris Stylianos I., Almeida Mario, Leontiadis Ilias, and Lane Nicholas D.. 2020. SPINN: Synergistic progressive inference of neural networks over device and cloud. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking (MobiCom’20). Association for Computing Machinery, New York, NY, USA, 115. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. [97] Lee Changsik, Hong Seungwoo, Hong Sungback, and Kim Taeyeon. 2020. Performance analysis of local exit for distributed deep neural networks over cloud and edge computing. ETRI Journal 42, 5 (2020), 658668. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  98. [98] Lee Hankook and Shin Jinwoo. 2018. Anytime Neural Prediction via Slicing Networks Vertically. DOI:arxiv:1807.02609 [cs, stat].Google ScholarGoogle ScholarCross RefCross Ref
  99. [99] Lee Royson, Venieris Stylianos I., Dudziak Lukasz, Bhattacharya Sourav, and Lane Nicholas D.. 2019. MobiSR: Efficient on-device super-resolution through heterogeneous mobile processors. In The 25th Annual International Conference on Mobile Computing and Networking (MobiCom’19). Association for Computing Machinery, New York, NY, USA, 116. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. [100] Leroux Sam, Bohez Steven, Boom Cedric De, Coninck Elias De, Verbelen Tim, Vankeirsbilck Bert, Simoens Pieter, and Dhoedt Bart. 2016. Lazy Evaluation of Convolutional Filters. DOI:arxiv:1605.08543 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  101. [101] Leroux Sam, Bohez Steven, Coninck Elias De, Verbelen Tim, Vankeirsbilck Bert, Simoens Pieter, and Dhoedt Bart. 2017. The cascading neural network: Building the internet of smart things. Knowledge and Information Systems 52, 3 (Sept.2017), 791814. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. [102] Leroux Sam, Molchanov Pavlo, Simoens Pieter, Dhoedt Bart, Breuel Thomas, and Kautz Jan. 2018. IamNN: Iterative and Adaptive Mobile Neural Network for Efficient Image Classification. DOI:arxiv:1804.10123 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  103. [103] Li Changlin, Wang Guangrun, Wang Bing, Liang Xiaodan, Li Zhihui, and Chang Xiaojun. 2021. DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers. DOI:arxiv:2109.10060 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  104. [104] Li Changlin, Wang Guangrun, Wang Bing, Liang Xiaodan, Li Zhihui, and Chang Xiaojun. 2021. Dynamic slimmable network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 86078617.Google ScholarGoogle ScholarCross RefCross Ref
  105. [105] Li Hengduo, Wu Zuxuan, Shrivastava Abhinav, and Davis Larry S.. 2021. 2D or not 2D? Adaptive 3D convolution selection for efficient video recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 61556164.Google ScholarGoogle ScholarCross RefCross Ref
  106. [106] Li Hao, Zhang Hong, Qi Xiaojuan, Yang Ruigang, and Huang Gao. 2019. Improved techniques for training adaptive deep networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 18911900.Google ScholarGoogle ScholarCross RefCross Ref
  107. [107] Li Liangzhi, Ota Kaoru, and Dong Mianxiong. 2018. Deep learning for smart industry: Efficient manufacture inspection system with fog computing. IEEE Transactions on Industrial Informatics 14, 10 (Oct.2018), 46654673. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  108. [108] Li Xiaoxiao, Liu Ziwei, Luo Ping, Loy Chen Change, and Tang Xiaoou. 2017. Not all pixels are equal: Difficulty-aware semantic segmentation via deep layer cascade. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 31933202.Google ScholarGoogle ScholarCross RefCross Ref
  109. [109] Li Yanwei, Song Lin, Chen Yukang, Li Zeming, Zhang Xiangyu, Wang Xingang, and Sun Jian. 2020. Learning dynamic routing for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 85538562.Google ScholarGoogle ScholarCross RefCross Ref
  110. [110] Li Zhichao, Yang Yi, Liu Xiao, Zhou Feng, Wen Shilei, and Xu Wei. 2017. Dynamic computational time for visual attention. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 11991209.Google ScholarGoogle ScholarCross RefCross Ref
  111. [111] LiKamWa Robert and Zhong Lin. 2015. Starfish: Efficient concurrency support for computer vision applications. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys’15). Association for Computing Machinery, New York, NY, USA, 213226. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. [112] Lin Ji, Rao Yongming, Lu Jiwen, and Zhou Jie. 2017. Runtime neural pruning. In Advances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc.Google ScholarGoogle Scholar
  113. [113] Lin Yingyan, Sakr Charbel, Kim Yongjune, and Shanbhag Naresh. 2017. PredictiveNet: An energy-efficient convolutional neural network via zero prediction. In 2017 IEEE International Symposium on Circuits and Systems (ISCAS). 14. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  114. [114] Liu Chuanjian, Wang Yunhe, Han Kai, Xu Chunjing, and Xu Chang. 2019. Learning Instance-wise Sparsity for Accelerating Deep Models. DOI:arxiv:1907.11840 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  115. [115] Liu Lanlan and Deng Jia. 2018. Dynamic deep neural networks: Optimizing accuracy-efficiency trade-offs by selective execution. Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (April2018). DOI:Google ScholarGoogle ScholarCross RefCross Ref
  116. [116] Liu Luyang, Li Hongyu, and Gruteser Marco. 2019. Edge assisted real-time object detection for mobile augmented reality. In The 25th Annual International Conference on Mobile Computing and Networking (MobiCom’19). Association for Computing Machinery, New York, NY, USA, 116. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. [117] Liu Miaomiao, Ding Xianzhong, and Du Wan. 2020. Continuous, real-time object detection on mobile devices without offloading. In 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). 976986. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  118. [118] Liu Sicong, Lin Yingyan, Zhou Zimu, Nan Kaiming, Liu Hui, and Du Junzhao. 2018. On-demand deep model compression for mobile devices: A usage-driven model selection framework. In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys’18). Association for Computing Machinery, New York, NY, USA, 389400. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. [119] Liu Weijie, Zhou Peng, Zhao Zhe, Wang Zhiruo, Deng Haotang, and Ju Qi. 2020. FastBERT: A Self-distilling BERT with Adaptive Inference Time. DOI:arxiv:2004.02178 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  120. [120] Liu Xianggen, Mou Lili, Cui Haotian, Lu Zhengdong, and Song Sen. 2020. Finding decision jumps in text classification. Neurocomputing 371 (Jan.2020), 177187. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  121. [121] Lo Chi, Su Yu-Yi, Lee Chun-Yi, and Chang Shih-Chieh. 2017. A dynamic deep neural network design for efficient workload allocation in edge computing. In 2017 IEEE International Conference on Computer Design (ICCD). 273280. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  122. [122] Lou Wei, Xun Lei, Sabet Amin, Bi Jia, Hare Jonathon, and Merrett Geoff V.. 2021. Dynamic-OFA: Runtime DNN architecture switching for performance scaling on heterogeneous embedded platforms. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 31103118.Google ScholarGoogle ScholarCross RefCross Ref
  123. [123] Lovagnini Luca, Zhang Wenxiao, Bijarbooneh Farshid Hassani, and Hui Pan. 2018. CIRCE: Real-time caching for instance recognition on cloud environments and multi-core architectures. In Proceedings of the 26th ACM International Conference on Multimedia (MM’18). Association for Computing Machinery, New York, NY, USA, 346354. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  124. [124] Mao Jiachen, Yang Qing, Li Ang, Nixon Kent W., Li Hai, and Chen Yiran. 2022. Toward efficient and adaptive design of video detection system with deep neural networks. ACM Transactions on Embedded Computing Systems 21, 3 (July2022), 33:1–33:21. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  125. [125] Marco Vicent Sanz, Taylor Ben, Wang Zheng, and Elkhatib Yehia. 2020. Optimizing deep learning inference on embedded systems through adaptive model selection. ACM Transactions on Embedded Computing Systems 19, 1 (Feb.2020), 2:1–2:28. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  126. [126] Matsubara Yoshitomo, Levorato Marco, and Restuccia Francesco. 2022. Split computing and early exiting for deep learning applications: Survey and research challenges. Comput. Surveys 55, 5 (Dec.2022), 90:1–90:30. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. [127] McGill Mason and Perona Pietro. 2017. Deciding how to decide: Dynamic routing in artificial neural networks. In Proceedings of the 34th International Conference on Machine Learning. PMLR, 23632372.Google ScholarGoogle Scholar
  128. [128] Meng Lingchen, Li Hengduo, Chen Bor-Chun, Lan Shiyi, Wu Zuxuan, Jiang Yu-Gang, and Lim Ser-Nam. 2022. AdaViT: Adaptive vision transformers for efficient image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1230912318.Google ScholarGoogle ScholarCross RefCross Ref
  129. [129] Meng Yue, Lin Chung-Ching, Panda Rameswar, Sattigeri Prasanna, Karlinsky Leonid, Oliva Aude, Saenko Kate, and Feris Rogerio. 2020. AR-Net: Adaptive frame resolution for efficient action recognition. In Computer Vision – ECCV 2020 (Lecture Notes in Computer Science), Vedaldi Andrea, Bischof Horst, Brox Thomas, and Frahm Jan-Michael (Eds.). Springer International Publishing, Cham, 86104. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  130. [130] Meng Yue, Panda Rameswar, Lin Chung-Ching, Sattigeri Prasanna, Karlinsky Leonid, Saenko Kate, Oliva Aude, and Feris Rogerio. 2021. AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition. DOI:arxiv:2102.05775 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  131. [131] Mnih Volodymyr, Heess Nicolas, Graves Alex, and Kavukcuoglu Koray. 2014. Recurrent models of visual attention. In Advances in Neural Information Processing Systems, Vol. 27. Curran Associates, Inc.Google ScholarGoogle ScholarDigital LibraryDigital Library
  132. [132] Mullapudi Ravi Teja, Mark William R., Shazeer Noam, and Fatahalian Kayvon. 2018. HydraNets: Specialized dynamic architectures for efficient inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 80808089.Google ScholarGoogle Scholar
  133. [133] Nalaie Keivan, Xu Renjie, and Zheng Rong. 2022. DeepScale: Online frame size adaptation for multi-object tracking on smart cameras and edge servers. In 2022 IEEE/ACM Seventh International Conference on Internet-of-Things Design and Implementation (IoTDI). 6779. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  134. [134] Namuduri Srikanth, Narayanan Barath Narayanan, Davuluru Venkata Salini Priyamvada, Burton Lamar, and Bhansali Shekhar. 2020. Review—Deep learning methods for sensor based predictive maintenance and future perspectives for electrochemical sensors. Journal of The Electrochemical Society 167, 3 (Jan.2020), 037552. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  135. [135] Neumann Mark, Stenetorp Pontus, and Riedel Sebastian. 2016. Learning to Reason with Adaptive Computation. DOI:arxiv:1610.07647 [cs, stat].Google ScholarGoogle ScholarCross RefCross Ref
  136. [136] O’Connor Peter and Welling Max. 2016. Sigma Delta Quantized Networks. DOI:arxiv:1611.02024 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  137. [137] Odena Augustus, Lawson Dieterich, and Olah Christopher. 2017. Changing Model Behavior at Test-Time Using Reinforcement Learning. DOI:arxiv:1702.07780 [cs, stat].Google ScholarGoogle ScholarCross RefCross Ref
  138. [138] Ogden Samuel S. and Guo Tian. 2018. {MODI}: Mobile deep inference made efficient by edge computing. In USENIX Workshop on Hot Topics in Edge Computing (HotEdge 18).Google ScholarGoogle Scholar
  139. [139] Pan Bowen, Lin Wuwei, Fang Xiaolin, Huang Chaoqin, Zhou Bolei, and Lu Cewu. 2018. Recurrent residual module for fast inference in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 15361545.Google ScholarGoogle ScholarCross RefCross Ref
  140. [140] Pan Bowen, Panda Rameswar, Fosco Camilo, Lin Chung-Ching, Andonian Alex, Meng Yue, Saenko Kate, Oliva Aude, and Feris Rogerio. 2021. VA-RED$\(\hat2\)$: Video Adaptive Redundancy Reduction. DOI:arxiv:2102.07887 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  141. [141] Pan Bowen, Panda Rameswar, Jiang Yifan, Wang Zhangyang, Feris Rogerio, and Oliva Aude. 2021. IA-RED2: Interpretability-aware redundancy reduction for vision transformers. In Advances in Neural Information Processing Systems, Vol. 34. Curran Associates, Inc., 2489824911.Google ScholarGoogle Scholar
  142. [142] Panda Priyadarshini, Ankit Aayush, Wijesinghe Parami, and Roy Kaushik. 2017. FALCON: Feature driven selective classification for energy-efficient image recognition. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 36, 12 (Dec.2017), 20172029. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  143. [143] Panda Priyadarshini, Sengupta Abhronil, and Roy Kaushik. 2016. Conditional deep learning for energy-efficient and enhanced pattern recognition. In 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE). 475480.Google ScholarGoogle ScholarDigital LibraryDigital Library
  144. [144] Panda Priyadarshini, Sengupta Abhronil, and Roy Kaushik. 2017. Energy-efficient and improved image recognition with conditional deep learning. ACM Journal on Emerging Technologies in Computing Systems 13, 3 (Feb.2017), 33:1–33:21. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  145. [145] Parger Mathias, Tang Chengcheng, Twigg Christopher D., Keskin Cem, Wang Robert, and Steinberger Markus. 2022. DeltaCNN: End-to-End CNN inference of sparse frame differences in videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1249712506.Google ScholarGoogle ScholarCross RefCross Ref
  146. [146] Park Eunhyeok, Kim Dongyoung, Kim Soobeom, Kim Yong-Deok, Kim Gunhee, Yoon Sungroh, and Yoo Sungjoo. 2015. Big/little deep neural network for ultra low power inference. In 2015 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). 124132. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  147. [147] Rao Yongming, Liu Zuyan, Zhao Wenliang, Zhou Jie, and Lu Jiwen. 2022. Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks. DOI:arxiv:2207.01580 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  148. [148] Rao Yongming, Lu Jiwen, and Zhou Jie. 2017. Attention-aware deep reinforcement learning for video face recognition. In Proceedings of the IEEE International Conference on Computer Vision. 39313940.Google ScholarGoogle ScholarCross RefCross Ref
  149. [149] Rao Yongming, Zhao Wenliang, Liu Benlin, Lu Jiwen, Zhou Jie, and Hsieh Cho-Jui. 2021. DynamicViT: Efficient vision transformers with dynamic token sparsification. In Advances in Neural Information Processing Systems, Vol. 34. Curran Associates, Inc., 1393713949.Google ScholarGoogle Scholar
  150. [150] Rashid Nafiul, Demirel Berken Utku, Odema Mohanad, and Faruque Mohammad Abdullah Al. 2022. Template matching based early exit CNN for energy-efficient myocardial infarction detection on low-power wearable devices. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 2 (July2022), 68:1–68:22. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  151. [151] Ren Mengye, Pokrovsky Andrei, Yang Bin, and Urtasun Raquel. 2018. SBNet: Sparse blocks network for fast inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 87118720.Google ScholarGoogle ScholarCross RefCross Ref
  152. [152] Rosenbaum Clemens, Klinger Tim, and Riemer Matthew. 2017. Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning. DOI:arxiv:1711.01239 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  153. [153] Bulo Samuel Rota and Kontschieder Peter. 2014. Neural decision forests for semantic image labelling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8188.Google ScholarGoogle ScholarDigital LibraryDigital Library
  154. [154] Sabetsarvestani Mohammadamin, Hare Jonathon, Al-Hashimi Bashir, and Merrett Geoff. 2021. Similarity-aware CNN for efficient video recognition at the edge. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (Dec.2021). DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  155. [155] Sabih Muhammad, Hannig Frank, and Teich Jürgen. 2022. DyFiP: Explainable AI-based dynamic filter pruning of convolutional neural networks. In Proceedings of the 2nd European Workshop on Machine Learning and Systems (EuroMLSys’22). Association for Computing Machinery, New York, NY, USA, 109115. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  156. [156] Salem Tareq Si, Neglia Giovanni, and Carra Damiano. 2021. AÇAI: Ascent similarity caching with approximate indexes. In 2021 33rd International Teletraffic Congress (ITC-33). 19.Google ScholarGoogle Scholar
  157. [157] Scardapane Simone, Scarpiniti Michele, Baccarelli Enzo, and Uncini Aurelio. 2020. Why should we add early exits to neural networks? Cognitive Computation 12, 5 (Sept.2020), 954966. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  158. [158] Schmerge Jordan, Mawhirter Daniel, Holmes Connor, McClurg Jedidiah, and Wu Bo. 2021. ELI\(\chi\)R: Eliminating computation redundancy in CNN-based video processing. In 2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA). 3444. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  159. [159] Schuster Tal, Fisch Adam, Gupta Jai, Dehghani Mostafa, Bahri Dara, Tran Vinh, Tay Yi, and Metzler Donald. 2022. Confident adaptive language modeling. Advances in Neural Information Processing Systems 35 (Dec.2022), 1745617472.Google ScholarGoogle Scholar
  160. [160] Schwartz Roy, Stanovsky Gabriel, Swayamdipta Swabha, Dodge Jesse, and Smith Noah A.. 2020. The Right Tool for the Job: Matching Model and Instance Complexities. DOI:arxiv:2004.07453 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  161. [161] Seo Minjoon, Min Sewon, Farhadi Ali, and Hajishirzi Hannaneh. 2018. Neural Speed Reading via Skim-RNN. DOI:arxiv:1711.02085 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  162. [162] Shazeer Noam, Mirhoseini Azalia, Maziarz Krzysztof, Davis Andy, Le Quoc, Hinton Geoffrey, and Dean Jeff. 2017. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. DOI:arxiv:1701.06538 [cs, stat].Google ScholarGoogle ScholarCross RefCross Ref
  163. [163] Shen Jianghao, Wang Yue, Xu Pengfei, Fu Yonggan, Wang Zhangyang, and Lin Yingyan. 2020. Fractional skipping: Towards finer-grained dynamic CNN inference. Proceedings of the AAAI Conference on Artificial Intelligence 34, 04 (April2020), 57005708. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  164. [164] Shi Mengnan, Liu Chang, Ye Qixiang, and Jiao Jianbin. 2021. Feature-Gate Coupling for Dynamic Network Pruning. DOI:arxiv:2111.14302 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  165. [165] Simonovsky Martin and Komodakis Nikos. 2017. Dynamic edge-conditioned filters in convolutional neural networks on graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 36933702.Google ScholarGoogle ScholarCross RefCross Ref
  166. [166] Song Zhuoran, Wu Feiyang, Liu Xueyuan, Ke Jing, Jing Naifeng, and Liang Xiaoyao. 2020. VR-DANN: Real-time video recognition via decoder-assisted neural network acceleration. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 698710. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  167. [167] Stamoulis Dimitrios, Chin Ting-Wu Rudy, Prakash Anand Krishnan, Fang Haocheng, Sajja Sribhuvan, Bognar Mitchell, and Marculescu Diana. 2018. Designing adaptive neural networks for energy-constrained image classification. In 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE Press, San Diego, CA, USA, 18. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  168. [168] Su Yu-Chuan and Grauman Kristen. 2016. Leaving some stones unturned: Dynamic feature prioritization for activity detection in streaming video. In Computer Vision – ECCV 2016 (Lecture Notes in Computer Science), Leibe Bastian, Matas Jiri, Sebe Nicu, and Welling Max (Eds.). Springer International Publishing, Cham, 783800. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  169. [169] Sukhbaatar Sainbayar, Grave Edouard, Bojanowski Piotr, and Joulin Armand. 2019. Adaptive Attention Span in Transformers. arxiv:1905.07799 [cs, stat].Google ScholarGoogle Scholar
  170. [170] Sun Ximeng, Panda Rameswar, Chen Chun-Fu (Richard), Oliva Aude, Feris Rogerio, and Saenko Kate. 2021. Dynamic network quantization for efficient video inference. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 73757385.Google ScholarGoogle ScholarCross RefCross Ref
  171. [171] Takhirov Zafar, Wang Joseph, Saligrama Venkatesh, and Joshi Ajay. 2016. Energy-efficient adaptive classifier design for mobile systems. In Proceedings of the 2016 International Symposium on Low Power Electronics and Design (ISLPED’16). Association for Computing Machinery, New York, NY, USA, 5257. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  172. [172] Tan Tianxiang and Cao Guohong. 2021. Efficient execution of deep neural networks on mobile devices with NPU. In Proceedings of the 20th International Conference on Information Processing in Sensor Networks (Co-Located with CPS-IoT Week 2021) (IPSN’21). Association for Computing Machinery, New York, NY, USA, 283298. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  173. [173] Tang Chen, Sun Wenyu, Wang Wenxun, and Liu Yongpan. 2022. Dynamic CNN accelerator supporting efficient filter generator with kernel enhancement and online channel pruning. In 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC). 436441. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  174. [174] Tang Chen, Zhai Haoyu, Ouyang Kai, Wang Zhi, Zhu Yifei, and Zhu Wenwu. 2022. Arbitrary Bit-width Network: A Joint Layer-Wise Quantization and Adaptive Inference Approach. DOI:arxiv:2204.09992 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  175. [175] Tang Yansong, Tian Yi, Lu Jiwen, Li Peiyang, and Zhou Jie. 2018. Deep progressive reinforcement learning for skeleton-based action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 53235332.Google ScholarGoogle ScholarCross RefCross Ref
  176. [176] Tann Hokchhay, Hashemi Soheil, Bahar R. Iris, and Reda Sherief. 2016. Runtime configurable deep neural networks for energy-accuracy trade-off. In 2016 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). 110.Google ScholarGoogle ScholarDigital LibraryDigital Library
  177. [177] Tanno Ryutaro, Arulkumaran Kai, Alexander Daniel, Criminisi Antonio, and Nori Aditya. 2019. Adaptive neural trees. In Proceedings of the 36th International Conference on Machine Learning. PMLR, 61666175.Google ScholarGoogle Scholar
  178. [178] Taylor Ben, Marco Vicent Sanz, Wolff Willy, Elkhatib Yehia, and Wang Zheng. 2018. Adaptive deep learning model selection on embedded systems. ACM SIGPLAN Notices 53, 6 (June2018), 3143. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  179. [179] Teerapittayanon Surat, McDanel Bradley, and Kung H. T.. 2016. BranchyNet: Fast inference via early exiting from deep neural networks. In 2016 23rd International Conference on Pattern Recognition (ICPR). 24642469. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  180. [180] Teerapittayanon Surat, McDanel Bradley, and Kung H. T.. 2017. Distributed deep neural networks over the cloud, the edge and end devices. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). 328339. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  181. [181] Vaudaux-Ruth Guillaume, Chan-Hon-Tong Adrien, and Achard Catherine. 2021. ActionSpotter: Deep reinforcement learning framework for temporal action spotting in videos. In 2020 25th International Conference on Pattern Recognition (ICPR). 631638. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  182. [182] Veit Andreas and Belongie Serge. 2018. Convolutional networks with adaptive inference graphs. In Proceedings of the European Conference on Computer Vision (ECCV). 318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  183. [183] Venugopal Srikumar, Gazzetti Michele, Gkoufas Yiannis, and Katrinis Kostas. 2018. Shadow puppets: Cloud-level accurate {AI} inference at the speed and economy of edge. In USENIX Workshop on Hot Topics in Edge Computing (HotEdge 18).Google ScholarGoogle Scholar
  184. [184] Verelst Thomas and Tuytelaars Tinne. 2020. Dynamic convolutions: Exploiting spatial sparsity for faster inference. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 23202329.Google ScholarGoogle ScholarCross RefCross Ref
  185. [185] Verelst Thomas and Tuytelaars Tinne. 2021. BlockCopy: High-resolution video processing with block-sparse feature propagation and online policies. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 51585167.Google ScholarGoogle ScholarCross RefCross Ref
  186. [186] Wang Huiyu, Kembhavi Aniruddha, Farhadi Ali, Yuille Alan L., and Rastegari Mohammad. 2019. ELASTIC: Improving CNNs With dynamic scaling policies. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 22582267.Google ScholarGoogle ScholarCross RefCross Ref
  187. [187] Wang Junjue, Feng Ziqiang, Chen Zhuo, George Shilpa, Bala Mihir, Pillai Padmanabhan, Yang Shao-Wen, and Satyanarayanan Mahadev. 2018. Bandwidth-efficient live video analytics for drones via edge computing. In 2018 IEEE/ACM Symposium on Edge Computing (SEC). 159173. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  188. [188] Wang Limin, Xiong Yuanjun, Wang Zhe, Qiao Yu, Lin Dahua, Tang Xiaoou, and Gool Luc Van. 2016. Temporal segment networks: Towards good practices for deep action recognition. In Computer Vision – ECCV 2016 (Lecture Notes in Computer Science), Leibe Bastian, Matas Jiri, Sebe Nicu, and Welling Max (Eds.). Springer International Publishing, Cham, 2036. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  189. [189] Wang Qilong, Wu Banggu, Zhu Pengfei, Li Peihua, Zuo Wangmeng, and Hu Qinghua. 2020. ECA-Net: Efficient channel attention for deep convolutional neural networks. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1153111539. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  190. [190] Wang Xin, Yu Fisher, Dou Zi-Yi, Darrell Trevor, and Gonzalez Joseph E.. 2018. SkipNet: Learning dynamic routing in convolutional networks. In Proceedings of the European Conference on Computer Vision (ECCV). 409424.Google ScholarGoogle ScholarDigital LibraryDigital Library
  191. [191] Wang Xin, Yu Fisher, Dunlap Lisa, Ma Yi-An, Wang Ruth, Mirhoseini Azalia, Darrell Trevor, and Gonzalez Joseph E.. 2020. Deep mixture of experts via shallow embedding. In Proceedings of The 35th Uncertainty in Artificial Intelligence Conference. PMLR, 552562.Google ScholarGoogle Scholar
  192. [192] Wang Yulin, Chen Zhaoxi, Jiang Haojun, Song Shiji, Han Yizeng, and Huang Gao. 2021. Adaptive focus for efficient video recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1624916258.Google ScholarGoogle ScholarCross RefCross Ref
  193. [193] Wang Yulin, Huang Rui, Song Shiji, Huang Zeyi, and Huang Gao. 2021. Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition. DOI:arxiv:2105.15075 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  194. [194] Wang Yue, Shen Jianghao, Hu Ting-Kuei, Xu Pengfei, Nguyen Tan, Baraniuk Richard, Wang Zhangyang, and Lin Yingyan. 2020. Dual dynamic inference: Enabling more efficient, adaptive, and controllable deep inference. IEEE Journal of Selected Topics in Signal Processing 14, 4 (May2020), 623633. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  195. [195] Wu Wenhao, He Dongliang, Tan Xiao, Chen Shifeng, and Wen Shilei. 2019. Multi-agent reinforcement learning based frame sampling for effective untrimmed video recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 62226231.Google ScholarGoogle ScholarCross RefCross Ref
  196. [196] Wu Wenhao, He Dongliang, Tan Xiao, Chen Shifeng, Yang Yi, and Wen Shilei. 2020. Dynamic inference: A new approach toward efficient video action recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 676677.Google ScholarGoogle ScholarCross RefCross Ref
  197. [197] Wu Zuxuan, Nagarajan Tushar, Kumar Abhishek, Rennie Steven, Davis Larry S., Grauman Kristen, and Feris Rogerio. 2018. BlockDrop: Dynamic inference paths in residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 88178826.Google ScholarGoogle ScholarCross RefCross Ref
  198. [198] Wu Zuxuan, Xiong Caiming, Jiang Yu-Gang, and Davis Larry S.. 2019. LiteEval: A coarse-to-fine framework for resource efficient video recognition. In Advances in Neural Information Processing Systems, Vol. 32. Curran Associates, Inc.Google ScholarGoogle Scholar
  199. [199] Wu Zuxuan, Xiong Caiming, Ma Chih-Yao, Socher Richard, and Davis Larry S.. 2019. AdaFrame: Adaptive frame selection for fast video recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12781287.Google ScholarGoogle ScholarCross RefCross Ref
  200. [200] Wu Zhaofeng, Zhao Ding, Liang Qiao, Yu Jiahui, Gulati Anmol, and Pang Ruoming. 2021. Dynamic sparsity neural networks for automatic speech recognition. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 60146018. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  201. [201] Xia Wenhan, Yin Hongxu, Dai Xiaoliang, and Jha Niraj K.. 2022. Fully dynamic inference with deep neural networks. IEEE Transactions on Emerging Topics in Computing 10, 2 (April2022), 962972. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  202. [202] Xie Zhenda, Zhang Zheng, Zhu Xizhou, Huang Gao, and Lin Stephen. 2020. Spatially adaptive inference with stochastic feature sampling and interpolation. In Computer Vision – ECCV 2020 (Lecture Notes in Computer Science), Vedaldi Andrea, Bischof Horst, Brox Thomas, and Frahm Jan-Michael (Eds.). Springer International Publishing, Cham, 531548. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  203. [203] Xin Ji, Tang Raphael, Lee Jaejun, Yu Yaoliang, and Lin Jimmy. 2020. DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference. DOI:arxiv:2004.12993 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  204. [204] Xin Ji, Tang Raphael, Yu Yaoliang, and Lin Jimmy. 2021. BERxiT: Early exiting for BERT with better fine-tuning and extension to regression. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Merlo Paola, Tiedemann Jorg, and Tsarfaty Reut (Eds.). Association for Computational Linguistics, Online, 91104. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  205. [205] Xu Dianlei, Li Tong, Li Yong, Su Xiang, Tarkoma Sasu, Jiang Tao, Crowcroft Jon, and Hui Pan. 2021. Edge intelligence: Empowering intelligence to the edge of network. Proc. IEEE 109, 11 (Nov.2021), 17781837. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  206. [206] Xu Kelvin, Ba Jimmy, Kiros Ryan, Cho Kyunghyun, Courville Aaron, Salakhudinov Ruslan, Zemel Rich, and Bengio Yoshua. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the 32nd International Conference on Machine Learning. PMLR, 20482057.Google ScholarGoogle ScholarDigital LibraryDigital Library
  207. [207] Xu Lanyu, Iyengar Arun, and Shi Weisong. 2020. CHA: A caching framework for home-based voice assistant systems. In 2020 IEEE/ACM Symposium on Edge Computing (SEC). 293306. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  208. [208] Mengwei Xu, Xuanzhe Liu, Yunxin Liu, and Felix Xiaozhu Lin. 2017. Accelerating convolutional neural networks for continuous mobile vision via cache reuse. CoRR abs/1712.01670 (2017). arXiv:1712.01670. http://arxiv.org/abs/1712.01670Google ScholarGoogle Scholar
  209. [209] Xu Mengwei, Zhu Mengze, Liu Yunxin, Lin Felix Xiaozhu, and Liu Xuanzhe. 2018. DeepCache: Principled cache for mobile deep vision. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking (MobiCom’18). Association for Computing Machinery, New York, NY, USA, 129144. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  210. [210] Xun Lei, Tran-Thanh Long, Al-Hashimi Bashir M., and Merrett Geoff V.. 2019. Incremental training and group convolution pruning for runtime DNN performance scaling on heterogeneous embedded platforms. In 2019 ACM/IEEE 1st Workshop on Machine Learning for CAD (MLCAD). 16. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  211. [211] Xun Lei, Tran-Thanh Long, Al-Hashimi Bashir M., and Merrett Geoff V.. 2020. Optimising resource management for embedded machine learning. In 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE). 15561561. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  212. [212] Yan Zhicheng, Zhang Hao, Piramuthu Robinson, Jagadeesh Vignesh, DeCoste Dennis, Di Wei, and Yu Yizhou. 2015. HD-CNN: Hierarchical deep convolutional neural networks for large scale visual recognition. In Proceedings of the IEEE International Conference on Computer Vision. 27402748.Google ScholarGoogle ScholarDigital LibraryDigital Library
  213. [213] Yang Kang, Xing Tianzhang, Liu Yang, Li Zhenjiang, Gong Xiaoqing, Chen Xiaojiang, and Fang Dingyi. 2019. cDeepArch: A compact deep neural network architecture for mobile sensing. IEEE/ACM Transactions on Networking 27, 5 (Oct.2019), 20432055. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  214. [214] Yang Kichang, Yi Juheon, Lee Kyungjin, and Lee Youngki. 2022. FlexPatch: Fast and accurate object detection for on-device high-resolution live video analytics. In IEEE INFOCOM 2022 - IEEE Conference on Computer Communications. 18981907. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  215. [215] Yang Le, Han Yizeng, Chen Xi, Song Shiji, Dai Jifeng, and Huang Gao. 2020. Resolution adaptive networks for efficient inference. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 23692378.Google ScholarGoogle ScholarCross RefCross Ref
  216. [216] Yang Yu, Liu Di, Fang Hui, Huang Yi-Xiong, Sun Ying, and Zhang Zhi-Yuan. 2022. Once for all skip: Efficient adaptive deep neural networks. In 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE). 568571. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  217. [217] Yang Zerui, Xu Yuhui, Dai Wenrui, and Xiong Hongkai. 2019. Dynamic-stride-net: Deep convolutional neural network with dynamic stride. In Optoelectronic Imaging and Multimedia Technology VI, Vol. 11187. SPIE, 4253. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  218. [218] Yeung Serena, Russakovsky Olga, Mori Greg, and Fei-Fei Li. 2016. End-to-end learning of action detection from frame glimpses in videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 26782687.Google ScholarGoogle ScholarCross RefCross Ref
  219. [219] Yin Hongxu, Vahdat Arash, Alvarez Jose M., Mallya Arun, Kautz Jan, and Molchanov Pavlo. 2022. A-ViT: Adaptive tokens for efficient vision transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1080910818.Google ScholarGoogle ScholarCross RefCross Ref
  220. [220] Yousefzadeh Amirreza and Sifalakis Manolis. 2022. Delta Activation Layer Exploits Temporal Sparsity for Efficient Embedded Video Processing.Google ScholarGoogle ScholarCross RefCross Ref
  221. [221] Yu Adams Wei, Lee Hongrae, and Le Quoc V.. 2017. Learning to Skim Text. DOI:arxiv:1704.06877 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  222. [222] Yu Haichao, Li Haoxiang, Shi Humphrey, Huang Thomas S., and Hua Gang. 2021. Any-precision deep neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 35, 12 (May2021), 1076310771. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  223. [223] Yu Jiahui and Huang Thomas S.. 2019. Universally slimmable networks and improved training techniques. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 18031811.Google ScholarGoogle ScholarCross RefCross Ref
  224. [224] Yu Jiahui, Yang Linjie, Xu Ning, Yang Jianchao, and Huang Thomas. 2018. Slimmable Neural Networks. DOI:arxiv:1812.08928 [cs].Google ScholarGoogle ScholarCross RefCross Ref
  225. [225] Yu Keyi, Liu Yang, Schwing Alexander G., and Peng Jian. 2022. Fast and accurate text classification: Skimming, rereading and early stopping. (Feb.2022).Google ScholarGoogle Scholar
  226. [226] Yuan Kun, Li Quanquan, Guo Shaopeng, Chen Dapeng, Zhou Aojun, Yu Fengwei, and Liu Ziwei. 2021. Differentiable dynamic wirings for neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 327336.Google ScholarGoogle ScholarCross RefCross Ref
  227. [227] Yuan Zhihang, Wu Bingzhe, Sun Guangyu, Liang Zheng, Zhao Shiwan, and Bi Weichen. 2020. S2DNAS: Transforming static CNN model for dynamic inference via neural architecture search. In Computer Vision – ECCV 2020 (Lecture Notes in Computer Science), Vedaldi Andrea, Bischof Horst, Brox Thomas, and Frahm Jan-Michael (Eds.). Springer International Publishing, Cham, 175192. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  228. [228] Zeng Liekang, Li En, Zhou Zhi, and Chen Xu. 2019. Boomerang: On-demand cooperative deep neural network inference for edge intelligence on the industrial internet of things. IEEE Network 33, 5 (Sept.2019), 96103. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  229. [229] Zhang Chen, Cao Qiang, Jiang Hong, Zhang Wenhui, Li Jingjun, and Yao Jie. 2018. FFS-VA: A fast filtering system for large-scale video analytics. In Proceedings of the 47th International Conference on Parallel Processing (ICPP 2018). Association for Computing Machinery, New York, NY, USA, 110. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  230. [230] Zhang Chen, Cao Qiang, Jiang Hong, Zhang Wenhui, Li Jingjun, and Yao Jie. 2020. A fast filtering mechanism to improve efficiency of large-scale video analytics. IEEE Trans. Comput. 69, 6 (June2020), 914928. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  231. [231] Zhang Jinrui, Zhang Deyu, Yang Huan, Liu Yunxin, Ren Ju, Xu Xiaohui, Jia Fucheng, and Zhang Yaoxue. 2022. MVPose:Realtime multi-person pose estimation using motion vector on mobile devices. IEEE Transactions on Mobile Computing (2022), 11. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  232. [232] Zhang Linfeng, Tan Zhanhong, Song Jiebo, Chen Jingwei, Bao Chenglong, and Ma Kaisheng. 2019. SCAN: A scalable neural networks framework towards compact and efficient models. In Advances in Neural Information Processing Systems, Vol. 32. Curran Associates, Inc.Google ScholarGoogle Scholar
  233. [233] Zhang Pei, Liang Tailin, Glossner John, Wang Lei, Shi Shaobo, and Zhang Xiaotong. 2021. Dynamic runtime feature map pruning. In Pattern Recognition and Computer Vision (Lecture Notes in Computer Science), Ma Huimin, Wang Liang, Zhang Changshui, Wu Fei, Tan Tieniu, Wang Yaonan, Lai Jianhuang, and Zhao Yao (Eds.). Springer International Publishing, Cham, 411422. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  234. [234] Zhang Wuyang, He Zhezhi, Liu Luyang, Jia Zhenhua, Liu Yunxin, Gruteser Marco, Raychaudhuri Dipankar, and Zhang Yanyong. 2021. Elf: Accelerate high-resolution mobile deep vision with content-aware parallel offloading. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking (MobiCom’21). Association for Computing Machinery, New York, NY, USA, 201214. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  235. [235] Zhang Yu, Liu Dajiang, and Xing Yongkang. 2021. Dynamic convolution pruning using pooling characteristic in convolution neural networks. In Neural Information Processing (Communications in Computer and Information Science), Mantoro Teddy, Lee Minho, Ayu Media Anugerah, Wong Kok Wai, and Hidayanto Achmad Nizar (Eds.). Springer International Publishing, Cham, 558565. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  236. [236] Zheng Yin-Dong, Liu Zhaoyang, Lu Tong, and Wang Limin. 2020. Dynamic sampling networks for efficient action recognition in videos. IEEE Transactions on Image Processing 29 (2020), 79707983. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  237. [237] Zhou Wangchunshu, Xu Canwen, Ge Tao, McAuley Julian, Xu Ke, and Wei Furu. 2020. BERT loses patience: Fast and robust inference with early exit. In Advances in Neural Information Processing Systems, Vol. 33. Curran Associates, Inc., 1833018341.Google ScholarGoogle Scholar
  238. [238] Zolfaghari Mohammadreza, Singh Kamaljeet, and Brox Thomas. 2018. ECO: Efficient convolutional network for online video understanding. In Proceedings of the European Conference on Computer Vision (ECCV). 695712.Google ScholarGoogle ScholarDigital LibraryDigital Library
  239. [239] Get Your Footage. 2021. Hands Up Waving Hello Green Screen Effect | Gesture Say Hi Chroma Key in HD 4K.Google ScholarGoogle Scholar
  240. [240] PCV. 2022. Vehicle Detection Dataset. https://universe.roboflow.com/pcv-wndzh/vehicle-detection-bq16s. Visited on 2024-04-09.Google ScholarGoogle Scholar

Index Terms

  1. Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Computing Surveys
      ACM Computing Surveys  Volume 56, Issue 10
      October 2024
      325 pages
      ISSN:0360-0300
      EISSN:1557-7341
      DOI:10.1145/3613652
      Issue’s Table of Contents

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 14 May 2024
      • Online AM: 10 April 2024
      • Accepted: 2 April 2024
      • Revised: 8 March 2024
      • Received: 18 January 2023
      Published in csur Volume 56, Issue 10

      Check for updates

      Qualifiers

      • survey
    • Article Metrics

      • Downloads (Last 12 months)328
      • Downloads (Last 6 weeks)316

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text