• <mark id="zupzn"><ruby id="zupzn"><option id="zupzn"></option></ruby></mark>

    <tt id="zupzn"><ol id="zupzn"><source id="zupzn"></source></ol></tt>
    <blockquote id="zupzn"><u id="zupzn"></u></blockquote>

        1. 簡而言之,您的反饋是什么?

             取消
          館長信箱 咨詢反饋

          新聞動態

          2020人工智能領域研究熱點(一):深度學習
            發表時間:2020-12-14  閱讀次數:642

          本文通過對2018—2020年的發文進行分析,結合使用次數、被引次數等指標,篩選了2020人工智能領域的Top20研究熱點。本期對深度學習這一研究熱點的數據特征進行展示。

          近年來,深度學習領域不僅在理論上有所突破,在不同領域也有較好的應用前景。深度學習領域的主要研究機構(按被引用排序)有谷歌公司、中國科學院、倫敦大學、牛津大學、倫敦大學學院、Facebook公司、約翰霍普金斯大學、中國科學院大學、中國科學院自動化研究所和加州大學系統等。

          微信截圖_20201214094105


           

          深度學習領域的主要研究者(按被引用排序)有Chen, Liang-Chieh(Google Incorporated)、Shen, Li(University of Oxford)、Hu, Jie、Sun, Gang、Papandreou, George(Google Incorporated)、Yuille, Alan L.(Johns Hopkins University)、Murphy, Kevin(Google Incorporated)、Kokkinos, Iasonas(University College London)、He, Kaiming(Facebook Inc)Girshick, Ross(Facebook Inc)。

           

           

          深度學習領域最受關注的主題詞為Convolutional neural network、Object detection、Neural networks、Image classification、Transfer learning、Feature extraction、Semantic segmentation、Generated adversarial network、Domain adaptation、Deep neural networks、Computer vision、Task analysis、Image segmentation、Training、Machine learning等。

           

          2020年深度學習領域最受關注的論文主要有:

           

          1.標題:Focal Loss for Dense Object Detection

          作者: Lin, Tsung-Yi; Goyal, Priya; Girshick, Ross; .

          期刊:IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

          會議: 16th IEEE International Conference on Computer Vision (ICCV) 會議地點: Venice, ITALY 會議日期: OCT 22-29, 2017

          被引頻次: 1,341

          摘要:The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In this paper, we investigate why this is the case. We discover that the extreme foreground-background class imbalance encountered during training of dense detectors is the central cause. We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. To evaluate the effectiveness of our loss, we design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors.

          Code is at: https://github.com/facebookresearch/Detectron.


           

          2.標題:Squeeze-and-Excitation Networks

          作者: Hu, Jie; Shen, Li; Albanie, Samuel; .

          期刊:IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

          會議: 31st IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 會議地點: Salt Lake City, UT 會議日期: JUN 18-23, 2018

          被引頻次: 1,186

          摘要:The central building block of convolutional neural networks (CNNs) is the convolution operator, which enables networks to construct informative features by fusing both spatial and channel-wise information within local receptive fields at each layer. A broad range of prior research has investigated the spatial component of this relationship, seeking to strengthen the representational power of a CNN by enhancing the quality of spatial encodings throughout its feature hierarchy. In this work, we focus instead on the channel relationship and propose a novel architectural unit, which we term the "Squeeze-and-Excitation" (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels. We show that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets. We further demonstrate that SE blocks bring significant improvements in performance for existing state-of-the-art CNNs at slight additional computational cost. Squeeze-and-Excitation Networks formed the foundation of our ILSVRC 2017 classification submission which won first place and reduced the top-5 error to 2.251 percent, surpassing the winning entry of 2016 by a relative improvement of similar to 25 percent. Models and code are available at https://github.com/hujie-frank/SENet.


           

          3.標題:Mask R-CNN

          作者: He, Kaiming; Gkioxari, Georgia; Dollar, Piotr; .

          期刊:IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

          會議: 16th IEEE International Conference on Computer Vision (ICCV) 會議地點: Venice, ITALY 會議日期: OCT 22-29, 2017

          被引頻次: 431

          摘要:We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition. Code has been made available at: https://github.com/facebookresearch/Detectron.


           

          4.標題:Deep Learning for Generic Object Detection: A Survey

          作者: Liu, Li; Ouyang, Wanli; Wang, Xiaogang; .

          期刊:INTERNATIONAL JOURNAL OF COMPUTER VISION

          被引頻次: 76

          摘要:Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research.


           

          5.標題:Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

          作者: Selvaraju, Ramprasaath R.; Cogswell, Michael; Das, Abhishek; .

          期刊:INTERNATIONAL JOURNAL OF COMPUTER VISION

          會議: 16th IEEE International Conference on Computer Vision (ICCV) 會議地點: Venice, ITALY 會議日期: OCT 22-29, 2017

          被引頻次: 62

          摘要:We propose a technique for producing 'visual explanations' for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable. Our approach-Gradient-weighted Class Activation Mapping (Grad-CAM), uses the gradients of any target concept (say 'dog' in a classification network or a sequence of words in captioning network) flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept. Unlike previous approaches, Grad-CAM is applicable to a wide variety of CNN model-families: (1) CNNs with fully-connected layers (e.g.VGG), (2) CNNs used for structured outputs (e.g.captioning), (3) CNNs used in tasks with multi-modal inputs (e.g.visual question answering) or reinforcement learning, all without architectural changes or re-training. We combine Grad-CAM with existing fine-grained visualizations to create a high-resolution class-discriminative visualization, Guided Grad-CAM, and apply it to image classification, image captioning, and visual question answering (VQA) models, including ResNet-based architectures. In the context of image classification models, our visualizations (a) lend insights into failure modes of these models (showing that seemingly unreasonable predictions have reasonable explanations), (b) outperform previous methods on the ILSVRC-15 weakly-supervised localization task, (c) are robust to adversarial perturbations, (d) are more faithful to the underlying model, and (e) help achieve model generalization by identifying dataset bias. For image captioning and VQA, our visualizations show that even non-attention based models learn to localize discriminative regions of input image. We devise a way to identify important neurons through Grad-CAM and combine it with neuron names (Bau et al. in Computer vision and pattern recognition, 2017) to provide textual explanations for model decisions. Finally, we design and conduct human studies to measure if Grad-CAM explanations help users establish appropriate trust in predictions from deep networks and show that Grad-CAM helps untrained users successfully discern a 'stronger' deep network from a 'weaker' one even when both make identical predictions. Our code is available at , along with a demo on CloudCV (Agrawal et al., in: Mobile cloud visual media computing, pp 265-290. Springer, 2015) () and a video at .


           

          6.標題:ClothingOut: a category-supervised GAN model for clothing segmentation and retrieval

          作者: Zhang, Haijun; Sun, Yanfang; Liu, Linlin; .

          期刊:NEURAL COMPUTING & APPLICATIONS

          被引頻次: 61

          摘要:This paper presents a new framework, ClothingOut, which utilizes generative adversarial network (GAN) to generate tiled clothing images automatically. Specifically, we design a novel category-supervised GAN model by learning transformation rules between clothes on wearers and clothes that are tiled. Our method features in adding category attribute to a traditional GAN model. For model training, we built a large-scale dataset containing over 20,000 pairs of wearer images and their corresponding tiled clothing images. The learned model can be straightforwardly applied to video advertising and cross-scenario clothing image retrieval. We evaluated our generated images which can be regarded as the segmentation from the wearer images from two aspects: authenticity and retrieval performance. Experimental results demonstrate the effectiveness of our method.


           

          7.標題:Hierarchical LSTMs with Adaptive Attention for Visual Captioning

          作者: Gao, Lianli; Li, Xiangpeng; Song, Jingkuan; .

          期刊:IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

          被引頻次: 42

          摘要:Recent progress has been made in using attention based encoder-decoder framework for image and video captioning. Most existing decoders apply the attention mechanism to every generated word including both visual words (e.g., "gun" and "shooting") and non-visual words (e.g., "the", "a"). However, these non-visual words can be easily predicted using natural language model without considering visual signals or attention. Imposing attention mechanism on non-visual words could mislead and decrease the overall performance of visual captioning. Furthermore, the hierarchy of LSTMs enables more complex representation of visual data, capturing information at different scales. Considering these issues, we propose a hierarchical LSTM with adaptive attention (hLSTMat) approach for image and video captioning. Specifically, the proposed framework utilizes the spatial or temporal attention for selecting specific regions or frames to predict the related words, while the adaptive attention is for deciding whether to depend on the visual information or the language context information. Also, a hierarchical LSTMs is designed to simultaneously consider both low-level visual information and high-level language context information to support the caption generation. We design the hLSTMat model as a general framework, and we first instantiate it for the task of video captioning. Then, we further instantiate our hLSTMarefine it and apply it to the imioning task. To demonstrate the effectiveness of our proposed framework, we test our method on both video and image captioning tasks. Experimental results show that our approach achieves the state-of-the-art performance for most of the evaluation metrics on both tasks. The effect of important components is also well exploited in the ablation study.


           

          8.標題:Alcoholism identification via convolutional neural network based on parametric ReLU, dropout, and batch normalization

          作者: Wang, Shui-Hua; Muhammad, Khan; Hong, Jin; .

          期刊:NEURAL COMPUTING & APPLICATIONS

          被引頻次: 40

          摘要:Alcoholism changes the structure of brain. Several somatic marker hypothesis network-related regions are known to be damaged in chronic alcoholism. Neuroimaging approach can help us better understanding the impairment discovered in alcohol-dependent subjects. In this research, we recruited subjects from participating hospitals. In total, 188 abstinent long-term chronic alcoholic participants (95 men and 93 women) and 191 non-alcoholic control participants (95 men and 96 women) were enrolled in our experiment via computerized diagnostic interview schedule version IV and medical history interview employed to determine whether the applicants can be enrolled or excluded. The Siemens Verio Tim 3.0 T MR scanner (Siemens Medical Solutions, Erlangen, Germany) was employed to scan the subjects. Then, we proposed a 10-layer convolutional neural network for the diagnosis based on imaging, including three advanced techniques: parametric rectified linear unit (PReLU); batch normalization; and dropout. The structure of network is fine-tuned. The results show that our method secured a sensitivity of 97.73 +/- 1.04%, a specificity of 97.69 +/- 0.87%, and an accuracy of 97.71 +/- 0.68%. We observed the PReLU gives better performance than ordinary ReLU, clipped ReLU, and leaky ReLU. The batch normalization and dropout gained enhanced performance as batch normalization overcame the internal covariate shift and dropout got over the overfitting. The results of our proposed 10-layer CNN model show its performance better than seven state-of-the-art approaches.


           

          9.標題:MultiResUNet : Rethinking the U-Net architecture for multimodal biomedical image segmentation

          作者: Ibtehaz, Nabil; Rahman, M. Sohel

          期刊:NEURAL NETWORKS

          被引頻次: 39

          摘要:In recent years Deep Learning has brought about a breakthrough in Medical Image Segmentation. In this regard, U-Net has been the most popular architecture in the medical imaging community. Despite outstanding overall performance in segmenting multimodal medical images, through extensive experimentations on some challenging datasets, we demonstrate that the classical U-Net architecture seems to be lacking in certain aspects. Therefore, we propose some modifications to improve upon the already state-of-the-art U-Net model. Following these modifications, we develop a novel architecture, MultiResUNet, as the potential successor to the U-Net architecture. We have tested and compared MultiResUNet with the classical U-Net on a vast repertoire of multimodal medical images. Although only slight improvements in the cases of ideal images are noticed, remarkable gains in performance have been attained for the challenging ones. We have evaluated our model on five different datasets, each with their own unique challenges, and have obtained a relative improvement in performance of 10.15%, 5.07%, 2.63%, 1.41%, and 0.62% respectively. We have also discussed and highlighted some qualitatively superior aspects of MultiResUNet over classical U-Net that are not really reflected in the quantitative measures. (C) 2019 Elsevier Ltd. All rights reserved.


           

          10.標題:Feature Boosting Network For 3D Pose Estimation

          作者: Liu, Jun; Ding, Henghui; Shahroudy, Amir; .

          期刊:IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

          被引頻次: 29

          摘要:In this paper, a feature boosting network is proposed for estimating 3D hand pose and 3D body pose from a single RGB image. In this method, the features learned by the convolutional layers are boosted with a new long short-term dependence-aware (LSTD) module, which enables the intermediate convolutional feature maps to perceive the graphical long short-term dependency among different hand (or body) parts using the designed Graphical ConvLSTM. Learning a set of features that are reliable and discriminatively representative of the pose of a hand (or body) part is difficult due to the ambiguities, texture and illumination variation, and self-occlusion in the real application of 3D pose estimation. To improve the reliability of the features for representing each body part and enhance the LSTD module, we further introduce a context consistency gate (CCG) in this paper, with which the convolutional feature maps are modulated according to their consistency with the context representations. We evaluate the proposed method on challenging benchmark datasets for 3D hand pose estimation and 3D full body pose estimation. Experimental results show the effectiveness of our method that achieves state-of-the-art performance on both of the tasks.



           

          概念解釋

          研究熱點是指在一段時間內具有大量成果產出和較高研究價值的研究主題。研究熱點在文獻上的表現為發文量和被引量的快速增長。研究熱點不同于研究前沿,研究前沿難以測度和捕獲,即使捕獲到的前沿,也需要專家進行研判,處于萌芽期的研究熱點可以認為是研究前沿。一個研究主題成為研究熱點后,通常就不再是研究前沿。盡管如此,研究熱點的識別和分析仍具有重要意義,研究熱點是經過一段時間和實踐檢驗的研究前沿,對于科學理論的創新和發展具有重要意義,具有較高的應用價值。

           

          數據來源

          基于Web of science平臺,篩選學科類別為COMPUTER SCIENCE ARTIFICIAL INTELLIGENCE且發表在2018—2020年間的論文,采用InCites數據庫中的Citation Topics模塊對研究主題進行識別,篩選獲得人工智能領域Top20研究熱點。


           

          撰稿:June

          審核:情報分析與研究部

          原文鏈接:

          http://www.cnzenghonghua.cn/newlib/index.php?classid=12231&newsid=31519&t=show

           

          返回

          捕鱼游戏赢钱的