Add CNN section in State of the Art

This commit is contained in:
coolneng 2021-07-03 19:30:14 +02:00
parent 9bf910f33a
commit 3d03b276f8
Signed by: coolneng
GPG Key ID: 9893DA236405AF57
5 changed files with 161 additions and 1 deletions

View File

@ -236,9 +236,30 @@ Donde $L$ es la función de pérdida que penaliza $g(f(x))$ por ser distinto de
Tradicionalmente, los /autoencoders/ se utilizaban para reducir la dimensionalidad o /feature learning/. Recientemente, ciertas teorías que conectan los AE y los modelos de variables latentes han llevado a los autocodificadores a la vanguardia del modelado generativo cite:Goodfellow-et-al-2016.
En la actualidad, los /autoencoders/ se utilizan para la reducción de ruido, tanto en texto cite:Lewis_2020 como en imágenes cite:bigdeli17_image_restor_using_autoen_prior, /clustering/ no supervisado cite:makhzani15_adver_autoen, generación de imágenes sintéticas cite:Yoo_2020, reducción de dimensionalidad cite:makhzani15_adver_autoen, predicción de secuencia a secuencia para la traducción automática cite:kaiser18_discr_autoen_sequen_model.
En la actualidad, los /autoencoders/ se utilizan para la reducción de ruido, tanto en texto cite:Lewis_2020 como en imágenes cite:bigdeli17_image_restor_using_autoen_prior, /clustering/ no supervisado cite:makhzani15_adver_autoen, generación de imágenes sintéticas cite:Yoo_2020, reducción de dimensionalidad cite:makhzani15_adver_autoen y predicción de secuencia a secuencia para la traducción automática cite:kaiser18_discr_autoen_sequen_model.
*** Redes neuronales convolucionales (CNN)
Una red neuronal convolucional (CNN) es un tipo de red neuronal especializada en el procesamiento de datos que tienen una topología en forma de cuadrícula (/grid/). El nombre de "red neuronal convolucional" indica que la red emplea una operación matemática denominada convolución. Las redes convolucionales han tenido han tenido un enorme éxito en las aplicaciones prácticas cite:Goodfellow-et-al-2016.
#+CAPTION: Diagrama de una CNN. Una CNN es una red neuronal multicapa, compuesta por dos tipos diferentes de capas, a saber, las capas de convolución (capas C) y las capas de submuestreo (capas S) cite:LIU201711
#+ATTR_HTML: :height 20% :width 70%
#+NAME: fig:CNN
[[./assets/figures/CNN.png]]
En el contexto de una red neuronal convolucional, una convolución es una operación lineal que implica la multiplicación de un conjunto de pesos con la entrada, al igual que una red neuronal tradicional. La multiplicación se realiza entre una matriz de datos de entrada y una matriz bidimensional de pesos, llamada filtro o núcleo (/kernel/). El filtro es más pequeño que los datos de entrada, dado que permite que el mismo filtro (conjunto de pesos) se multiplique por la matriz de entrada varias veces en diferentes puntos de la entrada cite:brownlee_2020. La operación queda representada de forma gráfica en la siguiente figura:
\clearpage
#+CAPTION: Representación de una convolución bidimensional. Restringimos la salida a sólo las posiciones en las que el núcleo se encuentra completamente dentro de la imagen. Los recuadros con flechas indican cómo se forma el tensor de salida, aplicando el núcleo a la correspondiente región superior izquierda del tensor de entrada cite:Goodfellow-et-al-2016
#+ATTR_HTML: :height 30% :width 70%
#+NAME: fig:convolution
[[./assets/figures/convolution.png]]
Las capas de convolución (capas C) se utilizan para extraer características y las capas de submuestreo (capas S) son esencialmente una capa de mapeo de características (/feature mapping/). Sin embargo, cuando la dimensionalidad de las entradas es igual a la de la salida del filtro, debido a la alta dimensionalidad, la aplicación de un clasificador puede provocar un /overfitting/. Para resolver este problema, se introduce un proceso de /pooling/, \ie submuestreo o /down-sampling/, para reducir el tamaño total de la señal cite:LIU201711.
En la actualidad, las CNN se utilizan para /computer vision/, tanto para la clasificación de imágenes cite:howard17_mobil como para la segmentación cite:ronneberger15_u_net, sistemas de recomendación cite:yuan18_simpl_convol_gener_networ_next_item_recom y análisis de sentimientos cite:sadr21_novel_deep_learn_method_textual_sentim_analy.
** Bioinformática
* Objetivos

Binary file not shown.

View File

@ -986,6 +986,7 @@
ISBN = 9781728163956,
publisher = {IEEE}
}
@article{kaiser18_discr_autoen_sequen_model,
author = {Kaiser, Łukasz and Bengio, Samy},
title = {Discrete Autoencoders for Sequence Models},
@ -1014,3 +1015,141 @@
eprint = {1801.09797v1},
primaryClass = {cs.LG},
}
@misc{brownlee_2020,
title = {How Do Convolutional Layers Work in Deep Learning Neural
Networks?},
url = {https://machinelearningmastery.com/convolutional-layers-for-deep-learning-neural-networks/},
journal = {Machine Learning Mastery},
author = {Brownlee, Jason},
year = 2020,
month = {Apr}
}
@article{howard17_mobil,
author = {Howard, Andrew G. and Zhu, Menglong and Chen, Bo and
Kalenichenko, Dmitry and Wang, Weijun and Weyand, Tobias and
Andreetto, Marco and Adam, Hartwig},
title = {Mobilenets: Efficient Convolutional Neural Networks for
Mobile Vision Applications},
journal = {CoRR},
year = 2017,
url = {http://arxiv.org/abs/1704.04861v1},
abstract = {We present a class of efficient models called MobileNets
for mobile and embedded vision applications. MobileNets are
based on a streamlined architecture that uses depth-wise
separable convolutions to build light weight deep neural
networks. We introduce two simple global hyper-parameters that
efficiently trade off between latency and accuracy. These
hyper-parameters allow the model builder to choose the right
sized model for their application based on the constraints of
the problem. We present extensive experiments on resource and
accuracy tradeoffs and show strong performance compared to
other popular models on ImageNet classification. We then
demonstrate the effectiveness of MobileNets across a wide
range of applications and use cases including object
detection, finegrain classification, face attributes and large
scale geo-localization.},
archivePrefix = {arXiv},
eprint = {1704.04861v1},
primaryClass = {cs.CV},
}
@article{ronneberger15_u_net,
author = {Ronneberger, Olaf and Fischer, Philipp and Brox, Thomas},
title = {U-Net: Convolutional Networks for Biomedical Image
Segmentation},
journal = {CoRR},
year = 2015,
url = {http://arxiv.org/abs/1505.04597v1},
abstract = {There is large consent that successful training of deep
networks requires many thousand annotated training samples. In
this paper, we present a network and training strategy that
relies on the strong use of data augmentation to use the
available annotated samples more efficiently. The architecture
consists of a contracting path to capture context and a
symmetric expanding path that enables precise localization. We
show that such a network can be trained end-to-end from very
few images and outperforms the prior best method (a
sliding-window convolutional network) on the ISBI challenge
for segmentation of neuronal structures in electron
microscopic stacks. Using the same network trained on
transmitted light microscopy images (phase contrast and DIC)
we won the ISBI cell tracking challenge 2015 in these
categories by a large margin. Moreover, the network is fast.
Segmentation of a 512x512 image takes less than a second on a
recent GPU. The full implementation (based on Caffe) and the
trained networks are available at
http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .},
archivePrefix = {arXiv},
eprint = {1505.04597v1},
primaryClass = {cs.CV},
}
@article{yuan18_simpl_convol_gener_networ_next_item_recom,
author = {Yuan, Fajie and Karatzoglou, Alexandros and Arapakis,
Ioannis and Jose, Joemon M and He, Xiangnan},
title = {A Simple Convolutional Generative Network for Next Item
Recommendation},
journal = {CoRR},
year = 2018,
url = {http://arxiv.org/abs/1808.05163v4},
abstract = {Convolutional Neural Networks (CNNs) have been recently
introduced in the domain of session-based next item
recommendation. An ordered collection of past items the user
has interacted with in a session (or sequence) are embedded
into a 2-dimensional latent matrix, and treated as an image.
The convolution and pooling operations are then applied to the
mapped item embeddings. In this paper, we first examine the
typical session-based CNN recommender and show that both the
generative model and network architecture are suboptimal when
modeling long-range dependencies in the item sequence. To
address the issues, we introduce a simple, but very effective
generative model that is capable of learning high-level
representation from both short- and long-range item
dependencies. The network architecture of the proposed model
is formed of a stack of \emph{holed} convolutional layers,
which can efficiently increase the receptive fields without
relying on the pooling operation. Another contribution is the
effective use of residual block structure in recommender
systems, which can ease the optimization for much deeper
networks. The proposed generative model attains
state-of-the-art accuracy with less training time in the next
item recommendation task. It accordingly can be used as a
powerful recommendation baseline to beat in future, especially
when there are long sequences of user feedback.},
archivePrefix = {arXiv},
eprint = {1808.05163v4},
primaryClass = {cs.IR},
}
@article{sadr21_novel_deep_learn_method_textual_sentim_analy,
author = {Sadr, Hossein and Solimandarabi, Mozhdeh Nazari and Pedram,
Mir Mohsen and Teshnehlab, Mohammad},
title = {A Novel Deep Learning Method for Textual Sentiment
Analysis},
journal = {CoRR},
year = 2021,
url = {http://arxiv.org/abs/2102.11651v1},
abstract = {Sentiment analysis is known as one of the most crucial
tasks in the field of natural language processing and
Convolutional Neural Network (CNN) is one of those prominent
models that is commonly used for this aim. Although
convolutional neural networks have obtained remarkable results
in recent years, they are still confronted with some
limitations. Firstly, they consider that all words in a
sentence have equal contributions in the sentence meaning
representation and are not able to extract informative words.
Secondly, they require a large number of training data to
obtain considerable results while they have many parameters
that must be accurately adjusted. To this end, a convolutional
neural network integrated with a hierarchical attention layer
is proposed which is able to extract informative words and
assign them higher weight. Moreover, the effect of transfer
learning that transfers knowledge learned in the source domain
to the target domain with the aim of improving the performance
is also explored. Based on the empirical results, the proposed
model not only has higher classification accuracy and can
extract informative words but also applying incremental
transfer learning can significantly enhance the classification
performance.},
archivePrefix = {arXiv},
eprint = {2102.11651},
primaryClass = {cs.CL},
}

BIN
assets/figures/CNN.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB