Compare commits
1 Commits
ade1112c8f
...
991191097e
Author | SHA1 | Date |
---|---|---|
coolneng | 991191097e |
|
@ -318,7 +318,7 @@ Estas observaciones no son sorprendentes; en la práctica se ha comprobado que e
|
||||||
|
|
||||||
* Diseño y descripción del sistema
|
* Diseño y descripción del sistema
|
||||||
|
|
||||||
La finalidad de este proyecto es el desarrollo de un /pipeline/, con el objetivo de crear un algoritmo de /Deep Learning/ capaz de corregir errores de secuenciación en secuencias de ADN, en particular, en la región CDR3 del TCR.
|
La finalidad de este proyecto es el desarrollo de un /pipeline/, con el objetivo de crear un algoritmo de /Deep Learning/ capaz de corregir errores de secuenciación en secuencias de ADN, en particular, en la región CDR3 del TCR. Por ende, el trabajo consiste en el desarrollo /end-to-end/ de un sistema de /machine learning/.
|
||||||
|
|
||||||
El sistema se compone de 2 partes, dado que el algoritmo de /Deep Learning/ no es dependiente del /dataset/ generado /in silico/, y se podría entrenar con cualquier otro conjunto de datos.
|
El sistema se compone de 2 partes, dado que el algoritmo de /Deep Learning/ no es dependiente del /dataset/ generado /in silico/, y se podría entrenar con cualquier otro conjunto de datos.
|
||||||
|
|
||||||
|
|
BIN
Dissertation.pdf
BIN
Dissertation.pdf
Binary file not shown.
|
@ -2,4 +2,4 @@ spanish-abstract: "Las nuevas técnicas de secuenciación de ADN (NGS) han revol
|
||||||
spanish-keywords: "deep learning, corrección de errores, receptor de linfocitos T, secuenciación de ADN, inmunología"
|
spanish-keywords: "deep learning, corrección de errores, receptor de linfocitos T, secuenciación de ADN, inmunología"
|
||||||
english-abstract: "Next generation sequencing (NGS) techniques have revolutionised genomic research. These technologies perform sequencing of millions of fragments of DNA in parallel, which are pieced together using bioinformatics analyses. Although these techniques are commonly applied, they have non-negligible error rates that are detrimental to the analysis of regions with a high degree of polimorphism. In this study we propose a novel computational method, locimend, based on a Deep Learning algorithm for DNA sequencing error correction. It is applied to the analysis of the complementarity determining region 3 (CDR3) of the T-cell receptor (TCR) found on the surface of lymphocytes, generated in silico and subsequently subjected to a sequencing simulator in order to produce sequencing errors. Using these data, we trained a depp neural network with the aim of generating a computational model that allows the detection and correction of sequencing errors. Our results show that locimend is a model that identifies and corrects DNA sequencing error patterns, obtaining an accuracy of 0,89 and an area under the curve (AUC) of 0,98. The implementation includes a REST API that performs the inference of the correct DNA sequence, from a DNA sequence with errors with the pre-trained model, in order to popularise its use in the scientific community."
|
english-abstract: "Next generation sequencing (NGS) techniques have revolutionised genomic research. These technologies perform sequencing of millions of fragments of DNA in parallel, which are pieced together using bioinformatics analyses. Although these techniques are commonly applied, they have non-negligible error rates that are detrimental to the analysis of regions with a high degree of polimorphism. In this study we propose a novel computational method, locimend, based on a Deep Learning algorithm for DNA sequencing error correction. It is applied to the analysis of the complementarity determining region 3 (CDR3) of the T-cell receptor (TCR) found on the surface of lymphocytes, generated in silico and subsequently subjected to a sequencing simulator in order to produce sequencing errors. Using these data, we trained a depp neural network with the aim of generating a computational model that allows the detection and correction of sequencing errors. Our results show that locimend is a model that identifies and corrects DNA sequencing error patterns, obtaining an accuracy of 0,89 and an area under the curve (AUC) of 0,98. The implementation includes a REST API that performs the inference of the correct DNA sequence, from a DNA sequence with errors with the pre-trained model, in order to popularise its use in the scientific community."
|
||||||
english-keywords: "deep learning, error correction, DNA sequencing, T-cell receptor, immunology"
|
english-keywords: "deep learning, error correction, DNA sequencing, T-cell receptor, immunology"
|
||||||
acknowdledgments: ""
|
acknowledgments: "Este proyecto no podría haber sido posible sin el apoyo de numerosas personas. En particular, quiero agradecer especialmente a Carlos Cano Gutiérrez por depositar su voto de confianza al asignarme un proyecto de investigación, el cual no era una propuesta de Trabajo de Fin de Grado. Y a María Soledad Benítez Cantos por su mentorización sido invaluable a lo largo de este trabajo. Su afán por el conocimiento, sus revisiones y comentarios de retroalimentación, su habilidad para exponer un concepto complejo en una frase y su dedicación incondicional al proyecto han sido el pilar central que ha permitido un desenlace favorable de la investigación."
|
||||||
|
|
|
@ -494,7 +494,7 @@
|
||||||
\textbf{Keywords:} $english-keywords$
|
\textbf{Keywords:} $english-keywords$
|
||||||
\end{center}
|
\end{center}
|
||||||
\chapter*{Agradecimientos}
|
\chapter*{Agradecimientos}
|
||||||
$acknowledgements$
|
$acknowledgments$
|
||||||
\tableofcontents
|
\tableofcontents
|
||||||
\listoftables{}
|
\listoftables{}
|
||||||
\listoffigures{}
|
\listoffigures{}
|
||||||
|
|
Loading…
Reference in New Issue