Compare commits
1 Commits
e732fdada1
...
2b280e2bdf
Author | SHA1 | Date |
---|---|---|
coolneng | 2b280e2bdf |
54
README.md
54
README.md
|
@ -2,10 +2,15 @@
|
||||||
|
|
||||||
locimend is a tool that corrects DNA sequencing errors using Deep Learning.
|
locimend is a tool that corrects DNA sequencing errors using Deep Learning.
|
||||||
|
|
||||||
|
The goal is to provide a correct DNA sequence, when a sequence containing errors is provided.
|
||||||
|
|
||||||
|
It provides both a command-line program and a REST API.
|
||||||
|
|
||||||
## Technologies
|
## Technologies
|
||||||
|
|
||||||
- Tensorflow
|
- Tensorflow
|
||||||
- Biopython
|
- Biopython
|
||||||
|
- FastAPI
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
|
@ -48,8 +53,53 @@ contains all the needed dependencies.
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
The following command creates the dataset, trains the Deep Learning model and shows the accuracy:
|
### Training the model
|
||||||
|
|
||||||
|
The following command creates the trains the Deep Learning model and shows the accuracy and AUC:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
poetry run python src/model.py
|
poetry run python src/main.py train <data file> <label file>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
- <data file>: FASTQ file containing the sequences with errors
|
||||||
|
- <label file>: FASTQ file containing the sequences without errors
|
||||||
|
|
||||||
|
Both files must contain the canonical and read simulated sequences in the same positions (same row).
|
||||||
|
|
||||||
|
A dataset is provided to train the model, in order to proceed execute the following command:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
poetry run python src/main.py train data/curesim-HVR.fastq data/HVR.fastq
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
### Inference
|
||||||
|
|
||||||
|
A trained model is provided, which can be used to infer the correct sequences. There are two ways to interact with it:
|
||||||
|
|
||||||
|
- Command-line execution
|
||||||
|
- REST API
|
||||||
|
|
||||||
|
#### Command-line
|
||||||
|
|
||||||
|
The following command will infer the correct sequence, and print it:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
poetry run python src/main.py infer "<dna sequence>"
|
||||||
|
```
|
||||||
|
|
||||||
|
#### REST API
|
||||||
|
|
||||||
|
It is also possible to serve the model via a REST API, to start the web server run the following command:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
poetry run api
|
||||||
|
```
|
||||||
|
|
||||||
|
The API can be accessed at http://localhost:8000, with either a GET or POST request:
|
||||||
|
|
||||||
|
| Request | Endpoint | Payload |
|
||||||
|
|:----:|:-----:|:-----:|
|
||||||
|
| GET | /<sequence> | Sequence as a path parameter |
|
||||||
|
| POST | /| JSON: {"sequence": "<sequence>"} |
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue