Compare commits
1 Commits
848a239f2c
...
760d78108c
Author | SHA1 | Date |
---|---|---|
coolneng | 760d78108c |
|
@ -0,0 +1,67 @@
|
||||||
|
# locigenesis
|
||||||
|
|
||||||
|
locigenesis is a tool that generates a human T-cell receptor (TCR), runs
|
||||||
|
it through a sequence reader simulation tool and extracts CDR3.
|
||||||
|
|
||||||
|
The goal of this project is to generate both HVR sequences with and
|
||||||
|
without sequencing errors, in order to create datasets for a Machine
|
||||||
|
Learning algorithm.
|
||||||
|
|
||||||
|
## Technologies
|
||||||
|
|
||||||
|
- [immuneSIM](https://github.com/GreiffLab/immuneSIM/): in silico
|
||||||
|
generation of human and mouse BCR and TCR repertoires
|
||||||
|
- [CuReSim](http://www.pegase-biosciences.com/curesim-a-customized-read-simulator/):
|
||||||
|
read simulator that mimics Ion Torrent sequencing
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
This project uses [Nix](https://nixos.org/) to ensure reproducible
|
||||||
|
builds.
|
||||||
|
|
||||||
|
1. Install Nix (compatible with MacOS, Linux and
|
||||||
|
[WSL](https://docs.microsoft.com/en-us/windows/wsl/about)):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -L https://nixos.org/nix/install | sh
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Clone the repository:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone https://git.coolneng.duckdns.org/coolneng/locigenesis
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Change the working directory to the project:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd locigenesis
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Enter the nix-shell:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
nix-shell
|
||||||
|
```
|
||||||
|
|
||||||
|
After running these commands, you will find yourself in a shell that
|
||||||
|
contains all the needed dependencies.
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
An execution script that accepts 2 parameters is provided, the following
|
||||||
|
command invokes it:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
./generation.sh <number of sequences> <number of reads>
|
||||||
|
```
|
||||||
|
|
||||||
|
- \<number of sequences\>: an integer that specifies the number of
|
||||||
|
different sequences to generate
|
||||||
|
- \<number of reads\>: an integer that specifies the number of reads
|
||||||
|
to perform on each sequence
|
||||||
|
|
||||||
|
The script will generate 2 files under the data directory:
|
||||||
|
|
||||||
|
|HVR.fastq |Contains the original CDR3 sequence |
|
||||||
|
|CuReSim-HVR.fastq | Contains CDR3 after the read simulation, with sequencing errors |
|
56
README.org
56
README.org
|
@ -1,56 +0,0 @@
|
||||||
* locigenesis
|
|
||||||
|
|
||||||
locigenesis is a tool that generates a human T-cell receptor (TCR), runs it through a sequence reader simulation tool and extracts CDR3.
|
|
||||||
|
|
||||||
The goal of this project is to generate both HVR sequences with and without sequencing errors, in order to create datasets for a Machine Learning algorithm.
|
|
||||||
|
|
||||||
** Technologies
|
|
||||||
|
|
||||||
- [[https://github.com/GreiffLab/immuneSIM/][immuneSIM]]: in silico generation of human and mouse BCR and TCR repertoires
|
|
||||||
- [[http://www.pegase-biosciences.com/curesim-a-customized-read-simulator/][CuReSim]]: read simulator that mimics Ion Torrent sequencing
|
|
||||||
|
|
||||||
** Installation
|
|
||||||
|
|
||||||
This project uses [[https://nixos.org/][Nix]] to ensure reproducible builds.
|
|
||||||
|
|
||||||
1. Install Nix (compatible with MacOS, Linux and [[https://docs.microsoft.com/en-us/windows/wsl/about][WSL]]):
|
|
||||||
|
|
||||||
#+begin_src shell
|
|
||||||
curl -L https://nixos.org/nix/install | sh
|
|
||||||
#+end_src
|
|
||||||
|
|
||||||
1. Clone the repository:
|
|
||||||
|
|
||||||
#+begin_src shell
|
|
||||||
git clone https://git.coolneng.duckdns.org/coolneng/locigenesis
|
|
||||||
#+end_src
|
|
||||||
|
|
||||||
3. Change the working directory to the project:
|
|
||||||
|
|
||||||
#+begin_src shell
|
|
||||||
cd locigenesis
|
|
||||||
#+end_src
|
|
||||||
|
|
||||||
4. Enter the nix-shell:
|
|
||||||
|
|
||||||
#+begin_src shell
|
|
||||||
nix-shell
|
|
||||||
#+end_src
|
|
||||||
|
|
||||||
After running these commands, you will find yourself in a shell that contains all the needed dependencies.
|
|
||||||
|
|
||||||
** Usage
|
|
||||||
|
|
||||||
An execution script that accepts 2 parameters is provided, the following command invokes it:
|
|
||||||
|
|
||||||
#+begin_src shell
|
|
||||||
./generation.sh <number of sequences> <number of reads>
|
|
||||||
#+end_src
|
|
||||||
|
|
||||||
- <number of sequences>: an integer that specifies the number of different sequences to generate
|
|
||||||
- <number of reads>: an integer that specifies the number of reads to perform on each sequence
|
|
||||||
|
|
||||||
The script will generate 2 files under the data directory:
|
|
||||||
|
|
||||||
| HVR.fastq | Contains the original CDR3 sequence |
|
|
||||||
| CuReSim-HVR.fastq | Contains CDR3 after the read simulation, with sequencing errors |
|
|
Loading…
Reference in New Issue