diff --git a/README.md b/README.md new file mode 100644 index 0000000..3814e6e --- /dev/null +++ b/README.md @@ -0,0 +1,69 @@ +# locigenesis + +locigenesis is a tool that generates a human T-cell receptor (TCR), runs +it through a sequence reader simulation tool and extracts CDR3. + +The goal of this project is to generate both HVR sequences with and +without sequencing errors, in order to create datasets for a Machine +Learning algorithm. + +## Technologies + +- [immuneSIM](https://github.com/GreiffLab/immuneSIM/): in silico + generation of human and mouse BCR and TCR repertoires +- [CuReSim](http://www.pegase-biosciences.com/curesim-a-customized-read-simulator/): + read simulator that mimics Ion Torrent sequencing + +## Installation + +This project uses [Nix](https://nixos.org/) to ensure reproducible +builds. + +1. Install Nix (compatible with MacOS, Linux and + [WSL](https://docs.microsoft.com/en-us/windows/wsl/about)): + +```bash +curl -L https://nixos.org/nix/install | sh +``` + +2. Clone the repository: + +```bash +git clone https://git.coolneng.duckdns.org/coolneng/locigenesis +``` + +3. Change the working directory to the project: + +```bash +cd locigenesis +``` + +4. Enter the nix-shell: + +```bash +nix-shell +``` + +After running these commands, you will find yourself in a shell that +contains all the needed dependencies. + +## Usage + +An execution script that accepts 2 parameters is provided, the following +command invokes it: + +```bash +./generation.sh +``` + +- \: an integer that specifies the number of + different sequences to generate +- \: an integer that specifies the number of reads + to perform on each sequence + +The script will generate 2 files under the data directory: + + ------------------- ----------------------------------------------------------------- + HVR.fastq Contains the original CDR3 sequence + CuReSim-HVR.fastq Contains CDR3 after the read simulation, with sequencing errors + ------------------- ----------------------------------------------------------------- diff --git a/README.org b/README.org deleted file mode 100644 index f51bc55..0000000 --- a/README.org +++ /dev/null @@ -1,56 +0,0 @@ -* locigenesis - -locigenesis is a tool that generates a human T-cell receptor (TCR), runs it through a sequence reader simulation tool and extracts CDR3. - -The goal of this project is to generate both HVR sequences with and without sequencing errors, in order to create datasets for a Machine Learning algorithm. - -** Technologies - -- [[https://github.com/GreiffLab/immuneSIM/][immuneSIM]]: in silico generation of human and mouse BCR and TCR repertoires -- [[http://www.pegase-biosciences.com/curesim-a-customized-read-simulator/][CuReSim]]: read simulator that mimics Ion Torrent sequencing - -** Installation - -This project uses [[https://nixos.org/][Nix]] to ensure reproducible builds. - -1. Install Nix (compatible with MacOS, Linux and [[https://docs.microsoft.com/en-us/windows/wsl/about][WSL]]): - -#+begin_src shell -curl -L https://nixos.org/nix/install | sh -#+end_src - -1. Clone the repository: - -#+begin_src shell -git clone https://git.coolneng.duckdns.org/coolneng/locigenesis -#+end_src - -3. Change the working directory to the project: - -#+begin_src shell -cd locigenesis -#+end_src - -4. Enter the nix-shell: - -#+begin_src shell -nix-shell -#+end_src - -After running these commands, you will find yourself in a shell that contains all the needed dependencies. - -** Usage - -An execution script that accepts 2 parameters is provided, the following command invokes it: - -#+begin_src shell -./generation.sh -#+end_src - -- : an integer that specifies the number of different sequences to generate -- : an integer that specifies the number of reads to perform on each sequence - -The script will generate 2 files under the data directory: - -| HVR.fastq | Contains the original CDR3 sequence | -| CuReSim-HVR.fastq | Contains CDR3 after the read simulation, with sequencing errors |