README.md 2.57 KB
Newer Older
Ninjani's avatar
Ninjani committed
1
2
# Turterra

Terlouw, Barbara's avatar
Terlouw, Barbara committed
3
Turterra is a portal for analysing protein families. It consists of two main parts: turterra, which runs a web portal from a folder tree, and turterra-build, which creates any files in the folder tree that may be missing from a .fasta file and a directory containing templates for homology modelling.
Ninjani's avatar
Ninjani committed
4
5
6

## Installation

Terlouw, Barbara's avatar
Terlouw, Barbara committed
7
8
9
10
11
12
13
14
15
16
17
Turterra and turterra-build are installed together as follows:

First, we recommend you install [Anaconda](https://www.anaconda.com/products/individual-b) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html). Then, create a new conda environment for turterra and activate it:

```sh
conda create -n turterra python=3.9
conda activate turterra
```

Next, clone the turterra repository into a location of your choice, navigate to the folder, and install turterra.

Ninjani's avatar
Ninjani committed
18
19
20
21
22
23
```sh
conda install -c bioconda epa-ng hmmer muscle fasttree
git clone https://github.com/TurtleTools/turterra.git
cd turterra
pip install .
```
Terlouw, Barbara's avatar
Terlouw, Barbara committed
24
25
26
27
28
29
30
31
32

The majority of turterra's dependencies are installed through the provided setup.py file. However, some dependencies will need to be installed through conda.

```sh
conda install -c bioconda epa-ng hmmer muscle fasttree
```

Congratulations! Turterra was installed!

Terlouw, Barbara's avatar
Terlouw, Barbara committed
33
## Turterra folder architecture
Terlouw, Barbara's avatar
Terlouw, Barbara committed
34

Terlouw, Barbara's avatar
Terlouw, Barbara committed
35
36
37
In order to run turterra with your own data, create a folder called 'data' in the top-level folder called turterra. This folder should contain the following files and folders:

```
Terlouw, Barbara's avatar
Terlouw, Barbara committed
38
39
40
41
42
43
44
45
46
47
48
49
50
turterra
    |--data
        |--data.txt
        |--sequences.fasta
        |--sequence_alignment.fasta
        |--smiles.tsv
        |--structure_alignment.fasta
        |--structures
            |--sequence1_model.pdb
            |--sequence2.pdb
            |--sequence3_model.pdb
            |--...
        |--tree.txt
Terlouw, Barbara's avatar
Terlouw, Barbara committed
51
```
Terlouw, Barbara's avatar
Terlouw, Barbara committed
52

Terlouw, Barbara's avatar
Terlouw, Barbara committed
53
54
55
56
57
58
59
| file name | file contents |
| ------ | ------ |
| data.txt | tab-separated file, with categories in the first row and data for each sequence in the following rows. Any category can be defined. These are the categories that turterra will later be able to filter your data on. Currently, the categories 'Accession', 'Species' and 'Compounds' should always be present. |
| sequences.fasta | A .fasta file containing all the sequences in the analysis, with the accessions specified in data.txt as headers. |
| sequence_alignment.fasta | A .fasta file containing an alignment of all sequences in the analysis, with the accessions specified in data.txt as headers. |
| smiles.tsv | A tab-separated file, with as header 'Name\tSMILES', and all compounds names in the analysis and their corresponding structures in [SMILES format](http://opensmiles.org/opensmiles.html). |

Terlouw, Barbara's avatar
Terlouw, Barbara committed
60
61
62
63