Library: Custom Species
By default, TCRconvert supports alpha-beta and gamma-delta V, D, J, and C genes for human, mouse, and rhesus macaque from the IMGT F+ORF+in-frame P references. For other species, follow these steps:
1. Create a folder of IMGT FASTA files
The simplest way is to download from IMGT.
Details:
TCRconvert expects a folder containing files ending in .fasta or .fa with headers in the IMGT format:
>SomeText|TRBV10-1*02|MoreText|...
The sequences are not used, so a text file containing headers and ending in .fa would also work.
2. Run build_lookup_from_fastas()
The species parameter should be the species name you’ll use when calling convert_gene()
[1]:
import tcrconvert
import os
# Using our example directory of fasta files
fastadir = tcrconvert.get_example_path('fasta_dir')
# Build lookup tables and note where they're being written to
new_lookup_dir = tcrconvert.build_lookup_from_fastas(fastadir, species='rabbit')
new_lookup_dir
[1]:
'/home/docs/.local/share/tcrconvert/rabbit'
Confirm they exist there:
[2]:
os.listdir(new_lookup_dir)
[2]:
['lookup.csv', 'lookup_from_adaptive.csv', 'lookup_from_tenx.csv']
Details:
species will also be the name of the folder storing lookup tables, so these characters are not allowed:
/ : * ? " < > | ~ ` n t