Use PSI-BLAST at NCBI to find all members of the aquaporin family. How many iterations do you need? What is the principial difference between BLAST and PSI-BLAST?
PSI-BLAST is a sensitive method, but we can only use it in some cases. If we have to use BLAST or FASTA, we need to know what their sensitivity is for smaller similarities (with larger genetic distances). We will try to use simulation to find out at what distances the programs are no longer reliable and where are their limits.
Create a program for mutating DNA and proteins, for simplicity we do not need to consider insertions / deletions. For example, a DNA mutagenesis program could have as arguments the name of the fasta file and the number of mutations, or the number of mutations per unit length. It should be possible to maintain the same sequence composition (e.g. GC content) and to enable or disable multiple mutations. A protein mutagenesis program could (using switches) take into account the genetic relatedness of individual amino acids and also keep the frequencies of individual amino acids or their groups (approximately) the same. (For non-programmers: use EMBOSS or SMS2.) The idea is for the programs to approximately simulate real biological processes.
From the bacterial gene database (/mnt/shared/GAA2024/db/gbbct1.seq.gz) select randomly one gene.
Use the mutagenesis program to gradually mutate it and test for how long you can still find in the database
its unmutated version using FASTA and BLASTN programs. (We are mainly interested in how E-value and bit-score change,
what is thge signal/noise ratio.
Translate the mutated gene into a protein and test the sensitivity using the TBLASTN and TFASTA programs.
Search NCBI for the corresponding protein product of your gene (or translate the DNA), mutate the protein and test for sensitivity TBLASTN and TFASTA in a similar way.
Prepare a brief, clear presentation of the results (powerpoint, pdf, document, html, according to personal preferences), with which you pass on the experience gained to the following year.