Project DNA Technology
SUBPROJECT IV
COMPUTER ANALYSIS OF NUCLEOTIDE SEQUENCE DATA
Background
Bioinformatics is currently one of the fastest growing fields in science. One branch in the field of bioinformatics focuses on the digital analysis of molecular biological data, and makes software to do so. A few simple examples of this are:
- predicting cutting sites in a DNA fragment for given restriction enzymes
- finding the coding sequences in mRNA sequences
- comparing an unknown protein or DNA sequence with all known sequences in a certain database
- predicting the location of genes in a genome
In this sub-project, you will get to know some of these applications. The two neighbouring genes human αB crystallin (HspB5) and HspB2 will be used as an example.
Aim
Collect molecular biological data about αB-crystallin and HspB2 using the appropriate software.
Method
Complete assignments 1 to 5. Save the answers of the assignments in a Word file and keep a printed version in your lab journal. Discuss the answers with your supervisor
Note: the answers of assignment 5 are needed in sub-project I.
The following files with nucleotide-sequences are available:
- αB-HspB2-gen.seq: 6 kb genomic sequence (human) containing the genes for αB crystallin and HspB2
- αB-cDNA.seq: nucleotide sequence of human αB-crystallin cDNA
- HspB2-cDNA.seq: nucleotide sequence of human HspB2 cDNA
- pEGFP.seq: pEGFP plasmid sequence
The files are in FASTA format, which means that the name starts with the greater-than symbol ( > name) and the sequence starts on the next line.
With these files and the appropriate software you can complete the assignments
Assignment 1:
Make a drawing of the two genes that includes the following information:
- The intron-exon structure of the αB-crystallin gene and the HspB2 gene
- The length of the exons
- The orientation of the genes, marked with an arrow. Are the start sites of the genes (5′ site) closest to each other (head to head), or the ends (3′ site, tail to tail), or do the genes have the same orientation (head to tail)?
- The length of the intergenic region (the space between the genes). What do you think is remarkable for this region? Which functional elements should be located in this region?
Instructions:
- Go to Basic Local Alignment Search Tool: BLAST
- Select Nucleotide BLAST
- Enter the query sequence: copy-paste the genomic sequence alphaB-HspB2gen in the appropriate place
- Select the database Reference RNA Sequences. This database contains a well-annotated reference sequence of every transcript
- Select organism: human (taxid:9606)
- Begin search by clicking on BLAST
- Only use the top gene structure of αB-crystallin and HspB2
Assignment 2:
Determine the amino acid sequence of αB-crystallin and HspB2.
- Explain what an open reading frame is
- Give the length of the various open reading frames in the cDNA sequence of αB-crystallin and HspB2
- What is the correct open reading frame? Provide good arguments to back up your answer
- Give the amino acid sequence of αB-crystallin and HspB2.
- Calculate the theoretical Molecular Weight and the isoelectric point for both proteins
- To separate the two proteins you can use a size exclusion column (separates proteins based on size) or an ion exchange column (separates proteins based on difference in isoelectric point). Which column can best be used to separate the two proteins?
Instructions:
- Go to Open Reading Frame Finder tool: ORFfinder
- Copy-paste the cDNA sequences for αB-cDNA and HspB2-cDNA in the appropriate place
- Start ORFfinder
- Go to Mw/pI calculator to calculate the Molecular Weight and isoelectric point of αB-crystallin and HspB2
Assignment 3:
HspB2 and αB-crystallin belong to the same family of proteins based on their homologous sequences. Determine the homologous regions of αB-crystallin and HspB2 in both the nucleotide sequence and the amino acid sequence.
- Make an alignment of the cDNA sequences of αB-crystallin and HspB2
- Make an alignment of the protein sequences of αB-crystallin and HspB2
- Indicate the regions where these proteins are homologous
- Why do people use the protein sequence rather than the cDNA sequence to determine genes/proteins relationships?
Instructions:
- Go to Multiple Sequence Alignment tool: ClustalO
- Paste in the two sequences beneath each other. Each sequence must be given a name in FASTA format. The name starts with the greater-than symbol: > name, and the sequence starts on the next line. The file αB-cDNA and HspB2-cDNA already have names in this FASTA format. These titles need to be added in the protein sequence, or obtained in ORF Finder, e.g. >lcl|ORF1.
Assignment 4:
In sub-project 2 you used a αB-pET16b construct with the αB-crystallin cDNA cloned in the NcoI and XhoI restriction sites (see page 10). You are not happy with the protein product, and for this reason you want to make another αB-pET16b construct. You perform a PCR to add new restriction sites to the αB cDNA, so that you can incorporate the cDNA at another position in the plasmid. You can use the primers listed below. Both primers contain unique restriction sites at the 5’site (in bold). The primers are designed so that only the coding area of the αB crystallin cDNA is amplified. The PCR product obtained with these primers will be cut with the unique restriction enzymes and subsequently cloned into the pET16b plasmid.
Primer1: CAT ATG GAC ATC GCC ATC C
Primer2: CTC GAG CTA TTT CTT GGG
Instructions:
- Check which restriction sites of the pET16b plasmid should be used to insert the PCR product. Use the webcutter software.
- Determine which of the primers is the Forward primer and which is the Reverse primer. Do this by aligning the primer with the αB cDNA using ClustalO. Important: use the ‘reverse complement’ version of the primer to find the sequence of the complementary strand. Note: the restrictions sites are not (yet) present in the αB-crystallin sequence, thus they will not align!
- The αB-crystallin protein that can be expressed with this new construct has a new feature. What is this new feature and for what purpose can it be used? Hint: look at the map of the pET16b plasmid (see Appendix in the manual).
- What kind of protein product would you expect if you were to use genomic DNA to make this αB-crystallin-pET16b construct?
Assignment 5:
Determine which of the three restriction enzymes (BamHI, EcoRI or XhoI) can be used to determine the orientation of the αB-crystallin insert in the pEGFP plasmid (see sub-project I).
Instructions:
- Use the webcutter software to locate the restriction sites of the three enzymes in the αB-cDNA
- Also do this for the plasmid pEGFP or look them up in the map of the pEGFP plasmid in the Appendix of the manual
- Use this information to calculate the size of the fragments you will get with the three possible constructs cut with the different enzymes. The idea is that both orientations of the insert can only be distinguished based on fragments that are larger than 300 bp (because smaller fragments are not visible on an agarose gel). You will need this information later in sub-project I!
Final Products
- Results of the assignments and answers to the questions in your lab journal. These need to be evaluated with your supervisor before the end of the practicum.