Busch Lab

Zebrafish Genome Literacy Workshop 2023

Exercise 1 - exploring the genome

Go to the region from 54,470,000 to 54,705,000 bp on chromosome 7.

How many contigs make up this region? Are they clones or whole genome shotgun sequence or both?

2. Both clones. Just the ones shown in the bottom panel, whereas the middle panel shows more in the surrounding area.

How many genes are in this region?

8. 5 on the forward strand. 3 on the reverse strand.

How many transcripts are in this region?

14. 7 on the forward strand. 7 on the reverse strand.

How many of the transcripts on the forward strand come from manual annotation and how many from automatic annotation? Note that manually annotated transcripts are given a source of "Havana", because that's the name of the team that did the manual annotation. Any other sources, like "Ensembl", indicate automated annotation run by Ensembl.

1 comes solely from automatic annotation [Ensembl]. 5 come from both manual and automatic annotation [Ensembl/Havana merge]. And 1 comes from manual annotation [Havana].

Zoom in on fgf4. There are a number of ways to do this. You could estimate the coordinates and type them in. Or you could draw a box round it on the middle panel or on the bottom panel and jump to it. Or you could draw a box round it and then mark it and then jump to the mark, which will also highlight this region so you can keep track of where you are as you navigate around. Or you could click on the transcript and then click on its location. Try a few of these methods to familiarise yourself with the navigation options.

Once you've zoomed in on fgf4, export the genomic sequence for this region in HTML format. What information is in the header of the FASTA sequence file?

e.g. ">7 dna:chromosome chromosome:GRCz11:7:54617060:54624876:1" chromosome, assembly, start, end, strand, etc..

Export an image of the bottom panel as a PDF file and open it. Are there any differences between the browser and the PDF?

Just slight differences in appearance.

What is the stable ID of the fgf4 gene? What is the stable ID of the only transcript of fgf4? What is the version number of the transcript's stable ID? What's the stable ID of the fgf4 protein? How many amino acids does the protein have?

Click on the gene to find out. ENSDARG00000105230. ENSDART00000158898. 2. ENSDARP00000135882. 191.

Go to the archive site for GRCz10 and find the equivalent region. Hint: You could search for the contig on which fgf4 is located. Look at the surrounding genes. What differences do you see to the current annotation on GRCz11?

fgf19 had an additional processed transcript. lto1 was called oraov1.

Use this sequence to run both BLASTN and BLAT against the zebrafish genome:

CAGAAGGCTGCTCCGTCATGCACTAGGTAGTGTGTGATGTAATTTCGCTGCAGATATAAATATCGTCGGC
AAGGGAAAGTATACAGCATGGCTGTGCGTGGCTAGGATTCACAGTCAAAATTTTCCAACTCTTTTTGACA
GCTTGAACAACAGTCTAAAGTATATTGTCTTGAGAGGAAACCCACTGAGCGCGACTTCATGCCGCTCACT
TGGGTTGACCGTAGCTCTGTCCGTGAGTAAGCTGTTTTGTGCGTCTTTTTGGCTGCCGACTCAAGCATAG
AGAAAAACGGTAGCCGGTGTCTTCAACTCCTTTTAGAAGGATGAGTGTCCAGTCGGCCCTCTTGCCAATC
CTGGTCTTAGGACTAATGACAAGCTCTGTGCGCTGCGCTCCGCTGCCCGGTGGACACAGCGGCCCCGTAG
AGCGACGCTGGGAGACCCTCTACTCGCGTTCACTAGCACGAATCCCCGGGGAAAAAAGAGATATCAGCAG
GGACAGTGATTATCTCACGGGCATCAAAAGACTCCGACGTCTCTACTGCAACGTTGGCATTGGGTTTCAT
CTTCAAGTATTACCGGGTGGTAAAATCACCGGCGTACACAACGAAAACCGCTACAGTCTTCTTGAGATAT
CTCCGGTGGAGAGGGGAGTTGTGACACTGTTTGGCGTGCGGAGCGGGCTATTCGTGGCCATGAACAGCAA
AGGGAAGCTTTACGGATCTGAGCAGTTCACAAACGAATGCAAATTCAGAGAAAAGCTCCTCGCAAATAAT

What gene is found? Why does it hit multiple times? Are there any differences between the results for BLASTN and BLAT?

3 hits to fgf4. It hits multiple exons. Slight differences in the lengths of the hits. BLASTN has a match to fgf6b.

Find your favourite gene on ZFIN and follow one of the links to Ensembl. Switch to the "Region in detail" view. Mark your gene and then explore the region around it. Is it on clone or whole genome shotgun sequence? Are the nearby genes manually or automatically annotated? What types of gene and transcript are represented? How long are the UTRs? If something looks weird then ask your neighbour about it and if they also think it looks weird then ask us about it!