AIP Hands-on Answer Key
Tip: At any point in time, you can check out the ThaleMine userguide for more information (https://www.araport.org/thalemine/user-guide).
Collecting a List of sucrose transporters from a Search
You are interested in the SUC family of sucrose transporters in Arabidopsis, to find these genes in ThaleMine, enter SUC* into either search box (use the wildcard * since we know there is more than 1).
Look at the Hits by Category, How many genes have we found? How many SUC genes?
(Hint: They are all in the top 10)
Clicking on a gene will take you to the Gene Report Page. Try SUC7, how many transcripts does it have? How many exons in each transcript?
Where is SUC7 located in the cell? (Hint: Check GO terms under Function)
According to the eFP Visualization, where is it expressed?
Clicking on SUC7_ARATH in the Proteins stripe will take you to the SUC7 Protein Report Page. How many transmembrane domains are predicted?
Select all the SUC genes and create a list. This should take you to the List Analysis page for your new_Gene_list_1.
Take a look at the Report table. What are the 4 operations you can perform on the columns?
(Hint: Mouse over icons)
sort this column
remove this column
Toggle column visibility
Explore the List Analysis page. What are some GO Terms enriched in this list?
Biologic process : sucrose metabolic process
Cellular component: integral to plasma membrane
Molecular function: sucrose transmembrane transporter activity
This list will be saved in your MyMine account. Give your list a Description and then verify that it has been saved by clicking the View button under the Lists tab at the top of your screen. You can also tag, rename, and share your lists from your MyMine tab. Rename your list “Sucrose Transporters” by clicking the Edit icon (this icon will be seen throughout ThaleMine).
Further research tells you there are 9 SUC genes. Do another search for SUC* to find the missing SUC genes.
(Hint: Look for the synonyms field on the Gene Report Pages for synonymous gene symbols.)
Add the missing genes to your list. (Hint: You can add genes to an existing list from the Gene Report Page or by creating a list and adding the contents of that list to an existing one.)
Find promoter cis-regulatory elements and confirm alternative splicing using JBrowse.
You want to know if any SUC genes are responsive to auxin signaling so you decide to search for Auxin Response Factor (ARF) binding sites (TGTCTC). You can collect this information from JBrowse.
First create a spreadsheet containing your list of genes. From the List Analysis page, download your list of 9 sucrose transporters into a spreadsheet with the following 4 columns:
DB Identifier | Chromosome . Primary Identifier | Chromosome Location . Start | Chromosome Location . End
(Hint: Format as Spreadsheet - comma separated values.)
Record in your spreadsheet which SUC genes have the motif TGTCTC in their promoter region using the JBrowse sequence search track (File -> Add sequence search track)
You may have noticed earlier that SUC3 (AT2G02860) has two transcripts. Staying in JBrowse, determine what the difference is between the two transcripts.
Alternative start and 4th exon
You want to verify the alternative splicing with RNA-seq data. JBrowse provides the functionality of opening data in your local instance through uploads or URLs. Open an RNA-seq read mapping (BAM file and its index) at the following URLs:
(File -> Open -> Remote URLs)
Can you verify the alternative splicing with the mapped RNA-seq reads? (Hint: Reads mapped across introns are connected by black lines.)
Yes, reads show spliced mapping to both exon boundaries.
Which SUC gene is most likely important in pollen germination?
Suppose you are interested in sucrose transport in pollen germination. Template Queries can be run with either a single object or a list of objects. Run the Genes → Expression template with your saved list of sucrose transporters.
How many rows are in your table?
How else could you have retrieved this table?
The List Analysis page pre-computes the Genes → Expression template.
Use the column summary feature to constrain the table to only results from Experiments -> Title containing “pollen”. How many experiments were there and how many rows are now found?
2 experiments and 338 rows
Which gene is most highly expressed?
Now use the Filters button to limit the table to Expressions -> Average Ratio > 20.
What genes are still left?
AT1G66570, AT2G14670, AT5G06170, AT5G43610
Which gene is most highly induced compared to its matched control?
Is this confirmed with the eFP Visualization?
(Hint: Go to the Gene Report Page in new window and select Data Source - Tissue Specific.)
Let’s change how the table looks. Try using the column operators to remove the Probes Name column and hide the Experiment Category column. Now use the Manage Columns button to add the Gene Symbol and rearrange the table so that the Gene Symbol is the second column and the Sample Description is the third. (Note: You can always Undo any changes.)
There are often a number of different ways to perform the same task in ThaleMine. Let’s see if we can recreate our Report table above by editing the Genes → Expression template in the Query Builder.
Go to the template and click Edit Query
The Query Overview will give you hints where to find the fields you want to constrain in the Model browser. Remember, we want:
1) Gene constrained by your list.
2) Gene Symbol added.
3) Experiment Title: ONE OF Expression data from Arabidopsis pollen, Pollen.
4) Expressions Average Ratio greater than 20.
5) Remove Gene Probes Name.
6) Set Gene Symbol as second column and Sample Description as the third.
You can see how you’re doing at any time by clicking Show results. Simply click Query in the upper left to return to the query builder.
This concludes the exercises.