modMine Worm Guide
Introduction
The aim of this document is to help researchers access and use the data in modMine for the detailed analysis papers. It provides information regarding where to find specific sets of data, how to view and download these and how to access and query specific subsets of the data. NOTE: this document is currently still a draft version. Please also note that templates and lists can be added to ModMine at any time, so if there is a query or list that you would like included please let us know and we will try and add it for you. Please feed back to rachel@flymine.org with any comments or anything you would like adding to this document.
Section 1: Classes of elements:
Protein Coding Genes:
There are currently no C. elegans submissions that have generated Gene models.
Small regulatory RNAs / Novel non-coding RNAs
Project: The C. elegans Transcriptome
PI: Robert Waterstone
Experiment: Identification of small RNAs in C. elegans: http://intermine.modencode.org/release-15/experiment.do?experiment=Identification%20of%20small%20RNAs%20in%20C.%20elegans
Experiment type: RNA-seq
Factors: Strain, Developmental stage
Features generated: This experiment has no features in modMine.
GBrowse tracks and Downloads: The experiment page (link above) provides links to GBrowse tracks and to the raw data files for each submission for download.
Project: The 3′UTRome
PI: Fabio Piano
Experiment: Small RNA expression in C. elegans embryos
Experiment type: RNA-seq
Factors: Strain, Developmental stage, temperature, fixative, fixation state
Features generated: This experiment has no features in modMine.
GBrowse tracks and Downloads: There is currently no data or GBrowse tracks available for this project.
3′UTR
Project: The 3′UTRome
PI: Fabio Piano
Experiment: Encyclopedia of C. elegans 3′UTRs and their regulatory elements: http://intermine.modencode.org/release-15/experiment.do?experiment=Encyclopedia%20of%20C.%20elegans%203%27UTRs%20and%20their%20regulatory%20elements
Experiment type:
Factors: Strain, Developmental stage, Temperature
Features generated: Gene, mRNA, 3′Race clone, polyA site, CDS, EST, 3′RST, 3′UST, 3′UTR
GBrowse tracks and Downloads: The experiment page (link above) provides links to GBrowse tracks and to the raw data files for each submission for download.
Further Analysis:
How do I find/download all the 3′UTRs identified by the project together with their associated genes?
All the 3′UTRs identified by this experiment can be viewed and downloaded from the experiments page (link above). However, this does not give the genes associated with the 3′UTRs. To view/download the 3′UTRS and the associated genes the following template can be run (the default value for this template is set to this experiment):
- Experiment –> 3UTRs + Genes: http://intermine.modencode.org/query/template.do?name=Experiment_ThreePrimeUTR_Genes&scope=all
How do I find all the 3′UTRs on a particular chromosome and their associated genes?
Use the following template. Note that this template only finds the 3′UTRs identified by modENCODE submissions, not those identified by the genome projects.
- Chromosome –> 3′UTRs + Genes: http://intermine.modencode.org/query/template.do?name=Chromosome_ThreePrimeUTR_Gene&scope=all
How do I find the 3′UTR(s) identified for a particular gene or list of genes.
Currently the genes associated with the 3′UTRS submitted by this project are not merged with the wormbase genes in modMINE. Genes analysed by this project can be identified by affixing ‘Gene:’ to the wormbase identifier (e.g. Gene:WBGene00022277).
In the next release of modMINE these genes should be merged with the wormbase genes and so will have the same identifier. In the meantime, if you wish to upload a list of worm genes to analyze against this data you will need to use the wormbase identifier and affix the ‘Gene:’ to each one.
The following template will find the 3′UTRs associated with a particular gene or list of genes:
- Gene(s) –> 3′UTRs: http://intermine.modencode.org/release-15/template.do?name=Gene_ThreePrimeUTR&scope=all
PolII Binding Sites
Project: Chromatin function
PI: Jason Lieb
Experiment: Chromatin ChIP-chip: http://intermine.modencode.org/query/experiment.do?experiment=Chromatin%20ChIP-chip
Experiment type: ChIP-chip
Factors: Strain, Antibody
Features generated: No features were generated.
GBrowse tracks and Downloads:
Only one submission currently maps polII binding sites: 599: POL-2_N2_L4_(CVMMS126R_8WG16_N2_L4): http://intermine.modencode.org/release-15/objectDetails.do?id=80000313
No featues were generated by this submission but the data can be viewed in GBrowse via the submission or experiment page, where you will also find links to access the raw data.
There will be further submissions mapping pol11 binding sites in the next modMine release – the GBrowse tracks for these can already be viewed via the above links.
TF Binding sites
Project: Regulatory Elements in C. Elegans
PI: Michael Snyder
Experiment: ChIP-Seq Identification of C. elegans TF Binding Sites: http://intermine.modencode.org/query/experiment.do?experiment=ChIP-Seq%20Identification%20of%20C.%20elegans%20TF%20Binding%20Sites
Experiment type: ChIP-seq
Factors: Strain, Developmental stage, Temperature, Target gene
Features generated: Binding site
GBrowse tracks and Downloads: The experiment page (link above) provides links to GBrowse tracks and to the raw data files for each submission for download.
Binding sites for the following transcription factors were mapped:
- alr-1
- ama-1
- ceh-14
- daf-16
- dpy-27
- egl-5
- eor-1
- gei-11
- hlh-1
- hlh-8
Further analysis:
As binding sites for these transcription factors are mapped as ‘bindingSites’ in modMine, it is useful to make use of a list of the submissions that includes just this set. Such a list is available on the lists ‘view’ page and is called ‘PL Snyder_TranscriptionFactors’. This list enables you to analyse the properties of just that set of submissions using the list analysis page and by running template and query builder queries. Alternatively, some of the templates below have been created to specifically run on the Snyder data from this experiment.
How do I find all the binding sites for all the transcription factors mapped?
Use the following template and run it with the above list, PL Snyder_TranscriptionFactors:
- Template: Submission(s) –> Binding sites (run with public lists of submissions): http://intermine.modencode.org/query/template.do?name=Submission_bindingSites&scope=all
How do I find all the binding sites for a particular transcription factor?
The easiest way to view and download the binding sites for each transcription factor is via the experiments page, http://intermine.modencode.org/query/experiment.do?experiment=Chromatin%20Binding%20Site%20Mapping . Here you will find links to download or view data for each individual submission. Links include options to view binding sites as a results table or in GBrowse or download in tab, csv or gff3 format. Sequences can also be downloaded. There is also the option to ‘Create list’ of all the binding sites from each submission. If you are logged in such a list will be permanently saved to your account. You can now analyse this list in the following ways:
- The list analysis page gives you additional information about your list
- The list can be used to run template queries and query builder queries
- From the lists ‘view’ page you can carry out list operations with other lists – union, intersect and subtract.
How do I find all binding sites at a particular developmental stage?
The following template query can be run with the list of all transcription factor submissions described above (PL Snyder_TranscriptionFactors).
- Submission(s) + developmental stage –> Binding sites (run with public lists of submissions): http://intermine.modencode.org/query/template.do?name=Submission_bindingSites&scope=all
How do I find all binding sites for a particular transcription factor at a particular developmental stage?
There are three possible templates depending on your starting point: NOTE: TARGET GENE DOES NOT MEAN THE TARGET GENE OF THE TRANSCRIPTION FACTOR. IT IS THE GENE THE SUBMISSION IS TARGETTING – IE THE ACTUAL TRANSCRIPTION FACTOR. (The snyder data uses GFP tagging and anti-GFP antibodies, so the target gene refers the gene being tagged by GFP
If you know the target gene you are interested in use the following template.
- Target gene + Developmental stage –> Binding sites: http://intermine.modencode.org/query/template.do?name=TargetGeneDevStage_bindingSites&scope=all
Alternatively if you have a specifc submission or list of submissions use this template:
- Submission(s) + Developmental stage –> Binding sites: http://intermine.modencode.org/query/template.do?name=SubmissionDevStage_BindingSites&scope=user
or if your list of submissions is already constrained by developmental stage:
- Submission(s) –> Binding sites (run with public lists of submissions): http://intermine.modencode.org/query/template.do?name=Submission_bindingSites&scope=all
How do I find all transcription factor binding sites in a particular chromosomal location?
The following template query can be run with the list of all transcription factor submissions described above (PL Snyder_TranscriptionFactors).
- Submission(s) + Chromsome location –> binding sites: http://intermine.modencode.org/query/template.do?name=SubmissionChromosomeLocation_BindingSites&scope=all
How do I find the binding sites for a particular transcription factor in a particular chromosomal location?
There are two possible template queries depending on your starting point:
To start your query from the transcription factor use the following template:
- Target gene + chromosome location –> Binding sites: http://intermine.modencode.org/query/template.do?name=TargetGeneChrmLoc_bindingSites&scope=all
Alternatively, if you have a particular submission or list of submissions, use the following template:
- Submission(s) + Chromsome location –> binding sites: http://intermine.modencode.org/query/template.do?name=SubmissionChromosomeLocation_BindingSites&scope=all
How do I find all binding sites at a particular developmental stage in a particular chromosomal location?
There are two possible template queries depending on your starting point:
To start your query from the gene use the following template:
- Target gene + Chromosome location + Developmental stage –> Binding sites: http://intermine.modencode.org/query/template.do?name=TargetGeneChrmLocDevStage_bindingSites&scope=all
Alternatively, if you have a particular submission or list of submissions, use the following template:
- Submission(s) + Chromsome location + Developmental stage –> binding sites: http://intermine.modencode.org/query/template.do?name=SubmissionChromosomeLocationDevStage_BindingSites&scope=user
Or, for all binding sites regardless of transcription factor or submission:
- Chromosome location + Developmental stage –> Binding sites: http://intermine.modencode.org/query/template.do?name=ChromosomeLocationDevStage_BindingSites&scope=all
How do I find all binding sites upstream of a particular gene or list of genes?
To find all binding sites use the template query:
- Gene –> Binding sites in the upstream intergenic region: http://intermine.modencode.org/query/template.do?name=Gene_TFBindingSitesUpstreamIntergenic&scope=all
To find specifically transcription factor binding sites use the following template, and set the submission to be either a list of transcription factor submissions (PL Snyder_TranscriptionFactors) or a specific submission (to find binding sites for a specific transcription factor).
- Gene + Submission(s) –> Binding sites in the upstream intergenic region: http://intermine.modencode.org/query/template.do?name=GeneSubmission_BindingSitesUpstream&scope=user
Defining broad domains for chromatin marks
Project: Chromatin function
PI: Jason Lieb
Experiment: Chromatin ChIP-chip: http://intermine.modencode.org/query/experiment.do?experiment=Chromatin%20ChIP-chip
Experiment type: ChIP-chip
Factors: Strain, Antibody
Features generated: Most submissions did not generate features in modMine. Computed peaks (binding sites) are available for one submission only (195: MES-4_FLAG_Early_Embryos_(SGF3165_FLAG_MES4FLAG_EEMB)
GBrowse tracks and Downloads: The experiment page (link above) provides links to GBrowse tracks and to the raw data files for each submission for download.
Project: Histone variants
PI: Steven Henikoff
Experiment: Genome-wide chromatin profiling : http://intermine.modencode.org/query/experiment.do?experiment=Chromatin%20ChIP-chip
Experiment type: RNA tiling array
Factors: Sodium chloride concentration, Biochemical fraction, Developmental stage, Extraction time
Features generated: These submissions did not generate features in modMine.
GBrowse tracks and Downloads: The experiment page (link above) provides links to GBrowse tracks and to the raw data files for each submission for download.
This experiment includes submissions for both C. elegans and D. melanogaster. The following submissions are C. elegans:
- 2532: 600_mM_Embryonic_Salt_Extracted_Chromatin: http://intermine.modencode.org/release-15/objectDetails.do?id=81000233
- 2533: 600_mM_Embryonic_Salt_Extracted_Chromatin_Pellet: http://intermine.modencode.org/release-15/objectDetails.do?id=81000242
- 2531: 80_mM_Embryonic_Salt_Extracted_Chromati: http://intermine.modencode.org/release-15/objectDetails.do?id=81000219
- 2538: Adult_Mononucleosomes: http://intermine.modencode.org/release-15/objectDetails.do?id=81000281
- 2537: Embryonic_Mononucleosomes: http://intermine.modencode.org/release-15/objectDetails.do?id=81000274
- 2535: H3.3_350_mM_Salt_Extracted_Chromatin: http://intermine.modencode.org/release-15/objectDetails.do?id=81000256
- 2536: H3.3_600_mM_Salt_Extracted_Chromatin: http://intermine.modencode.org/release-15/objectDetails.do?id=81000267
- 2534: H3.3_80_mM_Salt_Extracted_Chromatin: http://intermine.modencode.org/release-15/objectDetails.do?id=81000249