modMine Worm Guide

February 3rd, 2010 Leave a comment Go to comments

Introduction

The aim of this document is to help researchers access and use the data in modMine for the detailed analysis papers. It provides information regarding where to find specific sets of data, how to view and download these and how to access and query specific subsets of the data. NOTE: this document is currently still a draft version. Please also note that templates and lists can be added to ModMine at any time, so if there is a query or list that you would like included please let us know and we will try and add it for you. Please feed back to rachel@flymine.org with any comments or anything you would like adding to this document.

Section 1: Classes of elements:


Protein Coding Genes:


There are currently no C. elegans submissions that have generated Gene models.


Small regulatory RNAs  / Novel non-coding RNAs


Project: The C. elegans Transcriptome

PI: Robert Waterstone

Experiment: Identification of small RNAs in C. elegans: http://intermine.modencode.org/release-15/experiment.do?experiment=Identification%20of%20small%20RNAs%20in%20C.%20elegans

Experiment type: RNA-seq

Factors: Strain, Developmental stage

Features generated: This experiment has no features in modMine.

GBrowse tracks and Downloads: The experiment page (link above) provides links to GBrowse tracks and to the raw data files for each submission for download.

Project: The 3′UTRome

PI: Fabio Piano

Experiment: Small RNA expression in C. elegans embryos

Experiment type: RNA-seq

Factors: Strain, Developmental stage, temperature, fixative, fixation state

Features generated: This experiment has no features in modMine.

GBrowse tracks and Downloads: There is currently no data or GBrowse tracks available for this project.

3′UTR

Project: The 3′UTRome

PI: Fabio Piano

Experiment: Encyclopedia of C. elegans 3′UTRs and their regulatory elements: http://intermine.modencode.org/release-15/experiment.do?experiment=Encyclopedia%20of%20C.%20elegans%203%27UTRs%20and%20their%20regulatory%20elements

Experiment type:

Factors: Strain, Developmental stage, Temperature

Features generated: Gene, mRNA, 3′Race clone, polyA site, CDS, EST, 3′RST, 3′UST, 3′UTR

GBrowse tracks and Downloads: The experiment page (link above) provides links to GBrowse tracks and to the raw data files for each submission for download.

Further Analysis:

How do I find/download all the 3′UTRs identified by the project together with their associated genes?

All the 3′UTRs identified by this experiment can be viewed and downloaded from the experiments page (link above).  However, this does not give the genes associated with the 3′UTRs.  To view/download the 3′UTRS and the associated genes the following template can be run (the default value for this template is set to this experiment):

How do I find all the 3′UTRs on a particular chromosome and their associated genes?

Use the following template.  Note that this template only finds the 3′UTRs identified by modENCODE submissions, not those identified by the genome projects.

How do I find the 3′UTR(s) identified for a particular gene or list of genes.

Currently the genes associated with the 3′UTRS submitted by this project are not merged with the wormbase genes in modMINE.   Genes analysed by this project can be identified by affixing ‘Gene:’ to the wormbase identifier  (e.g. Gene:WBGene00022277).

In the next release of modMINE these genes should be merged with the wormbase genes and so will have the same identifier.  In the meantime, if you wish to upload a list of worm genes to analyze against this data you will need to use the wormbase identifier and affix the ‘Gene:’ to each one.

The following template will find the 3′UTRs associated with a particular gene or list of genes:

PolII Binding Sites


Project: Chromatin function

PI: Jason Lieb

Experiment: Chromatin ChIP-chip: http://intermine.modencode.org/query/experiment.do?experiment=Chromatin%20ChIP-chip

Experiment type: ChIP-chip

Factors: Strain, Antibody

Features generated: No features were generated.

GBrowse tracks and Downloads:

Only one submission currently maps polII binding sites: 599: POL-2_N2_L4_(CVMMS126R_8WG16_N2_L4): http://intermine.modencode.org/release-15/objectDetails.do?id=80000313

No featues were generated by this submission but the data can be viewed in GBrowse via the submission or experiment page, where you will also find links to access the raw data.

There will be further submissions mapping pol11 binding sites in the next modMine release – the GBrowse tracks for these can already be viewed via the above links.

TF Binding sites


Project: Regulatory Elements in C. Elegans

PI: Michael Snyder

Experiment: ChIP-Seq Identification of C. elegans TF Binding Sites: http://intermine.modencode.org/query/experiment.do?experiment=ChIP-Seq%20Identification%20of%20C.%20elegans%20TF%20Binding%20Sites

Experiment type: ChIP-seq

Factors: Strain, Developmental stage, Temperature, Target gene

Features generated: Binding site

GBrowse tracks and Downloads: The experiment page (link above) provides links to GBrowse tracks and to the raw data files for each submission for download.

Binding sites for the following transcription factors were mapped:

  • alr-1
  • ama-1
  • ceh-14
  • daf-16
  • dpy-27
  • egl-5
  • eor-1
  • gei-11
  • hlh-1
  • hlh-8

Further analysis:

As binding sites for these transcription factors are mapped as ‘bindingSites’ in modMine, it is useful to make use of a list of the submissions that includes just this set.  Such a list is available on the lists ‘view’ page and is called ‘PL Snyder_TranscriptionFactors’. This list enables you to analyse the properties of just that set of submissions using the list analysis page and by running template and query builder queries.  Alternatively, some of the templates below have been created to specifically run on the Snyder data from this experiment.

How do I find all the binding sites for all the transcription factors mapped?

Use the following template and run it with the above list,  PL Snyder_TranscriptionFactors:

How do I find all the binding sites for a particular transcription factor?

The easiest way to view and download the binding sites for each transcription factor is via the experiments page, http://intermine.modencode.org/query/experiment.do?experiment=Chromatin%20Binding%20Site%20Mapping . Here you will find links to download or view data for each individual submission. Links include options to view binding sites as a results table or in GBrowse or download in tab, csv or gff3 format. Sequences can also be downloaded. There is also the option to ‘Create list’ of all the binding sites from each submission. If you are logged in such a list will be permanently saved to your account. You can now analyse this list in the following ways:

  • The list analysis page gives you additional information about your list
  • The list can be used to run template queries and query builder queries
  • From the lists ‘view’ page you can carry out list operations with other lists – union, intersect and subtract.

How do I find all binding sites at a particular developmental stage?

The following template query can be run with the list of all transcription factor submissions described above (PL Snyder_TranscriptionFactors).

How do I find all binding sites for a particular transcription factor at a particular developmental stage?

There are three possible templates depending on your starting point:  NOTE:  TARGET GENE DOES NOT MEAN THE TARGET GENE OF THE TRANSCRIPTION FACTOR.  IT IS THE GENE THE SUBMISSION IS TARGETTING – IE THE ACTUAL TRANSCRIPTION FACTOR.  (The snyder data uses GFP tagging and anti-GFP antibodies, so the target gene refers the gene being tagged by GFP

If you know the target gene you are interested in use the following template.

Alternatively if you have a specifc submission or list of submissions use this template:

or if your list of submissions is already constrained by developmental stage:

How do I find all transcription factor binding sites in a particular chromosomal location?

The following template query can be run with the list of all transcription factor submissions described above (PL Snyder_TranscriptionFactors).

How do I find the binding sites for a particular transcription factor in a particular chromosomal location?

There are two possible template queries depending on your starting point:

To start your query from the transcription factor use the following template:

Alternatively, if you have a particular submission or list of submissions, use the following template:

How do I find all binding sites at a particular developmental stage in a particular chromosomal location?

There are two possible template queries depending on your starting point:

To start your query from the  gene use the following template:

Alternatively, if you have a particular submission or list of submissions, use the following template:

Or, for all binding sites regardless of transcription factor or submission:

How do I find all binding sites upstream of a particular gene or list of genes?

To find all binding sites use the template query:

To find specifically transcription factor binding sites use the following template, and set the submission to be either a list of transcription factor submissions (PL Snyder_TranscriptionFactors) or a specific submission (to find binding sites for a specific transcription factor).


Defining broad domains for chromatin marks


Project: Chromatin function

PI: Jason Lieb

Experiment: Chromatin ChIP-chip: http://intermine.modencode.org/query/experiment.do?experiment=Chromatin%20ChIP-chip

Experiment type: ChIP-chip

Factors: Strain, Antibody

Features generated: Most submissions did not generate features in modMine.  Computed peaks (binding sites) are available for one submission only (195: MES-4_FLAG_Early_Embryos_(SGF3165_FLAG_MES4FLAG_EEMB)

GBrowse tracks and Downloads: The experiment page (link above) provides links to GBrowse tracks and to the raw data files for each submission for download.

Project: Histone variants

PI: Steven Henikoff

Experiment: Genome-wide chromatin profiling : http://intermine.modencode.org/query/experiment.do?experiment=Chromatin%20ChIP-chip

Experiment type: RNA tiling array

Factors: Sodium chloride concentration, Biochemical fraction, Developmental stage, Extraction time

Features generated: These submissions did not generate features in modMine.  

GBrowse tracks and Downloads: The experiment page (link above) provides links to GBrowse tracks and to the raw data files for each submission for download.

This experiment includes submissions for both C. elegans and D. melanogaster.  The following submissions are C. elegans:

  1. No comments yet.
  1. No trackbacks yet.