PipMaker Home

PipMaker Instructions

Overview

PipMaker computes alignments of similar regions in two DNA sequences. The resulting alignments are summarized with a ``percent identity plot'', or ``pip'' for short.

To generate a pip, such as the one shown above, PipMaker requires four user-supplied files. The first sequence file is depicted along the horizontal axis. Interspersed repeats in the first sequence are indicated by various kinds of triangles, whose locations are supplied by a mask file of the first sequence. (The user generates this file using the RepeatMasker program, available on the web at the Institute for Systems Biology.) A file of gene and exon positions allows PipMaker to draw the locations of exons and indicate the directionality of genes, shown as black boxes and long arrows, respectively. Finally, the user provides a second sequence file. CpG islands in the first sequence are independently determined by PipMaker and are shown as low boxes.

PipMaker compares the first and second sequences. Alignments are plotted according to the position in the first sequence file. The light horizontal line through the middle of the plot indicates 75% nucleotide identity. This version of PipMaker compares the first sequence with both the second sequence and its reverse complement, so matching regions need not occur in the same orientations and relative positions in the two sequences. (Advanced PipMaker can optionally enforce the condition that matching regions appear in the same relative order and/or in the same orientation. To test PipMaker, copy the four provided files to your computer and submit them to PipMaker to generate the above sample pip.

Input to PipMaker

PipMaker processes the contents of the following four files. For each of those, you can either paste the data into the multi-line textarea or, if your browser supports it, give the filename in the subsequent single-line field.

PipMaker Output

The following three files are returned as attachments to an email message. You will need a MIME-aware email program to read them.

Interpreting a Pip

The pip consists of rows that show sequence conservation and features along segments of the first sequence. Each short horizontal line inside the large box corresponds to a section of an alignment bounded by successive gaps (or an end of the alignment). For instance, suppose that one of the alignments computed for your two sequences begins as follows.

      0     .    :    .    :    .    :    .    :    .    :
  19163 GCGGCTCCATGTCACCTGCGGGCAAGGGGCTGGTGTGGAAAGCCCCACGG
        ||:||| || ::|||||||:||:: ||:-||| :::||||| |||||||
   6465 GCAGCTACAGACCACCTGCAGGTGTGGA CTGTCACGGAAACCCCCACGT

     50     .    :    .    :    .    :    .    :    .    :
  19213 CATGGTGGAAAGTCCGAAATTCTACAGGGGCCTCTTTGTTAAACCTC
         -||:||||||||||:||||||||||||:|:||:|:||||:| |::|---
   6514 G TGATGGAAAGTCCAAAATTCTACAGGAGTCTTTCTGTTGATCTCCAGT

	...
In this portion of the alignment, there are three gap-free pieces. The first covers positions 19163-19190 of the first sequence at 64% nucleotide identity, the second spans 19192-19213 at 68% identity, and the third covers 19215-19259 at 78% identity. This would be depicted in the pip by three horizontal line segments that indicate the positions in the first sequence and the percent identity.

Icons along the top of the box have the following meanings.

Error Messages

You should receive precisely one email message from PipMaker -- either your results or an error message. Error messages begin with lines indicating how far the computation proceeded, and should end with a line explaining the cause of the failure. Normally, the email message should be sent within a few minutes of your submission. However, it is possible that your email program might refuse to accept the message from PipMaker in cases where a huge amount of output is produced.