|
Download Assemblies:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Maize Assembled Genomic Island (MAGI) Downloads
Publication:
Kalyanaraman A, Emrich SJ, Schnable PS, Aluru S (2006) Assembling genomes on large-scale parallel computers. Proceedings of
the IEEE International Parallel and Distributed Processing Symposium, April 25-29, 2006. [Full
Text PDF]
Publication:
Fu Y, Emrich SJ, Guo L, Wen T-J, Aluru S, Ashlock DA, Schnable PS (2005) Quality assesment of maize
assembled genomic islands (MAGIs) and experimental verification of predicted genes. Proceedings
National Academy Science, 102(34): 12282-12287. [Full
Text PDF]
Emrich SJ, Aluru S, Fu Y, Narayanan M, Guo L, Ashlock DA, Schnable PS (2004) A strategy
for assembling the maize (Zea mays L.) Genome. Bioinformatics, 20(2): 140-147. [Full
Text PDF]
Show Deprecated Assemblies
NOTE: MAGIs 3.1, and 4.0 are all available for BLAST and download. Please remember
that contig names are NOT conserved between assemblies 2.31, 3.1, and 4.0.
MAGIv4.0 Contigs Genes Predicted via FGENESH v2.6
To facilitate analyses of the maize gene space, gene predictions were performed on the 163,390 MAGIv4.0 contigs by
FGENESH v2.6 (Softberry, Inc.) using the monocots matrix and -GC -pmrna -pexons
-scip_prom -scip_term parameters. This resulted in the prediction of structures for 61,428 MAGIv4.0 genes. These predictions were parsed
to produce premature mRNAs, mRNAs, and ORFs. "MAGI Premature mRNAs" consist of genomic fragments that include predicted UTRs, exons, and introns.
"MAGI Premature mRNAs + 300 bp" additionally include 300 bases upstream and downstream of the predicted transcription start and end sites.
"MAGI mRNAs" include only predicted UTRs and exons. "MAGI mRNAs + 300 bp" additionally include 300 bases upstream and downstream of the
predicted UTR or first/last exon regions. "MAGI ORFs" consist of only exonic coding regions (i.e., mRNAs minus UTRs).
Schematics of the extracted structures are available here. Note that MAGIs can contain
truncated genes. Extracted sequences were masked using the MAGI version of repeatmasker
(Emrich et al, 2004), in combination with our Statistical Defined Repeat (SDRs) and Cereal Repeat
databases.
MAGIv3.1 Contigs Genes Predicted via FGENESH v2.6
To facilitate analyses of the maize gene space, gene predictions were performed on the 114,173 MAGIv3.1 contigs by
FGENESH v2.6 (Softberry, Inc.) using the monocots matrix and -GC -pmrna -pexons
-scip_prom -scip_term parameters. This resulted in the prediction of structures for 43,707 MAGIv3.1 genes. These predictions were parsed
to produce premature mRNAs, mRNAs, and ORFs. "MAGI Premature mRNAs" consist of genomic fragments that include predicted UTRs, exons, and introns.
"MAGI Premature mRNAs + 300 bp" additionally include 300 bases upstream and downstream of the predicted transcription start and end sites.
"MAGI mRNAs" include only predicted UTRs and exons. "MAGI mRNA + 300 bp" additionally include 300 bases upstream and downstream of the predicted
UTR or first/last exon regions. "MAGI ORFs" consist of only exonic coding regions (i.e., mRNAs minus UTRs).
Schematics of the extracted structures are available here. Note that MAGIs can contain
truncated genes. Extracted sequences were masked using the MAGI version of repeatmasker
(Emrich et al, 2004), in combination with our Statistical Defined Repeat (SDRs) and Cereal Repeat
databases.
Sorghum Assembled GenoMic Island (SAMI) Downloads
NOTE: SAMIs 1.0 and 2.0 are both available for BLAST and download. Please remember
that contig names are NOT conserved between assemblies 1.0 and 2.0.
Maize Expressed Genes (MEG) Downloads
Maize EST Contig (MEC) Downloads
418,638 Zea mays ESTs were downloaded from the dbEST division of GenBank in late January 2005, of which 11,521 sequences
were removed due to contamination. In additional, ~31,000 Shoot Apical Meristem (SAM) ESTs from the University of Georgia
and Iowa State University were also processed for a total of 438,223 ESTs. These ESTs were then clustered using PaCE with
initial 30 bp exact match as criterion. Overlaps with >= 87% identity over 80 bp were used to merge clusters. Two distinct
EST assemblies, named MEC_P98-May05 and MEC_P95-May05 were subsequently generated using the same PaCE clustering result.
MEC_P98-May05 and MEC_P95-May05 were built using CAP3 with the following parameters: >=98% (_P98) and >=95% (_P95) identities,
<=5% overhang length, >=60 bp clipping range, and an overlap length >= 50 bp. GeneSeqer EST alignment display cutoff: at least
one exon with similarity >= 95% and overall cDNA coverage >= 80%.
NOTE: MEC Contigs (_P98 and _P95) clustered on May 2005 and March 2006 are all available for BLAST
and download. Please remember that contig names are NOT conserved between each set of contigs.
454 Transcriptome Downloads
Publication:
Emrich SJ, WB Barbazuk, L Li, PS Schnable (2007) Gene discovery and annotation using LCM-454 transcriptome sequencing.
Genome Research, 17(1): 69-73. (Epub: 2006 Nov 9). [Full
Text PDF]
Publication:
Barbazuk WB, SJ Emrich, HD Chen, PS Schnable (2007) SNP discovery via 454 transcriptome sequencing.
Plant Journal, 51(5): 910-918. [Full
Text PDF]
Other Maize EST Contig Downloads
Approximately 32,000 3' B73 EST sequences generated by the Schnable Lab (Qiu et al., 2003) were downloaded from Genbank.
Only those 30,356 ESTs with polyT prefixes of >7 bp (indicative of the presence of a polyA tail on the corresponding cDNA) were
assembled using CAP3 (Huang and Madan, 1999). CAP3 parameters: overlap identity >=98%, overlap length >=60 bp, clipping range <=20 bp,
and overhang <= 5%. PolyT prefixes were masked prior to clustering. This CAP3 analysis yielded 3,252 contigs and 16,202 singletons.
GeneSeqer (Usuka et al., 2000) was used for MAGI/EST alignments with the option for specifying a particular orientation for genomic
sequences (-f). GeneSeqer EST alignment display cutoff: at least one exon with similarity >= 95% and overall cDNA coverage >= 80%
|
_01.jpg)
_02.jpg)
_03.jpg)