Spectrum Mill - Protein Sequence Database Utilities

(After a re-install or indexing a database, if you don't see the database listed, click the "Update Database List" button.)

Newly downloaded database:
Existing databases:
Existing database to re-index:
Suffix for subset database:
Existing database:
MW of protein: (from Da to Da)
Database name:
Species: Accession number:
Reading frame: (for DNA only)
Index number range: to
(After updating the database list you must Refresh the page to see any new database choices.)
Database 1:
Database 2:

When keeping/removing redundant entries (identical sequence), only the one nearest the top of the FASTA file is kept.
However, for UniProt databases SwissProt entries are preferentially kept over TrEMBL entries regardless of order in the FASTA file.
Suffix for subset database:
Existing database:
Accession #'s:
Create a FASTA file with the generic PA header format used in Spectrum Mill
New category file in seqdb directory:

The tab-delimited category file must contain 2 columns named: accession_number and sequence
Optional columns named entry_name and species, if present, will also be included in the FASTA header.
Any other additional columns will be ignored.

Filename for new database (relative to SeqDB): (The name must begin with a supported database name prefix.)
Reference proteome in use:
ensembl_Ref/Ensembl_Homo_sapiens.GRCh37.pep.all.fasta (aka hg19)
ensembl_Ref/Ensembl_Homo_sapiens.GRCh37.pep.all_clean3.fasta (aka hg19)
Suffix for preferential nr reference database:
(version of reference database made non-redundant preferentially based on the variants in these .fa files)
Enter paths to .fa files (using * as wildcard) under SeqDB folder to concatenate: