Only DNA sequences of 25,000 or less bases and protein or translated
sequence of 5000 or less letters will be processed. If multiple sequences
are submitted at the same time, the total limit is 50,000 bases or 12,500
Please paste in a query sequence to see where it is located in the
the genome. Multiple sequences can be searched at once if separated by a line
starting with > and the sequence name.
Rather than pasting a sequence, you can choose to upload
a text file containing the sequence.Upload sequence:
BLAT on DNA is designed to
quickly find sequences of 95% and greater similarity of length 40 bases or
more. It may miss more divergent or shorter sequence alignments. It will find
perfect sequence matches of 33 bases, and sometimes find them down to 22 bases.
BLAT on proteins finds sequences of 80% and greater similarity of length 20 amino
acids or more.
BLAT is not BLAST. DNA BLAT works by keeping an index of the entire genome
in memory. The index consists of all non-overlapping 11-mers except for
those heavily involved in repeats. The index takes up a bit less than
a gigabyte of RAM. The genome itself is not kept in memory, allowing
BLAT to deliver high performance on a reasonably priced Linux box.
The index is used to find areas of probable homology, which are then
loaded into memory for a detailed alignment. Protein BLAT works in a similar
manner, except with 4-mers rather than 11-mers. The protein index takes a little
more than 2 gigabytes.
BLAT was written by Jim Kent.
Like most of Jim's software interactive use on this web server is free to all.
Sources and executables to run batch jobs on your own server are available free
for academic, personal, and non-profit purposes. Non-exclusive commercial
licenses are also available. Contact Jim for details.