Instructions for Query Protein

Query Protein is a web-based program that searches for information about protein sequences on the web. It is distinctive because it is not limited to a single database, but instead captures protein information on the entire Internet using Google. Query Protein works by taking a protein sequence in combination with other search terms, finds similar sequences using NCBI's Protein-Protein Blast, retrieves the descriptions of those matching proteins from NCBI's Entrez Protein database, and performs a series of Google searches using the combination of your original search terms and each protein description. The percent sequence identity is indicated alongside each match: this indicates how much of your queried sequence is contained in the sequence it matches.

The following example illustrates Query Protein's capabilities:

Protein Sequence: (numbers and spaces optional)
1 mvhltpeeks avtalwgkvn vdavggealg rllvvypwtq rffesfgdls tpdavmgnpk 61 vkahgkkvlg afsdglahld nlkgtfatls elhcdklhvd penfr

Search Terms:
"genetic disease associated with"

You will notice the search results identify the sequence as human hemoglobin beta and the resulting Google searches retrieve "sickle cell" disease in the search results, a known genetic disease associated with human hemoglobin beta.

Google searching can often be more of an art than a science, so if your query does not yield expected results, trying rewording it or including extra adjectives. At present, only the first thirty-two words of the final query are searched. Click on the "Google Query" links in your search results to edit the original Google searches or click on the "Google Scholar" links to search Google Scholar instead.

Query Protein uses Google's Web API to perform Google searches and is limited to 1,000 daily Google searches as per the rules of Google's licensing agreement. You are encouraged to obtain your own Google API key by registering at www.google.com/apis/ if you want more searches even if Query Protein exhausts its daily allotment (registration takes two minutes).

Thank you for using Query Protein! Also check out the Query Science homepage.

Please send questions, bug reports, and comments about this page to Justin Klekota klekota@fas.harvard.edu.