BioScience Writers LLC

Follow us on Facebook Follow us on twitter Follow us on LinkedIn

  <     Client Comments    >   
  • INRA (French National Institute for Agricultural Research)
    "As I am finishing my PhD, I am quite busy at the moment and did not have time to look at your work until now. And I have to say that I am really happy with your proofreading and clever suggestions. You really helped to make the paper clearer, which was maybe not an easy task! You really understood the "spirit" of our work, even the really boring technical modelling details of the huge amount of appendixes attached to our work. I know that this is your job, but, I really wanted to thank you for the high quality and professional skills of this work!"
    Kevin Morel
    United States

You have our 100% Satisfaction Guarantee!

Guidelines for Formatting Gene and Protein Names

Release Date: January 30, 2014
Category: Scientific Writing
Author: Katherine A., Ph.D.

Within articles describing genetic studies, it is often difficult for readers to determine whether the authors are referring to a gene or its corresponding protein. This can be problematic when readers are trying to understand the details of complex molecular systems and the methodology used by the authors to probe those systems. To reduce this potential source of confusion for both peer reviewers and the larger audience of your published article, it is important to use accepted formatting conventions for gene and protein symbols in a consistent manner throughout your manuscript.

General formatting and writing guidelines

When possible, to reduce the proliferation of duplicative gene names, always use standard gene names and symbols, which can be found in community databases that are specific to particular organisms (e.g., human:; rat:; mouse:; zebrafish:; flies:; worms: The use of standard gene names and symbols is often specifically required by scientific and medical journals. If a gene does not yet have an approved name or symbol, it may be possible to propose new name or symbol designations to the relevant database or its professional association.

In general, symbols for genes are italicized (e.g., IGF1), whereas symbols for proteins are not italicized (e.g., IGF1). The formatting of symbols for RNA and complementary DNA (cDNA) usually follows the same conventions as those for gene symbols. If many genes are listed together in a table, it is usually up to the authors’ (or the journal’s) discretion as to whether they should be italicized. Gene names that are written out in full are not italicized (e.g., insulin-like growth factor 1). Genotype designations should be italicized, whereas phenotype designations should not be italicized. Several formatting conventions also depend on the type of organism, and these are discussed in greater detail below.

Although expert readers may be familiar with gene and protein symbols, non-expert readers may not be certain about the particular genes or proteins that are being represented. Therefore, it is good practice to provide the full gene or protein name followed by its symbol in parentheses upon first usage (e.g., huntingtin gene (HTT)), particularly if your article is to be published in a journal with broad readership.

In addition to the formatting of gene and protein symbols, there are also ways to emphasize the difference between genes and proteins through careful word choices in your writing. For instance, it can be helpful to explicitly state whether you are referring to a gene or protein, particularly within sentences in which both a gene and its product are mentioned (e.g., “We quantified APOE gene expression and APOE protein levels . . . “). Also, you could selectively use the term “expression” when referring to genes and the term “levels” when referring to RNA or proteins.

Organism-specific formatting guidelines

Although the general rule that gene symbols are italicized and protein symbols are not italicized holds true regardless of the type of organism, there are several variations among organisms in the composition and capitalization of alphanumeric characters within the gene and protein symbols.

Humans, non-human primates, chickens, and domestic species: Gene symbols contain three to six italicized characters that are all in upper-case (e.g., AFP). Gene symbols may be a combination of letters and Arabic numerals (e.g., 1, 2, 3), but should always begin with a letter; they generally do not contain Roman numerals (e.g., I, II, III), Greek letters (e.g., α, β, γ), or punctuation. Protein symbols are identical to their corresponding gene symbols except that they are not italicized (e.g., AFP).

Mice and rats: Gene symbols are italicized, with only the first letter in upper-case (e.g., Gfap). Protein symbols are not italicized, and all letters are in upper-case (e.g., GFAP).

Fish: In contrast to the general rule, full gene names are italicized (e.g., brass). Gene symbols are also italicized, with all letters in lower-case (e.g., brs). Protein symbols are not italicized, and the first letter is upper-case (e.g., Brs).

Flies: Gene names and symbols begin with an upper-case letter if: (1) the gene is named for a protein or (2) the gene was first named for a mutant phenotype that is dominant to the wild-type phenotype (e.g., Rpp30). Gene names and symbols begin with a lower-case letter if the gene was first named for a mutant phenotype that is recessive to the wild-type phenotype (e.g., kis). Gene symbols are italicized. Symbols for proteins that were named for genes begin with an upper-case letter, but there are no accepted formatting guidelines for proteins that were not named for genes. Protein symbols are not italicized.

Worms: Gene symbols are italicized and generally composed of three to four letters, a hyphen, and an Arabic number (e.g., abu-1). Protein symbols are not italicized, and all letters are in upper-case (e.g., ABU-1).

Bacteria: Gene symbols are typically composed of three lower-case, italicized letters that serve as an abbreviation of the process or pathway in which the gene product is involved (e.g., rpo genes encode RNA polymerase). To distinguish among different alleles, the abbreviation is followed by an upper-case letter (e.g., the rpoB gene encodes the β subunit of RNA polymerase). Protein symbols are not italicized, and the first letter is upper-case (e.g., RpoB).

Journal-specific formatting guidelines

Although following these general guidelines improve the precision of your scientific writing and prevent confusion among your readers, it is important to keep in mind that different journals sometimes have different rules for how genetic terms (e.g., genes, RNA, cDNA, proteins, genotypes, phenotypes, designation of mutant alleles) should be formatted. Therefore, before submitting your manuscript to a journal, we recommend checking that your formatting adheres to the journal’s specific guidelines, which may be provided in the “Instructions for Authors” on the journal website.

Further resources

Scientific Writing Workshops

If you like our articles, try our workshops! Our articles are based on the material from our scientific writing workshops, which cover these and many other topics more thoroughly, with more examples and discussion.

We offer on-site workshops for your event or organization, and also host workshops that individual participants can attend. Our on-site scientific writing workshops can range from 1-2 hours to several days in length. We can tailor the length to suit your needs, and we can deliver a writing workshop as a stand-alone activity or as part of scheduled meetings.

Our scientific writing workshops consistently receive high praise from participants including graduate students, post-docs, and faculty in diverse fields. Please see our scientific writing workshop page for details.

If you found this article helpful or if there is a topic you want us to address in a future article, please use our online comment submission form, or contact us directly. Your comments and suggestions are valuable! Click here to return to our scientific editing article library.