We are searching data for your request:

**Forums and discussions:**

**Manuals and reference books:**

**Data from registers:**

**Wait the end of the search in all databases.**

Upon completion, a link will appear to access the found materials.

Upon completion, a link will appear to access the found materials.

So the paper I am reading (here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7094582/) uses densitometric units to quantify the western blots.

The authors mentioned that the unphosphorylated proteins were used as loading controls, and their quantities are represented by the densitometric graphs for each corresponding figure, but protein activity is measured by their phosphorylated counterparts which I believe is represented by the densitometric graph on the right of each corresponding figure.

Questions:

- Did I interpret that correctly?
- What exactly are loading controls?
- I am trying to use these data for parameter estimation purposes. Is it possible to find the basal mass quantity of a particular protein?

Answers to each question:

- Approximately, yes. In this case, phosphorylated Akt (p-Akt) is supposed to be the active form of the enzyme that goes around the cell doing its job, whereas non-phosphorylated Akt doesn't have activity that they are interested in. The authors do show that targets of Akt show characteristics of Akt activity when Akt is phosphorylated (Fig 2B, 2C).

Note that densitometry is just a way of measuring the intensity of some bands on immunoblots (westerns). In this case, it is normalized by the loading control (non-phospho Akt). So you could interpret the densitometry plots as showing the *proportion* of Akt that is phosphorylated. It is a little confusing because they subsequently normalize them to a specific timepoint. From the methods:

The results were quantitated by densitometric analysis (ImageQuant, Molecular Dynamics and PDSI, GE Healthcare). The ratio of phosphoprotein to its respective internal control was normalized to the control level at 0.5 h, arbitrarily set to 1.

Loading controls indicate how much protein of interest is present. For example, 1% of 100mg of protein is the same as 10% of 10mg protein (each is 1 mg). However the interpretation of 1% vs. 10% may be quite large. So if what you are interested in is a percentage or proportion, you need to have a loading control to allow unbiased comparison or normalization.

I don't really know what "basal mass quantity" means, but you could look into the molecular weight of proteins (usually measured in kilodaltons). For example the molecular weight of Akt is ~55kda. You can compute this from the protein sequence by adding up the weights of all the amino acids.

## Decoding JBC Publication Guidelines

To help you successfully publish your quantitative Western blot data, we have taken a closer look at the Journal of Biological Chemistry 2,3 (JBC) guidelines and compiled resources to help you meet the documentation requirements.

JBC has spelled out specific requirements for authors, pertaining to analysis and submission of data from immunoblotting experiments. Many other journals and grant agencies have detailed requirements as well, so it is always best to check the appropriate website and review the specific journals’ instructions for authors.

Read the previous article

Qualitative or Quantitative Westerns?

Compare Proteins Accurately

## The necessity of and strategies for improving confidence in the accuracy of western blots

Western blotting is one of the most commonly used laboratory techniques for identifying proteins and semi-quantifying protein amounts however, several recent findings suggest that western blots may not be as reliable as previously assumed. This is not surprising since many labs are unaware of the limitations of western blotting. In this manuscript, we review essential strategies for improving confidence in the accuracy of western blots. These strategies include selecting the best normalization standard, proper sample preparation, determining the linear range for antibodies and protein stains relevant to the sample of interest, confirming the quality of the primary antibody, preventing signal saturation and accurately quantifying the signal intensity of the target protein. Although western blotting is a powerful and indispensable scientific technique that can be used to accurately quantify relative protein levels, it is necessary that proper experimental techniques and strategies are employed.

### Financial & competing interests disclosure

*This work was supported by National Institutes of Health Grants HL096819 and HL080101 and UC Davis research funds. AV Gomes presented a talk on ‘Can Western blots be trusted?’ at the Experimental Biology 2014 meeting which was sponsored by Bio-Rad. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.*

*No writing assistance was utilized in the production of this manuscript.*

Signal detection. The ease at which an x-ray film becomes saturated with signal gives investigators a false sense that film is more sensitive than digital imagers. Laboratories should be encouraged to switch from x-ray films to digital imagers or learn how to validate that signals are not saturated on x-ray films.

The misconception that housekeeping proteins are the best normalization method for western blots needs to be addressed.

The need for researchers to load sample amounts in which target detected by the antibody is within the linear range.

High volume of poor quality antibodies available.

Need for positive and negative controls to validate antibodies.

Need for determining and stating the molecular weight of the target protein on blots.

Need for an unbiased database allowing researchers to document good- and poor quality antibodies.

Need for publications to describe the western blotting technique utilized in more detail. A minimum reporting standard for western blotting data should be established.

## Interpreting Western Blot from Pull Down Analysis

Hi everyone, I'm having trouble understanding some aspects of this figure and would greatly appreciate it if anyone could provide some insight! Thank you in advance.

What are these figures basically saying? What's the general gist.

What does the "10% bound" vs "bound" mean?

Why are there multiple bands for the blots probed with anti-GST antibody? and why do the bands differ in size?

In essence they’re looking to see which parts of the proteins are required for interaction. When they write deltaN1-127 for example, that means they’re deleting amino acids 1-127 from the N terminus of Cdt1. When they write delta C1381-557, they’re deleting amino acids from 381-557 at the C terminus. That’s why they’re different sizes. They purified it and pulled down Cdt1 to test for interaction with geminin on the left and MCM4/6/7 on the tight. Wherever you see bands in the bound segment that means those Cdt1 mutants were still capable of interacting. I imagine the 10% unbound basically means they’re taking the supernatant from the beads. So that would be protein that wasn’t found to interact with Cdt1. As per the multiple bands, could be protein breakdown? I’m not quite sure.

## Primary antibody selection considerations for western blot analysis

#### Polyclonal vs monoclonal vs recombinant antibodies

Polyclonal | Monoclonal | Recombinant | |
---|---|---|---|

Definition | Collection of antibodies from different B cells that recognized multiple epitopes on the same antigen | Single antibody produced by identical B cell clones that recognize one epitope on the same antigen | Single antibody derived from recombinant DNA. Can be modified on the DNA level or used to generate defined antibody pool. |

Advantages | Highly sensitive. Many antibodies in the polyclonal pool can bind epitopes on antibody target | Lot-to-lot consistency. Often well characterized, historic knowledge of specific clones, publications for performance in western blotting. | Stable, long-term supply with lot-to-lot consistency. Not susceptible to cell-line drift. |

Disadvantages | Lot-to-lot variability of antibody pool can result in inconsistent detection. Epitopes similar to target can contribute to detection of unspecific bands. | Sensitivity of detection is dependent on abundance and exposure of a single epitope. Cell line drift could result in subtle, long-term changes to antibody. | Very specialized and epitope dependent. Longer development time. May need more upfront optimization. Usually higher price. |

Polyclonal, monoclonal and recombinant antibodies all work well for western blotting. Polyclonal antibodies are a pool of many monoclonal antibodies, which can vary from immunization to immunization and lot-to-lot. Polyclonal antibodies recognize multiple epitopes of an antigen and are therefore usually more sensitive than monoclonal antibodies that recognize only one epitope. This can be a benefit when epitope abundance, epitope masking or epitope exposure is a concern. Polyclonal antibodies are less expensive and less time-consuming to produce. Monoclonal antibodies are valued for their lot-to-lot consistency and in many cases, extensive characterization and publication history. Monoclonal antibodies are usually produced by cell lines that generate one individual antibody clone. These cell lines (or hybridomas) are grown in cell culture when the antibody is needed for production. Just like any often-propagated cell line, these cell lines could potentially undergo gradual changes affecting antibody production yields or even antibody characteristics. Recombinant antibodies are the best option for consistent, animal origin-free antibody production and lot-to-lot consistency. Recombinant antibodies are produced by transfecting production cell lines with recombinant DNA that encodes the desired immunoglobulins. Recombinant antibodies have several benefits: They can be modified at specific sites to add desired characteristics to IgGs and they are not subject to cell line drift such as hybridoma derived monoclonals. Recombinant antibodies can be pooled to generate recombinant antibody pools, such as recombinant polyclonal primary antibodies or superclonal recombinant secondary antibodies.

Antibodies are usually provided purified in PBS or similar buffers however in some cases, crude antibody preparations such as serum or ascites fluid are necessary in order to maintain certain antibody characteristics or antibody yield. It is important to optimize western blotting protocols to minimize the impact of impurities present in crude antibody preparations on background.

### Direct vs indirect detection

Both direct and indirect methods of detection can be used in western blotting. Each method provides its own advantages and disadvantages. With the direct detection method, an enzyme- or fluorophore-conjugated primary antibody is used to detect the antigen of interest on the blot. In the indirect detection method, an unlabeled primary antibody is first used to bind to the antigen. Subsequently, the primary antibody is detected using an enzyme- or fluorophore-conjugated secondary antibody. The indirect method offers many advantages over the direct method, which are described below.

- Requires only one antibody
- Saves time by eliminating secondary antibody incubation step
- Eliminates possible background by secondary antibody cross-reactivity in certain samples.

- Signal amplification, since multiple labeled secondaries bind to each primary
- Many options using HRP, Alexa Fluor or Alexa Fluor Plus labeled secondary antibodies
- Save time with multiplexed detection using fluorescent secondary antibodies
- Further signal amplification possible with biotinylated secondaries and fluorescent or enzyme-labeled streptavidin.

- Label may interfere with target binding, resulting in lower sensitivity and higher background.
- Selection of directly conjugated primary antibodies is limited

- Additional steps required
- Potential for non-specific binding that may increase background in certain samples.

### Host species considerations

Multiple species are used to generate antibodies that can be used in western blot applications. Most commonly: mouse, rabbit, rat, goat, donkey and chicken. Which host species primary antibody to choose will depend on whether a single target is being probed or multiple targets are being probed in a multiplex western blot experiment. When investigating only one antigen at a time, theoretically any host species can be used, however most primary research antibodies for western blotting are produced from immunized rabbits (polyclonal, monoclonal, recombinant) or mice (hybridoma derived monoclonals). Some host species provide additional advantages over others for example due to their size or immune biology. When comparing mouse or rabbit for example, rabbits usually are better at tolerating immunizations and have a significantly longer life span than mice. Furthermore, rabbits exhibit a more diverse natural repertoire of antibodies than mice, which makes rabbits a popular host for the generation of polyclonal, monoclonal and rabbit recombinant antibodies. When aiming to generate the most suitable monoclonal or recombinant antibody for western blotting, the greater repertoire of rabbit-produced antibodies allows for more successful screening, isolation and cloning of high affinity recombinant antibodies. This is especially important when aiming to make western blot antibodies to more challenging epitopes that may not be feasible to produce with other systems.

When performing a multiplex western blot, use primary antibodies from different host species for each target being probed. Ideally, use a combination of antibodies from two distantly related species such as rat and rabbit, avoiding combinations like mouse and rat or goat and sheep. This will aid in the selection of appropriate secondary antibodies to minimize potential antibody cross-reactivity, which can lead to confusing results.

#### Multiplexing host considerations

First antibody host species | Secondary antibody host species | |
---|---|---|

Rat | Rabbit | ✓ |

Mouse | Rabbit | ✓ |

Goat | Rabbit or mouse | ✓ |

Mouse | Rat | X |

Goat | Sheep | X |

### Western blot validated antibodies

Although antibodies are designed to recognize a specific target antigen, they may not work equally in all applications. Choose antibodies designated specifically for western blotting or that list western blotting as an application. In addition, it is important to confirm that the antibody is specific towards the native or denatured protein, to determine if SDS-PAGE or native PAGE should be performed.

#### When selecting a primary antibody, confirm:

- Antibody is validated for western blotting
- Antibody specificity towards the native or denatured protein

Invitrogen antibodies undergo a rigorous 2-part testing approach. Part 1: Target specificity verification, Part 2: Functional application validation. Target specificity verification helps ensure the antibody will bind to the correct target. Most antibodies were developed with specific applications in mind. Testing that an antibody generates acceptable results in a specific application is the second part of confirming antibody performance. Learn more about Invitrogen Antibody Validation process.

## 4. Probabilities and Proportions

### 4.1. Introduction

Sections 2 and 3 dealt exclusively with issues related to means. For many experiments conducted in our field, however, mean values are not the end goal. For example, we may seek to determine the *frequency* of a particular defect in a mutant background, which we typically report as either a *proportion* (e.g., 0.7) or a *percentage* (e.g., 70%). Moreover, we may want to calculate CIs for our sample percentages or may use a formal statistical test to determine if there is likely to be a real difference between the frequencies observed for two or more samples. In other cases, our analyses may be best served by determining *ratios* or *fold changes*, which may require specific statistical tools. Finally, it is often useful, particularly when carrying out genetic experiments, to be able to calculate the *probabilities* of various outcomes. This section will cover major points that are relevant to our field when dealing with these common situations.

### 4.2. Calculating simple probabilities

Most readers are likely proficient at calculating the probability of two *independent* events occurring through application of the *multiplication rule*. Namely, If event A occurs 20% of the time and event B occurs 40% of the time, then the probability of event A and B both occurring is 0.2 × 0.4 = 0.08 or 8%. More practically, we may wish to estimate the frequency of EcoRI restriction endonuclease sites in the genome. Because the EcoRI binding motif is GAATTC and each nucleotide has a roughly one-in-four chance of occurring at each position, then the chance that any six-nucleotide stretch in the genome will constitute a site for EcoRI is (0.25) 6 = 0.000244140625 or 1 in 4,096. Of course, if all nucleotides are not equally represented or if certain sequences are more or less prevalent within particular genomes, then this will change the true frequency of the site. In fact, GAATTC is over-represented in phage genomes but under-represented in human viral sequences (Burge et al., 1992). Thus, even when calculating straightforward probabilities, one should be careful not to make false assumptions regarding the independence of events.

In carrying out genetic studies, we will often want to determine the likelihood of obtaining a desired genotype. For example, if we are propagating an unbalanced recessive lethal mutation (*let*), we will need to pick phenotypically wild-type animals at each generation and then assess for the presence of the lethal mutation in the first-generation progeny. Basic Mendelian genetics (as applied to *C. elegans* hermaphrodites) states that the progeny of a *let/+* parent will be one-fourth *let/let*, one-half *let/+*, and one-fourth *+/+*. Thus, among the non-lethal progeny of a *let/+* parent, two-thirds will be *let/+* and one-third will be *+/+*. A practical question is how many wild-type animals should we single-clone at each generation to ensure that we pick at least one *let/+* animal? In this case, using the *complement* of the multiplication rule, also referred to as the probability of “*at least one*”, will be most germane. We start by asking what is the probability of an individual not being *let/+*, which in this case is one-third or 0.333? Therefore the probability of picking five animals, none of which are of genotype *let/+* is (0.333) 5 or 0.41%, and therefore the probability of picking at least one *let/+* would be 1 − 0.041 = 99.59%. Thus, picking five wild-type animals will nearly guarantee that at least one of the F1 progeny is of our desired genotype. Furthermore, there is a (0.667) 5 ≈ 13.20% chance that all five animals will be *let/+*.

### 4.3. Calculating more-complex probabilities

To calculate probabilities for more-complex problems, it is often necessary to account for the total number of *combinations* or *permutations* that are possible in a given situation. In this vernacular, the three different arrangements of the letters ABC, ACB, BAC, are considered to be distinct permutations, but only one combination. Thus, for permutations the order matters, whereas for combinations it does not. Depending on the situation, either combinations or permutations may be most relevant. Because of the polarity inherent to DNA polymers, GAT and TAG are truly different sequences and thus permutations would be germane. So far as a standard mass spectroscopy is concerned, however, the peptides DAVDKEN and KENDAVD are identical, and thus combinations might be more relevant in this case.

To illustrate the process of calculating combinations and permutations, we'll first use an example involving peptides. If each of the twenty standard amino acids (aa) is used only once in the construction of a 20-aa peptide, how many distinct sequences can be assembled? We start by noting that the order of the amino acids will matter, and thus we are concerned with permutations. In addition, given the set up where each amino acid can be used only once, we are *sampling without replacement*. The solution can be calculated using the following generic formula: # of permutations = *n*!. Here *n* is the total number of items and “!” is the mathematical *factorial* symbol, such that *n*! = *n* × (*n* − 1) × (*n* − 2) … × 1. For example, 5! = 5 × 4 × 3 × 2 × 1 = 120. Also by convention, 1! and 0! are both equal to one. To solve this problem we therefore multiply 20 × 19 × 18 … 3 × 2 × 1 or 20! ≈ 2.4e 18 , an impressively large number. Note that because we were sampling without replacement, the incremental decrease with each multiplier was necessary to reflect the reduced choice of available amino acids at each step. Had we been sampling *with* replacement, where each amino acid can be used any number of times, the equation would simply be 20 20 ≈ 1.1e 26 , an even more impressive number!

Going back to the previous genetic example, one might wish to determine the probability of picking five progeny from a parent that is *let/+* where three are of genotype *let/+* and two are *+/+*. One thought would be to use the multiplication rule where we multiply 0.667 × 0.667 × 0.667 × 0.333 × 0.333, or more compactly, (0.667) 3 (0.333) 2 = 0.0329 or 3.29%. If this seems a bit lower than you might expect, your instincts are correct. The above calculation describes only the probability of obtaining any one particular sequence that produces three *let/+* (L) and two *+/+* (W) worms. For this reason, it underestimates the true frequency of interest, since there are multiple ways of getting the same combination. For example, one possible order would be WWLLL, but equally probable are WLWLL, WLLWL, WLLLW, LLLWW, LLWLW, LLWWL, LWLLW, LWWLL, and LWLWL, giving a total of ten possible permutations. Of course, unlike peptides or strands of DNA, all of the possible orders are equivalent with respect to the relevant outcome, obtaining three *let/+* and two *+/+* worms. Thus, we must take permutations into account in order to determine the frequency of the generic combination. Because deriving permutations by hand (as we did above) becomes cumbersome (if not impossible) very quickly, one can use the following equation where *n* is the total number of items with *n _{1}* that are alike and

*n*that are alike, etc., up through

_{2}*n*.

_{k}Thus plugging in the numbers for our example, we would have 5!/3! 2! = 120/(6 × 2) = 10. Knowing the number of possible permutations we can then multiply this by the probability of getting any single arrangement of three *let/+* and two *+/+* worms calculated above, such that 0.0329 × 10 = 0.329 or 32.9%, a number that makes much more sense. This illustrates a more general rule regarding the probability (*Pr*) of obtaining specific combinations:

*Pr* combination = (# of permutations) × (probability of obtaining any single permutation)

Note, however, that we may often be interested in a slightly different question than the one just posed. For example, what is the probability that we will obtain at least three *let/+* animals with five picks from a *let/+* parent? In this case, we would have to sum the probabilities for three out of five [(5!/3! 2!) (0.0329) = .329], four out of five [(5!/4! 1!) (0.0329) = 0.165], five out of five [(0.667) 5 = 0.132] *let/+* animals, giving us 0.329 + 0.165 + 0.132 = 0.626 or 62.6%.

The ability to calculate permutations can also be used to determine the number of different nucleotide sequences in a 20-mer where each of the four nucleotides (G, A, T, C) is used five times. Namely, 20!/(5!) 4 ≈ 1.2e 10 . Finally, we can calculate the number of different peptides containing five amino acids where each of the twenty amino acids is chosen once without replacement. In this case, we can use a generic formula where *n* is the total number of items from which we select *r* items without replacement.

This would give us 20!/(20 − 5)! = 20!/(15)! = 20 × 19 × 18 × 17 × 16 = 1,860,480. The same scenario carried out with replacement would simply be (20) 5 = 3,200,000. Thus, __using just a handful of formulas, we are empowered with the ability to make a wide range of predictions for the probabilities that we may encounter__. This is important because probabilities are not always intuitive as illustrated by the classic 𠇋irthday problem”, which demonstrates that within a group of only 23 people, there is a 㹐% probability that at least two will share the same birthday 35 .

### 4.4. The Poisson distribution

Certain types of probabilistic events can be modeled using a distribution developed by the French mathematician Siméon Denis Poisson. Specifically, the *Poisson distribution* can be used to predict the probability that a given number of *events* will occur over a stipulated *interval* of time, distance, space, or other related measure, when said events occur independently of one another. For example, given a known forward mutation rate caused by a chemical mutagen, what is the chance that three individuals from a collection of 1,000 F1s (derived from mutagenized P0 parents) will contain a mutation in gene X? Also, what is the chance that any F1 worm would contain two or more independent mutations within gene X?

The generic formula used to calculate such probabilities is shown below, where µ is the mean number of expected events, *x* is the number of times that the event occurs over a specified interval, and *e* is the natural log.

For this formula to predict probabilities accurately, it is required that the events be independent of each other and occur at a constant average rate over the given interval. If these criteria are violated, then the Poisson distribution will not provide a valid model. For example, imagine that we want to calculate the likelihood that a mutant worm that is prone to seizures will have two seizures (i.e., events or *x*) within a 5-minute interval. For this calculation, we rely on previous data showing that, on average, mutant worms have 6.2 seizures per hour. Thus, the average (μ) for a 5-minute interval would be 6.2/12 = 0.517. Plugging these numbers into the above formula we obtain *P(x)* = 0.080 or 8%. Note that if we were to follow 20 different worms for 5 minutes and observed six of them to have two seizures, this would suggest that the Poisson distribution is not a good model for our particular event 36 . Rather, the data would suggest that multiple consecutive seizures occur at a frequency that is higher than predicted by the Poisson distribution, and thus the seizure events are not independent. In contrast, had only one or two of the 20 worms exhibited two seizures within the time interval, this would be consistent with a Poisson model.

### 4.5. Intuitive methods for calculating probabilities

Another useful strategy for calculating probabilities, as well as other parameters of interest that are governed by chance, is sometimes referred to as the *intuitive approach*. This includes the practice of plugging hypothetical numbers into scenarios to maximize the clarity of the calculations and conclusions. Our example here will involve efforts to maximize the efficiency of an F2-clonal genetic screen to identify recessive maternal-effect lethal or sterile mutations (Figure 11). For this experiment, we will specify that 100 P0 adults are to be cloned singly onto plates following mutagenesis. Then ten F1 progeny from each P0 will be single-cloned, some small fraction of which will be heterozygous for a desired class of mutation (*m/+*). To identify mutants of interest, however, F2s of genotype *m/m* must be single-cloned, and their F3 progeny must be inspected for the presence of the phenotype. The question is: what is the optimal number of F2s to single-clone from each F1 plate?

#### Figure 11

Schematic diagram of F2-clonal genetic screen for recessive mutations in *C. elegans*.

Mendelian genetics states that the chance of picking an *m/m* F2 from an *m/+* F1 parent is one in four or 25%, so picking more will of course increase the likelihood of obtaining the desired genotype. But will the returns prove diminishing and, if so, what is the most efficient practice? Table 4 plugs in real numbers to determine the frequency of obtaining *m/m* animals based on the number of cloned F2s. The first column shows the number of F2 animals picked per F1, which ranges from one to six. In the second column, the likelihood of picking at least one *m/m* animal is determined using the inverse multiplication rule. As expected, the likelihood increases with larger numbers of F2s, but diminishing returns are evident as the number of F2s increases. Columns 3𠄵 tabulate the number of worm plates required, the implication being that more plates are both more work and more expense. Columns six and eight calculate the expected number of *m/m* F2s that would be isolated given frequencies of (*m/+*) heterozygotes of 0.01 (10 in 1,000 F1s) or 0.001 (1 in 1,000 F1s), respectively. Here, a higher frequency would infer that the desired mutations of interest are more common. Finally, columns seven and nine show the predicted efficiencies of the screening strategies by dividing the number of isolated *m/m* F2s by the total number of F1 and F2 plates required (e.g., 2.50/2,000 = 1.25e 𠄳 ).

#### Table

Table 4. Intuitive approach to determine the maximum efficiency of an F2-clonal genetic screen.

From this we can see that either two or three F2s is the most efficient use of plates and possibly time, although other factors could potentially factor into the decision of how many F2s to pick. We can also see that the *relative efficiencies* are independent of the frequency of the mutation of interest. Importantly, this potentially useful insight was accomplished using basic intuition and a very rudimentary knowledge of probabilities. Of course, the outlined intuitive approach failed to address whether the optimal number of cloned F2s is 2.4 or 2.5 37 , but as we haven't yet developed successful methods to pick or propagate fractions of *C. elegans*, such details are irrelevant!

We note that an online tool has been created by Shai Shaham (Shaham, 2007) that allows users to optimize the efficiency of genetic screens in *C. elegans* 38 . To use the tool, users enter several parameters that describe the specific genetic approach (e.g., F1 versus F2 clonal). The website's algorithm then provides a recommended F2-to-F1 screening ratio. Entering parameters that match the example used above, the website suggests picking two F2s for each F1, which corresponds to the number we calculated using our intuitive approach. In addition, the website provides a useful tool for calculating the screen size necessary to achieve a desired level of genetic saturation. For example, one can determine the number of cloned F1s required to ensure that all possible genetic loci will be identified at least once during the course of screening with a 95% confidence level.

### 4.6. Conditional probability: calculating probabilities when events are not independent

In many situations, the likelihood of two events occurring is not independent. This does not mean that the two events need be totally interdependent or mutually exclusive, just that one event occurring may increase or decrease the likelihood of the other. Put another way, having prior knowledge of one outcome may change the effective probability of a second outcome. Knowing that someone has a well-worn � *C. elegans* International Meeting” t-shirt in his drawer does not guarantee that he is an aging nerd, but it certainly does increase the probability! The area of statistics that handles such situations is known as *Bayesian analysis* or inference, after an early pioneer in this area, Thomas Bayes. More generally, *conditional* *probability* refers to the probability of an event occurring based on the condition that another event has occurred. Although conditional probabilities are extremely important in certain types of biomedical and epidemiological research, such as predicting disease states given a set of known factors 39 , this issue doesn't arise too often for most *C. elegans* researchers. Bayesian models and networks have, however, been used in the worm field for applications that include phylogenetic gene tree construction (Hoogewijs et al., 2008 Agarwal and States, 1996), modeling developmental processes (Sun and Hong, 2007), and predicting genetic interactions (Zhong and Sternberg, 2006). Bayesian statistics is also used quite extensively in behavioral neuroscience (Knill and Pouget, 2004 Vilares and Kording, 2011 ), which is growing area in the *C. elegans* field. We refer interested readers to textbooks or the web for additional information (see Appendix A).

### 4.7. Binomial proportions

It is common in our field to generate data that take the form of *binomial proportions*. Examples would include the percentage of mutant worms that arrest as embryos or that display ectopic expression of a GFP reporter. As the name implies, binomial proportions arise from data that fall into two categories such as heads or tails, on or off, and normal or abnormal. More generically, the two outcomes are often referred to as a *success or failure*. To properly qualify, data forming a binomial distribution must be acquired by *random sampling*, and each outcome must be *independent* of all other outcomes. Coin flips are a classic example where the result of any given flip has no influence on the outcome of any other flip. Also, when using a statistical method known as the *normal approximation* (discussed below), the binomial dataset should contain a minimum of ten outcomes in each category (although some texts may recommend a more relaxed minimum of five). This is generally an issue only when relatively rare events are being measured. For example, flipping a coin 50 times would certainly result in at least ten heads or ten tails, whereas a phenotype with very low penetrance might be detected only in three worms from a sample of 100. In this latter case, a larger sample size would be necessary for the approximation method to be valid. Lastly, sample sizes should not be 㸐% of the entire population. As we often deal with theoretical populations that are effectively infinite in size, however, this stipulation is generally irrelevant.

An aside on the role of normality in binomial proportions is also pertinent here. It might seem counterintuitive, but the distribution of sample proportions arising from data that are binary does have, with sufficient sample size, an approximately normal distribution. This concept is illustrated in Figure 12, which computationally simulates data drawn from a population with an underlying “success” rate of 0.25. As can be seen, the distribution becomes more normal with increasing sample size. How large a sample is required, you ask? The short answer is that the closer the underlying rate is to 50%, the smaller the required sample size is with a more extreme rate (i.e., closer to 0% or 100%), a larger size is required. The requirements are reasonably met by the aforementioned *minimum of ten* rule.

### 4.8. Calculating confidence intervals for binomial proportions

To address the accuracy of proportions obtained through random sampling, we will typically want to provide an accompanying CI. For example, political polls will often report the percentage in favor of a candidate along with a 95% CI, which may encompass several percentage points to either side of the midpoint estimate 40 . As previously discussed in the context of means, determining CIs for sample proportions is important because in most cases we can never know the true proportion of the population under study. Although different confidence levels can be used, binomial data are often accompanied by 95% CIs. As for means, lower CIs (e.g., 90%) are associated with narrower intervals, whereas higher CIs (e.g., 99%) are associated with wider intervals. Once again, the meaning of a 95% CI is the same as that discussed earlier in the context of means. If one were to repeat the experiment 100 times and calculate 95% CIs for each repeat, on average 95 of the calculated CIs would contain the true population proportion. Thus, there is a 95% chance that the CI calculated for any given experiment contains the true population proportion.

Perhaps surprisingly, there is no perfect consensus among statisticians as to which of several methods is best for calculating CIs for binomial proportions 41 . Thus, different textbooks or websites may describe several different approaches. That said, for most purposes we can recommend a test that goes by several names including the adjusted Wald, the modified Wald, and the Agresti-Coull (A-C) method (Agresti and Coull, 1998 Agresti and Caffo, 2000). Reasons for recommending this test are: (1) it is widely accepted, (2) it is easy to implement (if the chosen confidence level is 95%), and (3) it gives more-accurate CIs than other straightforward methods commonly in use. Furthermore, even though this approach is based on the normal approximation method, the *minimum of ten* rule can be relaxed.

To implement the A-C method for a 95% CI (the most common choice), add two to both the number of successes and failures. Hence this is sometimes referred to as the *plus-four* or *“+4” method*. It then uses the doctored numbers, together with the normal approximation method, to determine the CI for the population proportion. Admittedly this sounds weird, if not outright suspicious, but empirical studies have shown that this method gives consistently better 95% CIs than would be created using the simpler method. For example, if in real life you assayed 83 animals and observed larval arrest in 22, you would change the total number of trials to 87 and the number of arrested larvae to 24. In addition, depending on the software or websites used, you may need to choose the normal approximation method and not something called the *exact method* for this to work as intended.

Importantly, the proportion and sample size that you report should be the actual proportion and sample size from what you observed the doctored (i.e., plus-four) numbers are used exclusively to generate the 95% CI. Thus, in the case of the above example, you would report the proportion as 22/83 = 0.265 or 26.5%. The 95% CI would, however, be calculated using the numbers 24 and 87 to give a 95% CI of 18.2%�.0%. Note that the +4 version of the A-C method is specific for 95% CIs and not for other intervals 42 . Finally, it is worth noting that CIs for proportions that are close to either 0% or 100% will get a bit funky. This is because the CI cannot include numbers π% or %. Thus, CIs for proportions close to 0% or 100% will often be quite lopsided around the midpoint and may not be particularly accurate. Nevertheless, unless the obtained percentage is 0 or 100, we do not recommend doing anything about this as measures used to compensate for this phenomenon have their own inherent set of issues. In other words, if the percentage ranges from 1% to 99%, use the A-C method for calculation of the CI. In cases where the percentage is 0% or 100%, instead use the exact method.

### 4.9. Tests for differences between two binomial proportions

Very often we will want to compare two proportions for differences. For example, we may observe 85% larval arrest in mutants grown on control RNAi plates and 67% arrest in mutants on RNAi-feeding plates targeting gene *X*. Is this difference significant from a statistical standpoint? To answer this, two distinct tests are commonly used. These are generically known as the *normal approximation* and *exact* methods. In fact, many website calculators or software programs will provide the *P*-value calculated by each method as a matter of course, although in some cases you may need to select one method. The approximation method (based on the so-called normal distribution) has been in general use much longer, and the theory behind this method is often outlined in some detail in statistical texts. The major reason for the historical popularity of the approximation method is that prior to the advent of powerful desktop computers, calculations using the exact method simply weren't feasible. Its continued use is partly due to convention, but also because the approximation and exact methods typically give very similar results. Unlike the normal approximation method, however, the exact method is valid in all situations, such as when the number of successes is less than five or ten, and can thus be recommended over the approximation method.

Regardless of the method used, the *P*-value derived from a test for differences between proportions will answer the following question: What is the probability that the two experimental samples were derived from the same population? Put another way, the null hypothesis would state that both samples are derived from a single population and that any differences between the sample proportions are due to chance sampling. Much like statistical tests for differences between means, proportions tests can be one- or two-tailed, depending on the nature of the question. For the purpose of most experiments in basic research, however, two-tailed tests are more conservative and tend to be the norm. In addition, analogous to tests with means, one can compare an experimentally derived proportion against a historically accepted standard, although this is rarely done in our field and comes with the possible caveats discussed in Section 2.3. Finally, some software programs will report a 95% CI for the difference between two proportions. In cases where no statistically significant difference is present, the 95% CI for the difference will always include zero.

### 4.10. Tests for differences between more than one binomial proportion

A question that may arise when comparing more than two binomial proportions is whether or not multiple comparisons should be factored into the statistical analysis. The issues here are very similar to those discussed in the context of comparing multiple means (Section 3). In the case of proportions, rather than carrying out an ANOVA, a *Chi-square* test (discussed below) could be used to determine if any of the proportions are significantly different from each other. Like an ANOVA, however, this may be a somewhat less-than-satisfying test in that a positive finding would not indicate which particular proportions are significantly different. In addition, FDR and Bonferroni-type corrections could also be applied at the level of *P*-value cutoffs, although these may prove to be too conservative and could reduce the ability to detect real differences (i.e., the *power* of the experiment).

In general, we can recommend that for findings confirmed by several independent repeats, corrections for multiple comparisons may not be necessary. We illustrate our rationale with the following example. Suppose you were to carry out a genome-wide RNAi screen to identify suppressors of larval arrest in the mutant *Y* background. A preliminary screen might identify 𢏁,000 such clones ranging from very strong to very marginal suppressors. With retesting of these 1,000 clones, most of the false positives from the first round will fail to suppress in the second round and will be thrown out. A third round of retesting will then likely eliminate all but a few false positives, leaving mostly valid ones on the list. This effect can be quantified by imagining that we carry out an exact binomial test on each of �,000 clones in the RNAi library, together with an appropriate negative control, and chose an α level (i.e., the statistical cutoff) of 0.05. By chance alone, 5% or 1,000 out of 20,000 would fall below the *P*-value threshold. In addition, let's imagine that 100 real positives would also be identified giving us 1,100 positives in total. Admittedly, at this point, the large majority of identified clones would be characterized as false positives. In the second round of tests, however, the large majority of true positives would again be expected to exhibit statistically significant suppression, whereas only 50 of the 1,000 false positives will do so. Following the third round of testing, all but two or three of the false positives will have been eliminated. These, together with the � true positives, most of which will have passed all three tests, will leave a list of genes that is strongly enriched for true positives. Thus, by carrying out several experimental repeats, additional correction methods are not needed.

### 4.11. Probability calculations for binomial proportions

At times one may be interested in calculating the probability of obtaining a particular proportion or one that is more extreme, given an expected frequency. For example, what are the chances of tossing a coin 100 times and getting heads 55 times? This can be calculated using a small variation on the formulae already presented above.

Here *n* is the number of trials, Y is the number of positive outcomes or successes, and *p* is the probability of a success occurring in each trial. Thus we can determine that the chance of getting exactly 55 heads is quite small, only 4.85%. Nevertheless, given an expected proportion of 0.5, we intuitively understand that 55 out of 100 heads is not an unusual result. In fact, we are probably most interested in knowing the probability of getting a result at least as or more extreme than 55 (whether that be 55 heads or 55 tails). Thus our probability calculations must also include the results where we get 56, 57, 58� heads as well as 45, 44, 43 𠉠 heads. Adding up these probabilities then tells us that we have a 36.8% chance of obtaining at least 55 heads or tails in 100 tosses, which is certainly not unusual. Rather than having to calculate each probability and adding them up, however, a number of websites will do this for you. Nevertheless, be alert and understand the principles behind what you are doing, as some websites may only give you the probability of �%, whereas what you really may need is the summed probability of both �% and �%.

### 4.12. Probability calculations when sample sizes are large relative to the population size

One of the assumptions for using the binomial distribution is that our population size must be very large relative to our sample size 43 . Thus, the act of sampling itself should not appreciably alter the course of future outcomes (i.e., the probability is fixed and does not change each trial). For example, if we had a (very large) jar with a million marbles, half of them black and half of them white, removing one black marble would not grossly alter the probability of the next marble being black or white. We can therefore treat these types of situations as though the populations were infinite or as though we were *sampling with replacement*. In contrast, with only ten marbles (five white and five black), picking a black marble first would reduce the probability of the next marble being black from 50% (5/10) to 44.4% (4/9), while increasing the probability of the next marble being white to 55.6% (5/9) 44 . For situations like this, in which the act of sampling noticeably affects the remaining population, the binomial is shelved in favor of something called the *hyper-geometric distribution*. For example, in the case of the ten marbles, the probability of picking out five marbles, all of which are black, is 0.0040 using the hypergeometric distribution. In contrast, the binomial applied to this situation gives an erroneously high value of 0.031.

For our field, we often see hyper-geometric calculations applied to computational or genomics types of analyses. For example, imagine that we have carried out an RNA-seq experiment and have identified 1,000 genes that are mis-expressed in a particular mutant background. A gene ontology (GO) search reveals that 13 of these genes encode proteins with a RING domain. Given that there are 152 annotated RING domain𠄼ontaining proteins in *C. elegans*, what is the probability that at least 13 would arise by chance in a dataset of this size? The rationale for applying a hyper-geometric distribution to this problem would be as follows. If one were to randomly pick one of �,000 worm proteins out of a hat, the probability of it containing a RING domain would be 152/20,000 (0.00760). This leaves 151 remaining RING proteins that could be picked in future turns. The chance that the next protein would contain a RING domain is then 151/19,999 (0.00755). By the time we have come to the thirteenth RING protein in our sample of 1,000 differentially expressed genes, our chances might be something like 140/19,001 (0.00737). Plugging the required numbers into both binomial and hyper-geometric calculators 45 (available on the web), we get probabilities of 0.0458 and 0.0415, respectively. Admittedly, in this case the hyper-geometric method gives us only a slightly smaller probability than the binomial. Here the difference isn't dramatic because the population size (20,000) is not particularly small.

We would also underscore the importance of being conservative in our interpretations of GO enrichment studies. In the above example with RING finger proteins, had the representation in our dataset of 1,000 genes been 12 instead of 13, the probability would have been greater than 0.05 (0.0791 and 0.0844 by hyper-geometric and binomial methods, respectively). Furthermore, we need to consider that there are currently several thousand distinct GO terms that are currently used to classify *C. elegans* genes. Thus, the random chance of observing over-representation within any one particular GO class will be much less than observing over-representation within some𠅋ut no particular—GO class. Going back to marbles, if we had an urn with 100 marbles of ten different colors (10 marbles each), the chance that a random handful of 5 marbles would contain at least 3 of one particular color is only 0.00664. However, the chance of getting at least three out of five the same color is 0.0664. Thus, for GO over-representation to be meaningful, we should look for very low *P*-values (e.g., � 𠄵 ).

### 4.13. Tests for differences between multinomial proportions

*Multinomial* proportions or distributions refer to data sets where outcomes are divided into three or more discrete categories. A common textbook example involves the analysis of genetic crosses where either genotypic or phenotypic results are compared to what would be expected based on Mendel's laws. The standard prescribed statistical procedure in these situations is the *Chi-square* *goodness-of-fit* test, an approximation method that is analogous to the normal approximation test for binomials. The basic requirements for multinomial tests are similar to those described for binomial tests. Namely, the data must be acquired through random sampling and the outcome of any given trial must be independent of the outcome of other trials. In addition, a minimum of five outcomes is required for each category for the Chi-square goodness-of-fit test to be valid. To run the Chi-square goodness-of-fit test, one can use standard software programs or websites. These will require that you enter the number of expected or control outcomes for each category along with the number of experimental outcomes in each category. This procedure tests the null hypothesis that the experimental data were derived from the same population as the control or theoretical population and that any differences in the proportion of data within individual categories are due to chance sampling.

As is the case with binomials, exact tests can also be carried out for multinomial proportions. Such tests tend to be more accurate than approximation methods, particularly if the requirement of at least five outcomes in each category cannot be met. Because of the more-complicated calculations involved, however, exact multinomial tests are less commonly used than the exact binomial test, and web versions are currently difficult to come by. The good news is that, like the binomial tests, the approximate and exact methods for multinomials will largely yield identical results. Also make sure that you don't confuse the Chi-square goodness-of-fit test with the *Chi-square test of independence*, which also has an exact version termed the *Fisher's exact* test.

Although many of us will probably not require the Chi-square goodness-of-fit test to sort out if our proportion of yellow wrinkled peas is what we might have expected, understand that this test can be used for any kind of sample data where more than two categories are represented. Imagine that we are studying a gene in which mutations lead to pleiotropy, meaning that a spectrum of distinct phenotypes is observed. Here, the proportion of animals displaying each phenotype could be compared across different alleles of the same gene to determine if specific mutations affect certain developmental processes more than others. In other instances, numerical data, such as the number of mRNA molecules in an embryo, may also benefit from imposing some broader categorization. For example, embryos might be divided into those containing 0�, 11�, 51�, and transcripts. These outcomes could then be compared across different mutant backgrounds or at different developmental time points to identify broad categorical differences in expression. In all of the above cases, a Chi-square goodness-of-fit test would be appropriate to determine if any differences in the observed proportions are statistically significant.

## More Western Blotting

### Western Blotting University

**Courses designed to make you a western blotting expert.**

**Monday &ndash Friday, April 19 &mdash April 23, 2021 at 9:00AM PDT (5:00PM BST)**

Join us for an in-depth experience learning about western blotting principles and techniques. This five-part webinar series is for anyone who has ever asked how to perform a western blot and would like to have a detailed understanding of western blotting principles at each step of the workflow.

### Sign Up to Be the First to Know

## Вебинары

При работе с нашим интерактивным средством для поиска сертификатов анализа придерживайтесь следующих рекомендаций:

- Убедитесь, что Вы ввели правильный номер по каталогу и номер партии (или контрольный номер) в полях для поиска.
- Если Вы используете набор, попробуйте поискать сертификат анализа по данным набора, а также по данным индивидуальных компонентов.
- Необходимый сертификат анализа может быть недоступен на веб-сайте. В этом случае можно обратиться к представителю Bio-Rad или воспользоваться формой запроса.

Если не удается найти требуемый сертификат анализа, воспользуйтесь формой запроса по приведенной ссылке.

#### Где можно найти номер по каталогу, номер артикула или продукта?

Номер по каталогу, номер артикула или продукта напечатаны на этикетке продукта. Местоположение этой информации показано на приведенном ниже образце.

#### Где можно найти номер партии или контрольный номер?

Номер партии или контрольный номер (только один из двух) напечатан на этикетке продукта. Местоположение этой информации показано на приведенном ниже образце.

#### Почему сертификаты анализа находятся не на вкладке документов?

Сертификаты анализа связаны не только с продуктом, но и с конкретными партиями этого продукта. Для каждого продукта может быть доступно несколько сертификатов анализа, особенно если данная линия продуктов выпускается давно и за годы производства было выпущено несколько партий. С помощью средства поиска сертификатов анализа можно ввести номер по каталогу и номер партии (или контрольный номер) для конкретного продукта, находящегося у Вас на руках, и загрузить необходимый сертификат анализа.

#### У меня есть номер серии, а не номер партии или контрольный номер. Как мне найти сертификат анализа для моего продукта?

Номер серии можно использовать вместо номера партии или контрольного номера. Используйте средство поиска сертификатов анализа: введите номер по каталогу, как обычно, а вместо номера партии или контрольного номера укажите номер серии.

#### Могу ли я получить сертификат анализа, если срок действия моего продукта истек?

Да. Несмотря на то, что мы периодически удаляем сертификаты анализа в ходе процедур обслуживания сайта, мы стараемся оставлять их в доступе в течение продолжительного времени после истечения срока действия продукта.

#### Почему для сертификата анализа, приведенного в результатах поиска, имеется пометка «Название продукта не найдено»?

Существуют сертификаты анализа для продуктов, выпуск которых прекращен, или продуктов, которые недоступны на веб-сайте. В таких ситуациях сертификат анализа доступен для загрузки, однако другие сведения о продукте, такие как его название, недоступны.

#### Почему для сертификата анализа, приведенного в результатах поиска, имеется пометка «Недоступно»?

Это означает, что Вы ввели правильный номер по каталогу и номер партии (или контрольный номер), и мы нашли сертификат анализа. Однако по какой-то причине сам файл недоступен.

Воспользуйтесь формой запроса или обратитесь к представителю Bio-Rad, чтобы мы выслали Вам сертификат анализа.

## Enzymes of Epigenetics, Part A

M.Y. Liu , . R.M. Kohli , in Methods in Enzymology , 2016

### 2.2 Qualitative Analysis by Dot Blotting

Dot blots are commonly used to probe for modified bases in gDNA. DNA is denatured to expose the bases, spotted onto an absorbent membrane, and probed with antibodies against each of the four cytosine modifications. Dot blots offer a clear visual result and can be performed using either serial dilutions or single concentrations of DNA. We consider the former to be semiquantitative, while the latter is only qualitative but still particularly useful for screening a large number of samples. Dot blotting also works for plasmids but is generally not well suited for short oligonucleotides, likely because these do not adhere consistently to membranes.

The first step is to determine the appropriate amount of DNA for blotting, considering the amount of expected modifications. For gDNA from HEK293T cells overexpressing TET, load 400 ng of gDNA into each well of a Bio-Dot microfiltration apparatus (Bio-Rad). Calculate the total amount of DNA needed (based on number of blots and number of serial dilutions) and dilute to 10 ng/μL in TE buffer (10 m*M* Tris–Cl, pH 8.0, 1 m*M* EDTA). Add 1/4 volume of 2 *M* NaOH/50 m*M* EDTA. Denature the DNA at 95°C for 10 min, transfer quickly to ice, and add 1 volume of ice-cold 2 *M* ammonium acetate to stabilize single strands. Serial dilutions may be performed at this point into TE buffer. Meanwhile, prepare membranes for blotting we have found that Sequi-Blot PVDF membranes (Bio-Rad) give cleaner results than nitrocellulose. Wet membranes in methanol and equilibrate in TE buffer then, assemble the dot blotting apparatus, taping off any unused wells. Wash each well with 400 μL TE and draw through with gentle vacuum. Purge any air bubbles in the wells, as these can interfere with washing and spotting DNA, and release the vacuum gently to avoid regurgitation that can cross-contaminate wells. Apply 100 μL of DNA samples at the desired dilutions and wash with another 400 μL of TE. Carefully place the membranes into 50 mL conical tubes for blotting. Note that replicate membranes are needed for each separate mC, hmC, fC, and caC blot.

The blotting procedure begins with blocking for 2 h in TBST buffer (50 m*M* Tris–Cl, pH 7.6, 150 m*M* NaCl, 0.5% Tween 20) with 5% (w/v) milk at room temperature. Then, wash three times with TBST and incubate at 4°C overnight with primary antibodies against each modified cytosine (Active Motif offers mouse monoclonal mC and rabbit polyclonal hmC, fC, and caC antibodies). We use the following antibody dilutions in 5% milk/TBST: 1:5000 mC 1:10,000 hmC 1:5000 fC and 1:10,000 caC. Volumes should be enough to cover the membrane evenly, and solutions should be poured off cleanly between steps. Wash the blots three times with TBST for 5 min each and incubate with secondary 1:2000 goat anti-mouse IgG-HRP or 1:5000 goat anti-rabbit IgG-HRP (Santa Cruz Biotechnology) at room temperature for 2 h. Wash three times again and, just before imaging, apply Immobilon Western Chemiluminescent HRP Substrate (Millipore) evenly over the entire blot. Expose on an imager with chemiluminescent detection capabilities (we use a Fujifilm LAS-1000), taking care to smooth the blot over the imaging surface and remove air bubbles and excess HRP substrate. As positive and negative controls for this optimized protocol, we typically use gDNA from cells transfected with WT hTET2-CD or empty vector, respectively. Fig. 2 shows an example of dot blotting results for select TET constructs.

Fig. 2 . Representative dot blots from analysis of gDNA from transfected HEK293T cells. Shown are (A) serial dilutions of gDNA from cells transfected with either empty expression vectors or hTET2-CD and (B) dot blots on 400 ng of gDNA from HEK293T cells transfected with empty expression vector, hTET1-CD, hTET2-CD, or mTet2-CD.

## Need help interpreting Western blot data - Biology

**Writing the Results Section **(printable version here)

This is the section in which you will want to present your findings to the reader in the most clear, consistent, orderly, and succinct fashion. As previously mentioned, we suggest that you write this section either first or second to the Materials and Methods section. Another possibility is that you could write them simultaneously, describing each experiment and the corresponding data. Whatever you find easiest is fine.

The results you collect will most likely contain a story that you want to tell to the reader in an interesting manner. Presenting these data in a clear and thorough fashion, however, is quite a responsibility, because you have many decisions to make as to how you want to tackle the ominous task. It must be done well, because without the results being understood, the credibility of the entire paper disintegrates before the reader's eyes. The task is a manageable one, provided that you sit down and think logically about what needs to be made unequivocally clear. By this we mean that common sense goes a long way. Include only what is necessary, and don't include extraneous information. If there is a datum that is important to the ultimate conclusion but is difficult to present, you must find a way to do it. Do not think that you can sweep some pertinent data under the rug and expect to get away with it. If something important is missing, the omission will stare the reader viciously in the face and he or she will be lost. Be sensible, include what you feel needs to be included, and do it in a clear and understandable way, for the results are the primary ingredients upon which your entire paper is based.

**Methods for Presenting Data**

The ways of presenting data vary depending upon what you want to present to the reader. The Results section should include all of the experimental data collected throughout the experiment that was necessary in reaching the ultimate conclusions drawn. This includes tables, graphs, Western blots, SDS-PAGE results, etc. Each set of data requires a logically selected label (e.g. Figure 1 or Table 1) and a descriptive title referring to the nature of the experiment. A brief paragraph of explanation should be included for each table or figure as well so that the reader knows exactly what he or she is looking at. Graphs and tables require some discretion in terms of what needs to be included and what doesn't.

You have to decide for yourself what information is essential for the reader's understanding of the paper, but do it carefully. Not enough information can confuse and lose the reader, but too much information can become monotonous for the reader. As a general rule, raw data does not need to be included it should be formed into some sort of graph whether that be a line graph, a bar graph, a pie graph, or whatever you feel is necessary to point out the important trends that help tell your story you decide what the data calls for. Then, the proper labels must be assigned to each axis if you choose to use a bar or a line graph. Also with graphs, the standard deviation for each datum will sometimes be required by your professor. Without the level of error provided, the reader has no idea how consistent your findings are. However, in a laboratory class, often you will not obtain the data to calculate the standard deviation. It will depend on your professor and the experiment being performed. Also, (just like every other table, picture, or graph) an explanatory paragraph must be included to guide the reader along.

**General guidelines for writing the results section**

**1. Do not be ambiguous**. Do not make the reader guess at what information you are trying to present.

**2. Organize the data in a logical fashion**. The reader must be able to follow the flow of the data otherwise, the paper will mean nothing and most likely frustrate the reader.

**3A. Do not describe methods used to obtain the data**. This belongs in the Materials and Methods section.

**3B. Do not attempt to interpret the data**. This belongs in the Discussion section

**4. Point out certain trends or patterns that the data follow**. Data is organized in a manner that will point out trends that you want to make clear to the reader in order to help tell your story. You must call the reader's attention to these trends or they may be missed.

The following data includes two tables and two figures to demonstrate the points explained above. Each table or figure has a description of what is appropriate or what needs improvement.

**Protein Values**

Experiment | Absorbance | Protein |

Media | 0.57 | 2.04 |

Media/LPS | 0.60 | 2.16 |

Vehicle | 0.61 | 2.20 |

Vehicle/LPS | 0.66 | 2.36 |

Drug | 0.69 | 2.50 |

Drug/LPS | 0.61 | 2.22 |

*Review of Sample 1:* There are many problems with the presentation of this table, forcing the reader to guess about some of the data. First, it is not labeled as either a table or a figure. It is simply given a title (Protein Values) that doesn't even describe anything. Protein values of what and under what circumstances? The reader has no idea what he or she is looking at. Also, the column labels don't have the units of measurement included. The absorbance values mean nothing if the reader doesn't know at what level they were taken, and what does protein mean in the third column? Is that concentration, and if so, what are the units? All of these things need to be included to make clear to the reader what the data is.

Table 1. Absorbance Readings and Corresponding Protein Concentration values

Experimental group | Absorbance (595nm) | Protein Concentration (micg/micl) |

Media | 0.57 | 2.04 |

Media/LPS | 0.60 | 2.16 |

Vehicle | 0.61 | 2.20 |

Vehicle/LPS | 0.66 | 2.36 |

Drug | 0.69 | 2.50 |

Drug/LPS | 0.61 | 2.22 |

This table demonstrates the protein concentration of each sample. The concentration of protein found in each sample is similar.

*Review of Sample 2:* This table is properly labeled Table 1, because it is the first table that appears in the paper, and it also has a descriptive title. All of the columns are clearly labeled with the unit of measurement for each one. Also note that there is a brief sentence describing what the numbers are and where they came from.

Figure One shows the absorbance values compared to the times of each tube in the experiment.

*Review of Sample 3:* There are two problems with the graph itself: neither axis contains the proper unit of measurement labels, and the none of the lines are marked as to what test tube each represents. As with the table in the previous example, the reader needs to know what level of absorbance the results were taken at. Also, the reader has no idea what the lines mean, because he or she has no idea which goes with each tube. Another problem with this figure is that the explanatory sentence is quite scanty. The author doesn't guide the reader along as to what results are being presented. Trends should be pointed out.

Figure One shows the absorbance values (read at 540 nm) of each of the three experimental tubes compared to the time in seconds. This is an indication of the rate that catechol is being turned into benzoquinone in each tube. In the group that the acidity was increased (tube 2), we see a steady increase in absorbance, then a slight dropoff, and then it regained its initial rate of increase. In the control group (tube 3), we see a gradual increase in absorbance values over time, and then it seems to level off. In the group that the amount of the enzyme was increased (tube 4), we see a very slim, but noticeable, increase in absorbance values over the first three seconds.

*Review of Sample 4:* In this version of Figure One, the proper labels are on both the axes and the three curves. Also, the explanatory paragraph is much more descriptive and informative-it tells the reader what occurring in each of the three tubes and points out specific trends in each of the three curves.

All citations from Pechenik, Jan A. *A short guide to writing about Biology*. pp. 54-102, Tufts University: Harper Collins College Publishers. 1993.