EMPIRICAL EXPRESSION ALGORITHM OUTPUT DEFINITIONS Versions of Microarray Suite lower than MAS5 run the empirical expression algorithm, i.e.MAS4 and all versions of the GeneChip Analysis Suite software. The following descriptions of column headers were obtained from Affymetrix (http://www.affymetrix.com). SINGLE-ARRAY RESULTS Positive & Negative: -------------------- The number of probe pairs scored positive or negative. A probe pair is called positive if the intensity of the PM probe cell is significantly greater than that of the corresponding MM probe cell. A probe pair is called negative if the intensity of the MM probe cell is significantly greater than that of the corresponding PM probe cell. To evaluate the intensity, the algorithm calculates the ratio and difference associated with each probe pair and compares these values to the Statistical Difference Threshold (SDT) and the Statistical Ratio Threshold (SRT). A probe pair is Positive if: PM - MM > SDT and PM/MM > SRT A probe pair is Negative if: MM - PM > SDT and MM/PM > SRT The SDT is a function of the noise (Q) and is calculated by the software: SDT = Q * SDTmult. The SDTmult and the SRT are user-modifiable parameters. The SDTmult is set at 2.0 for the standard staining protocol or 4.0 for the antibody amplification protocol. (Refer to the Expression Analysis Technical Manual or the HuSNP(r) Mapping Assay User Manual). The default SRT value is 1.5. Increasing the SDTmult and SRT increases analysis stringency, reducing these thresholds decreases analysis stringency. The number of positive and negative probe pairs is determined for every probe set and are used to derive parameters that describe probe set performance. Pairs: ------ The number of probe pairs for a particular probe set on an array. Pair Used: ---------- The number of probe pairs per probe set used in the analysis. This may be the total number of probes per probe set on the probe array or the number of probe pairs in a pre-designated subset (for example, probe pairs specified by a probe mask file and/or a masked image). Pairs Used = total probe pairs per probe set - (probe pairs masked in a mask file) - (probe pairs masked in the image). Pairs in Avg: ------------- A trimmed probe set that excludes probes with extremely intense or weak signal from the analysis. If 8 or fewer probe pairs are used, Pairs in Avg = Pairs Used (or the number of probe pairs per probe set minus any that are masked). Super scoring is performed if more than 8 probe pairs are used. Superscoring is a process that excludes probe pairs from calculation of the Avg Diff and Log Avg Ratio if they are outside a given intensity range. Microarray Suite calculates the mean and standard deviation of the intensity differences (PM - MM) for an entire probe set (excluding the highest and lowest values). Those values within a set number of standard deviations (STP) are included in the calculation of the Avg Diff or Log Avg Ratio. The STP is a user-modifiable parameter with a default value = 3. Pos Fraction: ------------- # positive probe pairs/# probe pairs used Log Avg: -------- Describes the hybridization performance of a probe set and is determined by calculating the ratio of the PM/MM intensities for each probe pair in a probe set, taking the logs of the resulting values, and averaging them for the probe set: Log Avg = 10 x {[Sum log (PM/MM)]/Pairs in Avg} Log Avg = 0 indicates random cross hybridization. The higher the Log Avg, the more confidence the transcript is present. Pos/Neg: -------- The ratio of Positive probe pairs to Negative probe pairs in a probe set (# Positive probe pairs / # Negative probe pairs). Avg Diff: --------- Avg Diff = Sum (PM - MM)/Pairs in Avg This parameter serves as a relative indicator of the level of expression of a transcript. It is used to determine the change in the hybridization intensity of a given probe set between two different experiments. The Avg Diff is calculated by taking the difference between the PM and MM of every probe pair (excluding the probe pairs where PM - MM is outside the STP standard deviation of the mean of PM-MM) in a probe set and averaging the differences for the entire probe set. The Avg Diff cannot be used to compare the hybridization intensity levels of two different probe sets on the same array. Absolute Call: -------------- Each transcript in an single-array analysis has three possible Absolute Call outcomes: Present (P), Absent (A), or Marginal (M). The Absolute call is derived from the Pos/Neg, Positive Fraction, and Log Avg Absolute call metrics. Each Absolute call metric is weighted and entered into a decision matrix to determine the status of the transcript. Comparison Analysis Results Increase: --------- A probe pair is considered to increase if the intensity difference between the PM and MM probe cells in the experimental sample is significantly higher than in the baseline sample. Two criteria must be met for a probe pair to show a significant increase: (PM - MM)exp - (PM - MM)base > Change Threshold (CT), and [(PM - MM)exp - (PM - MM)base] / max [Q/2, min(|PM - MM|exp |PM - MM|base)] > Percent Change Threshold/100 Decrease: --------- A probe pair is considered to decrease if the intensity difference between the PM and MM probe cells in the experimental sample is significantly lower than in the baseline sample. Two criteria must be met for a probe pair to show a significant decrease: (PM - MM)base - (PM - MM)exp > Change Threshold (CT), and [(PM - MM)base - (PM - MM)exp] / max [Q/2, min(|PM - MM|exp, |PM - MM|base)] > Percent Change Threshold/100 The software calculates the Change Threshold (CT) using the SDT (Statistical Difference Threshold) of both the experimental and baseline data. Alternatively, the user may define the CT by entering a value for the CT Multiplier (in the Parameters tab of the Expression Analysis Settings dialog box), which is multiplied by the noise (Q) of the baseline or experimental data, whichever is greater. The Percent Change Threshold is a user-specified value (also set in the Parameters tab of the Expression Analysis Settings dialog box). Inc Ratio: ---------- For each transcript: # Increased probed pairs / # probe Pairs Used Dec Ratio: ---------- For each transcript: # Decreased Probe pairs / # probe Pairs Used Pos Change: ----------- Positive probe pairs(exp) - # Positive probe pairs(baseline) Neg Change: ----------- Negative probe pairs(exp) - # Negative probe pairs(baseline) Inc/Dec: -------- For each transcript: the # increased probe pairs / # decreased probe pairs DPos-DNeg Ratio: ---------------- (Positive Change - Negative Change)/# probe Pairs Used The DPos - DNeg Ratio and Log Avg Ratio Change are usually positive when a transcript changes from a very low to a relatively high expression level and are typically negative when the expression level changes from a high to a very low or undetectable level. Both metrics may have values close to zero if the transcript is present in both the baseline and experimental samples despite an increase or decrease in the level of the transcript. Log Avg Ratio Change: -------------------- Log Avg(exp) - Log Avg(base) The difference between the Log Avg Ratio of the baseline and experimental probe array data (in a comparison analysis) for each transcript. The Log Avg Ratios are recomputed for each for each probe set based on probe pairs used in both the baseline and experimental probe arrays (the recomputed values are not displayed by the software). Difference Call: ---------------- Each transcript in a comparison analysis has five possible Difference Call outcomes: (1) Increase (I), (2) Marginally Increase (MI), or (3) Decrease (D), (4) Marginally Decrease (MD), and (5) No Change (NC). The difference call is derived from the comparison metrics: Max [Increase/Total, Decrease/Total], Increase/Decrease Ratio, Log Average Ratio Change, and Dpos - Dneg Ratio. Each comparison metric is weighted and entered into a decision matrix to determine the status of the transcript. Avg Diff Change: --------------- The Avg Diff values are recomputed for each probe set based on probe pairs used in both the baseline and experimental probe arrays (the recomputed values are not displayed by the software). Avg Diff Change = Avg Diffexp - Avg Diffbaseline B=A: ---- An asterisk (*) in this column indicates the transcript is called absent (A) in the baseline. Fold Change: ------------ The Fold Change indicates the relative change in the expression levels between the experiment and baseline targets. The Fold Change for a transcript is a positive number when the expression level in the experiment increases compared to the baseline and is a negative number when the expression level in the experiment declines. The Fold Change (FC) is calculated as: FC = [(AvgDiffChange)/ max[min(AveDiffbase, AveDiffexp, Q(M) x Q(C)] + [+1 if (AvgDiffexp >= AvgDiffbase -1 if (AvgDiffexp < AvgDiffbase] The normalized or scaled Avg Diff values are recomputed in both the experimental and baseline data sets to include only probe pairs used in both the baseline and experiment arrays. Then the Avg Diff Change is calculated as: Avg Diff Change = Avg Diffexp - Avg Diffbase QC = max(Qexp, Qbase) QM = 2.1 for a 50 micrometre feature or 2.8 for a 24 micrometre feature If the noise (Q) of the experiment or baseline array is greater than the Avg Diff of the transcript (the baseline or experimental data), the Fold Change is calculated over the noise and is an approximation [a tilde character (~) precedes the approximated Fold Change value in the *.chp file]. Sort Score: ----------- The Sort Score is a ranking based on the Fold Change and the Avg Diff Change. The higher the Fold Change and the Avg Diff Change, the higher the Sort Score.