Case Document 153-2 Filed 10/17/17 Page 1 of 13 EXHIBIT A Case 1:15-cr-00565-VEC Document 153-2 Filed 10/17/17 Page 2 of 13 Declaration of Nathaniel Adams I, Nathaniel Adams, declare pursuant to 28 U.S.C. § 1746, that I have personal knowledge of the following, and if called upon to do so, could and would testify competently to the matters contained herein. 1 Qualifications I have a Bachelor of Science degree with a major in Computer Science from Wright State University (Dayton, Ohio). I am enrolled in the Graduate School at Wright State University, pursuing a Master of Science degree in Computer Science. I am employed as a Systems Engineer at Forensic Bioinformatic Services, Inc. in Fairborn, Ohio. My responsibilities include analyzing electronic data generated during the course of forensic DNA testing; reviewing case materials; reviewing laboratory protocols; and performing calculations of statistical weights, including custom simulations for casework and research projects. I actively use, develop, and maintain a number of software programs to assist with these efforts. I have been involved in several reviews of probabilistic genotyping analyses in criminal cases, including FST. In 2014 I attended a week-long workshop on interpreting forensic DNA mixtures using emerging software solutions. In January 2016, I was retained in a criminal case and inspected the source code of the commercially-available probabilistic genotyping program STRmix™. Due to a non-disclosure agreement that I signed, I am not allowed to discuss the findings of my review of STRmix™ outside of that particular case. 2 Overview I have been asked by Sylvie Levine and Christopher Flood to conduct a code review of the Forensic Statistical Tool (FST). I have previously provided statements in the case US v. Johnson, dated May 27, 2016 and October 27, 2016. I have received Excel spreadsheets from Clint Hughes that reproduce FST’s likelihood ratio calculations at the D3, D13, and D16 loci for the “Pen B” mixture described in the FST Validation materials [see section 3.4 Results]. These Excel files use allele frequencies from the Caucasian subpopulation. 3 Departure from validation study – Pen B I have further investigated the undocumented behavior described in section 5.6 of my October 27, 2016 statement. That section describes the removal of a locus from FST’s likelihood ratio Page 1 of 9 Case 1:15-cr-00565-VEC Document 153-2 Filed 10/17/17 Page 3 of 13 (LR) calculation if the allele frequencies of observed alleles at that locus sums to greater than or equal to 0.97 in any of the four subpopulations used by FST (Asian, Black, Caucasian, and Hispanic). 3.1 Background Section 7 of the FST Validation “Executive Summary” states: All validation samples were compared against a database of DNA profiles of 1246 noncontributors. This database includes 646 profiles used by OCME for estimation of allele frequencies for calculation of random match probabilities plus 600 profiles obtained from John Butler at NIST (J. Forensic Sci. July 2003: 48(4):908-911). The referenced publication is entitled, “Allele Frequencies for 15 Autosomal STR Loci on U.S. Caucasian, African American, and Hispanic Populations” and is publicly available 1 in the population studies section of the National Institute of Standards and Technology’s Short Tandem Repeat DNA Internet DataBase (STRBase) 2. The 600 genotypes underlying these allele frequencies are available from NIST in Excel format 3. Section 5 of the FST Validation “Executive Summary” states: Tables 3A and 3B contain the LR results obtained with the Forensic Statistical Tool and the manual data comparison results. For each potential true contributor, the lowest LR from the four ethnic groups was recorded. From the population of 1,246 non contributors, the individual with the highest of these values was also recorded. In Table 3B, the results of non-contributor comparisons are summarized. The highest value reported for sample “Pen B” was an LR of “1.57E+02” (equivalent to 157) when compared to genotype “JB” [see Figure 1]. In the genotype dataset underlying Butler, et al., the genotype “JB” appears in the Caucasian dataset. The results generated during the course of testing sample “Pen B” appear in Table 20D.2 of the FST validation materials [see Figure 2]. FST Validation, Volume I, part IV, “Database Comparisons,” states: 1 J. M. Butler, R. Schoske, P. M. Vallone, J. W. Redman, M. C. Kline, and others, “Allele frequencies for 15 autosomal STR loci on US Caucasian, African American, and Hispanic populations,” J. Forensic Sci., vol. 48, no. 4, pp. 908–911, 2003. Available at http://www.cstl.nist.gov/biotech/strbase/pub_pres/Butler2003a.pdf 2 http://www.cstl.nist.gov/biotech/strbase/ 3 http://www.cstl.nist.gov/biotech/strbase/NISTpopdata/JFS2003IDresults.xls Page 2 of 9 Case 1:15-cr-00565-VEC Document 153-2 Filed 10/17/17 Page 4 of 13 The FST program can compare a database of “suspect” profiles against an evidence profile. This function can be used to check for potential contamination by lab or crime scene personnel. … For all types of database comparisons, results include two Excel files that have been compressed into a .zip file. One file includes the ID number of each person in the database and the LRs that were obtained for that person as a “suspect” using each of the four population allele frequencies. This file also contains model information, such as deducibility and DNA quantity. The second file contains the evidence profile and the profiles of everyone in the database. I have not received any Excel or .zip files generated by OCME during its FST validation study. Figure 1 - Pen B non-contributor results from Table 3B of the FST validation materials Figure 2 - Pen B results generated during the course of testing 3.2 Prior independent reproductions of Pen B I have been provided Excel spreadsheets underlying Clint Hughes’ reproduction of the 157 LR for the Pen B sample. 3.3 Evaluation Using the genotyping results described in [Figure 2] and the version of FST I was provided, I attempted to reproduce the FST Validation evaluation of the three Pen B replicates, using the “JB” reference as a comparison profile. The “Forensic Statistic Comparison Report” generated for this evaluation is included in the Appendix as [Figure A.1]. Page 3 of 9 Case 1:15-cr-00565-VEC Document 153-2 Filed 10/17/17 Page 5 of 13 3.4 Results In Table 20D.2 of the FST validation materials, the lowest LR reported for the Pen B comparison against reference profile “JB” is 157. Performing this same evaluation with the version of FST provided to me, the lowest LR reported is 70.6, for the Caucasian subpopulation [Figure A.1]. FST is a deterministic system, which means that for any given set of input, it will produce the same output each time it is run. The difference in values, given the same input, indicates a change in behavior between the version of FST used to perform the validation study and the version of FST provided to me. One explanation for this difference in expected vs. observed behaviors of FST is the undocumented behavior described in section 5.6 of my October 27, 2016 statement. An investigation into the calculations performed by the version of FST provided to me indicates that this version removed three loci (D3, D13, D16) from the calculation of the likelihood ratio for the Pen B mixture. The version of FST provided to me reported the same likelihood ratios when the alleles observed in the three Pen B replicates were entered for all fifteen loci [Figure A.1] and when the alleles at D3, D13, and D16 were removed, but the alleles at the other twelve loci were still entered [Figure A.2]. As indicated by [Figure A.3], evaluation of the D3, D13, and D16 loci in isolation resulted in LR’s of 1 for each of the four subpopulations used by FST. The Excel spreadsheets provided to me by Clint Hughes contain locus likelihood ratios for the comparison of the JB reference profile to the three Pen B replicates, using Caucasian allele frequencies. The reported locus LR’s are: • D3: 0.534387 • D13: 3.129836 • D16: 1.327869 Locus likelihood ratios are multiplied together to calculate an overall (final) likelihood ratio. As described above, the final likelihood ratio reported by the version of FST provided to me was 70.6 for the Caucasian subpopulation, using only 12 of the 15 total locus likelihood ratios for this calculation. The product of the D3, D13, and D16 locus likelihood ratios taken from these Excel spreadsheets is 2.22. Page 4 of 9 Case 1:15-cr-00565-VEC Document 153-2 Filed 10/17/17 Page 6 of 13 Multiplying the likelihood ratio of 70.6 reported by the version of FST provided to me by the Excel likelihood ratios product of 2.22 results in a 15-locus likelihood ratio of 157. 70.6 x 2.22 157 12 loci – Reported by the the version of FST provided to me 3 loci – Reported in the Hughes Excel files 15 loci total - Overall likelihood ratio Equation 1 – Combination of the 12-locus likelihood ratio reported by the version of FST provided to me and the three locus likelihood ratios for the removed loci (D3, D13, D16) as provided in the Hughes Excel files. This calculated likelihood ratio of 157 matches the likelihood ratio reported in the FST Validation materials for the Pen B mixture [Figure 1]. This supports the explanation that the undocumented behavior described in section 5.6 of my October 27, 2016 statement caused the difference in final likelihood ratios between the version of FST used during validation and the version of FST provide to me. This modification was necessarily made after data was generated for Table 20D.2 of the FST Validation materials. As stated in section 5.4.4 (“Version control”) of my October 27, 2016 statement, I am unaware of any version control system in-place for FST. Consequently, I cannot be any more specific about when this undocumented behavior was introduced into the FST source code. 3.5 Significance of the removal of locus likelihood ratios Likelihood ratios greater than 1 indicate greater relative support for the hypothesis represented in the numerator of the ratio. FST refers to the numerator hypothesis as “Hp,” shorthand for “prosecution hypothesis.” Likelihood ratios less than 1 indicate greater relative support for the hypothesis represented in the denominator of the ratio. FST refers to the denominator hypothesis as “Hd,” shorthand for “defense hypothesis.” In the Excel spreadsheets provided to me by Clint Hughes, the locus likelihood ratios for D13 (3.13) and D16 (1.33) are both greater than 1, indicating support for the prosecution hypothesis at these two loci. The locus likelihood ratio for D3 (0.53) is less than 1, indicating support for the defense hypothesis at this locus. From this investigation into the “Pen B” mixture and my examination of the FST source code provided to me, it is apparent that the removal of a locus from likelihood ratio calculations occurs before the likelihood ratio calculation is made. That is, FST does not “pick and choose” loci to remove from the likelihood ratio calculation based on the value of their locus likelihood Page 5 of 9 Case 1:15-cr-00565-VEC Document 153-2 Filed 10/17/17 Page 7 of 13 ratios. The loci removed from the calculation could entirely support the prosecution hypothesis; entirely support the defense hypothesis; or variously support the prosecution and the defense hypotheses, as seen with the Caucasian likelihood ratios for the D3, D13, and D16 loci in the comparison of “JB” vs. the “Pen B” mixture. It is difficult to predict when the version of FST provided to me will remove loci from the likelihood ratio calculation, given any particular sample. Only one occurrence of allele frequencies summing to ≥ 0.97 must occur for a locus to be removed. Consequently, it is conceivable that a locus removed from the calculation could support the prosecution hypothesis for one subpopulation and the defense hypothesis for another subpopulation. For example, in the Pen B mixture, alleles (8, 9, 11, 12, 13, 14) are observed at the D13 locus across all three replicates. According to the allele frequencies provided to me in the databases accompanying the FST source code, the frequencies of these alleles sum to, by subpopulation: • Asian: 0.820 • Black: 0.980 • Caucasian: 0.979 • Hispanic: 0.937 The Asian and Hispanic subpopulation summed allele frequencies did not reach the 0.97 threshold for removal, but the D13 locus was still removed from the overall likelihood ratio calculation for all subpopulations. Furthermore, the sum of allele frequencies within each Pen B replicate genotype never reached 0.97 at D13. This threshold was only reached when frequencies were summed for alleles observed once or more across all replicates. It is difficult to describe the significance of the change in behavior between the version of FST used during the validation study and the version of FST I was provided. This is due to a number of complicating factors: • the variance of observing different alleles across replicate genotypes; • the removal of a locus from the likelihood ratio calculation for all subpopulations when any subpopulation’s allele frequencies sum to ≥ 0.97; • the reporting of only the lowest LR across all four subpopulations in the FST Validation materials; Page 6 of 9 Case 1:15-cr-00565-VEC Document 153-2 Filed 10/17/17 Page 8 of 13 • and FST not reporting individual locus likelihood ratios. Ultimately, the behavior of the version of FST I was provided deviates from the expected behavior of FST as described in its validation study. Currently, I cannot comprehensively describe the significance of this deviation. Further investigation is merited. 3.6 Guidance on the validation of probabilistic genotyping systems In 2015 the Scientific Working Group on DNA Analysis Methods (SWGDAM) published a guidance document entitled, “Guidelines for the Validation of Probabilistic Genotyping Systems.” 4 This document states, under section 5, “Modification to Software”: A significant change(s) to the software, defined as that which may impact interpretation or the analytical process, shall require validation prior to implementation. In 2016 the International Society for Forensic Genetics (ISFG) published a guidance document entitled, “DNA Commission of the International Society for Forensic Genetics: Recommendations on the validation of software programs performing biostatistical calculations for forensic genetics applications.” 5 This document states, under the section entitled “Additional Guidance on Software Usage and Application”: Core changes to the implemented algorithms should be subjected to additional developmental validation prior to their release. “Significant” and “core” are not scientific terms and should therefore be precisely defined before they are used. That is, a scientific evaluation of the significance of such a change should involve answering the questions, “To what degree is this change significant?” and, “By what test are we judging significance?” Since no guidance body has published a framework or set of criteria to be used for determining the significance of changes to probabilistic genotyping software, these are open questions in the field. It is worth considering how it could be determined that any given change is not “significant” or “core” (and therefore does not merit a re-validation study) unless that determination is founded 4 Scientific Working Group on DNA Analysis Methods. Guidelines for the Validation of Probabilistic Genotyping Systems. Available at - http://www.swgdam.org/publications 5 M. D. Coble, J. Buckleton, J. M. Butler, T. Egeland, R. Fimmers, P. Gill, L. Gusmão, B. Guttman, M. Krawczak, N. Morling, and others, “DNA Commission of the International Society for Forensic Genetics: Recommendations on the validation of software programs performing biostatistical calculations for forensic genetics applications,” Forensic Sci. Int. Genet., vol. 25, pp. 191–197, 2016. Page 7 of 9 Case 1:15-cr-00565-VEC Document 153-2 Filed 10/17/17 Page 9 of 13 on scientific support, such as a validation study. Without mathematical proofs or any type of empirical study, claims about the behavior of any computational system should, by default, be considered speculative rather than conclusive. Changes to the likelihood ratio calculation portion of a probabilistic genotyping system can affect the sensitivity, specificity, precision, and accuracy of the system when comparing contributors and non-contributors to DNA mixtures. Due to the complexity of probabilistic genotyping algorithms, seemingly small modifications to the code can have unintended consequences and “knock-on” or compounding effects that alter the intended behavior of the algorithms. The significance of a change should not be underestimated. I am unaware of any background or validation materials pertaining to FST that describe the removal of loci from likelihood ratio calculations. FST’s behavior changed between the version of FST used to generate data for FST Validation Table 20D.2 and the version of FST provided to me. If this departure in behavior should be considered “significant” or “core,” SWGDAM and ISFG guidance indicate that a re-validation is merited. 4 Notice to the FST operator I have not received any notice or warning to indicate FST has removed loci from its likelihood ratio calculations. 5 Conclusion There is at least one indication of a deviation in behavior between the version of FST used during its validation study and the version of FST provided to me. My investigation indicates that this deviation is due to the behavior described in section 5.6 of my October 27, 2016 statement, which describes conditions under which FST removes loci from its likelihood ratio calculations. Due to the complexity of the issue, I cannot currently describe the full significance of this behavior; how frequently it can occur during casework; or how it quantitatively impacts FST’s likelihood ratio results. Further investigation is merited. The greater field of forensic DNA analysis has not reached a consensus for a framework that could be used to determine if a modification to FST affecting its likelihood ratio calculations is a “significant” or “core” post-validation change to the program. Extant guidance language suggests that if this change is “significant” or “core,” a re-validation of the FST software is Page 8 of 9 Case 1:15-cr-00565-VEC Document 153-2 Filed 10/17/17 Page 10 of 13 Signed, Nathaniel Adams February 12, 2017 Page 9 of 9 Case 1:15-cr-00565-VEC Document 153-2 Filed 10/17/17 Page 11 of 13 Appendix Figure A.1 – Pen B reproduction with the version of FST provided in US v. Johnson Appendix – Page 1 of 3 Case 1:15-cr-00565-VEC Document 153-2 Filed 10/17/17 Page 12 of 13 Figure A.2 - Pen B reproduction with the version of FST provided in US v. Johnson - no information provided at D3, D13, and D16 loci Appendix – Page 2 of 3 Case 1:15-cr-00565-VEC Document 153-2 Filed 10/17/17 Page 13 of 13 Figure A.3 - Pen B reproduction with the version of FST provided in US v. Johnson - information provided at only D3, D13, and D16 loci Appendix – Page 3 of 3