A Statistical Metric of University Quality Based Upon the NCES-IPEDS Data William D. Wachob William.Wachob@mi.rr.com Lawrence Technological University Southfield, Michigan 48075, USA Abstract The development of a quality metric for colleges and universities is an area that causes a great deal of conversation each year. Metrics used by most colleges and universities are normally localized to the school. This is done in order to make them relevant to the desired user or outcomes. When evaluating the quality of a college or university, there are a number of potential factors to be considered. This makes the whole process unwieldy. This paper will focus upon the development of an academic quality metric based upon the national standardized tests. If a simple single metric could be identified, many other comparisons can then be made. This paper develops the concept of an rScore that is a ranking score based upon the standardized testing scores of the incoming freshman class. The standardized testing scores are self-reported by the institutions in their NCES-IPEDS (National Center for Educational Statistics - Integrated Postsecondary Education Data Set) submissions. Keywords: quality, ranking, metric, IPEDS, rScore 1. INTRODUCTION There is an initial premise in this work which is represented in the following figure: Figure 1 - College Funding "Arms Race" When we read the arrows as “attracts”, the intent of the overall process becomes clear. The relationship between these components is described in a paper by Cunningham and Cochi-Ficano (2002). This process also represents the basic funding philosophy of many college and universities. It results in a fundamental belief that causes extra focus to be placed on grants, philanthropy and tuition revenue. The “war chest” of endowed funds for some colleges is very large. This allows that university far more freedom in enhancing their faculty, facilities and image. With sufficient application of financial resources, institutions can effectively outspend their competitors. That spending can be used to build new or remodeled facilities and update laboratory and computer equipment. It can even be used to subsidize proposal efforts for additional project funding. The spending can also be used to incent top faculty and researchers to come to or stay with a school. This practice is the same when we look at private industry. While private industry does not generally have endowed assets, the corporate coffers most likely will have a “war chest” that can be used to make similar strategic moves when they are deemed necessary. This concept is discussed in Fuller (1993). The process described in Figure 1 relies heavily upon the element of higher reputation or perceived quality. As there is no overall specific metric for rating the quality of colleges and universities, a statistical methodology for depicting the quality of an institution would be useful. One primary area where such a metric could be used is in institutional reporting and analysis. The metric would provide another means to describe competitive institutions. If one school wished to emulate a particular success of another, one element of their discussion would likely be an assessment of whether the students at one school or the other are likely to have better success in that particular discipline. The concept of perceived quality carries through to the needs of the public and perspective students. Several college selection publications (see references) advise the student to include “quality” in their college selection process. This will largely default to the unscientific lists of “best” colleges and marketing from the school. This project proposes a common metric that could be used in the decision process of college selection. 2. FOCAL THEORY The focal theory of this work is that it is possible to express a measure of the quality in a single numeric score. This score has been named the “ranking score” or rScore for short. The institutions can also be grouped into bins for the purpose of analysis and comparison. These are called the “ranking bins”, or “rBins”. Both of these terms can be shaped to bias them towards the type of institutions you wish to favor. For example, developing a metric that gives higher weight to engineering / scientific schools is basically a matter of giving higher weights to the math scores of the SAT (Scholastic Aptitude Test) or the ACT (originally American College Test, changed to simply ACT in 1996). One definition of the rScore is as follows: Equation 1 – rScore25 calculation The SAT and ACT scores were obtained as data reported by the colleges for their incoming freshman class. The data only includes students who are attending college for the first time. The choice of the elements was made to create a small bias towards the SAT mathematics score indicating that the students were likely to be more oriented towards science and engineering. The choice of the 25th percentile was made as that would indicate that 75% of the students had achieved the reported score or better. This effectively provides a measure of the likelihood of success of the incoming freshman class in mathematics and science. It should be noted that the actual rScore for an institution is less important than the use of a repeatable methodology with which to rank or group the institutions under analysis. Creating bins for the rScore values also suggests that the data set could be broken into a limited number of groups. Each group can then be shown to be statistically different from the others using the Student’s t-statistic as a test indicator. A more complete discussion of the Student’s t-statistic is provided in two references (Anderson (2002) and Lipson (1973)). The equation for calculating the t-statistic is: Equation 2 - Student's t-statistic When n is sufficiently large the t-statistic values of interest become: % Probability t-value 95% 1.96 98% 2.33 99% 2.58 When the t-statistic is calculated, a score that exceeds the above values suggests that the two sets come from different populations. The methodology followed in this work is modeled on the methodology used in Tang (2004). In this work, the authors compared a “university quality” metric to the compensation of the university CEOs, tuition and operating expenditures. They were able to develop equations of these metrics and relate them to perceived university quality metrics. This work used the reputation rankings as published each year by the Newsweek organization. Tang (2004) expands upon the use of such a metric. They key relevance of this work is that it establishes that there is a statistical relationship between certain “perception metrics” and hard data associated with the colleges and universities they studied. 3. HYPOTHESIS The hypotheses associated with this analysis were constructed such that they could support the development of the focal theory. These hypotheses were that: * A relationship exhibiting a high degree of correlation exists between the rScore and admission rates of the institution [AdmissRate=f(rScore)] * A relationship exhibiting a high degree of correlation exists between the rScore and faculty salaries [FacSalaries=f(rScore)] * A relationship exhibiting a high degree of correlation exists between the rScore and the tuition charged by the institution [Tuition=f(rScore)] * A relationship exhibiting a high degree of correlation exists between the rScore and the endowment assets of the institution [Endowment=f(rScore)] A high degree of correlation would be accepted when the correlation coefficient was above 0.6. Earlier work by Wachob (2009) identified that the variability in reporting some the data values precluded using these forms for predictive modeling. This does not mean that the existence of a relationship cannot be shown. 4. DATA ACQUISITION To obtain the data for the basic calculations, it was necessary to access the National Center for Education Statistics – Integrated Postsecondary Education Data Set (referred to as simply IPEDS). The data year used was 2005-2006 as that was the latest year that had a full set of data. The data from the Dataset Cutting Tool (DCT) was used to extract all of the school information. The data fields extracted from the data set are shown in attachment 1. The database was then divided into several related tables as shown in the following table: Data Table Contents inst Basic information about the institution endow Value of the endowments of the institution enroll Enrollment counts, number of applications, number of acceptances and enrollments salary Number of faculty employees, average salary staff Number of staff employees and total salary test Reported scores for SAT and ACT standard tests tuition Reported tuition, fees and housing costs Table 1 - MySQL data tables The data tables were unified using the IPEDS identification number for the institution, denormalizing the data set in the interest of convenience of use. The initial data set consisted of 7,018 institutions. A number of filters were then applied to the data set to limit the number of institutions that would be considered. These filters were: * Only four-year degree-granting institutions were included * Only institutions that reported endowment figures for 2005-2006 were included * Only institutions that reported their 25th percentile SAT and ACT scores were included * Only institutions that reported faculty salaries for 2005-2006 were included Application of these filters yielded a set of 1,063 schools. These schools had the following degree characteristics: Degree Type Number Bachelor’s 474 Master’s 389 Doctoral 200 Table 2 - Distribution of data set by degree Examination of the rScore25 values shows the following distribution: rScore (min) rScore (max) Number 0 100 4 100 200 100 200 300 552 300 400 275 400 500 83 500 600 44 600+ 5 Table 3 - Distribution of institutions by rScore25 The shape of this distribution is what would be expected. There are fewer schools that reflect the higher rScores. A further analysis shows that only 12.4% of the institutions had a score of 400 or more. In order to make data handling more manageable, as further subset of the 1,063 schools was chosen. A nominal sample size of thirty-five (35) institutions was chosen. This choice results in selection of 3.3% of the schools. The selection of the thirty-five schools was made using the following algorithm: 1. Partition the data set into groups by the type of degree being granted 2. Rank order the resulting set in descending order by rScore25 values 3. Calculate the number of samples for that group by taking 3.3% of the group size 4. Round the number of samples to an integer 5. Beginning with the highest ranked sample, select every rth sample where r was the value calculated in step 4. When the process described above is followed, the number of samples becomes: Classification Number of Schools Number of Samples Bachelor’s 474 16 Master’s 389 13 Doctoral 200 7 Table 4 - Sample sizes by degree type A fourteenth sample was added to the Master’s classification in order to include the author’s institution. This accounts for the 37th sample. A full listing of the sample set is included in appendix I. The identification of the source institutions has been omitted. 5. DATA ANALYSIS To make the data analysis, the segments represented in figure 1 need to be mapped into available datasets. The datasets available through the National Center for Educational Statistics – Integrated Postsecondary Education Data set provided the necessary information. The data from the thirty-seven samples was processed for linear correlation using Minitab and Excel with the following results: Slope Intercept r2 r Admiss Rate -0.1115 103.3 0.473 0.688 Faculty Salary 120.4 2,0183.0 0.646 0.804 Tuition 61.1 5,154.9 0.468 0.684 Endowment 3.354e6 -849.9e6 0.659 0.812 Log10(Endow) 0.00546 5.914 0.564 0.751 Table 5 - Linear correlation results From these results we can see that the hypothesis of a strong correlation between the rScore25 and the reported observations provided by the institutions holds true. The data and trend lines for each set were also graphed and are included in the appendices as figures 2 through 6. The data for the endowed assets was transformed by taking the log10 of the reported endowed assets before plotting. Admissions Rate Four-year degree granting institutions typically have enrollment policies that are based in part upon the likelihood of success for the student. Components of that assessment often include metrics such as the student’s high school grade point average and performance on one or both of the standardized national tests (ACT and SAT). Schools that have better reputations for their academic programs have the ability to be more selective in whom they admit to their programs. When the admissions rates are compared to the rScore statistic a good correlation factor of 0.688 exists. This is shown in Figure 2 - Admissions Rate. The negative value of the slope is consistent with the observation that admissions rates fall as the rScore increases. If the rScore is a reflection of the likelihood of success for the students, then the idea that schools with higher rScore values can be more selective in their admissions policies is supported. Among the selected schools, institutions with “open enrollment” policies were not included. This is not a major limitation because relatively few four-year degree granting institutions have open enrollment practices. Faculty Salaries The indicator used for the analysis was the reported faculty salaries adjusted for a nine-month school year. In this case, we find that the positive slope of the relationship is an indicator that the reported salaries increase with increasing rScores. This data is represented graphically in Figure 3 - Faculty Salaries. While higher salaries do not necessarily mean that instruction is better, it is also likely that better known professors will draw the students with higher SAT and ACT scores. The reasoning here becomes circular except for the observation that the independent axis of this relationship is determined by the incoming freshman class. The rScore for the class is based upon standardized scores that are often made in the high school junior year. The higher correlation between the rScore and the faculty salaries (0.804) may also be influenced by the higher tuition costs for more popular schools. Tuition Although the relationship between rScore and tuition has the lowest correlation coefficient, it still exhibits a good affinity based upon its r value of 0.684. The tuition values were not adjusted for geographic differences across the country. This data is shown graphically in Figure 4. The indications also do not represent any adjustments for market related factors. The reported values are made up of two components. Tuition, books and fees that are relatively constant within a region comprise the initial element. Local housing costs represent the other element. These have a higher degree of variance across geographic regions. Endowed Assets An initial analysis of the endowed asset data was normalized into a per student relationship. The statistical analysis failed to support a high correlation coefficient. A second analysis was performed using only the raw reported endowment amounts. This relationship had the strongest correlation of the data analyzed. When the data was plotted using the log10 of the reported amounts the relationship also had good correlation. This is depicted graphically in Error! Reference source not found.. This graph also suggests that a higher order relationship may exist between the rScore and tuition data. Fitting the values into a quadratic relationship (Figure–6) showed an exceptionally high correlation (0.902). 6. CONCLUSIONS The initial premise of this paper was that a statistical metric could be constructed to indicate the quality of an institution. Webster’s dictionary defines quality as a “degree of excellence” or “superiority of kind”. In assessing higher educational institutions, many factors come to bear on the definition. It is likely that top students, faculty and benefactors will affiliate with institutions that have a better reputation. While an important element of this is the brand management of the institution, the never ending relationship depicted in Figure 1 still holds true. When the data is analyzed, a strong statistical relationship can be shown between the rScore25 and appropriate observations to the segment. One of the observations from the rScore is that the distribution of the results follows a normal distribution. The histogram of the distribution of the rScores of the sample data set is provided as figure 7. The rScore statistic is perhaps most useful when performing comparison analysis between institutions. Small differences (<5%) should be considered as noise in the analytical process. Large differences raise the question of capability between the comparison incoming freshman classes. In this regard grouping of the institutions into rBins becomes an appropriate tool. The listing of the top 10 schools as calculated by the rScore is included in Appendix VII. The list represents the rScore as calculated with the formula in equation 1. This version of the rScore gives higher weighting to the SAT math25 score and results in a list that has a higher number of technology schools. This does not rule out liberal arts programs as many students who score well in the math component also score well in the verbal and the ACT composite25 scores. The rScore methodology provides a quick tool for evaluating the academic strength of an institution based upon the likelihood of success for their freshman class. The “arms race” hypothesis depicted in figure 1 has been validated against the rScore with a good level of correlation. The rScore can then be used in modeling to investigate its relationships to other common metrics. For example, the rScore has been correlated with technology infrastructure spending. The results were not as consistent as would be necessary for empirical modeling, but medium sized subsets of the results have shown promise. Beyond using a statistic such as this for college selection decision making, the question becomes what, if anything a school could do to improve their score? The data from this study suggest that a significant increase in endowments would be the best place to focus. This is hardly surprising as it is one of the critical elements in driving the whole “arms race” cycle. The component of faculty salaries represents the quality of the faculty. Recruitment of top faculty is normally an objective for any school. The need to maintain salary structures and competitive tuition pricing will act as a dampening mechanism in this element of the process. The acceptance policies of the school also play a part in this cycle. By increasing the SAT and ACT scores required for admission, the rScore could be increased. Student tuition has a similar relationship in this process. Increases in tuition and tightening of acceptance policies would undoubtedly have a negative effect on enrollment numbers. Given the importance of having a critical mass of students to sustain school operations, this may be counter-productive. 7. REFERENCES Anderson, Sweeney, Williams (2002), Statistics for Business and Economics, South-Western division of Thompson Learning, pp 560-567 Carter, Carol C. (2006). College Selection Criteria: Reading Between the Lines. Retrieved June 2006, from http://www.lifebound.com/lifebound-resources-for-parents/college-selection-criteria Coates, Joseph F., Mahaffie, John B., Hines, Andy (1997), 2025: Scenarios of US and Global Society Reshaped by Science and Technology, Oakhill Press, Greensboro, NC, ISBN 1-886939-09-8 College Selection Criteria (2006), Retrieved June 2006, from http://bcsd.k12.ny.us/high/counseling/collselection.htm Cunningham, Brendan M., Cochi-Ficano, Carlena K. (2002), The Determinants of Donative Revenue Flows from Alumni of Higher Education, The Journal of Human Resources, Vol. XXXVII, No. 3, pp 540-569 DePauw University Admissions (2006), College Selection Criteria, Retrieved June 2006, from http://www.depau.edu/admission/applying/tips/college-search/criteria.asp Fuller, Mark B. (1993), Business as War, Fast Company Magazine, October 1993, Issue 00, pp 42, Retrieved December 2006 from http://www.fastcompany.com/online/00/war_printer_friendly.html Highland Park High School College Planning Handbook (2006), Criteria for College Selection, Retrieved June 2006, from http://www.dist113.org/hphs/handbook/bcriteria.htm Libby, Wendy (2005), College choice is more than a numbers game, Columbia Daily Tribune, Published on May 1, 2006, Retrieved June 2006 from http://www.showmenews.com/2005/May/20050501Comm008.asp Lipson, Charles and Sheth, Narendra J. (1973), Statistical Design and Analysis of Engineering Experiments, McGraw-Hill Publishing Pascarella, Ernest T. (2001), Identifying Excellence in Undergraduate Education: Are We Even Close? Change 33 no 3 18-23 My/Je 2001 Tang, Thomas Li-Ping, Tang, David Shin-Hsiung and Tang, Cindy Shin-Yi (2004), College Tuition and Perceptions of Private University Quality, International Journal of Education Management, Emerald Group Publishing, Vol. 18, No. 5 – 2004, pp 304-316 Wachob, William (2009), An Investigation of Technology Funding in U.S. Colleges and Universities, (unpublished doctoral dissertation, Lawrence Technological University) Appendix – I Raw Data rScore25 Admissions Rate Endowment Tuition and Fees (In St/Res) Average 9 mo Faculty Salary 672.7 14.3 2,147,483,647.0 44,600 108,104 600.3 18.8 1,300,081,000.0 42,624 90,073 504.9 50.7 293,982,704.0 30,334 64,282 483.6 45.7 1,411,813,000.0 42,478 88,821 459.5 67.5 99,641,797.0 26,300 65,154 396.0 74.5 56,687,176.0 40,632 66,827 391.2 77.4 203,825,000.0 43,226 73,735 365.7 63.1 123,593,848.0 29,294 61,662 363.4 67.6 155,589,000.0 34,675 70,745 352.0 80.1 293,281,661.0 32,586 67,907 332.2 63.5 111,313,542.0 27,590 56,091 331.8 56.9 17,097,792.0 16,578 58,268 321.3 78.8 555,365,026.0 15,730 73,685 310.0 67.7 47,520,162.0 11,495 60,808 308.0 96.2 265,649,102.0 16,549 54,111 296.0 61.5 78,358,300.0 18,412 51,730 294.0 54.2 2,321,820.0 13,228 50,780 287.7 80.9 4,929,041.0 19,435 36,756 282.0 61.6 6,536,475.0 15,354 57,342 275.5 76.5 48,696,961.0 35,360 66,646 269.8 66.7 44,509,574.0 26,978 44,544 262.2 90.1 64,280,276.0 19,142 59,241 261.2 74.7 1,665,232.0 14,968 55,967 260.3 52.2 21,735,726.0 26,161 66,008 260.0 84.6 17,035,047.0 26,280 45,982 246.6 75.6   21,152 55,201 246.6 76.3 9,343,165.0 14,403 55,586 237.6 59.5 21,079,909.0 14,810 31,471 237.6 74.6 2,493,000.0 15,945 47,082 231.2 89.7 50,579,041.0 13,501 53,810 228.0 63.1 23,636,920.0 24,060 33,549 222.7 59.4 15,324,126.0 29,461 38,137 217.6 99.6 83,065,650.0 12,560 52,508 210.8 66.1 6,919,757.0 30,540 39,509 198.4 67.6 2,378,919.0 16,792 69,824 192.0 98.4   14,022 26,287 160.2 75.2 9,270,773.0 20,880 41,868 Table 6 - Raw Data Appendix II Data Graphs Figure 2 - Admissions Rate Figure 3 - Faculty Salaries Figure 4 - Reported Tuition Tuition reported for an in state, residential student Figure 5 - Endowed Assets (log scaled) Figure 6 - Endowed Assets (quadratic) Figure 7 - Distribution of rScore values