This project arose from an initiative at Open Data Delaware to explore ways in which data can be used to make better decisions about nursing home care. In a similar project, members of Chi Hack Night developed the Chicago Nursing Home Search web site to address concerns that Illinois ranks with the lowest performing states for nursing homes. The objective is to provide customers with the information to identify nursing homes that provide better patient care, and assess the trade-off with cost. This is increasingly important with the shift to value-based reimbursement structures for healthcare, which are designed to reduce cost while maintaining the quality of patient care (1). The Centers for Medicare & Medicaid services (CMS) regularly surveys patients and staff to assess the standard of patient care in nursing homes, and conducts inspections to identify deficiencies in the compliance with health and safety regulations. The data are compiled and made available to the public through the Nursing Home Compare (NHC) website. In accordance with the Five-Star Nursing Home Rating system, the data are used to rate each nursing home for performance in each of the areas of health inspections, staffing, and quality of resident care, and those ratings are combined to obtain an overall rating. The ratings are provided on the (NHC) website to assist consumers, and serve as the basis for other services such as US News rankings, and Elder Care Directory. However, quality of resident care is multidimensional and performance measurement is challenging because of the complexity of quality in nursing homes, diversity of residents in nursing homes, and lack of knowledge of the organizational factors that affect quality of care in nursing homes (2–4). Consequently, a five-star rating for a nursing home does not possess a simple interpretation in terms of quality of care because it is a composite quantity obtained by reducing multidimensional performance data to a one dimensional representation with a loss of information.

In their critical assessment of nursing home performance strategies, Phillips et al. point out that a fundamental limitation is the lack of “a gold standard to establish the validity of much of what we do” (2). Without a gold standard it is difficult to develop a measurement system for the quality of patient care. In this work, we use a the CART classification tree method to analyze nursing home performance data and obtain an approximate solution to this problem. The potential of CART to uncover patterns in data and identify factors influencing outcomes in nursing research was reviewed by Kuhn et al.(5) in 2014. More recent applications of CART include the prediction of weight loss following radiotherapy (6), and infection with influenza in primary care patients (7). The CART algorithm produces a decision tree representation of the relation between a target or utility variable and independent input variables. Branches of the decision tree are obtained by the recursive binary partitioning of data for the target variable by choosing the optimal splitting value from the independent input variables for each partition (8). Groups of nursing homes with good or bad quality of patient care are identified from the branching citeria of the decision tree.


2. Phillips CD, Hawes C, Lieberman T, Koren MJ (2007) Where should Momma go? Current nursing home performance measurement strategies and a less ambitious approach. BMC Health Services Research 7(1):93.

3. Castle NG, Ferguson JC (2010) What is nursing home quality and how is it measured? The Gerontologist 50(4):426–42.

4. Frijters DH, et al. (2013) The calculation of quality indicators for long term care facilities in 8 countries (SHELTER project). BMC Health Services Research 13(1):138.

5. Kuhn L, Page K, Ward J, Worrall-Carter L (2014) The process and utility of classification and regression tree methodology in nursing research. Journal of Advanced Nursing. doi:10.1111/jan.12288.

6. Cheng Z, et al. (2017) Evaluation of classification and regression tree (CART) model in weight loss prediction following head and neck cancer radiotherapy. Advances in Radiation Oncology. doi:10.1016/j.adro.2017.11.006.

7. Zimmerman RK, et al. (2016) Classification and Regression Tree (CART) analysis to predict influenza in primary care patients. BMC Infectious Diseases 16(1):1–11.

8. Krzywinski M, Altman N (2017) Points of Significance: Classification and regression trees. Nature Methods 14(8):757–758.