# Selecting the Best Prediction Model for Readmission

## Article information

J Prev Med Public Health. 2012;45(4):259-266
Publication date (electronic) : 2012 July 31
doi : https://doi.org/10.3961/jpmph.2012.45.4.259
College of Pharmacy, Gachon University, Incheon, Korea.
Corresponding author: Eun Whan Lee, MPH. 191 Hambangmoe-ro, Yeonsu-gu, Incheon 406-799, Korea. Tel: +82-32-899-6431, Fax: +82-32-820-4829, ewlee@gachon.ac.kr
Received 2012 February 24; Accepted 2012 April 30.

## Abstract

### Objectives

This study aims to determine the risk factors predicting rehospitalization by comparing three models and selecting the most successful model.

### Methods

In order to predict the risk of rehospitalization within 28 days after discharge, 11 951 inpatients were recruited into this study between January and December 2009. Predictive models were constructed with three methods, logistic regression analysis, a decision tree, and a neural network, and the models were compared and evaluated in light of their misclassification rate, root asymptotic standard error, lift chart, and receiver operating characteristic curve.

### Results

The decision tree was selected as the final model. The risk of rehospitalization was higher when the length of stay (LOS) was less than 2 days, route of admission was through the out-patient department (OPD), medical department was in internal medicine, 10th revision of the International Classification of Diseases code was neoplasm, LOS was relatively shorter, and the frequency of OPD visit was greater.

### Conclusions

When a patient is to be discharged within 2 days, the appropriateness of discharge should be considered, with special concern of undiscovered complications and co-morbidities. In particular, if the patient is admitted through the OPD, any suspected disease should be appropriately examined and prompt outcomes of tests should be secured. Moreover, for patients of internal medicine practitioners, co-morbidity and complications caused by chronic illness should be given greater attention.

## INTRODUCTION

Readmission refers to being hospitalized again after being discharged and is significant for two reasons: quality and cost of health care [1]. In other words, readmission reflects relatively low quality [2] and also has negative social impacts [3-5]. Readmission is being considered as an indicator for evaluating the overall health care environment [6,7]. Australian Council on Healthcare Standards and National Health Service of the UK use readmission within 28 days after discharge as an indicator of the quality of health care [8,9]. The readmission rate is also one of the indicators of the Hospital Services Evaluation Program in Korea [10]. In addition, the rate of readmission was chosen as one of the grounds for measuring the quality of health care in the National Quality Forum of 2008 and the US Centers for Medicare and Medicaid, pointed out readmission as a reason for the excessive medical expenses [11].

The US Medicare Payment Advisory Commission stated that $12 billion per year is spent on preventable readmissions [4], and a study estimated the cost of readmission of Medicare patients to be$17.4 billion per year [5]. When the medical costs of admitted patients are analyzed, high cost patients, who comprise 15% of all admitted patients, account for 55.3% of total medical costs, while the readmitted patients accounted for the highest portion, 42.0% [12].

The frequency of readmission varies depending on the number of days after discharge. Research on Medicare patients in the US shows that the rate of readmission within 7 days was 6.2%, 15 days 11.3%, and 30 days 17.6% [13]. In another study, patients readmitted within 30 days accounted for 19.6% of the total number of patients, which was about a fifth of the total number of patients [5].

As described above, readmission not only degrades the quality of health care but also increases medical expenses. Consequently, it is important to identify and predict the causes of readmission in order to prevent it. In this study, models to predict the risk of readmission were constructed, compared, and evaluated in order to determine an optimum model. In addition, the final model was applied to the research data in order to identify patients at risk of readmission in order to help prevent readmission.

## METHODS

### Materials and Subjects

The database of patients admitted to a teaching hospital in Seoul from January 1 to December 31, 2009, was used. The subject of analysis was individual patients, and when a patient had been admitted several times, all cases were evaluated individually and included in the analysis. Excluding the diseases that occupied less than 5% of the sample size among the 22 disease categories of the 10th revision of the International Classification of Diseases (ICD-10) code, 11 951 hospitalized patients among a total of 16 347 admitted under the following 8 categories of diseases were the final subjects of analysis: "neoplasms," "endocrine, nutritional and metabolic," "circulatory system," "respiratory system," "digestive system," "musculoskeletal system and connective tissues," "genitourinary system," and "symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified."

### Variables

#### Dependent variable

Readmission is a dependent variable for which the period between discharge and readmission varies according to the researcher. However, governmental agencies in Canada, Australia, the UK, and New Zealand, as well as most recent studies use 1 month (28-30 days) as the standard [8,9,14-18]. In this study, readmission within 28 days after discharge was used as the dependent variable.

#### Independent variable

The risk factors of readmission revealed in previous studies are composed mainly of "demographic," "treatment and clinical," and "health care utilization" factors. Demographic factors were sex [17], age [17,19-21], income [22], and level of education [16]. As for age, opinions about its significance differ among researchers [20,23-25]. Treatment and clinical factors identified in previous research include department of treatment [17,20], number of comorbidities [22,23], surgery [26,27], clinical test results [28], change in the amount of drug dose within 48 hours of discharge [20], experience of depression [21], and state of mental health [28]. Health care utilization factors previously identified were the length of hospitalization [20], frequency of hospitalization and use of the emergency room (ER) within 6 months prior to the index hospitalization [28], frequency of hospitalization within 1 year from the index hospitalization [19], type of insurance [18], and type of ward [27].

For this study, risk factors used in the demographic category were sex, age, and region of residence; for the treatment and clinical category, the risk factors were department of treatment, premium medical treatment (selecting a specific doctor), number of comorbidities, number of accompanying treatments, and whether the patient had surgery; the categories of main diagnosis of ICD-10 were included as independent variables; for the health care utilization category, type of insurance, type of patient room, admission route, length of stay (LOS), number of out-patient department visits and hospitalization, and use of ER within one year of the index hospitalization were taken into consideration as independent variables.

### Construction and Validation of the Model

Each of the three models in this study was constructed using a different method. The first model was constructed using logistic regression, which is widely used when the dependent variable is binomial, and the second was built with a decision tree, which decides the most significant independent variable in each stage of predicting dependent variables. The third model was constructed using a neural network, which is useful for classifying and predicting from a large database [29,30].

The complete analysis data were divided into 70% training data and 30% validation data, and in order to decide the model with the highest predictive power, the misclassification rate and root asymptotic standard error (ASE) were compared for each of the model's training data and validation data. A lift chart and receiver operating characteristic (ROC) curve were also used [29,30]. The model finally selected was then applied to the research data to identify patients with high readmission risk.

## RESULTS

### Characteristics of Subjects

The demographic characteristics of subjects were as follows: men account for 51.0%, as for age, 22.6% were in their 60s, 20.4% in their 50s, and 18.4% over 70, and more than half of the patients (56.3%) were residents of the city of Seoul (Table 1).

Demographic characteristics of subjects

Treatment and clinical characteristics were as follows: 64.1% of the patients were treated in the Department of Internal Medicine and 93.0% chose premium medical treatment. Among the principal diagnoses, diseases of the digestive system comprised 20.6% of patients and neoplasms 18.2% (Table 2).

Clinical characteristics of subjects

Health care utilization characteristics were as follows: patients with National Health Insurance accounted for 93.9%, and most of the patients (87.6%) stayed in rooms with 2 or more beds. Hospitalization via outpatient visits comprised 70.0%, and the average LOS was 8.5±11.2 days. The number of outpatient visits within one year before the index hospitalization was 14.7±20.0 visits, the number of hospitalizations within one year prior to the index admission was 2.5±3.4 times, and the use of the ER within one year of index hospitalization was nil for 89.5% of patients (Table 3).

Health care utilization characteristics of subjects

### Construction and Validation of Models

From a comparison of the misclassification rate and root ASE to evaluate the predictive power of the three models (Table 4), the root ASE of the training data results was 0.385 for regression, 0.373 for the decision tree, and 0.384 for the neural network, while that of the validation data was 0.385 for the regression, 0.369 for the decision tree, and 0.383 for the neural network. Thus, the decision tree showed the highest predictive power for the root ASE. The misclassification rate with the model generation data was 0.214 for the regression, 0.180 for the decision tree, and 0.214 for the neural network while that obtained with the validation data was 0.217 for the regression analysis, 0.177 for the decision tree, and 0.211 for the neural network. Thus, the decision tree also showed the highest predictive power for the misclassification rate. The lift chart and ROC curve, which are widely used to evaluate a given model's predictive power, were also used, and from the results, both the lift chart (Figure 1) and ROC curve (Figure 2) found the decision tree to have stronger predictive power. From the model comparison, the decision tree was chosen in order to predict patients with readmission risk.

Comparison of statistical prediction models by root ASE and misclassification rate

Comparison of statistical prediction models by Lift chart. 1Decision tree.

Comparison of statistical prediction models by receiver operating characteristic curve. 1Decision tree.

The results of the decision tree evaluated the main risk factors of readmission as LOS (standard of categorization: 2 days), route of admission, category of principal diagnosis (ICE-10), department, LOS (standard of categorization: 5 days), and the frequency of outpatient visits, in respective order (Figure 3).

Prediction models in rehospitalization. Values are presented as number (%). No, nonrehospitalization; Yes, rehospitalization; OPD, out-patient department; ER, emergency room; ICD-10, 10th revision of the International Classification of Diseases.

The most important factor in the first stage was the LOS; if the LOS was lower than 2 days, the risk of readmission was 64.9%. The important factor in the next stage was route of admission, where hospitalization via outpatient visit was the most risky. Risk was especially high (75.3%) for patients with a LOS of less than 2 days and who had been hospitalized via outpatient visit. For the third stage, the department of treatment was the important factor, where the risk of readmission increased for the Department of Internal Medicine. In particular, the risk of readmission was 80.6% if the patient was hospitalized via outpatient visit in the Department of Internal Medicine with less than 2 days of stay.

For patients with a LOS of more than 2 days, the important risk factor in the next stage was the category of principal diagnosis (ICD-10), where the risk of readmission for patients with neoplasms with a LOS over 2 days was 51.2%. The important factor in the third stage was the LOS, with increased risk when the LOS was less than 5 days. If the category of principal diagnosis was neoplasms with a LOS over 2 days but less than 5 days, the risk of readmission was 69.2%. For categories other than neoplasms, the risk factor of readmission was the number of outpatient visits within a year of the index hospitalization, where the risk increased if the number exceeded 10 times.

## DISCUSSION

The most important variable in predicting the risk of readmission was the LOS, where the risk was high when the LOS was less than two days (64.9%). Discharge within 24 or 36 hours of hospitalization was a risk increasing factor for various ages and disease categories due to decreased opportunity to discover comorbidities or complications as well as difficulty in managing pain for patients discharged on the same day as ambulatory surgery [31-33].

For patients with a LOS of less than 2 days, the variable to predict the risk of readmission was the route of admission, where the risk of readmission was high (75.3%) if the LOS was less than 2 days and they were admitted via an outpatient visit. The severity and urgency for patients who come to the ER are high, but the consultation time is short. The medical staff of the ER leave a variety of possibilities open and makes diagnoses in greater detail than is done for outpatients, resulting in a reduced possibility of misdiagnosis or undiscovered illnesses [34]. In addition, intensive treatment is carried out in a relatively short time based on rapid diagnosis. In particular, based on this study, the diagnosis and respective treatment of comorbidity are likely to have been made faster for emergency patients with a LOS less than 2 days than for those hospitalized via outpatient services. In other words, the risk of readmission for patients who were discharged early (within 2 days of hospitalization), for the same reason mentioned earlier, is speculated to be higher than for those hospitalized via an outpatient visit than via the ER.

The department of treatment was the most important variable in predicting the risk of readmission for patients with a LOS of less than 2 days and hospitalization via an outpatient visit. The risk of readmission was 80.6% for patients from the Department of Internal Medicine and 39.1% from the Department of Surgery. This study supports previous studies that concluded that the risk of readmission for patients from the Internal Medicine Department is generally higher due to the relatively high proportion of patients with chronic disease and complications [14,17,20].

Therefore, in order to prevent readmission, the appropriateness of discharge within 2 days of hospitalization should be considered with respect to undiscovered comorbidities and complications. It is especially important to provide appropriate examinations and rapid results for suspected disorders to patients admitted through an outpatient visit; moreover, close attention should be given to patients from the Department of Internal Medicine for potential comorbidities and complications caused by chronic diseases.

The category of principal diagnosis (ICD-10) was the most important variable for patients with a LOS of over 2 days. In previous studies, the title of the diagnosed disease was proposed as a factor of readmission based on the finding that the risk factor of readmission varies according to the specific disease [35]. In a study of patients with pneumonia, congestive heart failure, and intracranial hemorrhage, analysis according to each type of disease correlated with the factors affecting readmission [14]. Hence, risk of readmission can be predicted more accurately by breaking down and analyzing the target of research according to more detailed types of diseases. The titles of diseases are classified as "neoplasm" and "others" in the results of this study; that is because the risk factor is likely due to regular readmission for chemotherapy, a post-treatment, which is unavoidable [5].

LOS (standard: 5 days) was the key variable for predicting the risk of readmission for patients with LOS of more than 2 days and disease in the "neoplasms" category, where the shorter the LOS was, the higher the risk of readmission (69.2%). According to previous research, the risk of readmission within 1 year of index hospitalization increased with an increase in LOS, in a study of patients over 14 years of age in training hospitals [20] while an increased risk of readmission within 28 days of discharge was noted for a decrease in LOS for patients in academic hospitals [14]. In addition, LOS was not a significant factor in a study of patients over 65 years who were readmitted within 60 days of discharge [36]. This difference appears to be due to the difference in the subjects of study and the definition of the period from the day of discharge till readmission. For this study, the risk of readmission was higher when the LOS was relatively shorter with respect to the 5 day standard, which seems to be due to premature discharge without sufficient treatment. This phenomenon is caused by hospitals trying to reduce the LOS in order to increase bed turnover, which in turn increases profitability [37].

For patients classified in the "others" disease category, the factor of risk prediction was the number of outpatient visits within a year prior to the index hospitalization. Previous research shows that the risk of readmission is higher for patients who use medical services frequently; the risk of readmission increased with an increase in the frequency of hospitalization and use of ER within 6 months of index hospitalization as well as higher utilization of outpatient services within a year of index hospitalization [14,19,28]. The present study supports the findings of the previous studies, as the risk of readmission evaluated by this study was 12.5% for patients who used outpatient services less than 10 times within one year of the index hospitalization, whereas it was 27.3% for those who used it more than 10 times.

The first contribution of this study is that, unlike most prior studies that used a logistic regression, it used various methods to construct prediction models and then chose the model with the strongest predictive power. Second, it went beyond the simple task of identifying risk factors of readmission to finding the important predictive variables for each stage, thereby allowing a study of patterns and implications.

Some limitations were, first, that the results of the study cannot be generalized, as the data used were of hospitalized patients from a single academic hospital. Second, the study could not include all the variables that reflect the state of the patients mentioned in prior research such as the severity of the disease, the title of accompanying diseases, and the process of providing healthcare services. Although the number of accompanying treatments and comorbidities were included as independent variables in order to make up for this limitation, it is difficult to directly identify the severity of the disease and the process of providing medical services. Third, the decision tree modeled the pattern of readmitted patients by categorizing only 6 times, which means that it could not categorize according to all of the risk factors suggested in previous studies. In conclusion, the study provided results that help clarify a certain pattern in the way variables affect the identification of patients with risk of readmission, but it could not assess all the variables that affect readmission. It is recommended that further studies include variables such as the severity of diseases and the process of provision of medical services. Also the risk factors should be studied with respect to detailed types of diseases.

## Notes

The author has no conflicts of interest with the material presented in this paper.

## References

1. Hasan M. Readmission of patients to hospital: still ill defined and poorly understood. Int J Qual Health Care 2001;13(3):177–179. 11476140.
2. Clarke A. Readmission to hospital: a measure of quality or outcome? Qual Saf Health Care 2004;13(1):10–11. 14757792.
3. Friedman B, Basu J. The rate and cost of hospital readmissions for preventable conditions. Med Care Res Rev 2004;61(2):225–240. 15155053.
4. Medicare Payment Advisory Commission. Medicare Payment Advisory Commission. A path to bundled payment around a rehospitalization. Report to the Congress: reforming the delivery system 2008. Washington, DC: Medicare Payment Advisory Commission. p. 83–103.
5. Jencks SF, Williams MV, Coleman EA. Rehospitalizations among patients in the Medicare fee-for-service program. N Engl J Med 2009;360(14):1418–1428. 19339721.
6. Franklin PD, Noetscher CM, Murphy ME, Lagoe RJ. Using data to reduce hospital readmissions. J Nurs Care Qual 1999;14(1):67–85. 10616276.
7. Cullen C, Johnson DS, Cook G. Re-admission rates within 28 days of total hip replacement. Ann R Coll Surg Engl 2006;88(5):475–478. 17002854.
8. Australian Council on Healthcare Standards. Clinical indicators users' manual: hospital wide clinical indicators 2007. Ultimo: Australian Council on Healthcare Standards. p. 4–5.
9. Lakhani A, Olearnik H, Eayres D. Compendium of clinical and health indicators: data definitions and user guide for computer files 2009. London: NHS Information Centre for Health and Social Care. p. 97.
10. Ministry of Health & Welfare. Guidelines for hospital evaluation programme 2009 2009. Seoul: Ministry of Health & Welfare. p. 254. (Korean).
11. McBride S. Agency for Healthcare Research and Quality. Parameters for the appropriate definition of hospital readmissions. Workshop of Agency for Healthcare Research and Quality: using administrative data to answer state policy questions 2008. Rockville: Agency for Healthcare Research and Quality. p. 4–11.
12. Moon OR, Kang SH, Lee EP, Jwa YK, Lee HS. An analysis on the characteristics of high cost patients in the regional medical insurance program. Korean J Health Policy Adm 1993;3(1):53–83. (Korean).
13. Medicare Payment Advisory Commission. Report to the congress: promoting greater efficiency in medicare 2007. Washington, DC: Medicare Payment Advisory Commission. p. 103–120.
14. Lee E, Yu SH, Lee HJ, Kim S. Factors associated with unplanned hospital readmission. Korean J Hosp Manage 2010;15(4):125–142. (Korean).
15. Heggestad T, Lilleeng SE. Measuring readmissions: focus on the time factor. Int J Qual Health Care 2003;15(2):147–154. 12705708.
16. Jasti H, Mortensen EM, Obrosky DS, Kapoor WN, Fine MJ. Causes and risk factors for rehospitalization of patients hospitalized with community-acquired pneumonia. Clin Infect Dis 2008;46(4):550–556. 18194099.
17. Silverstein MD, Qin H, Mercer SQ, Fong J, Haydar Z. Risk factors for 30-day hospital readmission in patients ≥65 years of age. Proc (Bayl Univ Med Cent) 2008;21(4):363–372. 18982076.
18. Rumball-Smith J, Hider P, Graham P. The readmission rate as an indicator of the quality of elective surgical inpatient care for the elderly in New Zealand. N Z Med J 2009;122(1289):32–39. 19305447.
19. Reed RL, Pearlman RA, Buchner DM. Risk factors for early unplanned hospital readmission in the elderly. J Gen Intern Med 1991;6(3):223–228. 2066826.
20. Corrigan JM, Martin JB. Identification of factors associated with hospital readmission and development of a predictive model. Health Serv Res 1992;27(1):81–101. 1563955.
21. Marcantonio ER, McKean S, Goldfinger M, Kleefield S, Yurkofsky M, Brennan TA. Factors associated with unplanned hospital readmission among patients 65 years of age and older in a Medicare managed care plan. Am J Med 1999;107(1):13–17. 10403347.
22. Chu LW, Pei CK. Risk factors for early emergency hospital readmission in elderly medical patients. Gerontology 1999;45(4):220–226. 10394080.
23. Graham H, Livesley B. Can readmissions to a geriatric medical unit be prevented? Lancet 1983;1(8321):404–406. 6130389.
24. Gooding J, Jette AM. Hospital readmissions among the elderly. J Am Geriatr Soc 1985;33(9):595–601. 4031337.
25. Fethke CC, Smith IM, Johnson N. "Risk" factors affecting readmission of the elderly into the health care system. Med Care 1986;24(5):429–437. 3702502.
26. Holloway JJ, Thomas JW, Shapiro L. Clinical and sociodemographic risk factors for readmission of Medicare beneficiaries. Health Care Financ Rev 1988;10(1):27–36. 10312819.
27. Oh HJ, Yu SH. A case-control study of unexpected readmission in a university hospital. Korean J Prev Med 1999;32(3):289–296. (Korean).
28. Smith DM, Giobbie-Hurder A, Weinberger M, Oddone EZ, Henderson WG, Asch DA, et al. Department of Veterans Affairs Cooperative Study Group on Primary Care and Readmissions. Predicting non-elective hospital readmissions: a multi-site study. J Clin Epidemiol 2000;53(11):1113–1118. 11106884.
29. Steyerberg EW, Borsboom GJ, van Houwelingen HC, Eijkemans MJ, Habbema JD. Validation and updating of predictive logistic regression models: a study on sample size and shrinkage. Stat Med 2004;23(16):2567–2586. 15287085.
30. Bellazzi R, Zupan B. Predictive data mining in clinical medicine: current issues and guidelines. Int J Med Inform 2008;77(2):81–97. 17188928.
31. Shepperd S, Doll H, Broad J, Gladman J, Iliffe S, Langhorne P, et al. Early discharge hospital at home. Cochrane Database Syst Rev 2009;(1):CD000356. 19160179.
32. Lermitte J, Chung F. Patient selection in ambulatory surgery. Curr Opin Anaesthesiol 2005;18(6):598–602. 16534298.
33. Lock M, Ray JG. Higher neonatal morbidity after routine early hospital discharge: are we sending newborns home too early? CMAJ 1999;161(3):249–253. 10463045.
34. Terry NR. Physicians are talking about: the culture of defensive medicine 2010. cited 2012 Jul 6. Available from: http://www.medscape.com/viewarticle/718665.
35. Brown J, Grey CS. Stemming the tide of readmissions: patient, practice or practitioner? Rev Clin Gerontol 1998;8:173–181.
36. Burns R, Nichols LO. Factors predicting readmission of older general medicine patients. J Gen Intern Med 1991;6(5):389–393. 1744751.
37. Lee KJ, Lee WI, Kim YM, Seong SS, Chu HS, Lee AS. Korean Society of Quality Assurance in Health Care. Increasing of revenue through reduction of length of stay. Proceedings of the 2005 Conference of Korean Society of Quality Assurance in Health Care 2005. 2005 Nov 10-11; Daegu, Korea. Seongnam: Korean Society of Quality Assurance in Health Care; p. 361–362. (Korean).

## Article information Continued

### Figure 1

Comparison of statistical prediction models by Lift chart. 1Decision tree.

### Figure 2

Comparison of statistical prediction models by receiver operating characteristic curve. 1Decision tree.

### Figure 3

Prediction models in rehospitalization. Values are presented as number (%). No, nonrehospitalization; Yes, rehospitalization; OPD, out-patient department; ER, emergency room; ICD-10, 10th revision of the International Classification of Diseases.

### Table 1.

Demographic characteristics of subjects

All Nonrehospitalized Rehospitalized p-value (χ2 or t-test)
Sex
Male 6096 (51.0) 4133 (48.6) 1963 (56.9) <0.001
Female 5855 (49.0) 4368 (51.4) 1487 (43.1)
Age
0-9 763 (6.4) 675 (7.9) 88 (2.6) <0.001
10-19 364 (3.0) 292 (3.4) 72 (2.1)
20-29 675 (5.6) 523 (6.2) 152 (4.4)
30-39 1032 (8.6) 805 (9.5) 227 (6.6)
40-49 1784 (14.9) 1169 (13.8) 615 (17.8)
50-59 2440 (20.4) 1596 (18.8) 844 (24.5)
60-69 2697 (22.6) 1698 (20.0) 999 (29.0)
70+ 2195 (18.4) 1742 (20.5) 453 (13.1)
Residential regions
Seoul 6723 (56.3) 4991 (58.7) 1732 (50.2) <0.001
Gyeonggi/Incheon 2785 (23.3) 1914 (22.5) 871 (25.2)
Others 2443 (20.4) 1596 (18.8) 847 (24.6)
Total 11 951 (100.0) 8501 (100.0) 3450 (100.0)

Values are presented as number (%).

### Table 2.

Clinical characteristics of subjects

All Nonrehospitalized Rehospitalized p-value (χ2 or t-test)
Medical departments
Internal 7661 (64.1) 5269 (62.0) 2392 (69.3) <0.001
Surgical 4290 (35.9) 3232 (38.0) 1058 (30.7)
Yes 11 118 (93.0) 7893 (92.8) 3225 (93.5) 0.24
No 833 (7.0) 608 (7.2) 225 (6.5)
ICD-10 disease categories
Neoplasms 2173 (18.2) 986 (11.6) 1187 (34.4) <0.001
Endocrine, nutritional and metabolic diseases 813 (6.8) 642 (7.6) 171 (5.0)
Circulatory system 1248 (10.4) 1035 (12.2) 213 (6.2)
Respiratory system 1240 (10.4) 1045 (12.3) 195 (5.7)
Digestive system 2465 (20.6) 1589 (18.7) 876 (25.4)
Musculoskeletal sys- tem and connective tissue 901 (7.5) 790 (9.3) 111 (3.2)
Genitourinary system 1114 (9.3) 958 (11.3) 156 (4.5)
Symptoms, clinical findings, etc., else- where classified 1997 (16.7) 1456 (17.1) 541 (15.7)
No. of comorbidities 1.5±1.1 1.6±1.2 1.4±0.9 <0.001
No. of accompanying treatments 1.4±3.1 1.5±3.2 1.3±2.9 <0.001
Surgery
Yes 4451 (37.2) 3375 (39.7) 1076 (31.2) <0.001
No 7500 (62.8) 5126 (60.3) 2374 (68.8)
Total 11 951 (100.0) 8501 (100.0) 3450 (100.0)

Values are presented as number (%) or mean±SD.

ICD-10, 10th revision of the International Classification of Diseases.

### Table 3.

Health care utilization characteristics of subjects

All Nonrehospitalized Rehospitalized p-value (χ2 or t-test)
Insurance
National Health Insurance 11 220 (93.9) 7994 (94.0) 3226 (93.5) 0.002
Medical Aid 525 (4.4) 346 (4.1) 179 (5.2)
Other 206 (1.7) 161 (1.9) 45 (1.3)
Ward type
Private room 1144 (9.6) 932 (11.0) 212 (6.1) <0.001
Multi-bed room 10 469 (87.6) 7242 (85.2) 3227 (93.5)
Intensive care unit 338 (2.8) 327 (3.8) 11 (0.3)
Out-patient department 8368 (70.0) 5565 (65.5) 2803 (81.2) <0.001
Emergency room 3583 (30.0) 2936 (34.5) 647 (18.8)
Length of stay 8.5±11.2 9.0±11.3 7.5±10.9 <0.001
No. of out-patient department visits 14.7±20.0 11.7±17.5 22.0±22.5 <0.001
No. of admissions within previous 12 months 2.5±3.4 1.8±2.2 4.4±4.8 <0.001
Utilization of emergency room within previous 12 months
Yes 1260 (10.5) 828 (9.7) 432 (12.5) <0.001
No 10 691 (89.5) 7673 (90.3) 3018 (87.5)
Total 11 951 (100.0) 8501 (100.0) 3450 (100.0)

Values are presented as number (%) or mean±SD.

### Table 4.

Comparison of statistical prediction models by root ASE and misclassification rate

Logistic regression Decision tree Neural network
Root ASE Training data 0.385 0.373 0.384
Validation data 0.385 0.369 0.383
Misclassification rate Training data 0.214 0.18 0.214
Validation data 0.217 0.177 0.211

ASE, asymptotic standard error.