Article Text

Download PDFPDF

Cohort profile: the National Health Insurance Service-National Health Screening Cohort (NHIS-HEALS) in Korea
  1. Sang Cheol Seong1,
  2. Yeon-Yong Kim2,
  3. Sue K Park3,4,5,
  4. Young Ho Khang6,7,
  5. Hyeon Chang Kim8,
  6. Jong Heon Park2,
  7. Hee-Jin Kang2,
  8. Cheol-Ho Do2,
  9. Jong-Sun Song2,
  10. Eun-Joo Lee2,
  11. Seongjun Ha2,
  12. Soon Ae Shin9,
  13. Seung-Lyeal Jeong2
  1. 1National Health Insurance Service, Wonju, Korea
  2. 2Big Data Steering Department, National Health Insurance Service, Wonju, Korea
  3. 3Department of Preventive Medicine, Seoul National University College of Medicine, Seoul, Korea
  4. 4Department of Biomedical Science, Seoul National University College of Medicine, Seoul, Korea
  5. 5Cancer Research Institute, Seoul National University College of Medicine, Seoul, Korea
  6. 6Department of Health Policy and Management, Seoul National University College of Medicine, Seoul, Korea
  7. 7Institute of Health Policy and Management, Seoul National University Medical Research Center, Seoul, Korea
  8. 8Department of Preventive Medicine, Yonsei University College of Medicine, Seoul, Korea
  9. 9Gwanak Branch, National Health Insurance Service, Seoul, Korea
  1. Correspondence to Seung-Lyeal Jeong; sljeong{at}nhis.or.kr

Abstract

Purpose The National Health Insurance Service-Health Screening Cohort (NHIS-HEALS) is a cohort of participants who participated in health screening programmes provided by the NHIS in the Republic of Korea. The NHIS constructed the NHIS-HEALS cohort database in 2015. The purpose of this cohort is to offer relevant and useful data for health researchers, especially in the field of non-communicable diseases and health risk factors, and policy-maker.

Participants To construct the NHIS-HEALS database, a sample cohort was first selected from the 2002 and 2003 health screening participants, who were aged between 40 and 79 in 2002 and followed up through 2013. This cohort included 514 866 health screening participants who comprised a random selection of 10% of all health screening participants in 2002 and 2003.

Findings to date The age-standardised prevalence of anaemia, diabetes mellitus, hypertension, obesity, hypercholesterolaemia and abnormal urine protein were 9.8%, 8.2%, 35.6%, 2.7%, 14.2% and 2.0%, respectively. The age-standardised mortality rate for the first 2 years (through 2004) was 442.0 per 100 000 person-years, while the rate for 10 years (through 2012) was 865.9 per 100 000 person-years. The most common cause of death was malignant neoplasm in both sexes (364.1 per 100 000 person-years for men, 128.3 per 100 000 person-years for women).

Future plans This database can be used to study the risk factors of non-communicable diseases and dental health problems, which are important health issues that have not yet been fully investigated. The cohort will be maintained and continuously updated by the NHIS.

  • Cohort Studies
  • risk factors
  • National Health Programs
  • administrative claims

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • It is a cohort with a large sample size, with a relatively low rate of attrition over more than 10 years.

  • It contains the date and cause of death, which were determined using the national database and extensive information on healthcare usage regarding inpatient and outpatient visits to healthcare institutions and medication histories.

  • Variables on health behaviours are limited since those data were obtained from self-reporting. In addition, the disease diagnoses in the claim data might not accurately reflect patients’ medical conditions.

Introduction

The National Health Insurance Service-Health Screening Cohort (NHIS-HEALS) is a cohort of participants who participated in health screening programmes provided by the NHIS in the Republic of Korea (hereafter ‘Korea’). The purpose of this cohort is to offer relevant and useful data for a wide range of health researchers.

NHIS-HEALS is based on information obtained through the national health screening programmes of Korea. Since 1995, the NHIS has provided general national health screening programmes, including an oral health screening programme, to improve the health status of Koreans through the prevention and early detection of diseases.1 2 In 2007, a health screening programme for transitional ages, aimed at those aged 40 and 66 years, was also launched.3 NHIS-HEALS incorporates information from these three major health screening programmes for the adult Korean population (see online supplementary figure 1). All insured adults are eligible for a general health screening programme that is biennially conducted (annually for manual workers). The participation rate in the general health screening programme among the eligible population was 74.8% in 2014.4 The general health screening programme can be applied at least once every 2 years for the entire population of Korean adults aged 40 years or older. The healthcare institutions for screening are designated according to the Framework Act on Health Examinations, and must meet the standards of manpower, facilities and equipment.

Supplementary Material

Supplementary material 1

The NHIS established the National Health Information Database (NHID) in 2011, which incorporates all data from the NHIS and consists of five databases:5 an eligibility database, a national health screening database, a healthcare usage database, a long-term care insurance database and a healthcare provider database. The NHID covers the entire population of Korea (50 million) and thus has proven unwieldy for researchers. The NHIS constructed a representative 2% sample cohort database, the NHIS-National Sample Cohort (NHIS-NSC),6 but the NHIS-NSC did not meet the high demand for research requiring both health screening data and long-term health outcomes. The NHIS therefore constructed the NHIS-HEALS cohort database in 2015 to support a wide range of public research. The NHIS-HEALS has been made publicly available to facilitate wider use of the health screening database, and includes a larger sample of health screening participants than the NHIS-NSC.

Cohort description

The participants of the cohort

The eligibility criteria for the general health screening programme provided by the NHIS varied according to the insurance type of beneficiaries. Employed individuals were eligible at all ages, while the self-employed were eligible if they were the head of household of a family. The dependents of the employed and family members of the self-employed heads of household were eligible only for those aged 40 years or older. Among the beneficiaries of the medical aid programme, which is a tax-based governmental programme for low-income families that covers approximately 3% of all Koreans, heads of household 19–64 years of age and family members 41–64 years of age were eligible for the general health screening programme. Medical aid beneficiaries have been included in the general health screening programme since 2012.

To construct the NHIS-HEALS database, a sample cohort was first selected from the 2002 and 2003 health screening participants, who were aged between 40 and 79 in 2002 and followed up through 2013. This cohort included 514 866 health screening participants who comprised a 10% simple random sample of all health screening participants in 2002 and 2003. Since only a small proportion of people aged less than 40 participated in the health screening programme, and the response rate was very low among people aged 80 years or older, the NHIS-HEALS was limited to adults aged 40 to 79 years. Gender-specific and age-specific distributions of the cohort population, the source population (all health screening participants) and the overall Korean population are presented in online supplementary table 1.7 Under the current National Health Insurance Act, the data can only be used for research purposes without patients’ individual consent. Nevertheless, identification is difficult because the sample was drawn from the entire population and the data use deidentified individual keys that were created for the NHIS-HEALS.

The general characteristics of the cohort population at baseline are presented in table 1. A total of 54.2% of the participants were men. The number of participants aged 40–44 years was highest among all age groups, accounting for a quarter of the sample (25.2%). A total of 55.3% of the participants lived in non-metropolitan areas, which covers some urban areas and all rural areas. The most common insurance type was health insurance for the employed. A total of 0.6% of the participants had any disabilities. The biennial screening participant rates ranged from 65.1% to 70.9% during the 2004–2013 period. Of the sample population, 31.6% participated six times in the health screening programmes during the follow-up period. A total of 42.3% of the men and 96.2% of the women were non-smokers. Nearly half of the men (45.7%) drank alcohol more than once per week, while most of the women (82.5%) rarely drank. Of the men, 49.7% never engaged in exercise at least once per week, compared with 67.0% of the women.

Table 1

General characteristics of the National Health Insurance Service-Health Screening Cohort subjects at baseline (2002–2003)

Follow-up interval

The cohort was followed up through 2013 annually for the eligibility information including death information and healthcare usage (all participants), and not annually for the health screening information (only those who meet the eligibility criteria, biennially, for the screening programme and those who participated in the screening programme). Information on death (date and cause of death) from Statistics Korea was individually linked using unique personal identification numbers. By law, all deaths must be reported to Statistics Korea. Personal information regarding insurance contribution (a proxy for income), residential area and disability status was tracked every year from the eligibility database. The eligibility information was collected from the Public Information Sharing System, National Tax Service and Ministry of Health and Welfare of Korea, and managed by the NHIS, which has 178 regional branches and approximately 13 000 employees across Korea. As the NHIS covers the entire population of Korea, the healthcare usage information included all visits (inpatient, outpatient and pharmacy visits) to healthcare facilities that occurred in Korea. Information about the healthcare facilities was also monitored annually. Regarding the health screening follow-ups, 31.6% of the participants were monitored biennially until 2013, and 93.6% of the participants were examined at least once after a baseline screening. The cohort will be maintained and continuously updated by the NHIS.

The key variables

The key variables of the NHIS-HEALS, which were mainly constructed from the variables of the NHID, are presented in table 2 and online supplementary table 2. The eligibility database included information about income-based insurance contributions (a proxy for income), demographic variables, and date and cause of death. Variables for specific health problems and risk factors from questionnaires (cigarette smoking status/dose/duration, frequency per week and amount per day of alcohol drinking—regardless of the type of alcohol, type and days per week of physical activity, medical history and family history) and bioclinical laboratory results (blood pressure, fasting glucose, lipid profile, haemoglobin, urine stick test, creatinine, liver enzyme, body mass index and waist circumference) were included in the health screening database. Some variables changed during the follow-up period. The healthcare usage database was based on data collected during the process of claiming healthcare services and included information on records of inpatient and outpatient usage (diagnosis, length of stay, treatment costs and services received) and prescription records (drug code, days prescribed and daily dosage). The healthcare provider database included information on types of healthcare institutions, healthcare human resources and equipment.

Table 2

Major variables in the National Health Insurance Service-National Health Screening Cohort database

Findings to date

As the NHIS-HEALS was launched in December 2015, no noteworthy studies have yet been published. However, several studies using the health screening and healthcare usage database of the NHID have been published. Studies have examined the associations of body mass index with cancer risk8 and mortality,9 glucose levels with cancer risk10 and hospitalisation,11 smoking with cancer12 13 and diabetes mellitus,14 physical activities with body mass index15 and cholesterol levels with cancer risk.16 These research results have had positive impacts on health promotion by raising awareness of various public health issues, with an example being the lawsuit against the tobacco industry by the NHIS.17 The NHIS-HEALS will provide additional strong evidence regarding the issues that were assessed in previous studies using the NHID by including the cause of death, unlike the NHID.

We herein present the basic statistics of NHIS-HEALS for future data users. We calculated the prevalence rates of various conditions, the incidence density of those conditions, healthcare usage rates and mortality. The rates were age-standardised using the census population of Statistics Korea in 2005 and the world standard population.18 The rates that were standardised using the world standard are presented below.

Prevalence rates for specific health problems identified from the health screening database at baseline (2002–2003) are presented in table 3. The age-standardised prevalence of anaemia in the NHIS-HEALS was 9.8%, with a higher rate in women (15.5%) than men (5.9%) (p<0.001). The age-standardised prevalence of diabetes mellitus was 8.4%, while the age-standardised prevalence of hypertension in the NHIS-HEALS was 36.1%. The prevalence of diabetes and hypertension was higher in men than women (p<0.001). The age-standardised prevalence of obesity (body mass index of 30 kg/m2 or greater) in NHIS-HEALS was 2.7%, while the prevalence of overweight (body mass index of 25 kg/m2 or greater, but less than 30 kg/m2) was 31.0%. The age-standardised prevalence of hypercholesterolaemia in the NHIS-HEALS was 14.3%; the rate was higher in women (16.0%) than men (12.4%) (p<0.001). The age-standardised prevalence of abnormal urine protein tests was 2.0%, and the rate was the same (2.0%) in both sexes. When we compared these results with those of the Korean National Health and Nutrition Examination Survey for participants aged 40 or over,19 generally similar levels of prevalence of anaemia, diabetes, hypertension, obesity and hypercholesterolaemia were found.

Table 3

Crude and age-standardised (with the 2005 Korean census and world standard populations as references) prevalence rates (%) for specific health problems in the health screening database of the National Health Insurance Service-National Health Screening Cohort database at baseline, 2002–2003

The incidence density for specific health problems based on information from the health screening database in 2005–2013 is presented in table 4. To identify incident cases, we excluded patients who were previously diagnosed in the first 3 years (2002–2004) of the study period, because the data did not include the baseline information (participants’ screening and healthcare usage records before 2002). With reference to previous studies,20–22 the exclusion period was set as the first 2 years, starting in 2002 (2002–2003) or 2003 (2003–2004). The incidence density was highest for hypertension (4.7%), followed by anaemia (2.9%), hypercholesterolaemia (2.6%), abnormal urine blood (2.3%) and diabetes mellitus (1.7%).

Table 4

Crude and age-standardised (with the 2005 Korean census and world standard populations as references) incidence density (per 100 person-years) for specific health problems in the health screening database of the National Health Insurance Service-National Health Screening Cohort database , 2005–2013

The healthcare usage rates of 10 major diseases at baseline based on the healthcare usage database are presented in online supplementary table 3. The rates were highest for acute upper respiratory infections and influenza (46.5%), followed by dyspepsia and other diseases of the stomach and duodenum (29.7%) and other diseases of the eye and adnexa (22.3%).

The mortality rates of the cohort population are presented in table 5, and survival curve of participants is presented in figure 1. We calculated mortality rates using the entire sample data of NHIS-HEALS from 2003 to 2013. The age-standardised (defined with reference to the Korean census population) mortality rate for the first 2 years (through 2004) was 463.6 per 100 000 person-years, while the rate for 5 years (through 2007) was 678.3 per 100 000 person-years and the rate for 10 years (through 2012) was 910.2 per 100 000 person-years. In men, the mortality rate was higher than in women (2-year mortality rates of 680.4 per 100 000 person-years for men and 250.8 per 100 000 person-years for women) (p<0.001).

Table 5

Number of all-cause deaths through 2012 (10 years after baseline) and crude and age-standardised (with the 2005 Korean census and world standard populations as references) mortality rates (per 100 000 person-years) in the National Health Insurance Service-National Health Screening Cohort database

Figure 1

Survival curve of participants by sex in the National Health Insurance Service-National Health Screening Cohort database.

The major causes of death by sex during the follow-up period (2003–2013) are presented in table 6 and figure 2. Causes of death were classified using the list of 56 causes of death of Statistics Korea, which was derived from the list of 80 causes of death for the tabulation of mortality statistics recommended by WHO. The most common cause of death was malignant neoplasm in both sexes (406.6 per 100 000 person-years for men, 140.5 per 100 000 person-years for women). Heart disease was the second most common cause in men (91.0 per 100 000 person-years) and the third most common cause in women (50.8 per 100 000 person-years). Cerebrovascular diseases were the third most common cause in men (89.0 per 100 000 person-years) and the second most common cause in women (64.5 per 100 000 person-years). Suicide was the fourth most common cause overall (31.6 per 100 000 person-years), the fourth most common cause in men (45.6 per 100 000 person-years) and the fifth most common cause in women (16.9 per 100 000 person-years).

Table 6

Cause-specific death rates for leading causes of death (2003–2013) and age-standardised (with the 2005 Korean census and world standard populations as references) mortality rates (per 100 000 person-years) in the National Health Insurance Service-National Health Screening Cohort database

Figure 2

The major 10 causes of death by sex in the cohort sample of the National Health Insurance Service-National Health Screening Cohort database.

Strengths and limitations

The NHIS-HEALS has several strengths. First, it is a cohort with a large sample size (n=514 866), with a relatively low rate of attrition over more than 10 years of follow-up due to the nature of the national administration data. Second, a questionnaire survey, physical examination, dental health screening and clinical laboratory tests were performed for all cohort members. This database can be used to study the risk factors of non-communicable diseases and dental health problems, which are an important health issue that has not yet been fully investigated. Third, the NHIS-HEALS contains the date and cause of death, which were determined using the national database for cause of death produced by Statistics Korea, which allows investigations such as burden-of-disease studies. Statistics Korea annually reports cause of death statistics, and a previous study reported the accuracy of the cause of death to be 92%.23 Fourth, the NHIS-HEALS contains extensive information on healthcare usage regarding inpatient and outpatient visits to healthcare institutions and medication histories.

The NHIS-HEALS also has weaknesses. The study subjects are slightly younger than the general population of Korea. Variables on health behaviours are limited since those data were obtained from self-reporting in nationwide health screenings. In addition, the disease diagnosis variables in the healthcare claim data might not accurately reflect patients’ medical conditions, but only healthcare usage sensitive to the Korean fee-for-service payment and reimbursement system.

References

  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.

Footnotes

  • Contributors SCS, SKP, YHK, HCK, SAS, S-LJ contributed to the conception of this article. YYK, JHP, C-HD, J-SS were involved in manuscript writing and revision. Y-YK, JHP, H-JK, E-JL, SH were involved in data analysis and interpretation. All authors read and approved the final manuscript.

  • Funding This work was supported by National Health Insurance Service (NHIS) in Korea.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement The data can be accessed on the National Health Insurance Data Sharing Service homepage of the NHIS (http://nhiss.nhis.or.kr). Applications to use the NHIS-HEALS data will be reviewed by the inquiry committee of research support and, once approved, raw data will be provided to the applicant with a fee. Although, the data are coded in English and numbers, not in Korean (Hangul), use of individual data is allowed only for Korean researchers at the moment, but it would be possible for researchers outside the country to gain access to the data by conducting a joint study with Korean researchers.