SLC30A8 Rare Variant Modify Contribution of Common Genetic and Lifestyle Factors toward Type 2 Diabetes Mellitus

Article information

Diabetes Metab J. 2026;50(2):385-395
Publication date (electronic) : 2025 August 13
doi : https://doi.org/10.4093/dmj.2024.0830
Division of Genome Science, Department of Precision Medicine, National Institute of Health, Cheongju, Korea
Corresponding authors: Bong-Jo Kim https://orcid.org/0000-0003-3562-2654 Division of Genome Science, Department of Precision Medicine, National Institute of Health, 187 Osongsaengmyeong 2-ro, Osong-eup, Heungdeok-gu, Cheongju 28159, Korea E-mail: kbj6181@korea.kr
Young Jin Kim https://orcid.org/0000-0002-4132-4437 Division of Genome Science, Department of Precision Medicine, National Institute of Health, 187 Osongsaengmyeong 2-ro, Osong-eup, Heungdeok-gu, Cheongju 28159, Korea E-mail: inthistime@korea.kr
Received 2024 December 20; Accepted 2025 May 6.

Abstract

Background

This study aimed to investigate the modifying effects of rare genetic variants on the risk of type 2 diabetes mellitus (T2DM) in the context of common genetic and lifestyle factors.

Methods

We conducted a comprehensive analysis of genetic and lifestyle factors associated with T2DM in a cohort of 146,284 Korean individuals. Among them, 4,603 individuals developed T2DM during the follow-up period of up to 17 years. We calculated a polygenic risk score (PRS) for T2DM and identified carriers of the rare allele I349F at SLC30A8. A Healthy Lifestyle Score (HLS) was also derived from physical activity, obesity, smoking, diet, and sodium intake levels. Using Cox proportional hazards models, we analyzed how PRS, HLS, and I349F influenced T2DM incidence.

Results

Results showed that high PRS and poor lifestyle were associated with increased risk. Remarkably, I349F carriers exhibited a lower T2DM prevalence (5.7% compared to 11.7% in non-carriers) and reduced the impact of high PRS from 23.18% to 12.70%. This trend was consistent across different HLS categories, with I349F carriers displaying a lower risk of T2DM.

Conclusion

The integration of common and rare genetic variants with lifestyle factors enhanced T2DM predictability in the Korean population. Our findings highlight the critical role of rare genetic variants in risk assessments and suggest that standard PRS and HLS metrics alone may be inadequate for predicting T2DM risk among carriers of such variants.

GRAPHICAL ABSTRACT

Highlights

• Polygenic risk score identified genetically high-risk groups of T2DM.

SLC30A8 rare-allele carriers showed lower T2DM risk, independent of other risk factors.

• Rare variants may modify the effects of common genetic and lifestyle risk factors.

• Combined rare and common genetic and lifestyle factors showed additive effects.

• Genetic profiling enables T2DM subtyping for personalized interventions.

INTRODUCTION

There is growing concern regarding the global burden of diabetes mellitus, which is a leading cause of mortality and morbidity [1]. Type 2 diabetes mellitus (T2DM), which accounts for the majority of diabetes cases, is influenced by a complex interplay between genetic and environmental factors [1]. Over the past decade, genome-wide association studies (GWASs) focusing on variants with a minor allele frequency (MAF) greater than 1% have identified hundreds of loci associated with T2DM, explaining approximately half of its known heritability [2-4]. Polygenic risk scores (PRS) based on these associations have been used to summarize individual genetic risk by considering the number and effect sizes of risk alleles. Individuals in genetically high-risk groups showed approximately two to three times higher prevalence of T2DM compared to those in the remaining groups [2-6]. However, a recent study has highlighted the significant impact of rare variants (MAF <1%) with substantial genetic effects, resulting in a nearly 50% reduction in the prevalence of T2DM among individuals carrying the rare allele [7]. These studies are expected to contribute to the comprehensive identification of high-risk groups for T2DM and implementation of appropriate interventions [6,7].

In addition to the aforementioned genomic efforts, numerous studies have identified environmental factors associated with T2DM [8]. Recent investigations have proposed healthy lifestyle score (HLS), which integrates individual risk factors, such as physical activity, dietary habits, smoking status, and alcohol consumption, as a means of assessing an individual’s risk of developing T2DM [9-12]. Individuals with an ‘unfavorable’ lifestyle showed an increased incident T2DM compared with the baseline group [10-12]. A risk-stratified subset of individuals based on PRS and HLS showed varied levels of incident T2DM, suggesting an additive contribution of genetic and environmental factors to the susceptibility [10-12]. However, despite the marked impact of rare variants on T2DM, previous studies have primarily focused on PRS derived from common variants and their interactions with HLS [7,9-12]. Currently, the integration of PRS, HLS, and rare allele for T2DM risk stratification is limited because of insufficient large-scale genomic information with comprehensive coverage of rare variants across the human genome.

To explore the potential combined impact of PRS, HLS, and rare variants, we analyzed 146,284 samples from the Korean Genome and Epidemiology Study (KoGES) genotyped using the Korea Biobank Array (KBA). Previous GWASs utilizing KBA have demonstrated varying levels of T2DM prevalence for different combinations of common and rare genetic factors [7]. In this study, we have performed association analyses for common and rare variants associated with T2DM and developed models that incorporated PRS, HLS, and discovered rare variant. Our study highlights that the rare allele modify the contributions of common genetic and lifestyle factors to T2DM and emphasizes that enhanced T2DM risk stratification can be achieved by integrating PRS, HLS, and a rare allele.

METHODS

Study subjects

This study was approved by the Institutional Review Board of the Korea Disease Control and Prevention Agency, Republic of Korea (2022-03-03-PE-A and 2022-02-09-P-A). An overview of the study population used in this study is provided in Supplementary Fig. 1. In the KoGES, 211,725 participants were recruited from three population-based cohorts: the KoGES Ansan and Ansung study (n=10,030), KoGES Health Examinee study (HEXA, n=173,357), and KoGES Cardiovascular Disease Association Study (CAVAS, n=28,338). KoGES has been described previously [13,14]. Numerous variables, including epidemiological surveys, physical examinations, and laboratory tests, were examined. All participants aged 40 to 70 years provided written informed consent.

The T2DM cases and controls were defined according to the American Diabetes Association criteria. The cases included those with a fasting plasma glucose (FPG) concentration ≥126 mg/dL (7.0 mmol/L), an oral glucose tolerance test (OGTT) ≥200 mg/dL (11.1 mmol/L), or a glycosylated hemoglobin (HbA1c) ≥6.5% (48 mmol/mol). A participant who reported T2DM treatment was included as a case. Controls included subjects without a history of diabetes who satisfied the following criteria: FPG concentration <100 mg/dL (5.6 mmol/L), OGTT <140 mg/dL (7.8 mmol/L), and HbA1c level <6% (42 mmol/mol). The variables of OGTT and HbA1c were used, if available. Among the genotyped subjects in the discovery study (n=123,822), 11,087 cases and 86,058 controls were selected for further analyses (Supplementary Table 1).

Among the all genotyped participants (n=146,284), incident cases of T2DM were identified from those who did not have it at baseline recruitment (Supplementary Table 1). Among 130,590 participants without T2DM at baseline, 4,603 samples were regarded as incident cases if they met one of the following criteria during the 17 years of follow-up: past diagnosis or prior T2DM treatment, FPG ≥126 mg/dL, OGTT ≥200 mg/dL, or HbA1c ≥6.5%.

Genotyping and quality control

Among KoGES participants, 134,721 samples were genotyped using the KBA, an optimized single-nucleotide polymorphism (SNP) microarray for genome studies in the Korean population [14]. Details of genotyping and quality control have been described previously [7]. Briefly, 123,822 samples, with informed consent at the time of the analysis and non-missing phenotypes, were retained for further analysis after quality-control processes based on the following criteria for each batch grouped by versions (v1.0 and v1.1) of KBA: (1) samples were excluded if gender discrepancy, low call rate (<97%), excessive heterozygosity, 2nd-degree related samples, and outliers of principle component analysis; (2) variants were excluded for low call rate (<95%), Hardy Weinberg equilibrium (HWE) failure (P<10−6), and low MAF (<1%). Therefore, less than 550 thousand (K) SNPs were retained for the phasing and imputation analyses. To study rare genotyped variants, 68,431 rare functional autosomal variants (MAF <1%) were used for further analysis after quality control. Among approximately 160 K initial functional variants (missense, frameshift, start/stop gain or loss, splice site donor or acceptor, and structural interaction), rare variants were filtered out for allele frequency discrepancy (>0.5% either one of all batches, 2,579 sequenced Korean samples [7], 504 East Asian samples from the 1,000 Genomes Project Phase 3 [15], and 9,435 East Asian samples from the gnomAD database [16]), minor allele count <30, HWE failure (P<10−6), and missing rate (>30%).

Genotype imputation

A pre-phasing-based imputation analysis was performed on the quality-controlled (QCed) data. Eagle v2.3 [17] was used to phase the QCed data, and the phased data were imputed using Impute v4 [18] with a merged reference panel of 2,504 samples from the 1,000 Genomes Phase 3 [15] and 397 samples from the Korean Reference Genome [14]. The genotype file (GEN)-formatted file, an output from Impute v4, was converted to the variant call format (VCF) with imputed dosages using GEN2VCF [19]. For further analysis, 8.3 million (M) high-quality imputed common variants were obtained by excluding variants with imputation quality <0.8 and MAF <1%.

Replication study

For the replication study, approximately 24,000 samples from the HEXA cohort, a part of KoGES, were genotyped using the KBA, and 22,462 samples, with informed consent at the time of the analysis and non-missing phenotypes, were retained after quality-control procedures described above [7]. For further analysis, 8.1 M high-quality imputed variants were used after phasing and imputation analysis. The HLS for the replication dataset was calculated using the aforementioned protocol.

Calculation of PRS

PRS was calculated for all individuals analyzed in this study (n=146,284). To construct the PRS for T2DM, adjusted weights were obtained using PRS with continuous shrinkage (PRS-CS) priors [20] with T2DM GWAS conducted by Biobank Japan [21]. About 970,263 HapMap phase three variants were used to calculate the PRS using adjusted weights. The calculated PRS values were scaled by subtracting the mean and then dividing by the standard deviation to transform them to follow a normal distribution. Based on the PRS, individuals were categorized into three groups: low (bottom 20%), intermediate (20%–80%), and high (80%–100%).

Construction of healthy lifestyle score

The HLS was calculated for all individuals analyzed in this study (n=146,284). To measure the magnitude of the healthy lifestyle of a participant, five healthy lifestyle-related factors were assessed considering those from previous literature [9, 11,22] and the disease burden owing to high sodium intake in Korea [22,23]. These lifestyle factors were physical activity, obesity, smoking status, healthy dietary patterns, and sodium intake status as retrieved from the survey questionnaires. Each factor was coded as 1 (healthy lifestyle) if it met the criteria and 0 (unhealthy lifestyle) otherwise. The HLS for each participant was calculated by summing all five factors. An individual was regarded as having healthy lifestyle based on the following criteria: (1) physical activity at least 30 minutes once a week; (2) body mass index (BMI) <25 kg/m2; (3) current non-smoker; (4) healthy dietary pattern (reaching the daily recommended in ≥5 out of 9 items (Supplementary Table 2); and (5) daily sodium intake <2 g. Finally, HLS ranging from 0 to 5 were categorized into three groups: unfavorable (HLS=0–1), intermediate (HLS=2–3), and favorable (HLS=4–5).

Statistical analysis

Single-variant association analysis (logistic regression) for T2DM was performed using Hail v0.2.126, assuming an additive mode of inheritance based on alternative allele count, adjusting for age, sex, and recruitment area. Cluster plots of the associated rare variants from single variants and gene-based test results were visually inspected (solute carrier family 30 member 8 [SLC30A8] variants in Supplementary Fig. 2). An inverse variance-weighted meta-analysis was performed using METAL software (version 2011-03-25) [24] by combining datasets of KBAv1.0 and KBAv1.1. A locus was defined by clustering variants (P<5×10−8) within a 500-kb range. Gene-based burden test for rare functional variants was performed using the optimal unified test (Sequence Kernel Association Test and the Optimal unified test [SKAT-O]) [25]. A lead signal of the locus was selected as the most significant variant within the locus of clustered associated variants (P≤5×10−8) if the variants were located within a 500-kb range. A logistic regression model was used to test the association between genetic and prevalent T2DM cases adjusting for age, sex, and recruitment area. A Cox proportional hazards model using the R package ‘survival’ (R Foundation for Statistical Computing, Vienna, Austria) was used to test the association between genetic or lifestyle factors and incident T2DM events adjusting for age, sex, and recruitment area [26]. To assess the discriminative performance of each prediction model for incident T2DM, we computed the concordance index (C-index) from Cox proportional hazards regression models. Six different models were constructed using combinations of predictors: a PRS, a HLS, and a rare allele. All models included age, sex, and recruitment area as covariates. Time-to-event was defined as the time from baseline to the diagnosis of T2DM, and individuals without T2DM were censored at the end of follow-up.

RESULTS

GWAS on T2DM in the Korean population

The demographic characteristics of 123,822 samples analyzed in this study are summarized in Supplementary Table 1. Logistic regression analysis was conducted to identify the variants associated with T2DM in the Korean population. Among the 8.3 M high-quality imputed common variants (MAF ≥1%), 44 independent loci were identified with a P<5×10−8 (Supplementary Table 3, Supplementary Fig. 3). These loci were found within a 1-Mb window of previously known loci (Supplementary Table 3) [3,27].

To investigate the contribution of rare variants to T2DM, logistic regression analysis was performed at the single-variant level, and a burden test using SKAT-O was conducted at the gene level [25]. Only the I349F variant in SLC30A8, previously discovered in the Korean population, showed an association with T2DM (odds ratio [OR], 0.40; P=7.30×10−16) with a P<5×10−8 as a threshold (Supplementary Table 4). A gene-based test identified four gene associations (P<1×10−5) involving PSMB8, PSMB8-AS1, SLC30A8, and TAP1 (Supplementary Table 4). Among the variants used in the gene-based test, four, including I349F, were marginally significant (P<0.05) (Supplementary Table 5). Previously, some rare variants have shown non-independent associations owing to the correlated genetic architecture of nearby common and rare signals [7]. After adjusting for nearby common signals (rs56118007 at major histocompatibility comple (MHC) and rs13266634 at SLC30A8), only the association with SLC30A8 remained significant (Supplementary Table 4).

To replicate the rare associations identified in the discovery dataset containing 124 K Korean samples, an independent set of 22,462 samples was genotyped to validate the results of singlevariant and burden tests. In the replication dataset, I349F showed a significant association (OR, 0.55; P=1.73×10−2) with consistent directionality of effect size (Supplementary Table 4). However, the gene-based test in the replication dataset did not yield any significant association, possibly owing to the small number of rare carriers (Supplementary Table 4). Based on the results of the rare variant analysis (Supplementary Tables 4 and 5), a rare variant, I349F, with a protective effect against T2DM was selected in SLC30A8 to subset the group of rare-allele carriers. Among the 146,284 individuals, rare-allele carriers had a T2DM prevalence of 5.7%, whereas non-carriers had a prevalence of 11.7% (Supplementary Table 6). For incident T2DM cases, rare-allele carriers had a T2DM incidence of 4.7% and 7.3% for carriers and non-carriers, respectively (Supplementary Table 6).

Polygenic prediction of T2DM and its modification by rare allele

PRS were calculated for KoGES participants using summary statistics from an independently conducted GWAS for T2DM at Biobank Japan [21]. The constructed T2DM-PRS showed a strong association with prevalent T2DM cases in 146,284 individuals of the KoGES (OR, 2.06; P=1.65×10−1017), explaining 8.81% of the variance (Table 1). T2DM prevalence increased as the T2DM-PRS increased (Supplementary Fig. 4). When comparing the group of individuals with the top 1% T2DM-PRS to the median group (40%–60%), the genetically high-risk group showed an approximately a six-fold increase in T2DM prevalence (Table 1). The top 5% and 10% conferred a 4.2- and 3.7-fold increase in T2DM prevalence, respectively. Among the all individuals without T2DM at the baseline, T2DM-PRS showed a strong association with incident T2DM cases (hazard ratio [HR], 1.52; P=1.49×10−168) (Table 1). Similar to those of prevalent T2DM cases, the top 1% and 5% conferred 2.6- and 2.3-fold increase in T2DM incidence, respectively (Table 1).

Impact of high T2DM-PRS in the Korean population

In a previous study, we demonstrated that the I349F variant at SLC30A8 modified the effect of the common variant-based genetic risk score (GRS), resulting in a decrease in T2DM prevalence by approximately half, regardless of the score [7]. As expected, the modification effect of the rare allele was shown to be additive to that of the common variant-based PRS (Fig. 1A). For example, among the samples in the discovery study, the T2DM prevalence of in the top quintile of the T2DM-PRS decreased from 23.2% overall to 12.7% in individuals carrying rare protective allele of SLC30A8, and from 4.6% to 1.8% in the bottom quintile (Supplementary Table 6).

Fig. 1.

Risk stratification of type 2 diabetes mellitus (T2DM) based on genetic and lifestyle factors. (A) Modified effect of T2DM-polygenic risk score (PRS) by protective rare allele. After sorting the T2DM-PRS scores in increasing order, the PRS bins were categorized as 1st bin (1%–20%), 2nd bin (21%–80%), and 3rd bin (81%–100%). For rare-allele carriers and non-carriers, all samples were categorized into three T2DM-PRS bins, and the prevalence of T2DM was calculated separately for rare-allele carriers and non-carriers. (B) Risk of future T2DM according to genetic and lifestyle risk factors. Associations among T2DM-PRS, healthy lifestyle score, and incident T2DM. The analyses were adjusted for age, sex, and the area of recruitment. HR, hazard ratio; CI, conf idence interval.

The T2DM-PRS and subsets of rare allele were examined further to assess their ability to predict future T2DM. As shown in Supplementary Table 7, Supplementary Fig. 5, participants with intermediate and high T2DM-PRS had a HR of 1.83 (P=4.30×10−38) and 3.16 (P=7.86×10−113) for incident T2DM, respectively, compared to individuals in the bottom 20%. When considering the presence of the protective rare allele in addition to T2DM-PRS, non-carriers with intermediate and high PRS had an HR of 1.84 (P=7.07×10−39) and 3.17 (P=1.09×10−112) for incident T2DM, respectively, compared to the bottom 20%. Meanwhile, carriers with intermediate and high PRS had an HR of 0.84 (P=5.01×10−1) or 2.60 (P=2.59×10−4) for incident T2DM, respectively, compared to individuals in the bottom 20% of T2DM-PRS.

Contribution of a healthy lifestyle to T2DM

Among the five lifestyle-related factors, 146,284 individuals who followed a healthy lifestyle constituted 86.32%, 65.88%, 35.71%, 45.34%, and 39.65% in terms of current smoking status, obesity, healthy diet, physical activity, and sodium-intake status, respectively (Supplementary Table 8). All five healthy-lifestyle factors showed protective effects against incident T2DM and four factors except sodium intake were statistically significant (P<0.05) (Supplementary Table 8). These results suggest that these lifestyle factors could be adequate predictors of future T2DM.

Based on the calculated HLS, 35,743 (24.43%), 91,607 (62.62%), and 18,934 (12.94%) individuals were in the favorable, intermediate, and unfavorable groups, respectively (Table 2). The HLS-unfavorable group showed an approximately 2.5-fold increase in incident T2DM compared to the favorable group (HR, 2.48; P=6.31×10−51) (Table 2). The HLS-intermediate group showed a less pronounced increase in incident T2DM (HR, 1.45; P=8.04×10−16).

Impact of high T2DM-PRS in the Korean population

Combinatorial effect of genetic factors and HLS on future T2DM

To assess the combined effect of PRS, rare allele, and HLS, we stratified all 146,284 individuals into groups considering three PRS groups, presence of rare allele, and three HLS groups. Unfortunately, the combination of all three factors showed less than five counts for groups of rare-allele carriers with an unfavorable lifestyle and high PRS. Therefore, we focused only on the combinatorial effects of the PRS-HLS and HLS-rare allele on future T2DM.

Compared with the reference (low PRS and favorable HLS), among 130,590 individuals without T2DM at the baseline, incident T2DM risk was increased as the risk of either PRS or HLS increased (Table 3, Fig. 1B). For example, unfavorable HLS showed increased T2DM risk over the intermediate group, regardless of the PRS group. PRS and HLS showed an increasing tendency toward T2DM risk in a roughly additive manner. Moreover, individuals with an unfavorable lifestyle and the top 20% of T2DM-PRS had a highest risk of incident T2DM (HR, 9.48; P=1.39×10−43) (Table 3). Similar patterns were also observed for prevalent T2DM (Supplementary Fig. 6). These results were consistent with those of previously conducted studies [10,28].

Predictability of T2DM-PRS and HLS for future T2DM

Studying the combinatorial effect of HLS and presence of rare allele was limited owing to the small number of incident T2DM cases in the stratified groups. By using the baseline group with favorable HLS and non-carriers, non-carriers and carriers among the unfavorable and intermediate-HLS groups had an HR of 1.59 (P=7.88×10−25) and 0.84 (P=3.97×10−1), respectively, for incident T2DM (Table 4). Although the results from the group with rare allele carriers were not statistically significant, protective rare-allele carriers showed a decreased risk of incident T2DM compared with the baseline group. Fig. 2 shows that individuals carrying rare protective allele within the unfavorable or intermediate-HLS group exhibited a similar trend in survival rates compared with non-carriers within the favorable-HLS group. When these analytical schemes were applied to prevalent T2DM cases, the combination of HLS and presence of rare allele showed patterns similar to those of the HLS and PRS relationship by acting additively with each other (P<1.02×10−6) (Table 4). Compared to the baseline (favorable HLS and non-carriers), rare-allele carriers showed a decreased T2DM risk compared to non-carriers, regardless of the HLS group (Supplementary Table 9, Supplementary Fig. 7). For instance, non-carriers and carriers with intermediate HLS had an OR of 1.18 (P=8.03×10−12) and 0.47 (P=6.28×10−7) for prevalent T2DM, respectively, compared to the baseline group (Supplementary Table 9).

Predictability of HLS and I349F rare allele for future T2DM

Fig. 2.

Survival rate of incident type 2 diabetes mellitus (T2DM) according to genetic and lifestyle risk factors. Survival rate of incident T2DM, stratified by (A) T2DM-polygenic risk score (PRS) (bottom 20%, intermediate 20%–80%, and top 20%), (B) healthy lifestyle score (HLS) (favorable, intermediate, and unfavorable), (C) HLSxRare allele combination (favorable HLS and non-carrier; unfavorable or intermediate HLS and non-carrier; and unfavorable or intermediate HLS and carrier), and (D) T2DM-PRS and HLS combination (bottom 20% PRS and favorable HLS; intermediate level in both PRS and HLS; and top 20% PRS and unfavorable).

In addition to evaluating the combinatorial effects of the I349F variant, PRS, and lifestyle factors as independent factors, we performed an expanded interaction analysis using cross-product terms in both logistic (for prevalent T2DM) and Cox proportional hazards (for incident T2DM) models (Supplementary Table 10). Overall, no significant interaction involving the I349F variant was observed for either prevalent or incident T2DM. By contrast, we found a significant interaction between PRS and HLS (P<6.77×10−4), suggesting that genetic risk and overall lifestyle factors may jointly modify T2DM risk. When each lifestyle component was examined separately, additional significant interactions emerged between PRS and obesity, as well as between PRS and sodium intake (P<9.85 ×10−3), in both the prevalent and incident analyses. Notably, sodium intake alone was not associated with T2DM (Supplementary Table 8), which highlights the possibility that an individual’s genetic background may heighten the impact of sodium consumption on disease risk. We also observed a marginally significant PRS×healthy diet interaction in prevalent T2DM (P=2.83×10−2), but this finding was not replicated in incident T2DM, indicating potential confounding or limited statistical power in cross-sectional comparisons.

When the linear model was constructed based on HLS, PRS, and rare allele, the combination of these factors showed increased predictability for incident T2DM over the baseline model that consisted of age and sex. The C-index analysis provided increased predictability over the baseline model with C-index values of 0.67, 0.64, 0.62, and 0.69 for PRS, HLS, rare allele, and combinations, respectively, while the baseline model showed a C-index value of 0.59.

DISCUSSION

Our study offers valuable insights into the relationship between genetic and lifestyle factors and risk of T2DM in the Korean population. We identified significant genetic variants, including common and rare allele, that contribute to this risk. Additionally, we demonstrated the crucial role of a healthy lifestyle in reducing the incidence of T2DM. Moreover, we highlighted the modified effects of rare protective allele on genetic and lifestyle factors related to T2DM, emphasizing the importance of considering various aspects of genetics and lifestyle factors in predictive models.

To the best of our knowledge, this is the first study to introduce the modification effect of rare variant on common genetic and lifestyle factors related to incident T2DM. We showed that rare variant has a substantial impact on T2DM in carriers of the rare allele. Furthermore, the effects of HLS were also modified by the rare protective allele. Despite the recent large-scale sequencing studies by Cao et al. [29] and Halldorsson et al. [30] have demonstrated that more than 94% of variants in the human genome are rare (MAF <1%), studies regarding rare variants are still in their early stages owing to the limited availability of large-scale sequencing data. Considering the vast number of rare variants in the human genome, studying these variants is crucial for accurately assessing the risk of various diseases. Our findings suggest that the rare allele carriers in the intermediate (20%–80%) and high (80%–100%) T2DM-PRS groups are at an elevated risk of developing T2DM, even though the number of T2DM cases among these carriers is relatively small (Supplementary Table 7). The low frequency of these rare variants inherently leads to fewer events, yet the statistically significant associations observed underscore their potential impact on disease risk. Moreover, our supplementary analysis (Supplementary Table 7) demonstrated consistent results when including prevalent T2DM cases, further supporting the robustness of these findings.

Herein, we have provided insights into the risk stratification of T2DM and offered clues for personalized treatment based on individual genetics and lifestyles. Fig. 2 illustrates the survival rate of incident T2DM based on Cox regression models for risk factors analyzed in this study. Overall, individuals who do not carry rare protective allele, have an unfavorable lifestyle, and have high T2DM-PRS may develop T2DM at an early age. Based on an individual’s specific risk profile, personalized treatment is possible by implementing appropriate lifestyle interventions for individuals with unfavorable lifestyles but with a genetic risk lower than the top 80% of T2DM-PRS. Conversely, routine screening of individuals with high genetic risk should be conducted regardless of their lifestyle. Furthermore, the stratification strategy employed in this study complements recent efforts in subtyping T2DM. Ahlqvist et al. [31] demonstrated that clustering patients using six clinical parameters identified five distinct subtypes with different clinical characteristics, progression rates, and risk of complications of T2DM. With growing attention to a new era of subclassification analysis for T2DM, our study underscores the added value of integrating common genetic factors (via PRS) and rare alleles—such as the I349F variant—with healthy lifestyle factors (HLS). While Ahlqvist et al. [31] used a clustering approach based on diverse clinical parameters to identify diabetes subtypes, our findings suggest that incorporating genetic risk information, including both common and rare variants that have not been extensively studied previously, can further refine subclassification. By merging detailed clinical phenotyping with genetic insights, we can achieve a more nuanced understanding of the disease’s heterogeneity, thereby paving the way for more personalized and effective therapeutic strategies.

Our expanded interaction analysis yielded several notable findings. First, the I349F variant in SLC30A8 did not show a statistically significant interaction with either PRS or HLS, suggesting that while this rare allele exerts a protective effect and act in an additive manner to those risk by genetic and lifestyle factors, it may not substantially exert interaction influence of overall genetic burden or lifestyle factors. Second, we detected significant PRS×HLS interactions, implying that individuals with higher genetic risk may be more susceptible or responsive to their aggregate lifestyle behaviors. Moreover, when we explored individual lifestyle components, obesity and sodium intake emerged as important effect modifiers of PRS for T2DM risk. Interestingly, sodium intake alone was not independently associated with T2DM (Supplementary Table 8), but the gene–sodium interaction was robust, pointing toward a potential synergy between genetic predisposition and sodium consumption patterns. These findings partially align with previous studies, which have similarly reported modest interactions between genetic factors and specific lifestyle components [12]. However, the absence of strong, consistent interactions across all lifestyle measures underscores the complexity of gene–environment interplay [32]. Furthermore, the marginal PRS×healthy diet interaction we observed in prevalent T2DM did not persist in incident analyses, suggesting that cross-sectional data might be confounded by reverse causation (i.e., individuals changing their diets post-diagnosis) or simply reflect insufficient statistical power. Additional large-scale, longitudinal studies with repeated measures of diet and other lifestyle factors may help clarify the full extent of these gene-lifestyle interactions and their role in T2DM pathogenesis.

In our study, the incremental C-index associated with the PRS was higher than that of the HLS, indicating a greater contribution of genetic factors to the predictive performance of our models. Similar observations have been reported in previous studies. For example, Schnurr et al. [11] evaluated the incremental predictive value of a GRS and a lifestyle score for T2DM. When added to a base model including age, sex, and BMI, the GRS increased the area under the curve by 1.04, whereas the lifestyle score led to an increase of 0.24. These findings are consistent with our results and suggest that genetic factors may offer a greater marginal improvement in risk prediction performance compared to lifestyle factors, at least in the context of models that already include demographic and anthropometric variables.

We acknowledge that there are some limitations to this study. First, the sample size for the replication analysis of rare variants and assessment of combinatorial effects was relatively small, which may have limited the statistical power to detect significant associations. We recognize that caution must be taken in interpreting results from relatively small case numbers among the I349F carriers and the low frequency of the rare variant. Additional sequencing efforts in diverse populations are needed to validate these observations and fully elucidate the role of rare variants in T2DM susceptibility. Second, the lifestyle factors assessed in our study were self-reported and are subject to potential bias. The inherent limitations of survey-based questionnaires contribute to the potential misclassification or underestimation of healthy lifestyle behaviors. Third, the baseline lifestyle may not accurately reflect their long-term lifestyle patterns. Indeed, many people adopt healthier behaviors (e.g., dietary changes or increased physical activity) following a diagnosis, which can mask or reverse the expected protective association in a cross-sectional context. For example, well-known lifestyle factors, including healthy diet and physical activity, were not associated with prevalent T2DM while obesity, which is known to be strongly related to healthy diet and physical activity, was strongly associated with T2DM (Supplementary Table 8). These results imply possible confounding from changes in personal healthy lifestyle based on health status or clinical diagnosis and underlying diseases. Although incident T2DM was well associated with healthy lifestyle components (Supplementary Table 8), our results do not reflect changes over time and long-term lifestyle patterns. Finally, considering the population-specific nature of rare variants, the results of this study may not be directly applicable to other populations. The rare variant discovered at SLC30A8 were found to be polymorphic in East Asians, yet monomorphic in other populations.

In conclusion, our findings highlight the importance of genetic variants, including common and rare alleles, in the development of T2DM in Korean population. The combination of common and rare genetic factors along with lifestyle factors improves the predictability of T2DM. These results contribute to our understanding of the etiology of T2DM and may have implications for personalized prevention and intervention strategies for individuals at risk of developing this disease. By considering the interplay between genetics and lifestyle, we can better identify individuals who may benefit from targeted interventions and tailored treatment options.

SUPPLEMENTARY MATERIALS

Supplementary materials related to this article can be found online https://doi.org/10.4093/dmj.2024.0830.

Supplementary Table 1.

Demographic characteristics of study samples

dmj-2024-0830-Supplementary-Table-1.pdf
Supplementary Table 2.

Healthy dietary items

dmj-2024-0830-Supplementary-Table-2.pdf
Supplementary Table 3.

Common variants associated with type 2 diabetes mellitus in the discovery stage

dmj-2024-0830-Supplementary-Table-3.pdf
Supplementary Table 4.

Rare variant association results

dmj-2024-0830-Supplementary-Table-4.pdf
Supplementary Table 5.

Single variant associations (genes from burden test)

dmj-2024-0830-Supplementary-Table-5.pdf
Supplementary Table 6.

T2DM-PRS and its modification by protective rare allele

dmj-2024-0830-Supplementary-Table-6.pdf
Supplementary Table 7.

Predictability of T2DM-PRS and rare allele for future T2DM

dmj-2024-0830-Supplementary-Table-7.pdf
Supplementary Table 8.

Impact of individual lifestyle factors on T2DM

dmj-2024-0830-Supplementary-Table-8.pdf
Supplementary Table 9.

Combinatorial effect of HLS and rare allele on prevalent T2DM

dmj-2024-0830-Supplementary-Table-9.pdf
Supplementary Table 10.

Interaction effect of genetic and lifestyle factors

dmj-2024-0830-Supplementary-Table-10.pdf
Supplementary Fig. 1.

Overview of study population and analysis workflow. ASAS, Ansung and Ansan; HEXA, Health Examinee Study; CAVAS, Cardiovascular Disease Association Study; PCA, principal component analysis; T2DM, type 2 diabetes mellitus.

dmj-2024-0830-Supplementary-Fig-1.pdf
Supplementary Fig. 2.

Cluster plots of rare variants at solute carrier family 30 member 8 (SLC30A8). (A) 8:118174092_A/C (S320R) discovery study, (B) 8:118174092_A/C (S320R) replication study, (C) 8:118184855_A/T (I349F) discovery study, and (D) 8: 118184855_A/T (I349F) replication study.

dmj-2024-0830-Supplementary-Fig-2.pdf
Supplementary Fig. 3.

Manhattan plot of type 2 diabetes mellitus (T2DM) genome-wide association study. Manhattan plot shows logistic regression analysis results of common variants for T2DM. Red horizontal line indicates –log10(5e-8). Blue and orange colors indicate different chromosomes.

dmj-2024-0830-Supplementary-Fig-3.pdf
Supplementary Fig. 4.

Prevalence of type 2 diabetes mellitus (T2DM) by T2DM-polygenic risk score (PRS) group. Samples were grouped into 30 groups based on PRS scores in an increasing order. For each PRS bin, T2DM prevalence was calculated as number of T2DM samples divided by number of samples in the PRS bin.

dmj-2024-0830-Supplementary-Fig-4.pdf
Supplementary Fig. 5.

Combinatorial effect of polygenic risk score (PRS) and I349F rare allele on type 2 diabetes mellitus (T2DM). Bar plots depict the odds ratios (ORs) (A: prevalent T2DM) and hazard ratios (HRs) (B: incident T2DM) across PRS groups (0%20%, 20%–80%, 80%–100%) and I349F carrier status (non-carrier in orange, carrier in sky blue). The reference group (0%–20% PRS) is set to 1. Error bars represent 95% confidence intervals (CIs). Data are from Supplementary Table 7.

dmj-2024-0830-Supplementary-Fig-5.pdf
Supplementary Fig. 6.

Risk of prevalent type 2 diabetes mellitus (T2DM), according to genetic and lifestyle risk. Association of T2DM-polygenic risk score and healthy lifestyle score with prevalent T2DM. Analyses were adjusted for age, sex, and recruitment area. OR, odds ratio; CI, confidence interval.

dmj-2024-0830-Supplementary-Fig-6.pdf
Supplementary Fig. 7.

Combinatorial effect of healthy lifestyle score (HLS) and I349F rare allele on type 2 diabetes mellitus (T2DM). Bar plots depict the odds ratios (ORs) (A: prevalent T2DM) and hazard ratios (HRs) (B: incident T2DM) across HLS groups (favorable, intermediate, unfavorable) and I349F carrier status (non-carrier in orange, carrier in sky blue). The reference group (favorable HLS and non-carrier) is set to 1. Error bars represent 95% confidence intervals (CIs). Data are from Supplementary Table 9.

dmj-2024-0830-Supplementary-Fig-7.pdf

Notes

CONFLICTS OF INTEREST

No potential conflict of interest relevant to this article was reported.

AUTHOR CONTRIBUTIONS

Conception or design: B.J.K., Y.J.K.

Acquisition, analysis, or interpretation of data: H.M.J., M.Y.H., Y.S.P., Y.J.K.

Drafting the work or revising: H.M.J., B.J.K., Y.J.K.

Final approval of the manuscript: all authors.

FUNDING

This study was supported by intramural grants from the National Institute of Health, Republic of Korea (grant numbers 2019-NI-097-02, 2022-NI-065-01, and 2022-NI-067-01).

ACKNOWLEDGMENTS

The Korea Biobank Array (KBA) data were provided by the Collaborative Genome Program for Fostering New Post-Genome Industry (3000-3031b).

The summary-level results generated in this study are available on the KNIH (Korea National Institute of Health) PheWeb website (https://coda.nih.go.kr/usab/pheweb/intro.do).

References

1. Lin X, Xu Y, Pan X, Xu J, Ding Y, Sun X, et al. Global, regional, and national burden and trend of diabetes in 195 countries and territories: an analysis from 1990 to 2025. Sci Rep 2020;10:14790.
2. Mahajan A, Taliun D, Thurner M, Robertson NR, Torres JM, Rayner NW, et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat Genet 2018;50:1505–13.
3. Spracklen CN, Horikoshi M, Kim YJ, Lin K, Bragg F, Moon S, et al. Identification of type 2 diabetes loci in 433,540 East Asian individuals. Nature 2020;582:240–5.
4. Mahajan A, Spracklen CN, Zhang W, Ng MC, Petty LE, Kitajima H, et al. Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation. Nat Genet 2022;54:560–72.
5. Khera AV, Chaffin M, Aragam KG, Haas ME, Roselli C, Choi SH, et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat Genet 2018;50:1219–24.
6. Torkamani A, Wineinger NE, Topol EJ. The personal and clinical utility of polygenic risk scores. Nat Rev Genet 2018;19:581–90.
7. Kim YJ, Moon S, Hwang MY, Han S, Jang HM, Kong J, et al. The contribution of common and rare genetic variants to variation in metabolic traits in 288,137 East Asians. Nat Commun 2022;13:6642.
8. Ismail L, Materwala H, Al Kaabi J. Association of risk factors with type 2 diabetes: a systematic review. Comput Struct Biotechnol J 2021;19:1759–85.
9. Khera AV, Emdin CA, Drake I, Natarajan P, Bick AG, Cook NR, et al. Genetic risk, adherence to a healthy lifestyle, and coronary disease. N Engl J Med 2016;375:2349–58.
10. Said MA, Verweij N, van der Harst P. Associations of combined genetic and lifestyle risks with incident cardiovascular disease and diabetes in the UK biobank study. JAMA Cardiol 2018;3:693–702.
11. Schnurr TM, Jakupovic H, Carrasquilla GD, Angquist L, Grarup N, Sorensen TI, et al. Obesity, unfavourable lifestyle and genetic risk of type 2 diabetes: a case-cohort study. Diabetologia 2020;63:1324–32.
12. Li H, Khor CC, Fan J, Lv J, Yu C, Guo Y, et al. Genetic risk, adherence to a healthy lifestyle, and type 2 diabetes risk among 550,000 Chinese adults: results from 2 independent Asian cohorts. Am J Clin Nutr 2020;111:698–707.
13. Kim Y, Han BG, ; KoGES group. Cohort profile: the Korean Genome and Epidemiology Study (KoGES) consortium. Int J Epidemiol 2017;46e20.
14. Moon S, Kim YJ, Han S, Hwang MY, Shin DM, Park MY, et al. The Korea Biobank Array: design and identification of coding variants associated with blood biochemical traits. Sci Rep 2019;9:1382.
15. 1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A global reference for human genetic variation. Nature 2015;526:68–74.
16. Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alfoldi J, Wang Q, et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 2020;581:434–43.
17. Loh PR, Palamara PF, Price AL. Fast and accurate long-range phasing in a UK Biobank cohort. Nat Genet 2016;48:811–6.
18. Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 2018;562:203–9.
19. Shin DM, Hwang MY, Kim BJ, Ryu KH, Kim YJ. GEN2VCF: a converter for human genome imputation output format to VCF format. Genes Genomics 2020;42:1163–8.
20. Ge T, Chen CY, Ni Y, Feng YA, Smoller JW. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun 2019;10:1776.
21. Suzuki K, Akiyama M, Ishigaki K, Kanai M, Hosoe J, Shojima N, et al. Identification of 28 new susceptibility loci for type 2 diabetes in the Japanese population. Nat Genet 2019;51:379–86.
22. Park CY, Jo G, Lee J, Singh GM, Lee JT, Shin MJ. Association between dietary sodium intake and disease burden and mortality in Koreans between 1998 and 2016: the Korea National Health and Nutrition Examination Survey. Nutr Res Pract 2020;14:501–18.
23. Park HK, Lee Y, Kang BW, Kwon KI, Kim JW, Kwon OS, et al. Progress on sodium reduction in South Korea. BMJ Glob Health 2020;5(5)
24. Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 2010;26:2190–1.
25. Lee S, Emond MJ, Bamshad MJ, Barnes KC, Rieder MJ, Nickerson DA, et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am J Hum Genet 2012;91:224–37.
26. Therneau TM, Grambsch PM. Modeling survival data: extending the cox model New York: Springer; 2000.
27. Vujkovic M, Keaton JM, Lynch JA, Miller DR, Zhou J, Tcheandjieu C, et al. Discovery of 318 new risk loci for type 2 diabetes and related vascular outcomes among 1.4 million participants in a multi-ancestry meta-analysis. Nat Genet 2020;52:680–91.
28. Langenberg C, Sharp SJ, Franks PW, Scott RA, Deloukas P, Forouhi NG, et al. Gene-lifestyle interaction and type 2 diabetes: the EPIC interact case-cohort study. PLoS Med 2014;11e1001647.
29. Cao Y, Li L, Xu M, Feng Z, Sun X, Lu J, et al. The ChinaMAP analytics of deep whole genome sequences in 10,588 individuals. Cell Res 2020;30:717–31.
30. Halldorsson BV, Eggertsson HP, Moore KH, Hauswedell H, Eiriksson O, Ulfarsson MO, et al. The sequences of 150,119 genomes in the UK Biobank. Nature 2022;607:732–40.
31. Ahlqvist E, Prasad RB, Groop L. Subtypes of type 2 diabetes determined from clinical parameters. Diabetes 2020;69:2086–93.
32. Zheng Y, Ley SH, Hu FB. Global aetiology and epidemiology of type 2 diabetes mellitus and its complications. Nat Rev Endocrinol 2018;14:88–98.

Article information Continued

Fig. 1.

Risk stratification of type 2 diabetes mellitus (T2DM) based on genetic and lifestyle factors. (A) Modified effect of T2DM-polygenic risk score (PRS) by protective rare allele. After sorting the T2DM-PRS scores in increasing order, the PRS bins were categorized as 1st bin (1%–20%), 2nd bin (21%–80%), and 3rd bin (81%–100%). For rare-allele carriers and non-carriers, all samples were categorized into three T2DM-PRS bins, and the prevalence of T2DM was calculated separately for rare-allele carriers and non-carriers. (B) Risk of future T2DM according to genetic and lifestyle risk factors. Associations among T2DM-PRS, healthy lifestyle score, and incident T2DM. The analyses were adjusted for age, sex, and the area of recruitment. HR, hazard ratio; CI, conf idence interval.

Fig. 2.

Survival rate of incident type 2 diabetes mellitus (T2DM) according to genetic and lifestyle risk factors. Survival rate of incident T2DM, stratified by (A) T2DM-polygenic risk score (PRS) (bottom 20%, intermediate 20%–80%, and top 20%), (B) healthy lifestyle score (HLS) (favorable, intermediate, and unfavorable), (C) HLSxRare allele combination (favorable HLS and non-carrier; unfavorable or intermediate HLS and non-carrier; and unfavorable or intermediate HLS and carrier), and (D) T2DM-PRS and HLS combination (bottom 20% PRS and favorable HLS; intermediate level in both PRS and HLS; and top 20% PRS and unfavorable).

Table 1.

Impact of high T2DM-PRS in the Korean population

PRS Prevalent T2DM
Incident T2DM
No. of case No. of control Odds ratio P value 95% CI Pseudo R2 No. of case Hazard ratio P value 95% CI
T2DM-PRS 13,220 100,453 2.06 1.65E-1017 2.01–2.10 8.81% 4,603 1.52 1.49E-168 1.48–1.57
High risk group/Reference group
 Top 20%/Remaining 80% 4,920 16,407 3.34 1.27E-719 3.21–3.48 5.53% 1,300 1.97 2.42E-95 1.85–2.10
 Top 10%/Remaining 90% 2,849 7,645 3.72 2.13E-568 3.53–3.91 4.15% 691 2.10 3.65E-72 1.94–2.28
 Top 5%/Remaining 95% 1,568 3,545 4.19 3.63E-392 3.93–4.48 2.78% 375 2.31 2.94E-54 2.08–2.57
 Top 1%/Remaining 99% 399 595 6.09 1.12E-148 5.32–6.98 1.04% 73 2.61 4.70E-16 2.07–3.29

T2DM, type 2 diabetes mellitus; PRS, polygenic risk score; CI, confidence interval.

Table 2.

Impact of high T2DM-PRS in the Korean population

Group Lifestyle score Total no. (%) Prevalent T2DM
Incident T2DM
No. of case Odds ratio P value 95% CI No. of case Hazard ratio P value 95% CI
Favorable (baseline) 4, 5 35,743 (24.43) 2,682 - - - 620 - - -
Intermediate 2, 3 91,607 (62.62) 8,263 1.19 2.65E-12 1.13–1.25 2,818 1.45 8.04E-16 1.32–1.58
Unfavorable 0, 1 18,934 (12.94) 2,275 1.67 3.87E-46 1.55–1.79 1,165 2.48 6.31E-51 2.20–2.79

T2DM, type 2 diabetes mellitus; PRS, polygenic risk score; CI, confidence interval.

Table 3.

Predictability of T2DM-PRS and HLS for future T2DM

PRS groupa HLS groupa No. of case Hazard ratio P value 95% CI
Bottom 20% Favorable 55 - - -
Intermediate 345 1.72 2.86E-04 1.28–2.31
Unfavorable 151 3.42 5.92E-11 2.37–4.94
20%–80% Favorable 364 2.43 8.10E-10 1.83–3.23
Intermediate 1,666 3.47 2.01E-19 2.65–4.54
Unfavorable 722 6.05 7.55E-34 4.52–8.09
Top 20% Favorable 201 4.87 3.21E-25 3.61–6.56
Intermediate 807 6.43 6.88E-40 4.88–8.47
Unfavorable 292 9.48 1.39E-43 6.89–13.03

T2DM, type 2 diabetes mellitus; PRS, polygenic risk score; HLS, healthy lifestyle score; CI, confidence interval.

a

Each group was compared with the baseline (favorable HLS and bottom 20% T2DM-PRS).

Table 4.

Predictability of HLS and I349F rare allele for future T2DM

HLS groupa I349F Prevalent T2DM
Incident T2DM
No. of case Odds ratio P value 95% CI No. of case Hazard ratio P value 95% CI
Favorable Non-carrier 2,659 - - - 613 - - -
Intermediate+Unfavorable Non-carrier 10,422 1.25 1.22E-20 1.19–1.31 3,947 1.59 7.88.E-25 1.45–1.74
Carrier 69 0.53 1.02E-06 0.41–0.68 27 0.84 3.97.E-01 0.57–1.25

HLS, healthy lifestyle score; T2DM, type 2 diabetes mellitus; CI, confidence interval.

a

Each group was compared to the baseline (favorable HLS and non-carriers).