Global surveillance of antimicrobial resistance in food animals using priority drugs maps

This analysis is structured in five steps (Fig. 5a–e): (a) collection and extraction of epidemiological information from point prevalence surveys (PPS); (b) mapping distribution of resistance prevalence using three machine learning models; (c) ensembling predictions using Gaussian process stacked generalization; (d) mapping priority antimicrobials for surveillance; and (e) estimating prediction uncertainty of maps generated in steps c and d. The literature review was conducted using Zotero (version 5.0.96.2) and Microsoft Excel (version 16.53), and all data analysis was conducted using R (version 4.1.1).

Fig. 5: Modeling framework.

a Collect point prevalence surveys. b Map distribution of resistance prevalence using three machine learning models: boosted regression trees (BRT), LASSO logistic regression (LASSO), feed-forward neural network (NNR). c Ensemble predictions using Gaussian process stacked generalization. d Map priority antimicrobials for surveillance. e Estimate prediction uncertainty of maps generated in steps c and d.

Data collection and imputation

We extracted 1088 point prevalence surveys on AMR of E. coli and nontyphoidal Salmonella in healthy food animals across low- and middle-income countries (LMICs) across two decades between 2000 and 2019 (Supplementary Table 3). These surveys were collected through three rounds of literature review of four databases (PubMed, Scopus, ISI Web of Science, and China National Knowledge Infrastructure). The process of data extraction is explained in detail in the Supplementary Information section “Literature review and data extraction”. These surveys were conducted on major food animal species including cattle (n = 409), pigs (n = 303), poultry (n = 570), sheep (n = 89), horse (n = 2), and goat (n = 2). The animal samples used to determine resistance prevalence were taken from their meat (34% of total resistance prevalence), swabs from living animals on farm or in wet markets (32%), food products such as milk and eggs (16%), swabs from slaughtered animals (9%), and fecal samples on farm (7%).

In each survey, we extracted information on resistance prevalence, method used for antibiotic susceptibility testing (AST), guideline document used for performing AST, breakpoints used for assessing AST results, sample origin, number of animal samples and bacterial isolates, as well as the geographic location and time of the survey. The majority (91%) of the studies used the performance standards for antimicrobial susceptibility testing developed by the Clinical and Laboratory Standards Institute (CLSI) or the European Committee for Antimicrobial Susceptibility Testing (EUCAST). Each performance standards set breakpoints to classify resistance phenotypes, which are updated annually. These variations in breakpoints were adjusted using methods developed by Van Boeckel and colleagues⁴, to maximize comparability between surveys.

For this analysis, we focused on 7 antimicrobial drugs: tetracycline (TET), ampicillin (AMP), sulfamethoxazole-trimethoprim (SXT), chloramphenicol (CHL), ciprofloxacin (CIP), gentamicin (GEN), and cefotaxime (CTX). The resistance prevalences of these drugs were the most frequently reported for their individual antimicrobial classes in the collected surveys (Supplementary Table 6), and therefore ensured robustness in comparisons made between surveys. These antimicrobial classes were all classified as critically important in veterinary medicine²⁰, and were also classified as either critically important or highly important for human medicine¹⁴. For each of the 7 drugs, we used all PPS that reported its resistance prevalence individually to map its distribution, with methods explained in the next sector. However, the subsequent prediction of priority antimicrobials requires complete resistance profiles with resistance prevalence of all 7 drugs. Therefore, 806 PPS that reported resistance prevalence of at least 4 out of these 7 drugs were included for this part of analysis. For the unreported antimicrobials, we imputed their resistance prevalence based on correlations between antimicrobials, using multivariate imputation by chained equations³⁷ (MICE; Supplementary Methods; Fig. 5a). The MICE algorithm imputed plausible values for 21% out of 9,877 antimicrobial resistance prevalence estimates in these surveys, while also providing a mechanism for integrating the uncertainty of imputation in the following analysis, as explained in section “Uncertainty”.

Trends of AMR for each antimicrobial class

We used logistic regression models to estimate temporal trends of resistance prevalence between 2000 and 2019 for each antimicrobial. For TET and AMP, we removed one outlier (DOI of PPS: 10.1264/jsme2.2000.173) out of 758 PPS reporting resistance for TET and 797 PPS reporting resistance for AMP, to ensure that the assumption of linearity between the logit of dependent variable and the independent variable was met based on results of Box-Tidwell test.

We mapped the distribution of the prevalence of resistance for each antimicrobial at 10 × 10 kilometer resolution using Gaussian process stacked generalization, an ensemble approach of multiple models. This approach has been shown to increase prediction accuracy for disease mapping compared with other methods such as Gaussian process regression³⁸. This mapping procedure comprised two steps (Fig. 5b, c). In the first step, we trained three ‘child models’ to predict resistance prevalence based on a set of environmental and anthropogenic covariates, such as total antimicrobial use in 2013 and 2020, animal population density, and temperature (Supplementary Table 3; Supplementary Method). For each antimicrobial class, we also included the quantities (kg) used in 2020¹⁸ disaggregated at 10 × 10 kilometer resolution as a covariate. This was calculated by disaggregating the total antimicrobial use per country proportionally to the distribution of animals’ biomass in 2020¹⁸. Animals’ biomass was calculated as the population correction units of food animals in 2020, using methods described in Van Boeckel et al. ³⁹. In the second step, the child model predictions were stacked using Gaussian process regression, fitted using the integrated nested Laplace approximations (INLA)⁴⁰ (Supplementary Methods). This second step allowed to simultaneously capture the influence of environmental and anthropogenic covariates, as well as the residual spatial correlation.

For each antimicrobial, we defined resistance hotspots as regions with resistance prevalence higher than the 95% percentile of all pixels on the map. We combined the drug-level resistance maps using summary metrics for the overall AMR level – N10, N25, or N50: the number of antimicrobials (out of 7) with resistance prevalence higher than 10%, 25%, or 50% in each pixel. For the summary AMR level across antimicrobial classes, resistance hotspots were defined as regions with N50 ≥ 3.

Mapping priority antimicrobials for AMR surveillance

Priority antimicrobials for AMR surveillance were defined as antimicrobials that have the highest probability of their resistance prevalence exceeding a critical level (defined as 10%, 25%, or 50%) in the near future. Here, we assumed that prevalence of resistance will continue to increase in the future, based on temporal trends of AMR between 2000 and 2019. We developed an approach to predict priority antimicrobials at each 10 × 10 kilometer pixel, based on local risk factors as well as patterns of co-resistance in PPS. In the following, we explain the modeling process using 50% as the critical resistance level, while similar procedures were followed for the other cutoff values of resistance prevalence (10% or 25%). We illustrate the model formulation, with the following example of a pixel with N50 = 4 (Fig. 6).

Fig. 6: LASSO logistic regression model to predict the probability that resistance prevalence of ciprofloxacin (CIP) will exceed 50% in the future, in pixels with predicted resistance profile (rp) of [1,1,1,1,0,0,0] (rp0) in 2015.

Firstly, we binarized the resistance profile in 2015 for a given pixel (e.g. TET 70%, AMP 75%, SXT 60%, CHL 55%, CIP 40%, GEN 30%, and CTX 30%) by reclassifying the antimicrobials with resistance higher than 50% as 1, and the opposite as 0, such that the resistance profile for the 7 drugs considered in this analysis was: [1,1,1,1,0,0,0] (Fig. 6a). Secondly, for each of the three antimicrobials classified as 0 (e.g. CIP, GEN, CTX), we predicted whether their resistance prevalence will exceed 50% as a binary response variable (Fig. 6c), using covariates extracted from the collected surveys (Fig. 6d). The model considers future scenarios where only one additional antimicrobial will exceed 50% resistance (Fig. 6b). The model was constructed using least absolute shrinkage and selection operator (LASSO) applied to logistic regression. Using CIP as an example, its resistance prevalence exceeds 50% in resistance profile rp1, while it is absent in resistance profiles rp2 and rp3 (Fig. 6b, c). The covariates used to predict its presence and absence included two components. The first component considers patterns of co-resistance between antimicrobials, implying that probabilities of occurrence vary between resistance profiles. This variation is captured by using the proportion of surveys recording rp1 out of all surveys recording rp1, rp2 or rp3 as a covariate (Fig. 6d.i). Patterns of co-resistance also implies that the development of resistance of CIP is dependent on resistance of other antimicrobials. This dependence is captured by using the number of antimicrobials with resistance higher than 50% in the resistance profile in 2015 as a covariate (Fig. 6d.ii). The second component of covariates includes risk factors for predicting the development of resistance. This includes the percentage of CIP use (kg) out of all three antimicrobials at the location of the survey (Fig. 6d.iii), as well as a set of environmental and anthropogenic covariates associated with the locations of the surveys, such as total antimicrobial use in 2013 and 2020, temperature, and animal density (Fig. 6d.iv; Supplementary Table 3).

The above example was based on the current resistance profile rp0 (Fig. 6a). For CIP, there were in total 64 permutations of current resistance profiles—all six antimicrobials apart from CIP could have resistance of 0 or 1. A complete model for CIP was trained by including all permutations in the procedure described in Fig. 6. This model was then applied to each pixel on the map where resistance to CIP has not yet exceeded 50%, to generate the probability that it will exceed 50% in the future. Similarly, the probabilities for the other antimicrobials were generated. Finally, at each pixel, we mapped the antimicrobial with the highest probability of its resistance prevalence exceeding 50% in the future.

The accuracy of the models for each antimicrobial was quantified by calculating the area under the receiver operating characteristic curve (AUC) using four-fold spatial cross validation⁴. The predictive capacity of the model came from two components of the covariates. The first component was based on co-resistance between drugs (Fig. 6d.i and Fig. 6d.ii). The second component was environmental and anthropogenic covariates associated with resistance to individual drugs (Fig. 6d.iii and Fig. 6d.iv). We quantified the relative contribution of these two covariate components to the model prediction accuracy, by calculating the drop in AUC following the withdrawal of each covariates compared with a full model including all covariates.

Furthermore, based on predictions of the priority antimicrobial for AMR surveillance at each 10 × 10 km pixel (Fig. 6), we estimated the time it takes for resistance prevalence of this antimicrobial to reach 50% in the future (Supplementary Fig. 16). Concretely, we extracted the current resistance prevalence estimated at each pixel, and calculated the time difference from the current resistance prevalence (Supplementary Fig. 16, time point a) until it reaches 50% (Supplementary Fig. 16, time point b), using the corresponding regression models fitted in section “Trends of AMR for each antimicrobial class”.

Uncertainty

The uncertainty of the mapped predictions of resistance prevalence (Fig. 5c) was calculated as the variance of the posterior predictive distribution for each map. The uncertainty of the mapped priority antimicrobials was calculated in two steps. Firstly, we generated 15 Monte Carlo simulations of imputed datasets of resistance prevalence, to incorporate the uncertainty introduced by imputation in the following analyses. Secondly, using the imputed datasets, we generated 15 maps of priority antimicrobials. We quantified its uncertainty by calculating—at each pixel—the proportion of maps that generated different predictions of antimicrobials as compared with the final map:

$${{{{{rm{Uncertainty}}}}}}=frac{{{{{{{rm{N}}}}}}}_{{{{{{rm{maps}}}}}}};{{{{{rm{with}}}}}};{{{{{rm{different}}}}}};{{{{{{rm{predictions}}}}}}}}{m}$$

(1)

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Reference

Rohit Malhotra

Rohit Malhotra is a medical expert and health journalist who offers evidence-based advice on fitness, nutrition, and mental well-being. His articles aim to help readers lead healthier lives.

Data collection and imputation

Trends of AMR for each antimicrobial class

Mapping priority antimicrobials for AMR surveillance

Uncertainty

Reporting summary

Leave a Comment Cancel reply