1. Introduction
The South African National Department of Health has committed to ‘The Global Alliance to end AIDS in children by 2030.’ This partnership aims to reinvigorate comprehensive HIV care and improve clinical outcomes in children and adolescents [
1]. The work of the Global Alliance is aligned with four pillars, namely (1) Early testing and optimal treatment and care for infants, children, and adolescents (2) Closing the treatment gap for pregnant and breastfeeding women living with HIV to eliminate vertical transmission (3) Preventing new HIV infections among pregnant and breastfeeding adolescent girls and women; and (4) Addressing rights, gender equality, and the social and structural barriers that hinder access to services.
Progress in reducing vertical transmission of HIV has been remarkable since 2011, when the ‘Global Plan towards Elimination of New HIV Infections among Children and Keeping their Mothers Alive by 2015’ was launched [
2]. However, criteria for the elimination of vertical transmission (EVT), which include reducing new HIV cases to fewer than 50 per 100,000 live births, have been difficult to achieve, with only a handful of countries succeeding [
3]. Nevertheless, remarkable progress in reducing the vertical transmission rate has been made in South Africa, with early transmission at around 6 weeks of age declining from 25–30% in 2001 to approximately 1.4% in 2016 [
4]. Whereas South Africa’s paediatric HIV incidence has in the past been reported using epidemiological surveys, routine data sources continue to be used to gauge paediatric HIV testing coverage and vertical transmission rates. These include data from the National Health Laboratory Service (NHLS), as reported from the National Institute for Communicable Diseases (NICD) data warehouse, and the District Health Information System (DHIS). Data from the NHLS represents all clinical laboratory data from South Africa’s public health sector and is considered the most accurate source of laboratory test data. Alternatively, the DHIS is a much broader source of public health data and is used to collect aggregated routine health data from over 4000 public healthcare facilities to support the planning and monitoring of health services. Additionally, the Thembisa mathematical model is widely used to monitor the country’s HIV programme [
5]. The latest Thembisa 4.6 version of the model has been revised to include assumptions associated with the introduction of routine birth and 6-month testing among HIV-exposed infants.
In the following study, triangulation of South African HIV information sources was undertaken to document trends, establish a baseline for early HIV testing coverage and vertical transmission, and facilitate the identification of gaps and monitoring of interventions to contribute towards the country’s Global Alliance workplan.
2. Materials and Methods
2.1. Diagnostic Setting
The South African National HIV Guidelines’ testing recommendations are mostly aligned with the scheduled clinic visits of the Expanded Program of Immunization (EPI) to facilitate an integrated approach to comprehensive child care services. HIV Polymerase Chain Reaction (PCR) testing is recommended for all HIV-exposed infants at birth to detect intrauterine transmission, around 10-weeks to detect intrapartum transmission, at 6-months to detect early postnatal transmission, and at 6 weeks post cessation of breastfeeding to exclude postnatal transmission. Any child presenting with symptoms of HIV infection is eligible for an age-appropriate HIV test. All positive HIV PCR tests should be confirmed on a subsequent sample. The recommendation for testing at 6-months was introduced in December 2019 [
6], just prior to the COVID-19 pandemic, and was informed by the South Africa Prevention of Mother-to-Child Transmission Evaluation study finding that 80% of all vertical HIV transmission in pregnant and breastfeeding women living with HIV (PBWHIV) diagnosed during antenatal care in South Africa occurs by 6-months of age [
7]. These new guidelines also recommended that all children, regardless of HIV exposure, have a rapid HIV test or HIV ELISA at 18 months of age, which, if reactive, be confirmed with an HIV PCR test (on account of the potential to detect maternal antibodies during infancy and early childhood, antibody-based HIV tests are only deemed diagnostic after 24 months of age) [
8]. The rationale for universal testing at 18 months of age is to identify and link to care for all vertically HIV-infected children at that time point. Reasons for HIV-infected children only being identified at 18 months include being previously diagnosed but not linked to care or lost to follow-up post linkage, never tested, or born to PBWHIV with undetected HIV infection or late seroconversion.
Essentially, all children living with HIV (CLHIV) <2 years of age should be diagnosed by HIV PCR testing, with guidelines supporting multiple testing opportunities to facilitate timely diagnosis and linkage to care during infancy and early childhood.
2.2. Data Sources and Age Disaggregation
The NHLS, which serves an estimated 80% of the South African population, performs laboratory diagnostic services in around 260 laboratories nationally. All these test data are stored centrally, and the NICD, a division of the NHLS, analyses the HIV test data to provide near real time HIV programme ‘data for action’ reporting designed to assist in improving patient outcomes [
9]. For this analysis, the total number of HIV PCR tests and the total number of first HIV PCR positive tests performed <2 years of age were extracted. The ‘first HIV PCR positive test’ was obtained using the NHLS Corporate Data Warehouse (CDW) algorithm, which links multiple tests belonging to a single patient based on a scored probability of how closely patient demographics, viz., name, surname, and date of birth, match [
10]. The earliest HIV PCR positive test per patient is the ‘first HIV PCR positive test’ which theoretically equates to a newly infected child. However, poor data quality and a linking algorithm set to maintain high specificity translate into the under-matching of tests and therefore over-counting of newly-diagnosed HIV-infected children using the first HIV PCR positive test indicator. To remedy this, a manual deduplication exercise was undertaken on 4870 HIV PCR positive tests in 2021 to estimate the percentage overestimation of first PCR positive tests at different age intervals (personal communication: GG Sherman). 70% of this deduplication exercise was on HIV PCR-positive tests from the Western Cape Province, known to have the best data quality and therefore the highest chance of detecting duplicate tests. These findings were integrated into the NICD data by reducing the number of first HIV PCR positives by 10% and 30% at ages <7 days and 7 days–<7 months, respectively. Negligible numbers of duplicate HIV PCR positive tests were detected between 7 and 24 months of age. NICD PCR positives were disaggregated by age such that birth, 10-week, and 6-month testing were defined as testing occurring at <7 days, 7 days–<3 months, and 5–7 months of age, respectively. The vast majority of total HIV tests are comprised of HIV PCR-negative results, for which no estimate of the underperformance of the CDW linking algorithm is available. Therefore, the NICD age ranges for total HIV tests performed have been narrowed to reduce the likelihood of overcounting. For example, total tests performed at birth are counted as those performed at <3 days and 10-weeks at 2–3 months of age, to reduce the chances of counting potential duplicate testing performed at the 6-day postnatal or 6-week immunisation visits, respectively.
DHIS indicators used for this analysis included the total number of HIV PCR tests performed and the number of HIV PCR positive tests at birth (0–<6 weeks of age) and around 10-weeks (6–<14 weeks of age); the total number of HIV rapid tests performed and positive results at 18 months of age; the total number of live births at a facility; the total number of live births to women living with HIV (WLHIV); and the public sector immunisation coverages at birth, 10-weeks, and 6- and 18 months of age. As indicated above, the age ranges for birth and around 10-week PCR testing differ between NICD and DHIS data. NICD defines total birth positives as <7 days, compared with DHIS at 0–<6 weeks, to further counteract inaccuracies in deduplication and achieve a more accurate estimate of neonates testing HIV PCR positive for the first time. This is based on NICD data demonstrating that 95% of the total HIV PCR tests performed between 0–<6 weeks occur at <7 days. Furthermore, by limiting the interval of the various NICD age categories, more accurate patient-level reporting can be achieved with the CDW record linking algorithm, as previously demonstrated [
10]. For the years 2017–2019, HIV rapid test data was reported on DHIS. Although the 2019 HIV testing guidelines recommended that all children undergo HIV antibody testing at 18 months of age, the national indicator data set defined this indicator for HIV-exposed children only, which may have resulted in underreporting.
Thembisa 4.6 estimates used include new HIV diagnoses <1 year of age, 1–<2 years of age, and new HIV infections at/before birth and due to breastfeeding <3 years, total births, and births to HIV-positive mothers. The model assumes vertical transmission rates at birth and during breastfeeding depend on the timing of maternal HIV acquisition (with transmission rates being particularly high during the acute phase of HIV infection), the timing of maternal ART initiation, infant feeding practises (with the lower postnatal transmission in the case of exclusive breastfeeding compared to mixed feeding), and the maternal CD4 count in untreated mothers [
11]. Women who are HIV-diagnosed are assumed to have shorter durations of breastfeeding than undiagnosed HIV-positive mothers and HIV-negative mothers. Infants who acquire HIV at/before birth are assumed to have high rates of HIV disease progression and mortality in the absence of antiretroviral treatment (ART), while children who acquire HIV postnatally are assumed to have relatively slow disease progression. The model allows for early infant diagnosis, testing at 18 months, and testing in children at other times (most frequently because of HIV-related symptoms). The model is calibrated to HIV prevalence data from antenatal clinic surveys, and the paediatric component of the model is further calibrated to a number of additional data sources, including paediatric HIV prevalence in household surveys, recorded deaths, total antibody tests, and testing yields, total children on ART, and the age distribution of children on ART [
12]. There are a number of important differences between Thembisa estimates and DHIS and NICD data sources. For example, the number of CLHIV patients who are diagnosed according to the Thembisa model is the estimated number where positive test results are returned to the caregiver, whereas NICD and DHIS indicators are based on total positive results. More broadly, the Thembisa estimates are for the whole country, including the private healthcare sector, whereas HIV indicators from DHIS and NICD are restricted to the public sector.
NICD and DHIS monthly data were extracted for a five-year period spanning 1 July 2017 to 30 June 2022. Data were analysed such that flow variables (e.g., testing volumes and positivity) for each year were reported from the middle of the stated calendar year to the middle of the following calendar year to match the time intervals used in the Thembisa model. Output definitions, age ranges, and calculations are summarised in
Table 1. Flow variables were categorised by age as <1 year referring to the first year of life, 1–<2 years referring to the second year of life, and <2 years referring to the first two years of life.
2.3. Testing Coverage and Positivity
NICD and DHIS testing coverages at different ages were calculated by dividing the total number of HIV PCR tests performed by the total number of live births to WLHIV—except for the 18-month age group, where the total number of live births at a facility was used as the denominator (
Table 1). The percentages of children immunized at the same time intervals as HIV testing (birth, 10-weeks, 6-months, and 18-months) were obtained from DHIS for the public health sector.
The number of infants with a positive result per annum for the period 2017–2021 is described for NICD and DHIS data sources at birth and 10-weeks, with data also available from NICD at 6 months of age. Additionally, for the period 2015–2019, DHIS reported on the number of positive HIV rapid tests at 18 months of age, thereby allowing a comparison with the NICD number of positive HIV PCR results between 18–24 months of age.
Case rates of newly infected children for the Thembisa model were calculated by dividing the new HIV infections at/before birth and due to breastfeeding, accounting for all vertical transmission among CLHIV <3 years of age, by the total number of live births in the country as estimated by the model. Additionally, Thembisa estimates for new HIV infections at <2 years of age were used as an alternate numerator that would be more comparable to the NICD data.
Case rates for the NICD and DHIS data were calculated by dividing the total number of first HIV PCR positive tests within <2 years by the NICD by the DHIS total number of live births at a facility. Case rates are expressed per 100,000 live births.
Annual trends are presented, and direct comparisons between the datasets are assessed for the 12-month period of 1 July 2021–30 June 2022, unless otherwise stated.
2.4. Ethical Considerations
The National Institute for Communicable Diseases has approval from the Human Research Ethics Committee of the University of the Witwatersrand (M160667; M210752) to conduct communicable disease surveillance and analysis of routine laboratory data. All study methods were performed in accordance with the relevant guidelines and regulations. No patient-identifying data was extracted or used for this analysis.
4. Discussion
South Africa’s health data, especially for infants, urgently requires a universally used unique identifier. Until this is achieved, national longitudinal cohort monitoring is unattainable. Although the three sources of HIV estimates used in this comparison are distinctly different in the types and methods of data collection, their triangulation (with consideration of the caveats associated with each dataset) provides a more robust picture of the vertical transmission landscape than viewing each in isolation.
While the Thembisa model estimates that nearly half of HIV-infected infants remain undiagnosed during the first year of life, NICD and DHIS data demonstrate excellent testing coverage at birth but increasing testing gaps at 10-weeks and 6-months of age. Whereas the NICD 10-week testing coverage slowly increased between 2017 and 2021 from 80% to 86%, the 6-month testing coverage reached only 49%, reflecting implementation efforts that were likely stunted by the COVID-19 pandemic occurring three months after the release of the new guidelines in 2019. As the majority of vertical transmission cases are thought to occur by 6-months of age [
7], increasing testing coverage at this time point is critical to improving early diagnostic case findings.
The lower testing coverage estimates of DHIS compared with NICD at these time points, and especially the lower numbers of infected infants reported, are likely accounted for by challenges associated with the manual nature of data collection within DHIS. Accurate reporting of DHIS HIV PCR positive tests at birth is particularly problematic since DHIS reporting at hospital-level is generally less well performed than at clinic-level and because birth PCR test results are usually only available once the newborn has already been discharged, often to a local primary healthcare facility, for postnatal follow-up. On account of the additional deduplication and restricted age ranges applied to the NICD data, the NICD estimates are likely to be conservative, further highlighting the underreporting of these DHIS HIV PCR-positive indicators.
The reported HIV rapid testing coverage of only 33% at 18 months of age for 2019 is also concerning. Like 6-month HIV PCR testing, universal HIV antibody testing for all children was introduced in the 2019 guidelines, and implementation may also have been hampered by the COVID-19 pandemic. Although the definition of this indicator at the time was for HIV-exposed children only, it is unlikely that data collection was restricted to this population. However, this may have resulted in some underreporting. Nevertheless, improved testing coverage at this time point, including confirmation of positive antibody tests with HIV PCR tests, will be essential for closing the diagnostic gap, as Thembisa estimates over a third of CLHIV patients <2 years of age remain undiagnosed.
Importantly, DHIS public-sector immunisation data suggests moderate vaccination coverage of >80% at 6 months of age. The difference between immunisation and early infant diagnosis coverage at this time point suggests a lack of comprehensive care at facilities where infants are presenting for their immunizations but are not being identified for HIV testing. This is even true at 18 months of age, when two-thirds of children are being vaccinated but only one-third are being tested for HIV. This critical gap in HIV testing should prompt facility-based investigations to improve the integration of health services at the primary care level.
A deeper understanding of transmission routes is also fostered through data triangulation. Although the Thembisa model estimates breastfeeding to be the predominant mode of vertical transmission, NICD data demonstrates persistent intra-uterine infections, which comprise more than a third of all diagnosed CLHIV <1 year of age. This highlights the need not only for preventing new HIV infections among pregnant and breastfeeding women but also the importance of early antenatal booking and comprehensive HIV testing services, as well as improved maternal virological control during the antenatal period, if South Africa is to eliminate vertical transmission.
As per World Health Organization criteria for validation of the elimination of vertical transmission, South Africa has to date achieved the impact target of Bronze tier status with
750 cases per 100,000 live births [
13]. Although the antenatal HIV prevalence is expected to decrease, with HIV-exposed infants comprising approximately 16% of total live births by 2030 (from 25% in 2020), Thembisa estimates suggest that at best, only Silver-tier status of between 250 and 500 cases per 100,000 live births will be achieved by 2030. Hence, the target of
50 cases per 100,000 live births required for the elimination of vertical transmission will remain elusive unless additional interventions are introduced. One such potential intervention is enhanced access to pre-exposure prophylaxis (PrEP). A modelling study evaluating the impact of 80% PrEP coverage in South Africa between 2020 and 2030 has estimated a reduction in vertical transmission of approximately 40% [
14]. Although this is very promising and will likely reduce the gap between new cases and those diagnosed, it is in itself not sufficient to bring down the case rates to the desired level. Hence, further strengthening of the healthcare system in conjunction with additional innovations in the field, such as long-acting injectable agents for both prevention and treatment, must be effected together if South Africa is to eliminate vertical transmission.
Limitations exist for all three sets of estimates. Statistical modelling is dependent on reliable input data, so estimates are only as valid and updated as the data used [
15]. As the number of CLHIV patients who are diagnosed according to the Thembisa model refers to those cases where a diagnosis is received by caregivers, it is not directly comparable with NICD and DHIS data. However, as there has recently been considerable effort in the field to ensure all HIV PCR positive results are acted upon [
9], it remains to be determined what proportion of CLHIV patients do not receive results. Furthermore, estimates presented in this study, such as the number and proportion of CLHIV patients who remain undiagnosed as well as case rates, are calculated using numerators and denominators that relate to different cohorts, resulting in potential inaccuracies (albeit minimal). Using routine programmatic data like the DHIS depends on every healthcare worker responsible for data collection at ±4000 facilities correctly and completely recording information and accurately collating it, resulting in likely underreporting. The laboratory data reported by NICD excludes indeterminate HIV PCR results (as does DHIS data) as well as eligible infants who never accessed testing, such as when a mother’s HIV status is unknown or in cases where there is poor health-seeking behaviour or infant death. Indeterminate results comprise approximately 15% of ‘HIV-detected’ PCR results, with nearly half of patients with an indeterminate result subsequently found to be HIV-infected [
16]. The decision to exclude these cases was taken to account for the reduced positive predictive value (i.e., increased probability of false-positive results) among early infant diagnostic assays within the context of a decreasing transmission rate, but is likely to have resulted in underreporting of the true number of HIV-infected infants who have an HIV-detected PCR result. On the other hand, the lack of a unique patient identifier from birth makes reporting patient-level data challenging, with potential over-reporting of testing coverage and positivity.
Some of these limitations can be addressed. For instance, the most recent revision of the Thembisa model has incorporated programmatic updates that strengthen diagnostic estimates. This has, for example, resulted in a higher estimate of the proportion of CLHIV <2 years of age who are diagnosed. Whereas the Cost-Effectiveness of Preventing AIDS Complications (CEPAC) Pediatric model estimates 56% of CLHIV 2 years of age were diagnosed in 2018, Thembisa 4.6 estimates 61% of CLHIV <2 years of age were diagnosed in 2018, and this further increased to 66% in 2021 [
17]. Analysis of laboratory data has been refined by limiting the age ranges used to describe testing time points and adjusting positivity outputs following additional deduplication (albeit by generating assumptions from a validation exercise involving only one province and one district). An investigation of the discrepancy in the number of PCR-positive tests at birth and 10-weeks of age between DHIS and NICD is required to more accurately count the number of infants diagnosed. However, until the Health Patient Registration System (HPRS) or other unique health identifier is issued at birth, efforts to accurately enumerate children diagnosed with HIV and longitudinally monitor them in care will be hampered.