A mosquito (Culex pipiens) drinks blood on human skin. (Photo by Ihor Hvozdetskyi on Shutterstock)
In A Nutshell
- Researchers created a predictive mapping framework that estimates the monthly risk of West Nile virus (WNV) activity in mosquitoes across the Northeast U.S. at ~4 km resolution.
- The model was trained on 20 years of Connecticut mosquito surveillance (261,092 Culex pipiens tested, 1,417 positive pools) paired with weather, drought, land cover, and population data.
- Predictions were validated against human case data: towns and counties with higher predicted mosquito infection probabilities consistently had more reported human cases.
- Authors emphasize that active mosquito surveillance remains the “gold standard,” but the model increases the utility of existing trap data to guide risk awareness in unmonitored areas.
LUBBOCK, Texas — For decades, health officials have relied on mosquito traps to track West Nile virus (WNV). But traps are costly to operate and can’t cover every community. That leaves many areas with little direct information about their seasonal risk.
A study published in PNAS Nexus introduces a predictive mapping framework designed to extend what trap data can tell us. By combining two decades of mosquito surveillance in Connecticut with environmental data like temperature, rainfall, drought, land cover, and population density, the researchers built models that can spatially extrapolate the monthly risk of West Nile Virus activity in mosquitoes across the Northeast at about 4-kilometer resolution.
The authors stress that this is not a replacement for real mosquito surveillance: “Active mosquito-based WNV surveillance remains the gold standard for assessing the risk of WNV activity.” Instead, the approach provides a way to expand the utility of existing data to places where traps don’t exist.
Two Decades Of Connecticut Mosquito Data
The Connecticut Agricultural Experiment Station (CAES) has maintained one of the longest continuous mosquito monitoring programs in the country. According to the study, “CAES established their mosquito and arbovirus surveillance systems for eastern equine encephalitis virus (EEEV) in 1997.” The system grew in response to new threats, with “major expansions… in 2001 (due to WNV) and 2019 (due to EEEV).”
Between 2001 and 2020, scientists captured and tested more than 261,000 Culex pipiens mosquitoes, the main species that carries WNV in the Northeast. These were grouped into nearly 15,000 test pools, of which about 1,400 tested positive for the virus.
The study team, based at Texas Tech University, CAES, and Indiana University, paired this dataset with weather records, drought indices, land cover classifications, and population density. They then applied boosted regression trees, a machine-learning method, to identify environmental factors linked to mosquito abundance and WNV detection.
What the Models Found
The models showed that multiple factors combined to predict WNV detection. Urbanization and human population density were consistently important, though the authors note this might reflect both mosquito ecology and the design of the CAES surveillance system.
Relationships with weather were not simple. The study identified a “humped-shaped relationship with PDSI” (a drought index) and a “u-shaped relationship with prior month average precipitation.” These findings mean that virus detection does not rise linearly with hotter or drier conditions; it depends on timing and severity.
Seasonal timing also emerged: mosquito numbers peaked in July, virus detection in August, and reported human cases typically appeared around the same time or shortly afterward.
Testing Against Human Cases
To see if these predictions reflected real-world outcomes, the team compared model outputs with human case data.
In Connecticut towns (2001–2022), as the predicted probability of detecting WNV in mosquitoes increased from 0% up to 55%, the probability of observing at least one human case rose from 0% to about 40%, even after accounting for population size.
Across eight Northeastern states (2021–2022), counties with higher predicted mosquito detection probabilities also reported more human cases once population was considered.
“Our results confirm that detecting enzootic WNV transmission in the Northeast directly relates to risk of WNV to humans,” the authors write.
Mapping West Nile Virus Risk Across the Region
Using these models, the researchers generated maps showing monthly WNV risk across the Northeast. They found higher predicted probabilities in more urbanized areas, especially along southern and eastern coastal regions.
The model also captured known high-activity years. The authors note: “The spatial expansion and magnitude of risk that defined the 2018 transmission season is captured by the model predicting increased risk of WNV in the areas surrounding CT’s urban cores as well [as] very high detection probabilities within the cores; expansion of risk into eastern CT is also captured by the model, especially in the month of August.”
The paper also references 2004 as a low-prevalence year in contrast with 2018, illustrating how the model reflects both expansions and contractions of high-risk areas over time.
Why It Matters
Since West Nile Virus first appeared in New York in 1999, more than 50,000 clinical cases have been reported nationwide. But, as the authors write, “it is estimated that only 1% of all human infections are reported.” That suggests a total burden of around 5 million infections, underscoring the importance of tools that can extend risk awareness beyond trap locations.
The predictive framework increases the usefulness of existing surveillance: “This methodology increases the utility of point-source mosquito surveillance data by creating a flexible workflow for predicting risk of WNV to humans across the Northeast United States using easily accessible online data sources.”
Limitations To Keep in Mind
The authors identify several key limitations:
- The models predict probability of detecting West Nile Virus in mosquitoes, not human outbreak sizes.
- They focus on one species (Culex pipiens) in the Northeast, so results may not apply to other regions or vectors.
- The environmental relationships are non-linear, including the hump-shaped and u-shaped effects described above.
- Importantly, the study states: “no variables were included that specified the time of year (such as a monthly seasonality term) or directly identified the spatial location of a surveillance trap.”
Extending, Not Replacing, Mosquito Surveillance
The authors emphasize that predictive mapping cannot replace direct surveillance, but it can expand what health departments know between and beyond trap sites. By using publicly available environmental data alongside existing mosquito records, officials may be able to generate updated maps each summer that better guide mosquito control and public health warnings.
In short, the work shows that decades of trap data don’t just tell us where the virus has been, but they can also help forecast where risk is likely to appear next.
Paper Summary
Methodology
Researchers analyzed 20 years (2001-2020) of mosquito surveillance data from 87 sites across Connecticut, capturing and testing over 261,000 Culex pipiens mosquitoes. They combined this data with weather information (temperature, precipitation, drought conditions), land cover classifications, and human population density data. Using machine learning techniques called boosted regression trees, they identified the environmental factors that best predicted both mosquito abundance and West Nile virus detection. The team then created predictive models that could forecast virus activity across the entire Northeast United States at a 4-kilometer resolution.
Results
The models successfully predicted West Nile virus activity, with urban areas experiencing higher risk during hot, dry summer months (July-September). When validated against human case data from Connecticut (2001-2022) and northeastern states (2021-2022), areas with higher predicted mosquito infection probabilities consistently showed more human cases. The probability of human cases increased from nearly 0% to about 40% as predicted mosquito infection probability rose from 0% to 55%. Key predictors included drought conditions, human population density, specific wetland types, and precipitation patterns.
Limitations
The study focused specifically on the Northeast United States and Culex pipiens mosquitoes, so findings may not apply to other regions or mosquito species. The models did not include information about West Nile virus activity in other mosquito species or bird populations that also contribute to transmission. The research also used relatively coarse temporal (monthly) and spatial (4-kilometer) resolution, which might miss finer-scale transmission patterns.
Funding and Disclosures
This research was supported by the American Mosquito Control Association (2022 Research Fund number 2022-02) and the Centers for Disease Control and Prevention (Cooperative Agreement Number U01CK000509). The authors declared no competing interests. The contents are solely the responsibility of the authors and do not necessarily represent official views of the funding organizations.
Publication Information
McMillan, Joseph R., James Sun, Luis Fernando Chaves, and Philip M. Armstrong. “Using mosquito and arbovirus data to computationally predict West Nile virus in unsampled areas of the Northeast United States.” PNAS Nexus, vol. 4, no. 8, 2025, article pgaf227. Published online August 19, 2025. DOI: https://doi.org/10.1093/pnasnexus/pgaf227.







