On June 16, 2026, Google Research published what could quietly become one of the most practically useful AI datasets of the year for climate work: a vectorized version of their Farmscapes 2020 dataset, covering over 130,000 km² of England and mapping every hedgerow, stone wall, copse, and woodland boundary that standard satellites have been systematically blind to for decades.
The work comes from Michelangelo Conserva (Research Scientist) and Charlotte Stanton (Senior Program Manager) at Google Research, building on Google Earth AI infrastructure. The full research blog post is available at goo.gle/4eMWqG0.
This post unpacks why those "invisible" features matter enormously, how the AI pipeline works, what vectors give you that rasters never could, and what it means for the intersection of climate policy, food security, and open data.
The problem: a gap between what satellites see and what ecologists need
When climate scientists talk about land-cover mapping, they usually mean products like ESA's Sentinel-2 land cover map or the USGS National Land Cover Database. These are excellent at distinguishing forests from cropland from urban areas. At 10–30 meters per pixel, they are genuinely useful at landscape and national scale.
But they are structurally blind to a whole category of ecologically important features: the fine-scale woody structures that live between agricultural parcels rather than within them.
A traditional English hedgerow—a dense mix of hawthorn, blackthorn, ash, and oak along a field boundary—might be 2–4 meters wide. A drystone wall on the Yorkshire Dales is narrower still. An isolated copse of trees in the corner of an arable field is a few hundred square meters at most. At 10-meter resolution, all of these features blend into surrounding pixels or fall below the minimum mapping unit entirely.
The result is a systematic accounting gap. National forest inventories in England and across Europe reliably miss these structures. And because you cannot manage or protect what you cannot measure, they tend to fall outside carbon accounting frameworks, agri-environment payment schemes, and biodiversity baseline assessments—even though they collectively represent substantial ecological value.
This is the core problem Farmscapes sets out to solve.
From raster pixels to vectors: why the distinction matters
Most land-cover products are rasters: grids of pixels where each pixel carries a class label (forest, cropland, water). Rasters are computationally convenient and visually intuitive. But for fine-scale features like hedgerows, they have a fundamental limitation: a hedgerow that crosses dozens of pixels produces a fragmented chain of labels, and there is no way to query "give me all hedgerows longer than 500 meters" without expensive post-processing that is often inaccurate.
Vector representations encode features as polygons and lines with explicit geometry—start point, end point, vertices, area, perimeter. Every hedgerow becomes a single geometric object with queryable attributes. You can immediately filter by length, area, compactness, proximity to water, or intersection with a specific farm parcel. You can compute carbon density estimates per feature. You can detect where a hedgerow was removed between two survey years by comparing two vector datasets. You can feed feature geometries directly into carbon accounting models or agri-environment payment calculations.
Vectors are the format that land managers, policymakers, and GIS analysts actually need. Farmscapes delivers them at national scale for the first time.
The AI/ML pipeline: RSF ViT, dual-layer labeling, and shape math
The backbone: Remote Sensing Foundations (RSF)
The central challenge in mapping fine-scale features nationally is data scarcity. You cannot annotate every hedgerow in England by hand—it is an enormous undertaking. The team had only approximately 247 km² of carefully annotated training data, covering less than 0.2% of England's area.
The solution is transfer learning from Google Earth AI's Remote Sensing Foundations (RSF) model: a Vision-Transformer (ViT) architecture pre-trained on over 300 million global satellite images. This pre-training builds deep, generalizable representations of what the Earth looks like from above—texture patterns, spectral signatures, structural features—across every climate zone and land-cover type on the planet.
When you fine-tune RSF on a small annotated dataset of English hedgerows, the model does not start from random weights; it starts from rich Earth-observation representations built over 300 million examples. The result is a model that generalizes well to unannotated areas from very limited supervision—exactly the regime that national-scale mapping requires.
Dual-layer labeling: imagery plus LiDAR
Aerial imagery alone has an inherent limitation for ecological mapping: it captures surface color and texture but not height. A stone wall and a low hedge may look similar in RGB imagery. Bare soil and a mown grass verge may look identical spectrally.
Farmscapes addresses this by fusing two complementary data sources:
- Submeter aerial imagery: captures fine-grained texture, color, and planimetric shape.
- 1-meter LiDAR point-cloud data: captures the height of everything above the ground surface—tree canopies, walls, hedges, buildings.
This dual-layer approach produces two output layers:
- Ground-level boundaries: farmed land parcels, water bodies, roads—the flat features.
- Above-ground features: trees, hedgerows, walls, copses—anything with vertical extent above ground.
Separating these two layers is critical for downstream use. A farmer calculating the area of their cultivable land needs the ground-level parcel boundary, not the canopy extent. A carbon accountant measuring woody biomass needs the above-ground feature layer. Conflating them produces errors in both use cases.
Polsby-Popper: borrowing from political science to classify ecology
Once the model produces candidate polygons for above-ground woody features, the pipeline needs to decide: is this a woodland, a small copse, or a hedgerow?
The team uses the Polsby-Popper compactness score, a formula originally developed to detect gerrymandered electoral districts. It measures how compact a shape is relative to its perimeter. A perfect circle scores 1.0; a highly elongated or irregular shape scores close to 0.
The classification rules are:
- Woodlands: compact shapes (Polsby-Popper ≥ 0.5) with a substantial canopy of at least 30 meters in diameter.
- Woody patches: compact shapes with smaller canopy—individual trees or small copses.
- Linear woody features (hedgerows and shelterbelts): Polsby-Popper score below 0.5, indicating the shape is elongated rather than circular.
This is an elegant choice. Rather than training a separate classifier on feature type (which would require type-specific annotations), the pipeline infers structural type directly from geometry. Hedgerows are, by definition, long and thin. The Polsby-Popper score operationalizes that intuition into a consistent, scalable rule.
Tile-based processing and geometry merging
Processing 130,000 km² in one pass is computationally impractical. The pipeline uses Google Earth Engine to run inference in parallel across a grid of S2-cell tiles, a hierarchical geographic indexing system that partitions the Earth's surface into cells of consistent area.
This introduces a boundary problem: features that straddle tile edges get artificially split into two polygon fragments. The pipeline includes a post-processing step that merges geometries across tile borders, reconnecting fragmented hedgerows and woodlands into single coherent objects. This is a non-trivial engineering detail—without it, every cross-border feature would appear as two truncated pieces in the output, corrupting length measurements and connectivity analyses.
The climate-biodiversity-food security triangle
The Farmscapes project sits at the intersection of three competing pressures that define land-use policy in the 2020s.
The climate pressure: reducing atmospheric carbon requires expanding carbon sinks. Trees and shrubs are among the most scalable biological carbon stores. Rewilding initiatives across Europe aim to restore forest cover on agricultural land.
The biodiversity pressure: decades of agricultural intensification have caused catastrophic declines in farmland birds, pollinators, and wildflower meadows. Restoring habitat connectivity—the networks of hedgerows, field margins, and copses that link larger habitat patches—is one of the most evidence-supported biodiversity interventions available.
The food security pressure: a growing global population requires sustained or increased agricultural output. Rewilding or afforestation programs that convert productive cropland to woodland directly reduce the land available for food production. In a world where food security is already stressed, this tradeoff is politically and practically fraught.
Fine-scale woody features like hedgerows and shelterbelts offer a third way through this triangle. They occupy field margins and boundaries—land that typically contributes little to yield—rather than the cultivable interior of fields. A well-maintained hedgerow network can store significant carbon, provide habitat connectivity for pollinators and birds, reduce soil erosion, and buffer fields against wind without removing a single hectare from food production.
But realizing this opportunity requires knowing where these features are, how much carbon they contain, where they have been lost, and where new ones could be established. That is precisely what Farmscapes provides.
Real-world applications: from pixels to policy
Carbon accounting and nature credits
The immediate practical application is integrating hedgerow and copse data into carbon accounting frameworks. Current UK Woodland Carbon Code rules have minimum size thresholds that exclude most fine-scale woody features. With a nationally consistent vector dataset at submeter precision, it becomes possible to accurately quantify carbon stocks in existing hedgerows, monitor changes over time, and build the evidence base needed to include them in accredited carbon accounting methodologies.
Agri-environment payment schemes
England's post-Brexit Environmental Land Management (ELM) scheme pays farmers for ecosystem services—but payment rates and eligibility depend on accurate measurement of the features being managed. Farmscapes data could underpin payment calculations for hedgerow maintenance, new hedgerow establishment, and connectivity restoration, making the payments both more accurate and harder to game.
Identifying leakage
One persistent problem in conservation programs is leakage: a farmer protects hedgerows on their land in exchange for a payment, while a neighboring farm removes hedgerows that are not under any scheme. The net result for biodiversity and carbon is near-zero, but the payment has been made. Detecting leakage requires systematic monitoring across entire landscapes, not just on enrolled parcels. A national-scale vector dataset updated at regular intervals makes landscape-level leakage detection feasible for the first time.
Silvopasture and agrisilviculture planning
The research team explicitly flags support for silvopasture (integrating trees into pasture) and agrisilviculture (integrating trees into arable systems) as a future direction. Both practices require detailed baseline mapping of existing woody cover—where trees already exist, how dense they are, and where gaps in connectivity could be filled with new planting. Farmscapes provides exactly this baseline.
What open data means for this work
The decision to release Farmscapes as open data rather than as a commercial or restricted-access product is not a footnote—it is a critical design choice that determines whether the work achieves impact.
Carbon accounting depends on reproducibility and third-party verification. Conservation payment schemes require transparency in the underlying data. Academic researchers studying land-use change need access to the baseline. Farmers assessing their own land use want to see their own hedgerows in the data. None of these use cases are compatible with restricted data access.
By making the dataset publicly available and processing it on Google Earth Engine—a platform with free access for research and non-commercial use—the team has ensured that the work can be used, scrutinized, extended, and challenged by the full ecosystem of people who need it: ecologists, climate scientists, policymakers, land managers, and farmers themselves.
This also positions Farmscapes as infrastructure. Future work can extend the pipeline to other regions of the UK, to Europe, or to any landscape where fine-scale woody features are present and underrepresented in existing inventories. The combination of RSF pre-training (global satellite data), the dual-layer labeling framework, and the Polsby-Popper classification logic is a reusable system, not a one-off experiment.
What comes next
The Google Research team outlined three near-term directions at the end of their post:
Broader utility for nature-based solutions: the pipeline is not limited to hedgerows. The same approach could map riparian buffer strips, field margin vegetation, roadside verges, and other fine-scale features that are similarly invisible to standard inventories.
Silvopasture and agrisilviculture: integrating trees into farming systems at scale requires detailed understanding of existing woody cover on productive agricultural land. Farmscapes provides the baseline.
Leakage event identification: building temporal comparison into the pipeline would allow detection of where habitat gains in one area are offset by losses elsewhere—making conservation accounting more honest and more useful for policy evaluation.
The technical infrastructure—RSF ViT, dual-layer fusion, Polsby-Popper classification, Google Earth Engine at scale—is in place. The open dataset is released. The question now is how quickly the land management, carbon accounting, and biodiversity policy communities build on top of it.
For a field where the standard answer to "where are the hedgerows?" has long been "we don't really know," Farmscapes is a substantial step forward.
FAQ
What is the Google Farmscapes dataset? Farmscapes is a vectorized land-cover dataset released by Google Research covering 130,000+ km² of England. It uses a deep learning pipeline—built on the Remote Sensing Foundations (RSF) Vision-Transformer backbone pre-trained on 300M+ satellite images—to detect fine-scale ecological features like hedgerows, stone walls, copses, and woodlands that are invisible to standard national forest inventories.
Why can't standard satellites detect hedgerows? Most national forest inventories rely on moderate-resolution satellite imagery (10–30m per pixel). A mature hedgerow or stone wall boundary may only be 2–5 meters wide. At standard resolution those features blend into adjacent cropland pixels and are simply not detectable. Farmscapes uses submeter aerial imagery combined with 1-meter LiDAR data to resolve these features accurately.
What is the RSF ViT backbone? RSF (Remote Sensing Foundations) is a Vision-Transformer model developed as part of Google Earth AI and pre-trained on over 300 million global satellite images. This large-scale pre-training gives the model rich feature representations that transfer well to specific tasks—like hedgerow detection—even when annotated training data is scarce. The Farmscapes pipeline fine-tuned RSF on only ~247 km² of annotated data to achieve national-scale coverage.
What is the Polsby-Popper score and why does it matter here? The Polsby-Popper score is a shape-compactness metric originally used to detect gerrymandered electoral districts. A perfect circle scores 1.0; elongated or irregular shapes score lower. Farmscapes uses it to distinguish linear woody features (hedgerows, shelterbelts) from compact ones (woodlands, copses). Features with a score below 0.5 are classified as hedgerows; more compact shapes become woodlands or woody patches.
Why do hedgerows matter for climate and biodiversity? Hedgerows occupy field boundaries rather than cropland interiors, meaning they can accumulate carbon and support wildlife without displacing food production. This makes them a practical "third way" between pure rewilding and pure intensification. Accurate mapping is the first step to counting them in carbon credits, agri-environment schemes, and national biodiversity baselines.
Is the Farmscapes dataset publicly available? Yes. Google Research released the vectorized Farmscapes 2020 dataset as open data, accessible to farmers, scientists, and policymakers. It was processed using Google Earth Engine for parallel tile-based computation at national scale across England.
Sources
- Google Research blog post (June 16, 2026): goo.gle/4eMWqG0
- Authors: Michelangelo Conserva (Research Scientist) and Charlotte Stanton (Senior Program Manager), Google Research
- Related: Google Earth AI and Remote Sensing Foundations (RSF) project
This post is based on the Google Research blog post published June 16, 2026. Dataset details, coverage, and future roadmap may be updated as the project evolves.