Intergration With 3rd-Party Extensions

Intergration With 3rd-Party Extensions#

3rd-party Xarray geospatial extensions may leverage XProj in different ways:


import pyproj
import xarray as xr
import xproj

xr.set_options(display_expand_indexes=True);

CRS-aware Xarray accessor#

Here below is a basic example of a custom “geo” Xarray Dataset accessor class that is also explictly registered with the xproj.register_accessor() decorator. It inherits from ProjAccessorMixin.

Registering this “geo” accessor allows executing custom logic from within the accessor (via the proj accessor interface) when calling xproj API.

Note

The xproj.register_accessor() decorator must be applied after (on top of) the Xarray register decorators.

@xproj.register_accessor
@xr.register_dataset_accessor("geo")
class GeoAccessor(xproj.ProjAccessorMixin):

    def __init__(self, obj):
        self._obj = obj

    @property
    def crs(self):
        # Just reusing XProj's API
        # (Assuming this accessor only supports single-CRS datasets)
        return self._obj.proj.crs

    def _proj_set_crs(self, crs_coord_name, crs):
        # Nothing much done here, just printing something before
        # returning the Xarray dataset unchanged

        print(f"from GeoAccessor: new CRS of {crs_coord_name!r} is {crs}!")
        return self._obj

Let’s see it in action with an Xarray tutorial dataset.

ds = xr.tutorial.load_dataset("air_temperature")

Assigning a new spatial reference coordinate with a CRS will also call the GeoAccessor._proj_set_crs method implemented above.

ds_wgs84 = ds.proj.assign_crs(spatial_ref="epsg:4326")
from GeoAccessor: new CRS of 'spatial_ref' is epsg:4326!
ds_wgs84
<xarray.Dataset> Size: 31MB
Dimensions:      (lat: 25, time: 2920, lon: 53)
Coordinates:
  * lat          (lat) float32 100B 75.0 72.5 70.0 67.5 ... 22.5 20.0 17.5 15.0
  * lon          (lon) float32 212B 200.0 202.5 205.0 ... 325.0 327.5 330.0
  * time         (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
  * spatial_ref  int64 8B 0
Data variables:
    air          (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7
Indexes:
    spatial_ref  CRSIndex (crs=EPSG:4326)
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

The CRS defined above can be accessed via the “geo” accessor:

ds_wgs84.geo.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

CRS-aware Xarray index#

Here below is a basic example of a CRS-aware index, i.e., a custom Xarray index that adds some CRS-dependent functionality (via the proj index interface) on top of Xarray’s default PandasIndex.

Note

The ProjIndexMixin class can be used to mark an Xarray index as formally implementing the proj index interface. However, XProj doesn’t require an Xarray index to explicitly inherit from this mixin class to be recognized as CRS-aware.

import warnings


class GeoIndex(xr.indexes.PandasIndex, xproj.ProjIndexMixin):

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._crs = None

    def sel(self, *args, **kwargs):
        if self._crs is not None:
            warnings.warn(
                f"make sure that indexer labels have CRS {self._crs}!",
                UserWarning,
            )

        return super().sel(*args, **kwargs)

    @property
    def crs(self):
        return self._crs

    def _proj_set_crs(self, spatial_ref, crs):
        # note: `spatial_ref` is not used here
        print(f"set CRS of index {self!r} to crs={crs}!")

        self._crs = crs
        return self

    def _copy(self, deep=True, memo=None):
        # bug in PandasIndex? crs attribute not copied here
        obj = super()._copy(deep=deep, memo=memo)
        obj._crs = self._crs
        return obj

    def _repr_inline_(self, max_width=70):
        return f"{type(self).__name__} (crs={self._crs})"

    def __repr__(self):
        return f"{type(self).__name__} (crs={self._crs})"

Let’s see it in action by reusing the example dataset above, to which we replace the default indexes of the “lat” and “lon” coordinates with instances of the GeoIndex defined above.

ds_geo_wgs84 = (
    ds_wgs84
    .drop_indexes(["lat", "lon"])
    .set_xindex("lat", GeoIndex)
    .set_xindex("lon", GeoIndex)
)

Note that the CRS of the “lat” and “lon” indexes aren’t yet initialized despite the presence of the “spatial_ref” coordinate:

ds_geo_wgs84
<xarray.Dataset> Size: 31MB
Dimensions:      (lat: 25, time: 2920, lon: 53)
Coordinates:
  * lat          (lat) float32 100B 75.0 72.5 70.0 67.5 ... 22.5 20.0 17.5 15.0
  * lon          (lon) float32 212B 200.0 202.5 205.0 ... 325.0 327.5 330.0
  * time         (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
  * spatial_ref  int64 8B 0
Data variables:
    air          (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7
Indexes:
    spatial_ref  CRSIndex (crs=EPSG:4326)
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

Mapping the CRS of “spatial_ref” to the “lat” and “lon” geo-indexed coordinates has to be done manually using xarray.Dataset.proj.map_crs():

ds_geo_wgs84 = ds_geo_wgs84.proj.map_crs(spatial_ref=["lat", "lon"])
set CRS of index GeoIndex (crs=None) to crs=epsg:4326!
set CRS of index GeoIndex (crs=None) to crs=epsg:4326!

The CRS of the “lat” and “lon” geo-indexed coordinates is updated via the proj index interface implemented in GeoIndex. Data selection is now CRS-aware! (just a warning is emitted below).

ds_geo_wgs84.sel(lat=70)
/tmp/ipykernel_2519/75032568.py:12: UserWarning: make sure that indexer labels have CRS epsg:4326!
  warnings.warn(
<xarray.Dataset> Size: 1MB
Dimensions:      (time: 2920, lon: 53)
Coordinates:
  * time         (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
  * spatial_ref  int64 8B 0
    lat          float32 4B 70.0
  * lon          (lon) float32 212B 200.0 202.5 205.0 ... 325.0 327.5 330.0
Data variables:
    air          (time, lon) float64 1MB 250.0 249.8 248.9 ... 239.9 242.6 246.3
Indexes:
    spatial_ref  CRSIndex (crs=EPSG:4326)
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...

GeoIndex has a crs property (as required by ProjIndexMixin), which is possible to access also via the proj accessor like so:

ds_geo_wgs84.proj("lat").crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

Caveat#

Changing the CRS of a spatial reference coordinate via assign_crs() requires to manually call map_crs() again in order to synchronize the new CRS with the coordinate indexes.

temp = ds_geo_wgs84.proj.assign_crs(spatial_ref="epsg:4322", allow_override=True)

# note the CRS of the "lat" and "lon" indexes that hasn't changed
temp
from GeoAccessor: new CRS of 'spatial_ref' is epsg:4322!
<xarray.Dataset> Size: 31MB
Dimensions:      (time: 2920, lat: 25, lon: 53)
Coordinates:
  * time         (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
  * spatial_ref  int64 8B 0
  * lat          (lat) float32 100B 75.0 72.5 70.0 67.5 ... 22.5 20.0 17.5 15.0
  * lon          (lon) float32 212B 200.0 202.5 205.0 ... 325.0 327.5 330.0
Data variables:
    air          (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7
Indexes:
    spatial_ref  CRSIndex (crs=EPSG:4322)
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
ds_geo_wgs72 = temp.proj.map_crs(spatial_ref=["lat", "lon"], allow_override=True)

# up-to-date CRS
ds_geo_wgs72
set CRS of index GeoIndex (crs=epsg:4326) to crs=epsg:4322!
set CRS of index GeoIndex (crs=epsg:4326) to crs=epsg:4322!
<xarray.Dataset> Size: 31MB
Dimensions:      (time: 2920, lat: 25, lon: 53)
Coordinates:
  * time         (time) datetime64[ns] 23kB 2013-01-01 ... 2014-12-31T18:00:00
  * spatial_ref  int64 8B 0
  * lat          (lat) float32 100B 75.0 72.5 70.0 67.5 ... 22.5 20.0 17.5 15.0
  * lon          (lon) float32 212B 200.0 202.5 205.0 ... 325.0 327.5 330.0
Data variables:
    air          (time, lat, lon) float64 31MB 241.2 242.5 243.5 ... 296.2 295.7
Indexes:
    spatial_ref  CRSIndex (crs=EPSG:4322)
Attributes:
    Conventions:  COARDS
    title:        4x daily NMC reanalysis (1948)
    description:  Data is from NMC initialized reanalysis\n(4x/day).  These a...
    platform:     Model
    references:   http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...