Skip to content

depsi.classification

depsi.classification.ps_selection(slcs, threshold, method='nad', output_chunks=10000, mem_persist=False, ps_selection_start_date=None, ps_selection_end_date=None)

Select Persistent Scatterers (PS) from an SLC stack, and return a Space-Time Matrix.

The selection method is defined by method and threshold. The selected pixels will be reshaped to (space, time), where space is the number of selected pixels. The unselected pixels will be discarded. The original azimuth and range coordinates will be persisted. The computed NAD or NMAD will be added to the output dataset as a new variable. It can be persisted in memory if mem_persist is True. The original time axis will be preserved in all cases. If ps_selection_start_date and ps_selection_end_date are provided, the selection will only use the images in the provided time window. However, the full time axis will be preserved and returned. In this case, the layers time_selection_nmad / time_selection_nad and full_ts_nmad / full_ts_nad are thus different.

Parameters:

Name Type Description Default
slcs Dataset

Input SLC stack. It should have the following dimensions: ("azimuth", "range", "time"). There should be a amplitude variable in the dataset.

required
threshold float

Threshold value for selection for "nad" / "nmad".

required
method Literal['nad', 'nmad']

Method of selection, by default "nad". - "nad": Normalized Amplitude Dispersion - "nmad": Normalized median absolute deviation

'nad'
output_chunks int

Chunk size in the space dimension, by default 10000

10000
mem_persist bool

If true persist the NAD or NMAD in memory, by default False.

False
ps_selection_start_date datetime | str | None

the start date of the time window to be used for the ps_selection, in one of three formats: - datetime object - str object, formatted as YYYYMMDD - None, no cropping in time requested for the ps_selection (default)

None
ps_selection_end_date datetime | str | int | None

the end date of the time window to be used for the ps_selection, in one of four formats: - datetime object - str object, formatted as YYYYMMDD - int object, which is interpreted as the number of images intended in the crop (including the start date). If more images are requested than exist since the start date, all images from start_date until the last image are provided. - None, no cropping in time requested for the ps_selection (default)

None

Returns:

Type Description
Dataset

Selected STM, in form of an xarray.Dataset with dimensions: - space ( # PS selected) - time ( # epochs of input dataset) with coordinates: - time: epoch in np.datetime64 format - space: index of the PS - azimuth: azimuth coordinate of the PS - range: range coordinate of the PS with attributes: - ps_selection_start_date: the epoch of the first image used for the PS selection - ps_selection_end_date: the epoch of the last image used for the PS selection with variables: - h2ph (space, time): the height to phase conversion - lat (space): latitude of the PS - lon (space): longitude of the PS - complex (space, time): the complex value of the PS at each epoch - amplitude (space, time): the amplitude of the PS at each epoch - phase (space, time): the phase of the PS at each epoch - time_selection_nad / time_selection_nmad (space): the value used for selection of the PS, dependent on method - full_ts_nad (space): the Normalized Amplitude Dispersion of the PS - full_ts_nmad (space): the Normalized Median Amplitude Dispersion of the PS - pnt_class (space): 1 for all selected PS

Raises:

Type Description
NotImplementedError

Raised when an unsupported method is provided.

Source code in depsi/classification.py
def ps_selection(
    slcs: xr.Dataset,
    threshold: float,
    method: Literal["nad", "nmad"] = "nad",
    output_chunks: int = 10000,
    mem_persist: bool = False,
    ps_selection_start_date: datetime | str | None = None,
    ps_selection_end_date: datetime | str | int | None = None,
) -> xr.Dataset:
    """Select Persistent Scatterers (PS) from an SLC stack, and return a Space-Time Matrix.

    The selection method is defined by `method` and `threshold`.
    The selected pixels will be reshaped to (space, time), where `space` is the number of selected pixels.
    The unselected pixels will be discarded.
    The original `azimuth` and `range` coordinates will be persisted.
    The computed NAD or NMAD will be added to the output dataset as a new variable. It can be persisted in
    memory if `mem_persist` is True.
    The original time axis will be preserved in all cases. If `ps_selection_start_date` and `ps_selection_end_date`
    are provided, the selection will only use the images in the provided time window. However, the full time axis
    will be preserved and returned. In this case, the layers `time_selection_nmad` / `time_selection_nad` and
    `full_ts_nmad` / `full_ts_nad` are thus different.

    Parameters
    ----------
    slcs : xr.Dataset
        Input SLC stack. It should have the following dimensions: ("azimuth", "range", "time").
        There should be a `amplitude` variable in the dataset.
    threshold : float
        Threshold value for selection for "nad" / "nmad".
    method : Literal["nad", "nmad"], optional
        Method of selection, by default "nad".
        - "nad": Normalized Amplitude Dispersion
        - "nmad": Normalized median absolute deviation
    output_chunks : int, optional
        Chunk size in the `space` dimension, by default 10000
    mem_persist : bool, optional
        If true persist the NAD or NMAD in memory, by default False.
    ps_selection_start_date : datetime | str | None, optional
        the start date of the time window to be used for the ps_selection, in one of three formats:
        - datetime object
        - str object, formatted as YYYYMMDD
        - None, no cropping in time requested for the ps_selection (default)
    ps_selection_end_date : datetime | str | int | None, optional
        the end date of the time window to be used for the ps_selection, in one of four formats:
        - datetime object
        - str object, formatted as YYYYMMDD
        - int object, which is interpreted as the number of images intended in the crop (including the start date). If
            more images are requested than exist since the start date, all images from start_date until the last image
            are provided.
        - None, no cropping in time requested for the ps_selection (default)

    Returns
    -------
    xr.Dataset
        Selected STM, in form of an xarray.Dataset with dimensions:
        - space ( # PS selected)
        - time ( # epochs of input dataset)
        with coordinates:
        - time: epoch in np.datetime64 format
        - space: index of the PS
        - azimuth: azimuth coordinate of the PS
        - range: range coordinate of the PS
        with attributes:
        - ps_selection_start_date: the epoch of the first image used for the PS selection
        - ps_selection_end_date: the epoch of the last image used for the PS selection
        with variables:
        - h2ph (space, time): the height to phase conversion
        - lat (space): latitude of the PS
        - lon (space): longitude of the PS
        - complex (space, time): the complex value of the PS at each epoch
        - amplitude (space, time): the amplitude of the PS at each epoch
        - phase (space, time): the phase of the PS at each epoch
        - time_selection_nad / time_selection_nmad (space): the value used for selection of the PS, dependent on method
        - full_ts_nad (space): the Normalized Amplitude Dispersion of the PS
        - full_ts_nmad (space): the Normalized Median Amplitude Dispersion of the PS
        - pnt_class (space): 1 for all selected PS

    Raises
    ------
    NotImplementedError
        Raised when an unsupported method is provided.
    """
    # define the PS selection functions based on the method. This is in the function since it requires
    # _nad_block and _nmad_block to already be defined.
    ps_selection_functions = {
        "nad": _nad_block,
        "nmad": _nmad_block,
    }

    # Make sure there is no temporal chunk
    # since later a block function assumes all temporal data is available in a spatial block
    slcs = slcs.chunk({"time": -1})

    # Apply the time crop for the SLC selection if requested
    if ps_selection_start_date is not None:
        selection_slcs = crop_slc_spacetime(slcs, start_date=ps_selection_start_date, end_date=ps_selection_end_date)
    else:
        selection_slcs = slcs

    if method not in ps_selection_functions.keys():
        raise NotImplementedError(f"Know methods {ps_selection_functions.keys()} but {method} was requested!")

    # Calculate the selection mask
    nad_nmad = xr.map_blocks(
        ps_selection_functions[method],
        selection_slcs["amplitude"],
        template=selection_slcs["amplitude"].isel(time=0).drop_vars("time"),
    )
    nad_nmad = nad_nmad.compute() if mem_persist else nad_nmad
    slcs = slcs.assign({f"time_selection_{method}": nad_nmad})
    mask = nad_nmad < threshold

    # Get the 1D index on space dimension
    mask_1d = mask.stack(space=("azimuth", "range")).drop_vars(["azimuth", "range", "space"])  # Drop multi-index coords
    index = mask_1d["space"].where(mask_1d.compute(), other=0, drop=True)  # Evaluate the 1D mask to index

    # Reshape from Stack ("azimuth", "range", "time") to Space-Time Matrix  ("space", "time")
    stacked = slcs.stack(space=("azimuth", "range"))

    # Drop multi-index coords for space coordinates
    # This will also azimuth and range coordinates, as they are part of the multi-index coordinates
    stm = stacked.drop_vars(["space", "azimuth", "range"])

    # Assign a continuous index the space dimension
    # Assign azimuth and range back as coordinates
    stm = stm.assign_coords(
        {
            "space": (["space"], range(stm.sizes["space"])),
            "azimuth": (["space"], stacked["azimuth"].values),
            "range": (["space"], stacked["range"].values),
        }
    )  # keep azimuth and range as coordinates

    # Apply selection
    stm_masked = stm.sel(space=index)

    # Re-order the dimensions to community preferred ("space", "time") order
    stm_masked = stm_masked.transpose("space", "time")

    # Rechunk is needed because after apply masking, the chunksize will be inconsistent
    stm_masked = stm_masked.chunk(
        {
            "space": output_chunks,
            "time": -1,
        }
    )

    # Reset space coordinates
    stm_masked = stm_masked.assign_coords(
        {
            "space": (["space"], range(stm_masked.sizes["space"])),
        }
    )

    # add full timeseries NAD and NMAD
    nad = xr.map_blocks(
        _nad_block, stm_masked["amplitude"], template=stm_masked["amplitude"].isel(time=0).drop_vars("time")
    )
    nmad = xr.map_blocks(
        _nmad_block, stm_masked["amplitude"], template=stm_masked["amplitude"].isel(time=0).drop_vars("time")
    )
    stm_masked = stm_masked.assign({"full_ts_nmad": (["space"], nmad.data)})
    stm_masked = stm_masked.assign({"full_ts_nad": (["space"], nad.data)})

    # Add selection date attributes
    start_date = npdatetime64_to_datetime(selection_slcs["time"].values[0])
    end_date = npdatetime64_to_datetime(selection_slcs["time"].values[-1])
    stm_masked.attrs["ps_selection_start_date"] = start_date.strftime("%Y%m%d")
    stm_masked.attrs["ps_selection_end_date"] = end_date.strftime("%Y%m%d")

    # Add the classification flag
    stm_masked = stm_masked.assign({"pnt_class": (["space"], np.ones_like(stm_masked.space.values).astype(np.int8))})

    # Compute NAD or NMAD if mem_persist is True
    # This only evaluate a very short task graph, since NAD or NMAD is already in memory
    if mem_persist:
        for key in [f"time_selection_{method}", "full_ts_nad", "full_ts_nmad"]:
            stm_masked[key] = stm_masked[key].compute()

    return stm_masked

depsi.classification.network_stm_selection(stm, min_dist, include_index=None, sortby_var='time_selection_nmad', crs='radar', x_var='azimuth', y_var='range', azimuth_spacing=None, range_spacing=None)

Select a Space-Time Matrix (STM) from a candidate STM for network processing.

The selection is based on two criteria: 1. A minimum distance between selected points. 2. A sorting metric to select better points.

The candidate STM will be sorted by the sorting metric. The selection will be performed iteratively, starting from the best point. In each iteration, the best point will be selected, and points within the minimum distance will be removed. The process will continue until no points are left in the candidate STM.

Parameters:

Name Type Description Default
stm Dataset

candidate Space-Time Matrix (STM).

required
min_dist int | float

Minimum distance between selected points. The unit is determined by crs. When crs is "radar", the unit is the same as azimuth_spacing and range_spacing.

required
include_index list[int]

Index of points in the candidate STM that must be included in the selection, by default None

None
sortby_var str

Sorting metric for selecting points, by default "time_selection_nmad"

'time_selection_nmad'
crs int | str

EPSG code of Coordinate Reference System of x_var and y_var, by default "radar". If crs is "radar", the distance will be calculated based on radar coordinates, and azimuth_spacing and range_spacing must be provided.

'radar'
x_var str

Data variable name for x coordinate, by default "azimuth"

'azimuth'
y_var str

Data variable name for y coordinate, by default "range"

'range'
azimuth_spacing float

Azimuth pixel spacing, by default None. Required if crs is "radar".

None
range_spacing float

Range pixel spacing, by default None. Required if crs is "radar".

None

Returns:

Type Description
Dataset

Selected network Space-Time Matrix (STM).

Raises:

Type Description
ValueError

Raised when azimuth_spacing or range_spacing is not provided for radar coordinates.

NotImplementedError

Raised when an unsupported Coordinate Reference System is provided.

Source code in depsi/classification.py
def network_stm_selection(
    stm: xr.Dataset,
    min_dist: int | float,
    include_index: list[int] = None,
    sortby_var: str = "time_selection_nmad",
    crs: int | str = "radar",
    x_var: str = "azimuth",
    y_var: str = "range",
    azimuth_spacing: float = None,
    range_spacing: float = None,
):
    """Select a Space-Time Matrix (STM) from a candidate STM for network processing.

    The selection is based on two criteria:
    1. A minimum distance between selected points.
    2. A sorting metric to select better points.

    The candidate STM will be sorted by the sorting metric.
    The selection will be performed iteratively, starting from the best point.
    In each iteration, the best point will be selected, and points within the minimum distance will be removed.
    The process will continue until no points are left in the candidate STM.

    Parameters
    ----------
    stm : xr.Dataset
        candidate Space-Time Matrix (STM).
    min_dist : int | float
        Minimum distance between selected points. The unit is determined by `crs`.
        When `crs` is "radar", the unit is the same as `azimuth_spacing` and `range_spacing`.
    include_index : list[int], optional
        Index of points in the candidate STM that must be included in the selection, by default None
    sortby_var : str, optional
        Sorting metric for selecting points, by default "time_selection_nmad"
    crs : int | str, optional
        EPSG code of Coordinate Reference System of `x_var` and `y_var`, by default "radar".
        If crs is "radar", the distance will be calculated based on radar coordinates, and
        azimuth_spacing and range_spacing must be provided.
    x_var : str, optional
        Data variable name for x coordinate, by default "azimuth"
    y_var : str, optional
        Data variable name for y coordinate, by default "range"
    azimuth_spacing : float, optional
        Azimuth pixel spacing, by default None. Required if crs is "radar".
    range_spacing : float, optional
        Range pixel spacing, by default None. Required if crs is "radar".

    Returns
    -------
    xr.Dataset
        Selected network Space-Time Matrix (STM).

    Raises
    ------
    ValueError
        Raised when `azimuth_spacing` or `range_spacing` is not provided for radar coordinates.
    NotImplementedError
        Raised when an unsupported Coordinate Reference System is provided.
    """
    match crs:
        case "radar":
            if (azimuth_spacing is None) or (range_spacing is None):
                raise ValueError("Azimuth and range spacing must be provided for radar coordinates.")
        case _:
            raise NotImplementedError

    # Get coordinates and sorting metric, load them into memory
    stm_select = None
    stm_remain = stm[[x_var, y_var, sortby_var]].compute()

    # Select the include_index if provided
    if include_index is not None:
        stm_select = stm_remain.isel(space=include_index)

        # Remove points within min_dist of the included points
        coords_include = np.column_stack(
            [stm_select["azimuth"].values * azimuth_spacing, stm_select["range"].values * range_spacing]
        )
        coords_remain = np.column_stack(
            [stm_remain["azimuth"].values * azimuth_spacing, stm_remain["range"].values * range_spacing]
        )
        idx_drop = _idx_within_distance(coords_include, coords_remain, min_dist)
        if idx_drop is not None:
            stm_remain = stm_remain.where(~(stm_remain["space"].isin(idx_drop)), drop=True)

    # Reorder the remaining points by the sorting metric
    stm_remain = stm_remain.sortby(sortby_var)

    # Build a list of the index of selected points
    if stm_select is None:
        space_idx_sel = []
    else:
        space_idx_sel = stm_select["space"].values.tolist()

    while stm_remain.sizes["space"] > 0:
        # Select one point with best sorting metric
        stm_now = stm_remain.isel(space=0)

        # Append the selected point index
        space_idx_sel.append(stm_now["space"].values.tolist())

        # Remove the selected point from the remaining points
        stm_remain = stm_remain.isel(space=slice(1, None)).copy()

        # Remove points in stm_remain within min_dist of stm_now
        coords_remain = np.column_stack(
            [stm_remain["azimuth"].values * azimuth_spacing, stm_remain["range"].values * range_spacing]
        )
        coords_stmnow = np.column_stack(
            [stm_now["azimuth"].values * azimuth_spacing, stm_now["range"].values * range_spacing]
        )
        idx_drop = _idx_within_distance(coords_stmnow, coords_remain, min_dist)
        if idx_drop is not None:
            stm_drop = stm_remain.isel(space=idx_drop)
            stm_remain = stm_remain.where(~(stm_remain["space"].isin(stm_drop["space"])), drop=True)

    # Get the selected points by space index from the original stm
    stm_out = stm.sel(space=space_idx_sel)

    return stm_out