span8
span4
span8
span4
Spikes (or Outliers) in spatial data occur when a vertex in a feature has an x, y, or z value that is so incorrect as to result in a spike-like appearance.
Spikes are a fairly specific type of defect, and there is a specific FME transformer designed to handle them: the SpikeRemover.
However, the visibility of a spike depends on the direction in which the data is being viewed. For example, a spike in the Z values of a feature won't be visible when viewing the data from above in a traditional x/y view. Therefore the CoordinateSwapper transformer can also be useful in dealing with spikes.
In this example, we will look at identifying and fixing spikes in a dataset containing contour lines.
Spike Removal: Workspace as a Template
The source data is a MicroStation DGN dataset containing bathymetric data for English Bay in the City of Vancouver:
Map tiles by Stamen Design, under CC-BY-3.0. Data by OpenStreetMap, under CC-BY-SA.
There are points to denote depth (which we can ignore) and contour lines. Both are 2.5D (i.e. they have Z values for each vertex) with depths in fathoms (a fathom is equal to 6 feet). We'll investigate if there are any spikes in the contours, particularly in the Z values that cannot be visually inspected so easily.
Follow these steps to learn how to locate and fix spike features with a SpikeRemover transformer. Note that the SpikeRemover transformer gives no way to locate a spike without also fixing it, apart from a point feature that indicates the removed vertex.
1. Start FME Workbench and begin with an empty canvas. Select Readers > Add Reader from the menubar.
Set the data format to Bentley MicroStation Design (V8). Select the attached MicroStation dataset as the source and click OK to add the reader.
When prompted, only select the Contours feature type (level) to add to the workspace.
2. Inspect the source data in the FME Data Inspector. There is one very obvious spike (shown in the screenshot at the beginning of this article) but other pieces of contour that might also count as spikes:
3. Place a SpikeRemover transformer on the canvas, connected to the reader feature type.
Check the transformer parameters. There are parameters that let us control the maximum angle and the maximum length of a spike. A spike will be removed when the angle it creates is less than or equal to the maximum angle, and when the line segment is no longer than the maximum line length.
Set the angle parameters to a value of 10 and the length parameter to 250; i.e. the maximum angle is 10 degrees and the maximum length is 250 metres (the coordinate system is UTM so the units are degrees/metres).
4. Connect Inspector transformers to the output ports of the SpikeRemover transformer and run the workspace.
A single spike should be located and removed:
Notice that a point feature emerges from the Removed output port to denote where the spike vertex was removed. If the QA process is intended to identify spikes for fixing in a different application, then the Removed output port can be saved to act as a flag for where edits should take place.
If the requirement is for FME to fix the problems, then the Changed port outputs the line feature with the spike removed.
5. To try and remove less extreme spikes, let's experiment with the SpikeRemover parameters. Set the angle parameter to 45 and run the translation again. This time 7 line features have had 9 spikes removed:
Unfortunately, this has also removed some points that were not really spikes, like so:
Therefore it's obviously important to be able to experiment with the parameters to get the maximum correction of spikes, without removing valid vertices.
6. To count the number of vertices removed, add a StatisticsCalculator transformer to the SpikeRemover:Removed port. In the parameters pick any attribute to analyze (_spike_angle is convenient) and set the Total Count Attribute parameter to NumberSpikes.
The output of this transformer now contains an attribute to count the number of spikes fixed.
7. In the FME Data Inspector, use the toolbar button to switch the view to 3D mode:
Pivot the display (use the "Orbit" tool on the toolbar) and you will notice a previously unnoticed spike in one Z value:
The SpikeRemover only operates in two dimensions, so we will need to do something different to handle this spike.
8. Add a BoundsExtractor transformer connected to the Unchanged and Changed output ports of the SpikeRemover, and follow it with a Tester transformer:
The BoundsExtractor will return the minimum and maximum values for x, y, and z coordinates. Set the Tester to test for features where the minimum z coordinate is less than zero (_zmin < 0). This will isolate the feature at fault.
9. Now add a CoordinateSwapper transformer connected to the Tester:Passed port.
Check the transformer's parameters. Notice that it allows you to swap the coordinates on a particular axis; for example swapping the x and y values. In this case set it up to swap the Y <-> Z axes.
Add an Inspector transformer and run the workspace. You'll now see the data as a side view. The long spike what we need to fix. Use the measurement tool in the Data Inspector and you'll find it is nearly 10,000m in length.
10. Add a second SpikeRemover transformer, this one connected to the CoordinateSwapper:Output port. Set the maximum angle to 1 and the maximum length to 10,100.
11. Add another CoordinateSwapper transformer, this one a duplicate of the first in order to switch back the Y and Z axes:
Add an Inspector to the second CoordinateSwapper transformer and run the translation. You'll notice that the sharp spike has been removed.
Also notice that the line has been smoothed out in other places too, due to the data being compressed on the Y axis after the first coordinate swap. This is why we need to isolate the bad feature, to minimize the unwanted changes.
We now have a set of data that has been cleaned of spikes. If automatic cleaning is not desirable then the spike removal points can be used to identify places to check where spikes might be manually resolved.
The data used here originates from public domain data made available by the City of Vancouver, British Columbia.
Data QA: Identifying Self-Intersections with FME
Data QA Identifying Sliver Overlaps and Gaps in Polygon Coverage
Data QA: Identifying Bad Topology in Linear Networks
Data QA: Identifying Non-Consecutive Duplicate Vertices with FME
Data QA: Identifying Small Polygon Features
Tutorial: Data Validation and QA with FME
Data QA: Identifying Duplicate Features with FME
Data QA: Identifying Consecutive Duplicate Vertices with FME
Data QA: Identifying Features Closer than a Minimum Distance
© 2019 Safe Software Inc | Legal