Schematic Processor v2.0 documentation

Iteration with the Schematic Processor

As a standalone geoprocessing tool, the Process Schematic Network tool within the Schematic Processor Tools toolbox is designed to process each node and link in a schematic network only once per tool execution. However, workflows involving processing a time series of values or performing stochastic analyses may require iteration over a number of steps.

This document describes two techniques for running the schematic processor with iteration. The explanations assume that the reader already has a good understanding of how the schematic processor works and is familiar with the Arc Hydro data model and tools.

The two techniques for iteration are:

  • Use ModelBuilder iteration.
  • Write a processing op that performs iteration across all steps for a given schematic feature.

The advantage of using ModelBuilder iteration is that you are working within the geoprocessing framework in ArcGIS. This supports readability amongst users who are familiar with this framework, and self-documentation by saving your workflows as geoprocessing models. Furthermore, the ability to introduce randomness into input parameters enables stochastic analyses. However, working within this framework also has its limitations. For example, for each iteration the Total Value and Passed Value attributes within schematic links and nodes could be overwritten, so care must be taken in preserving values from intermediate iteration steps if necessary.

The advantage of iteration within a processing op is that you can program an iteration algorithm that is as sophisticated as you need, without the limits of the ModelBuilder framework. The disadvantage is that you may have to write a substantial amount of code to enable iteration.

Both techniques can be combined for even more advanced applications. For example, you could use processing op iteration to route inflow stream discharge along stream links with data from a related time series table, and use ModelBuilder iteration to introduce randomness into the computations for stochastic analyses. The techniques are described in more detail below.

ModelBuilder Iteration

ModelBuilder in ArcGIS 10 includes several methods for iteration including iterating over a specified number of executions, a set of rows or features, a set of values such as a list of datetimes, a list of files, etc. For each iteration, the entire geoprocessing model is executed. The arcgis.rand() function can be used to introduce random variability in input parameters, enabling stochastic analyses. The ArcGIS help provides excellent documentation on its iteration capabilities.

To iterate the schematic processor using ModelBuilder, you would perform these steps:

  1. Create a new ArcToolbox geoprocessing model.
  2. Drag the Process Schematic Network tool into the model.
  3. Drag other tools and data into the model to create the desired workflow.
  4. Add model iteration. This can be done via the Insert menu or via the Model Properties dialog.

The main challenge to this approach is writing the inputs and outputs of each iteration. If you were to iterate only on the Process Schematic Network tool, the tool would use the same inputs for each iteration, and it would overwrite the Total Value and Passed Value fields for each iteration. Therefore, your model will likely include some tools before the Process Schematic Network tool to prepare the input data for each iteration, and some tools after the Process Schematic Network tool to store the results.

As an example, consider the model in the image below. This model is very simple just to illustrate how iteration can be performed. The goal of this model is to perform a simple accumulation of some parameter throughout the network, and to store the result of three iterations in separate fields in the node feature class.

_images/iteration_model5.PNG

The workflow in the model proceeds as follows:

  1. The Calculate Input Field tool calculates the incremental values for all nodes to be (arcgis.rand("Normal 2 2"))*10. This calculates a random number with a normal distribution, with a mean of 2 and a standard deviation of 2, and then multiplies that random number by 10. This represents preparing input data for a given iteration.
  2. The Process Schematic Network tool includes no custom processing ops, which means the values are simply accumulated throughout the network. The result is that the nodes and links have values written to their Total Value and Passed Value fields.
  3. The Add Result Field tool adds a field named Result%n% to the nodes. The %n% is a ModelBuilder inline variable directing ModelBuilder to substitute the current model iteration index in its place. For example, for iteration number 3, the new field would be called Result3.
  4. The Calculate Result Field tool calculates the values for the field created in the previous step to be equal to the values currently in the Total Value field of the nodes. Recall that the total value field values will be overwritten in each iteration, so we store the result in a separate field for safe keeping.

The model is set to run for three iterations using the Model Properties dialog as shown in the image below.

_images/iteration_model_properties5.PNG

After the model executes, the user will find three fields added to the node feature class, storing the result of the analysis. (The iteration index is zero-based, so the indices for three iterations are 0, 1, and 2.) The image below shows what the result might look like for a few nodes. Notice how TotVal (the Total Value) matches Result2. Result0 and Result1 store the values calculated for TotVal in earlier model iterations.

_images/iteration_results5.PNG

Obviously if you run 1000 iterations, storing the result in individual fields may become unwieldy. As your workflow becomes more complex, you’ll likely have to take advantage of the suite of ArcToolbox tools (such as those related to table joins and queries) to store the results in an intelligent manner. You could also write custom processing ops to handle the storage of results for you, which is similar to the technique used for processing op iteration described in the next section.

Processing Op Iteration

The schematic processor calls upon processing ops to handle custom receiving or passing behaviors for links and nodes. It relies on the op to perform whatever calculations are required by the op, and it expects the op to return a value that will be written to the attribute table of the current link or node. Thus, processing ops have the freedom to do whatever they want behind the scenes, just so long as they return a value. This is your opportunity to implement iteration. And thanks to the power of Python, there are many iteration scenarios that you can program.

As an example, consider the case where schematic features are related to data in a TimeSeries table, where each row in the table represents a single value of a given variable (VarID) at a given time for a given feature. For this example, the TimeSeries table stores rainfall time series (VarID = 1) for schematic nodes that represent catchments. The desired behavior is for each node to compute a runoff time series from its rainfall time series, and to write the runoff time series (VarID = 2) as new rows in the TimeSeries table. This behavior can be implemented with a processing op. When the schematic processor reaches a catchment node over the course of its processing, it will call upon the processing op which will perform the following tasks:

  1. Select related rainfall time series rows from the TimeSeries table.
  2. Read data from the related rows into memory.
  3. Perform the rainfall-runoff calculation using the rainfall time series and catchment properties stored as attributes on the schematic node.
  4. Write the runoff time series to the TimeSeries table as new rows with VarID = 2.
  5. To indicate which time series represents runoff in the TimeSeries table, return a value of 2 to the schematic processor. The schematic processor stores this value in either the Total Value or Passed Value field of the current node and then moves on to the next feature in the network.

In this example, the values that the schematic processor writes to the Total Value and Passed Value fields are not very important. The real values of interest are those that the processing op writes to the TimeSeries table.

To see this kind of example implemented in code, please refer to the relate_to_timeseries.py script in the ops folder. The script is well-commented to explain the logic behind each section of code. Note that this script performs computations much simpler than rainfall-runoff for the sake of brevity.