NovaStar / Core Data Filing Process / Validation

Introduction
Process

Introduction

The validation step checks scaled values for validity and assigns the following data flags:

V for valid data
Q for questionable data
B for a break

Validation is run automatically during data filing (such as from nsrecdata and sendrpt) and can be run manually if database values are manually edited.

TODO need to mention how edits in the Administrator and other programs are handled. I commented on that in the nsdbvalidate program doc.

Process

The process of validation has several different substeps, which are described in the following sections.

TODO all of these would greatly benefit from diagrams, especially showing the intervals. Maybe do a plot Administrator, put in PowerPoint and edit, or just show a simple case in PowerPoint.

Can a statement be made that a data value is considered valid unless one of the following conditions is met?

Finding Data Reports that are Close in Time

Validation looks at data reports that are close in time to determine whether a data report is valid. In some cases, data reports that have already been previously validated may have their validation status changed by a new report coming in.

To determine what reports are close in time, the following process is used:

The check interval is determined via the following steps:
1. The point for the data report to examine is checked for an interval field.
  1. If there is an interval, this is taken as the value to use in future steps.
  2. Otherwise, 86400 seconds (24 hours) is taken as the value to use in future steps. This is a historical default appropriate for legacy ALERT, where a daily regular report is common.
2. The point is checked for a change_interval field. If that value is greater than the value from the previous step, it is used in future steps. Otherwise, the value from the previous step remains.
3. The point is checked for a compress_interval field. If that value is greater than the value from the previous step, it is used in future steps. Otherwise, the value is unchanged.
4. If the data report is to be filed (true most of the time, false if -f is given to nsrecdata) and alarms should be checked (true most of the time, false if -a is given to nsrecdata, sendrpt, starpt, or nsdataimportrpt), TODO make any program references links. then get all alarm triggers for the point.
  1. Each alarm trigger is checked in the following ways:
    1. If the alarm trigger type is not one of rise, fall, or report count, skip to the next trigger.
    2. Otherwise, check if the value from previous steps is less than the alarm trigger limit_interval plus the value from the first step. If so, use the limit_interval plus the value from the first step in future steps.
    3. If the alarm trigger reset_interval is null, skip to the next trigger
    4. Check if the value from previous steps is less than the alarm trigger reset_interval plus the value from the first step. If so, use the reset_interval plus the value from the first step in future steps.
5. Check if the repeat_report_interval is set in the global configuration table. If not, use 2 seconds as the repeat report interval.
6. Check if the repeat report interval is larger than the value from the earlier steps. If so, use the repeat report interval instead.
Subtract the calculated check interval from the time of the report to be validated. This is the start time.
Add the calculated check interval to the time of the report to be validated. This is the end time.
Get all data reports for the point between the start and end time, ordered in ascending order of report time (oldest to newest).

The result of this is the report list, which is used in the checks described below.

Zero Limit

For some calculations, values are checked not against zero itself, but the zero limit, which is calculated as follows:

Get the scaled digits from the point type for the point.
Divide 5 by 10 to the power of the scaled digits plus one. This is the zero limit.

For example, if the scaled digits are 0, the zero limit would be 0.5 - 5/10. TODO, which is 0.5 - .5 equals zero. What? I generally just do something like check for -.00000001 to .00000001. Maybe Dave Leader used some precision math to get to the same place? Does the zero limit indicate zero? Null? NaN?

TODO seems like this is background that should maybe go in the introduction or at the top of the process?

Bounds Checking

Bounds checks compare a scaled value against the limits defined for a point. For example, water level limits may be defined to match physical limits and precipitation limits may be defined based on a sensor's maximum accumulated value.

Check that the point for the data report exists. If not, note that this report is out of bounds and skip to the final step. TODO maybe use "indicate that" instead of "note that". Just need to make sure that "note that" is not a message to the reader but is an explanation of logic.
Check that the raw data for the report is a number. If not, note that this report is out of bounds and skip to the final step.
If the point class of the point of the report to be validated is status, loop through each point status.
1. If any point status has a use of fixed on with a corresponding data bit of 0, note that this report is out of bounds and skip to the final step. TODO maybe use ticks to format "fixed on" and "fixed off" to make it stand out?
2. If any point status has a use of fixed off with a corresponding data bit of 1, note that this report is out of bounds and skip to the final step.
If the point class of the point is counter, encoder, wind, or alertwind:
1. If the point class is alertwind, shift the data so that it is just wind run rather than wind run and wind direction combined.
2. If the point type cycle size (TODO maximum sensor value?) is >= the zero limit, check if the raw data is negative or >= the cycle size. If either is true, note that this report is out of bounds and skip to the final step.
Check that the calibrated data for the report is a number. TODO meaning not NaN? Why wouldn't it be a number? Should "calibrated data" be "scaled value"? If not, note that this report is out of bounds and skip to the final step.
If either the point type upper limit is greater than 0 or lower limit is greater than the zero limit, if the calibrated data is greater than the upper limit or less than the lower limit, note that this report is out of bounds and skip to the final step.
If the report was noted to be out of bounds, check if the report is marked as verified (E) or maintenance (M). If not, remove any valid (V) flag on the report and add a questioned (Q) flag. TODO Maybe should use "scaled value" instead of "the report"? Use "the report" for entire record?

Main Process

The data report is checked for the questioned (Q) or maintenance (M) flag. If it has either one, no further processing is done. (Note that if being run via nsdbvalidate, any questioned flags are removed before reaching this step.) TODO does this mean that values only get changed from V to Q and never Q to V? When does E come in because I thought it made values sticky?
Run the bounds check. If the bounds check marks the report as questioned, stop processing. Maybe say no further checks are performed rather than "stop processing"?
Set the allowed change value to the change value for the point.
If the allowed change value is greater than the zero limit, TODO insert "check the change value"? Does zero limit mean the equivalent of zero, null, or NaN? I'm trying to understand.:
1. Loop through the nearest reports before the report being validated from the retrieved report list, stopping either when there are no more retrieved reports or after 10 reports. If out of reports and the allowed change value is less than the zero limit, set the change flag to true. On each loop, do the following: TODO meaning loop backwards?
2. If the amount of time between the report selected for this loop iteration and the report after it is larger than the interval field of the point:
  1. If this is the first time through the loop, note that the report to be validated should have a break (B) flag added. TODO should have? Or set the flag to B? Need to indicate in the data model documentation that a break is defined as a gap in data longer than ? and the first value bounding the gap is marked with B.
  2. For all other times through the loop, note that the older report should have a break (B) flag added.
3. If the report selected for this loop iteration is marked as maintenance (M) or is marked as questioned by the bounds check, go immediately to the next loop iteration.
4. If the point class of the point is status TODO (binary on/off as 0/1):
  1. If the absolute value of the difference between the scaled values of the report to be validated and the report selected for this iteration of the loop is less than the zero limit, increment the valid report counter. If that counter is equal to the rounded value of the change allowed by the point, note that older report should have a valid (V) flag added, set the change flag to true, and exit the loop.
  2. If the absolute value of the difference between the scaled values of the current and previous reports is greater than or equal to the zero limit:
    1. If the previous report has a valid (V) flag, then increment the not valid report counter. If this counter is equal to the rounded value of the change allowed by the point, exit the loop.
    2. Otherwise, if the previous report does not have a questioned (Q) flag, note that the previous report should have a valid (V) flag added, note that the previous report should be updated, and clear any notes that the previous report should have a break flag added.
5. For all other point classes:
  1. If the allowed change value is less than the zero limit, set the change flag to true, and exit the loop. TODO meaning what? Need to explain to the reader what is going on without getting confused by "zero limit"? Does "change flag" mean need to write the record?
  2. If the absolute value of the difference between the scaled values of the current and previous reports is less than or equal to the allowed change value (and, if the point class is counter, the change is positive), TODO this is a valid value:
    1. Note that the older report should have a valid (V) flag added. I find "note that" to be confusing. Does this mean that the older report's data flag IS set to V? Maybe say somewhere up front that the process is run for each data report and the last step is to update the database data report.
    2. Note that the older report should be updated, and clear any notes that the older report should have a break flag added. TODO Rather than "clear any notes" say "indicate that the".
    3. Set the change flag to true. TODO maybe avoid "flag" because there are data flags. Maybe say "Set an internal flag that the scaled value has exceeded the change limit?
    4. Increment the valid report counter, and save the position of this report as the last known valid report.
    5. Mark this report to be updated, and clear any notes to add a break flag to this report. TODO maybe say "any indicators" instead of "any notes"?
    6. If this is the second or later iteration of the loop and a report from a previous loop was valid, calculate the difference between the scaled value of the last known valid report and the scaled value of the report for this iteration of the loop. If this is greater than the zero limit, add the point change value to the allowed change value. Maybe say somewhere that the change in scaled values is computed for the cumulative change in values for the change interval and values can go up and down in the interval.
    7. Otherwise, calculate the difference between the scaled value of the report for this iteration of the loop and the scaled value for the report to be validated. If this is greater than the zero limit, add the point change value to the allowed change value. TODO I don't understand this. Why?
  3. Otherwise TODO do what, can a short statement be inserted here?:
    1. If the report selected for this iteration of the loop has the valid (V) flag:
      1. If this is the second or later iteration of the loop and a report from a previous loop was valid, calculate the difference between the scaled value of the last known valid report and the scaled value of the report for this iteration of the loop. If this is greater than the zero limit, add the point change value to the allowed change value. TODO again, zero limit is confusing. Checking for null or Nan?
      2. Otherwise, calculate the difference between the scaled value of the report for this iteration of the loop and the scaled value for the report to be validated. If this is greater than the zero limit, add the point change value to the allowed change value.
    2. Mark that the report selected for this iteration of the loop should have the valid (V) flag removed. TODO maybe say "Indicate that". Try to be consistent because elsewhere "note that" is used.
    3. Increment the not valid report counter. TODO what is the valid report counter?
    4. If this is not the first time through the loop and the report selected on the last report was marked to have the valid flag added:
      1. If the difference between the scaled value of the report selected the last time through the loop and the scaled value of the report selected this time through the loop is larger than the allowed change (or, if the point class is counter, the change is negative) and the report selected the last time through the loop does not have the valid flag, note that the report selected the last time through the loop should have the break (B) flag added.
  4. If the report selected for this iteration of the loop already has the valid (V) flag:
    1. If the valid report counter is greater than or equal to 3, exit the loop.
    2. Otherwise, if the not valid report counter is greater than or equal to the valid report counter plus 3, exit the loop.
Otherwise set the change flag to true. TODO kind of deep to remember what "otherwise" is. Maybe say "otherwise, explain the case, ...".
If the change for the point is greater than the zero limit:
1. If the position of the last valid report was saved (see step 4.5.2.3), the last valid report was not the last report checked in the loop, the valid report counter is greater than 0, and the not valid report counter is greater than the valid report counter:
  1. Note that the last valid report should have the break (B) flag added.
  2. Iterate through the reports between the report being validated and the last valid report. If said intermediate report does not have the verified (E) or maintenance (M) flag, or not had a note to add a valid flag, mark the report to be updated and remove any notes to have the break flag added.
2. If the valid report counter is greater than or equal to the not valid report counter and the not valid report counter is less than 3:
  1. Loop backwards through all reports checked in the loop in step 4.1.
    1. If the report has the valid (V) flag or has a note to add a valid flag, skip to the next loop iteration. TODO can be confusing with multiple loops. Maybe say "skip to the previous report".
    2. If the report has the maintenance (M) flag, skip to the next loop iteration.
    3. If the report fails the bounds check, skip to the next loop iteration.
    4. Set the allowed change to the change value for the point.
    5. Loop through all reports older than the report in the 6.2.1 loop:
      1. If an older report does not have a note to add a valid flag, continue to the next iteration of the loop.
      2. Set the next oldest report as the before report.
      3. Set the next newest report that has a note to add a valid flag from the one in the 6.2.1 loop as the after report.
      4. If the loop report scaled value minus the before report scaled value is larger than the allowed change (or, if the point class is counter, the difference is negative), remove any notes on the loop report saying to add the valid flag.
      5. Otherwise:
        
        Calculate the difference between the after report scaled value and the loop report scaled value.
        
        If the point class is not counter, or the point class is counter and the difference in the above step is positive, add a note to the loop report to add the valid flag.
        
        Otherwise, if the loop report already has a note to add the valid flag, remove any notes to add the valid flag to the after report.
        
        If neither of the above conditions, remove any notes to add the valid flag to the loop report.
      6. If the change calculated in step 4 above is not equal to zero, add the point change value to the allowed change.
    6. If the loop report does not have a note to add a break flag:
      1. Mark the report to be updated.
      2. If the loop report has a note to add the valid flag and it is the newest report retrieved overall, add a note to add the break flag.
If the change flag is true, the change limit of the point is greater than the zero limit, the point change interval is not equal to 0, and reports were found when getting the list of nearby reports:
1. Initialize the rate to 0.
2. Set the last data to the scaled data from the report to be validated.
3. Set the last break status to whether the report to be validated has the break (B) flag.
4. Loop through the reports that are older than the report to be validated:
  1. If the report does not have the valid (V) flag and does not have a note to add the valid flag, continue to the next iteration of the loop.
  2. If the last break status was false, add to the rate the difference between the last data and the loop report scaled data.
  3. Set the last data to the loop report scaled data.
  4. Set the last break status to whether the loop report has the break (B) flag.
  5. If the loop report's time is older than the report to be validated's time minus the point check interval, exit the loop.
5. If the absolute value of the rate is larger than the absolute value of the point change limit, set the change flag to false, and if the report to be validated does not have the maintenance (M) or verified (E) flag, remove the valid (V) flag and add the questioned (Q) flag.
If the change flag is true and any report was processed in the 4.1 loop, loop through the reports processed in that loop:
1. If the loop report has been marked to be updated and it does not have the maintenance (M) or verified (E) flags:
  1. If the loop report has been marked to have the valid flag added:
    1. If the loop report does not have the valid flag or has been marked to have the break flag added:
      1. Remove the questioned (Q) flag from the loop report and add the valid (V) flag to the loop report.
      2. If the loop report has been marked to have the break flag added, add it here.
      3. If the report should be filed, update the report in the database.
      4. Set the flag on the loop report for it to be recalibrated.
      5. If the loop report has the break (B) flag, exit the loop.
  2. Otherwise if the loop report does not have the questioned (Q) flag:
    1. Remove the valid (V) flag from the loop report.
    2. Add the questioned (Q) flag to the loop report.
    3. If the report should be filed, update the report in the database.
If the change flag is true:
1. If the report to be validated does not have the maintenance (M) or verified (E) flags:
  1. Add the valid (V) flag to the report to be validated.
  2. If 4.2.1 was hit, add the break (B) flag to the report to be validated.
Otherwise if at least one report was processed in the 4.1 loop:
1. If the report to be validated does not have the maintenance (M) or verified (E) flags:
  1. Remove the valid (V) flag from the report to be validated.
  2. Add the questioned (Q) flag to the report to be validated.