Skip to content

NovaStar Program Reference / Data Collection / Overview

Data collection programs communicate with stations and web services to collect and load real-time data reports into the NovaStar database. These programs focus on station observations and are more highly managed than Data Loader and Data Import programs.


Introduction

A NovaStar system can be configured to ingest data from multiple sources and formats. Data collection programs depend on station dataloggers (data collection units), communications protocols, and data formats that must be decoded. The NovaStar system provides programs for specific protocols and data formats, and other programs that process simple data formats.

See also:

Reasons to ingest data into NovaStar include:

  • Some protocols and formats, such as ALERT2, are tightly integrated with the flood warning system and are not supported by other systems.
  • NovaStar alarms can only be processed when data are loaded into NovaStar.
  • Data analysis using NovaStar features requires that all input data are in the NovaStar database.
  • Ingesting data into NovaStar provides the smallest latency between data collection and access for visualization, analysis, and notification.
  • It is desirable to maintain a local archive of third-party data. A local data archive does present challenges, such as how to synchronize changes in the third-party data.

Reasons to not ingest data into NovaStar include:

  • Third-party data can be easily accessed for historical and real-time (provisional data), for example using web services.
  • Ingestion of third-party data results in excessive data records or other stress on the system. For example, flood warning systems typically focus on event-driven data with periodic regular reports (e.g., every hour). There is often no need for regular reports at small intervals such as 1-minute "continuous" monitoring.
  • Analysis tools are able to read data from NovaStar and third-party data sources without storing data in the database.
  • There is not a need to define alarms for the data source.

Data collection uses the concept of "lines" to identify the data source. For example, multiple nsrecdata programs can be configured, each listening to a different source, to ensure that data are being received from different communication paths. The line is added to the data report as a flag.

Stations that provide real-time data typically use a program listed on this page for normal operations. Data that are loaded at system initialization or to backfill when there is an outage may require that data are processed and loaded using one of the simple approaches or a Data Import program.

Main Data Collection Approaches

Approach When to Use
Event-based data collection
  • Remote stations push data to NovaStar, for example ALERT, ALERT2, and custom data protocol transmissions. NovaStar listens on a port.
  • Typically small latency between data measurement and availiability for decisions.
Station polling (interrogation)
  • Station provides a "dial in" capability and an API that accepts commands to query data.
  • Web services including REST, cgibin, updating web resources.
  • System tracks the time of last polling for each polled station and adjusts the polling period accordingly.
Manual data load
  • Station provides a "dial in" capability and an API that accepts commands to query data.
  • Web services including REST, cgibin, updating web resources.

Event-based Data Collection

Some data collection configurations such as ALERT/ALERT2 are capable of sending event-based data and typically provide minimal latency between measurement and data visibility in the base station software. The data are typically pushed to NovaStar from decoder hardware, and NovaStar can also pull from decoders. ALERT2 systems typically use multiple radio frequencies that are handled by decoders that push data to NovaStar using different port numbers.

Simple and custom formats can also be enabled, for example by programming data loggers to push data to an nsrecdata process that is listening on a specific port.

ALERT, ALERT2, and Custom Data Collection

The following program is typically used for development and testing:

  • nsalertxmt - transmit ALERT reports on a serial port (use nsrecdata in production)

Station Polling (Interrogation)

Data sources can be queried using a polling (interrogation) approach by enabling polling in the NovaStar system configuration. The main differences between polling and data import is that stations that are polled track the last time polled in the database, and control over polling is very granular. The last poll time is used to determine the period to query data. Data imports, by contrast, use a fixed query period with a scheduled process and may load a data file that has been exported by another system.

The data source provides a custom API interface, file format, web service endpoint, etc. Specific polling programs are needed when the communication protocols are specific to hardware or data formats (such as web services) require specific processing, in contrast to standard formats such as ALERT and comma-separated-value.

Station polling is enabled by setting automatic_interrogation=true in the NovaStar system configuration table and is implemented as follows:

  • A single nsautointer program is run as a service. The NovaStar system will restart the service if it stops for some reason.
  • Stations that have polling configured (polling command, polling interval, etc.) will cause the nsautointer service program to run the polling program:
    • The nsautointer service program evaluates on a 1-second interval whether to poll stations. The actual time between polls for a station is determined by the station's data.
    • The station type for a polled station provides default polling configuration information, which will be used if not specified in a station's data.
    • The station's polling interval is used to evaluate when to run the polling program.
    • The station's polling order is used to determine the sequence order for polling once a list of stations to poll is determined.
    • The station's polling command will be run to poll the station data. See the nsautointer documentation for standard command parameters that are passed to polling programs.
    • The mapping of external station/site identifier depends on station configuration, for example using the remote tag to specify the USGS site identifier for the nspollusgswaterdata program. See the documentation each polling program for specific configuration requirements.
    • The mapping of external sensor/parameter/data type depends on point configuration, for example using the point parameter to specify the USGS parameter code for the nspollusgswaterdata program.
    • See individual program documentation for command parameters that are specific to the program.
    • Programs that do not have polling configured in the database are not polled (e.g., ALERT data are instead queried using the nsrecdata program and data may also be ingested using scheduled import programs). To avoid confusion and conflicts, a station typically only reports data using one protocol.
    • A station's polling timeout is not applied when launching the polling program (it must be handled in the individual polling program).
    • Once launched, the nsautointer service does not monitor the polling program processes other than checking the station's last polling time.
  • Each polling program:
    • May be implemented as a compiled program or a script.
    • Must handle the standard command parameters passed by the nsautointer service.
    • Reads station and point polling configuration data to understand how to query data.
    • Reads the data from the station, web service, etc.
    • Inserts data reports into the NovaStar database using the filing code, which handles quality control, alarms, etc. If the polling program is a script, the insert typically uses a simple data loader program such as sendrpt or starpt.
    • A point's configuration information may be used during reading and inserts, for example to indicate the parameter position in polled data, if such information is required based on the station query.
    • Update the station's last polling time, to allow the nsautointer service to check when the next polling session should occur

An alternative to using polling is to run a program in the scheduler, independent of nsautointer, for example by creating a simple data file and calling sendrpt. See Data Import programs.

As of NovaStar 5.4.1.3, TriLynx Systems is updating polling and import programs to support both polling and scheduled import run modes. This will provide flexibility to allow a data load program to be used as appropriate.

HydroLynx 50386 Data Collection

HydroLynx 50386 data can be polled with nsautointer using the following program:

Other software is also available to integrate NovaStar with 50386 hardware:

HydroLynx 5096 Data Collection

HydroLynx 5400 Data Collection

The HydroLynx 5400 hardware configuration is used to communicate with Orbcomm Satellite data source.

Poll Comma Separated Value (CSV) Data

  • nspollcsv - poll station data from CSV file

Poll Corps of Engineers (COE) Reservoir Data]

Poll RAWS Data

Poll USGS NWIS Data

Poll (Import) XML Data

Simple Data Loading Programs

The following programs provide basic data loading capabilities and are often called by scripts that convert data into simple formats for loading. For example, data polling and import scripts often call one of the following programs.

  • sendrpt - file data report
  • starpt - file data report for station and sensor ID

Data Import Programs

Data import programs are run on a schedule or are run one time on the command line. The main differences between polling and data import is that stations that are polled track the last time polled in the database, and control over polling is very granular. The last poll time is used to determine the period to query data. Data imports, by contrast, use a fixed query period with a scheduled process and may load a data file that has been exported by another system.

The NovaStar system has in the past made a distinction between interrogation (polling) programs that are managed by the nsautointer service and data import programs that are run by the system scheduler. As of NovaStar 5.3.1.4, polling and import programs are being updated to allow running in either mode. Programs that have traditionally been referred to as "import" programs are documented separately:

Utilities - Overdue Report Check

  • nsoverduechk - check for overdue data reports and create "no report" alarms