NovaStar Program Reference / Data Collection / Overview
Data collection programs communicate with stations and web services to collect and load real-time data reports into the NovaStar database. These programs focus on station observations and are more highly managed than Data Loader and Data Import programs.
- Introduction
- Event-based Data Collection
- Station Polling (Interrogation)
- Simple Data Loading Programs
- Data Import Programs
- Utilities - Overdue Report Check
Introduction
A NovaStar system can be configured to ingest data from multiple sources and formats. Data collection programs depend on station dataloggers (data collection units), communications protocols, and data formats that must be decoded. The NovaStar system provides programs for specific protocols and data formats, and other programs that process simple data formats.
See also:
- Data Loader programs - optimized data loading
- Data Import programs - traditional NovaStar data import
- Data Export programs - traditional NovaStar data export
Reasons to ingest data into NovaStar include:
- Some protocols and formats, such as ALERT2, are tightly integrated with the flood warning system and are not supported by other systems.
- NovaStar alarms can only be processed when data are loaded into NovaStar.
- Data analysis using NovaStar features requires that all input data are in the NovaStar database.
- Ingesting data into NovaStar provides the smallest latency between data collection and access for visualization, analysis, and notification.
- It is desirable to maintain a local archive of third-party data. A local data archive does present challenges, such as how to synchronize changes in the third-party data.
Reasons to not ingest data into NovaStar include:
- Third-party data can be easily accessed for historical and real-time (provisional data), for example using web services.
- Ingestion of third-party data results in excessive data records or other stress on the system. For example, flood warning systems typically focus on event-driven data with periodic regular reports (e.g., every hour). There is often no need for regular reports at small intervals such as 1-minute "continuous" monitoring.
- Analysis tools are able to read data from NovaStar and third-party data sources without storing data in the database.
- There is not a need to define alarms for the data source.
Data collection uses the concept of "lines" to identify the data source.
For example, multiple nsrecdata
programs can be configured,
each listening to a different source, to ensure that data are being received from
different communication paths. The line is added to the data report as a flag.
Stations that provide real-time data typically use a program listed on this page for normal operations. Data that are loaded at system initialization or to backfill when there is an outage may require that data are processed and loaded using one of the simple approaches or a Data Import program.
Main Data Collection Approaches
Approach | When to Use |
---|---|
Event-based data collection |
|
Station polling (interrogation) |
|
Manual data load |
|
Event-based Data Collection
Some data collection configurations such as ALERT/ALERT2 are capable of sending event-based data and typically provide minimal latency between measurement and data visibility in the base station software. The data are typically pushed to NovaStar from decoder hardware, and NovaStar can also pull from decoders. ALERT2 systems typically use multiple radio frequencies that are handled by decoders that push data to NovaStar using different port numbers.
Simple and custom formats can also be enabled,
for example by programming data loggers to push data to an
nsrecdata
process that is listening on a specific port.
ALERT, ALERT2, and Custom Data Collection
- ALERT and ALERT2 data collection:
nsrecdata
- ALERT, ALERT2, delimited data format (used with modems and some loggers), custom format, etc.recdatalogrefile
- simple program to manually refile data fromnsrecdata
log file(s)nsrecdata-log-buffer-refile
- refile buffered data from the currentnsrecdata
log file, as scheduled process
- see Data Import
nsrecdatalogimport
program - used to import data fromnsrecdata
logs
The following program is typically used for development and testing:
nsalertxmt
- transmit ALERT reports on a serial port (usensrecdata
in production)
Station Polling (Interrogation)
Data sources can be queried using a polling (interrogation) approach by enabling polling in the NovaStar system configuration. The main differences between polling and data import is that stations that are polled track the last time polled in the database, and control over polling is very granular. The last poll time is used to determine the period to query data. Data imports, by contrast, use a fixed query period with a scheduled process and may load a data file that has been exported by another system.
The data source provides a custom API interface, file format, web service endpoint, etc. Specific polling programs are needed when the communication protocols are specific to hardware or data formats (such as web services) require specific processing, in contrast to standard formats such as ALERT and comma-separated-value.
Station polling is enabled by setting automatic_interrogation=true
in the
NovaStar system configuration table and is implemented as follows:
- A single
nsautointer
program is run as a service. The NovaStar system will restart the service if it stops for some reason. - Stations that have polling configured (polling command, polling interval, etc.) will cause the
nsautointer
service program to run the polling program:- The
nsautointer
service program evaluates on a 1-second interval whether to poll stations. The actual time between polls for a station is determined by the station's data. - The station type for a polled station provides default polling configuration information, which will be used if not specified in a station's data.
- The station's polling interval is used to evaluate when to run the polling program.
- The station's polling order is used to determine the sequence order for polling once a list of stations to poll is determined.
- The station's polling command will be run to poll the station data.
See the
nsautointer
documentation for standard command parameters that are passed to polling programs. - The mapping of external station/site identifier depends on station configuration,
for example using the remote tag to specify the USGS site identifier for the
nspollusgswaterdata
program. See the documentation each polling program for specific configuration requirements. - The mapping of external sensor/parameter/data type depends on point configuration,
for example using the point parameter to specify the USGS parameter code for the
nspollusgswaterdata
program. - See individual program documentation for command parameters that are specific to the program.
- Programs that do not have polling configured in the database are not polled
(e.g., ALERT data are instead queried using the
nsrecdata
program and data may also be ingested using scheduled import programs). To avoid confusion and conflicts, a station typically only reports data using one protocol. - A station's polling timeout is not applied when launching the polling program (it must be handled in the individual polling program).
- Once launched, the
nsautointer
service does not monitor the polling program processes other than checking the station's last polling time.
- The
- Each polling program:
- May be implemented as a compiled program or a script.
- Must handle the standard command parameters passed by the
nsautointer
service. - Reads station and point polling configuration data to understand how to query data.
- Reads the data from the station, web service, etc.
- Inserts data reports into the NovaStar database using the filing code,
which handles quality control, alarms, etc.
If the polling program is a script, the insert typically uses a simple data loader program such as
sendrpt
orstarpt
. - A point's configuration information may be used during reading and inserts, for example to indicate the parameter position in polled data, if such information is required based on the station query.
- Update the station's last polling time, to allow
the
nsautointer
service to check when the next polling session should occur
An alternative to using polling is to run a program in the scheduler, independent of
nsautointer
,
for example by creating a simple data file and calling sendrpt
.
See Data Import programs.
As of NovaStar 5.4.1.3, TriLynx Systems is updating polling and import programs to support both polling and scheduled import run modes. This will provide flexibility to allow a data load program to be used as appropriate.
HydroLynx 50386 Data Collection
HydroLynx 50386 data can be polled with nsautointer
using the following program:
nspollhydro50386
- poll a HydroLynx 50386 logger
Other software is also available to integrate NovaStar with 50386 hardware:
ns50386server
- need to documentns50386xmt
- transmit 50386 data packets on a serial portnscmdhydro50386
- send a command to a HydroLynx 50386 data collection unit (DCU)nsctrlhydro50386
- send a 50386 ScadaLynx control to a remote terminal unit (RTU)nsrcvhydro50386
- HydroLynx 50386 receive datanstimesync50386
- set time on HydroLynx 50386 DCUnstimesync50386all
- get list of stations with HydroLynx 50386 poll command and time sync enabled
HydroLynx 5096 Data Collection
nspollhydro5096
- poll a HydroLynx 5096 logger
HydroLynx 5400 Data Collection
The HydroLynx 5400 hardware configuration is used to communicate with Orbcomm Satellite data source.
nspollhydro5400
- poll a HydroLynx 5400 stationnspollhydro5400logsort
- sortnspollhydro5400
log file
Poll Comma Separated Value (CSV) Data
nspollcsv
- poll station data from CSV file
Poll Corps of Engineers (COE) Reservoir Data]
nspollcoeresvdata
- poll station data for COE reservoir
Poll RAWS Data
nspollrawsdata
- poll CDEC/NOAA RAWS station collect data
Poll USGS NWIS Data
nspollusgswaterdata
- poll USGS NWIS data
Poll (Import) XML Data
nsxmlimport
- poll (import) XML datansxmltranslate
- translate XML data - need to document
Simple Data Loading Programs
The following programs provide basic data loading capabilities and are often called by scripts that convert data into simple formats for loading. For example, data polling and import scripts often call one of the following programs.
Data Import Programs
Data import programs are run on a schedule or are run one time on the command line. The main differences between polling and data import is that stations that are polled track the last time polled in the database, and control over polling is very granular. The last poll time is used to determine the period to query data. Data imports, by contrast, use a fixed query period with a scheduled process and may load a data file that has been exported by another system.
The NovaStar system has in the past made a distinction between interrogation (polling) programs
that are managed by the nsautointer
service
and data import programs that are run by the system scheduler.
As of NovaStar 5.3.1.4, polling and import programs are being updated to allow running in either mode.
Programs that have traditionally been referred to as "import" programs are documented separately:
Utilities - Overdue Report Check
nsoverduechk
- check for overdue data reports and create "no report" alarms