Skip to content

Upload Missing or Bad Data

The majority of the air quality, meteorological, and sand motion data collected by the OLIVE system are inserted automatically via the telemetry system. Occasionally, however, failures in the telemetry, timing of station visits, a misconfigured station, or other circumstances may lead to incomplete or corrupted data sets. Additionally, select stations are not equipped with radios and all data must be inserted manually.

Data Insertion Process

OLIVE has an automated script that detects, validates, and processes data files that are uploaded to the Cloud service. Each time the script runs, it attempts to match incoming data with an existing station and a defined file format. If a match is found, it next checks if data already exist for the time period of the incoming file. It then makes the following decisions:

  1. If data exist, and all data are QC Level 0 (i.e., unreviewed), the existing data will be deleted and replaced with the incoming data.
  2. If data exist, and not all data are QC Level 0, it will attempt a "row-level insert." For a given time-value pair, Level 0 data are replaced, missing data are inserted, Level 1+ data are skipped. This can take a considerable amount of time if the data set is large.
  3. If no data exist for the period, incoming data are bulk inserted.

If no match is found, an issue will be created (see the Issues Page). Data are generally processed immediately, but a large data file that requires row-level processing may take several minutes to fully process. You can use the Data Review Page to monitor the progress of a large insert.

Upload data that reflects the complete time span of interest, and no more. The upload script will time out after 15 minutes so it is important to not create unnecessary overhead. The upload process detects the minimum and maximum time in the file and deletes all existing Level 0 data in between before inserting the data from the incoming file. Skipping time steps in a file in an attempt to only upload missing hours will delete all Level 0 data except those missing hours!

How to Manually Upload Data

Uploading data files to OLIVE is straightforward. Attach the files to an email and send to: upload@automation.oliveair.net

That said, the file must match an existing file pattern that has been set up in OLIDS. See the section on setting up file patterns for more details.

Setting the Correct Filename

The file patterns listed in OLIVE (the Files tab on the Loggernet Page) use regular expressions. Below are two examples to help decipher how to set the filename properly.

You can explore the file patterns below further by visiting this website.

Filename Example 1

^(1101_Min05)(_[0-9]+)+.dat$

  • The leading ^ means must start with. So, the filename must start with 1101_Min05. The parentheses represent expression evaluation groups and should not be included when naming a file.
  • The next group, (_[0-9]+), means an underscore followed by any number of digits from 0 to 9. In practice, this is usually the datetime that the file was created, written YYYYmmddHHMM, but could be any string of numbers. So, the filename now looks like 1101_Min05_202401031341.
  • The + after the second group simply means repeat the second group 0 or more times. This is to accommodate, for example, a date where each part is separated by an underscore, e.g., 2024_01_03_13_41.
  • The trailing $ translates as must end with preceded by the characters after the last expression, so in this case the filename must end with .dat.

With that, the following are valid filenames:

  • 1101_Min05_202401031341.dat
  • 1101_Min05_2024_01_03_13_41.dat
  • 1101_Min05_3425987059872350987235097.dat

The following are NOT valid filenames:

  • 1101_min05_202401031341.dat (lowercase m)
  • a1101_Min05_2024_01_03_13_41.dat (name does not begin with 1101)
  • 1101_Min05_2024_old.dat (second group pattern violated: _old are letters not digits)
  • 1101_Min05_2024-asdf3445tegf.dat (extra characters between second group pattern and .dat)

Filename Example 2

^(1101)_((H|h){1}our)_((M|m){1}anual).+.dat$

We won't repeat what we learned in Example 1, but this pattern introduces another expression that adds flexibility to the naming but looks more daunting.

  • The pattern (H|h) means an uppercase or lowercase h. Recall in Example 1 that a capital M was mandatory.
  • The pattern {1} immediately after the above expression means only once. Essentially, the pattern wants to see either Hour or hour.
  • The pattern .+ after the third grouping means any character(s). So, any valid filename character can go between manual and .dat.

For this example, as long as the name starts with 1101, is followed by _Hour_Manual, _hour_manual, _Hour_manual, or , _hour_Manual, and ends with .dat, the pattern is valid.

With all filenames, the patterns are meant to be able to consistently match the correct station and format while allowing for the filename to be unique. It is best practice to include a datetime string or other unique string of digits to insure that an existing file in the archive with the same name won't be overwritten.