Skip to content

Item data storage

The data values for each item are ordered and reorganized based on the selected spatial index tree. This leads to less chunky data retrieval. Therefore, for example with grids, we can read much smaller amount of data than we would read with reading full rows.

image.png

Data storage strategies/options

Based on the dataset defintion and query/usage requirements item data can be stored in 1 or more blobs. There are 2 typical query types with regard to temporal domain

  • Time step oriented - Gets 1 time step data in 1 request, spatial domain can be big
  • Time series oriented - Gets many time steps data in 1 request, spatial domain should be smaller to avoid extra large results

Single timestep storage per file

  • Optimized for Timestep retrieval.
  • Each Timestep is stored in its own blob.

Time series optimized storage

  • Optimized for Timeseries retrieval.
  • Both temporal domain and spatial domain are divided into groups , each spatio-temporal group is stored in 1 file.

image.png

  • Byte array is ordered first by time than by spatial position.
  • This storage strategy is used for larger amount of timesteps

Timeseries storage support 2 block types:

Fixed block timeseries storage

Pros Cons
Flexible updates, even single timestep can be added Always allocates fixed block size
Allows gaps Uses more space
Designed for incremental forecasts Slower import/update operation

Variable block timeseries storage

Pros Cons
Less storage size, just imported data are stored Only full blocks can be updated
Faster import/update opearation Gaps cannot be easilly filled later
Designed for historical data load

Use cases

  • Datasets with large number of timesteps queried by isolated points.
  • Forecast results with multiple timesteps queried by isolated points.

Multi timestep storage per file

Used for internal storage effeciency in case of small spatial domains.

  • Optimized for Timestep retrieval.
  • Temporal domain is divided into groups. Each timestep group is stored in 1 file to avoid large number of small files.
  • This storage strategy is used for smaller spatial domains with larger amount of timesteps