pygeobase package

Submodules

pygeobase.io_base module

class pygeobase.io_base.GriddedBase(path, grid, ioclass, mode='r', fn_format='{:04d}', ioclass_kws=None)[source]

Bases: object

The GriddedBase class uses another IO class together with a grid object to read/write a dataset under the given path.

Parameters:
  • path (string) – Path to dataset.

  • grid (pygeogrids.BasicGrid of CellGrid instance) – Grid on which the time series data is stored.

  • ioclass (class) – IO class.

  • mode (str, optional) – File mode and can be read ‘r’, write ‘w’ or append ‘a’. Default: ‘r’

  • fn_format (str, optional) – The string format of the cell files. Default: ‘{:04d}’

  • ioclass_kws (dict, optional) – Additional keyword arguments for the ioclass. Default: None

close()[source]

Close file.

flush()[source]

Flush data.

get_spatial_subset(gpis=None, cells=None, ll_bbox=None, grid=None)[source]

Select spatial subset and return data set with new grid.

Parameters:
  • gpis (numpy.ndarray) – Grid point indices.

  • cells (numpy.ndarray) – Cell number.

  • ll_bbox (tuple (latmin, latmax, lonmin, lonmax)) – Lat/Lon bounding box

  • grid (pygeogrids.CellGrid) – Grid object.

Returns:

dataset – New data set with for spatial subset.

Return type:

GriddedBase or child

iter_gp(**kwargs)[source]

Yield all values for all grid points.

Yields:
  • data (pandas.DataFrame) – Data set.

  • gp (int) – Grid point.

read(*args, **kwargs)[source]

Takes either 1 or 2 arguments and calls the correct function which is either reading the gpi directly or finding the nearest gpi from given lat,lon coordinates and then reading it

write(*args, **kwargs)[source]

Takes either 1 or 2 arguments and calls the correct function which is either writing the gpi directly or finding the nearest gpi from given lat,lon coordinates and then writing it.

class pygeobase.io_base.GriddedTsBase(path, grid, ioclass, mode='r', fn_format='{:04d}', ioclass_kws=None)[source]

Bases: GriddedBase

The GriddedTsBase class uses another IO class together with a grid object to read/write a time series dataset under the given path.

class pygeobase.io_base.ImageBase(filename, mode='r', **kwargs)[source]

Bases: object

ImageBase class serves as a template for i/o objects used for reading and writing image data.

Parameters:
  • filename (str) – Filename path.

  • mode (str, optional) – Opening mode. Default: r

abstract close()[source]

Close file.

abstract flush()[source]

Flush data.

abstract read(**kwargs)[source]

Read data of an image file.

Returns:

image – pygeobase.object_base.Image object

Return type:

object

read_masked_data(**kwargs)[source]

Read data of an image file and mask the data according to specifications.

Returns:

image – pygeobase.object_base.Image object

Return type:

object

resample_data(image, index, distance, weights, **kwargs)[source]

Takes an image and resample (interpolate) the image data to arbitrary defined locations given by index and distance.

The default implementation just takes the weighted mean of all defined distances.

Parameters:
  • image (:py:class`pygeobase.object_base.Image` or numpy.recarray) – Image or numpy.recarray like object with shape = (x, )

  • index (np.array) – Index into image data defining a look-up table for data elements used in the interpolation process for each defined target location. For each point in image the neighbors in the targed grid are in the index array. This array is of shape (x, max_neighbors)

  • distance (np.array) – Array representing the distances of the image data to the arbitrary defined locations. The distances of points not to use are set to np.inf This array is of shape (x, max_neighbors)

  • weights (np.array) – Array representing the weights of the image data that should be used during resampling. The weights of points not to use are set to np.nan This array is of shape (x, max_neighbors)

Returns:

target – dictionary with a numpy.ndarray for each field in the input image. We can not return a image here since we do not know the target latitudes and longitudes.

Return type:

dict

abstract write(image, **kwargs)[source]

Write data to an image file.

Parameters:

image (object) – pygeobase.object_base.Image object

class pygeobase.io_base.IntervalReadingMixin(*args, **kwargs)[source]

Bases: object

Class overwrites functions to enable reading of multiple images in a time interval as one chunk. E.g. reading 3 minute files in 50 minute half-orbit chunks.

read(interval, **kwargs)[source]

Return an image for a specific interval.

Parameters:

interval (tuple) – (start, end)

Returns:

image – pygeobase.object_base.Image object

Return type:

object

tstamps_for_daterange(startdate, enddate)[source]

Here we split the period between startdate and enddate into intervals of size self.chunk_minutes. These interval reference dates are then translated to the actual file dates during reading of the chunks.

Returns:

intervals – list of (start, end) of intervals

Return type:

list of tuples

class pygeobase.io_base.MultiTemporalImageBase(path, ioclass, mode='r', fname_templ='', datetime_format='', subpath_templ=None, ioclass_kws=None, exact_templ=True, dtime_placeholder='datetime')[source]

Bases: object

The MultiTemporalImageBase class make use of an ImageBase object to read/write a sequence of multi temporal images under a given path.

Parameters:
  • path (string) – Path to dataset.

  • ioclass (class) – IO class.

  • mode (str, optional) – File mode and can be read ‘r’, write ‘w’ or append ‘a’. Default: ‘r’

  • fname_templ (str) – Filename template of the data to read. Default placeholder for parsing datetime information into the fname_templ is “{datetime}”. e.g. “ASCAT_{datetime}_image.nc” will be translated into the filename ASCAT_20070101_image.nc for the date 2007-01-01.

  • datetime_format (str) – String specifying the format of the datetime object to be parsed into the fname_template. e.g. “%Y/%m” will result in 2007/01 for datetime 2007-01-01 12:15:00

  • subpath_templ (list, optional) – If given it is used to generate a sub-paths from the given timestamp. Each item in the list represents one folder level. This can be used if the files for May 2007 are e.g. in folders 2007/05/ then the files can be accessed via the list [‘%Y’, ‘%m’].

  • ioclass_kws (dict) – Additional keyword arguments for the ioclass.

  • exact_templ (boolean, optional) – If True then the fname_templ matches the filename exactly. If False then the fname_templ will be used in glob to find the file.

  • dtime_placeholder (str) – String used in fname_templ as placeholder for datetime. Default value is “datetime”.

close()[source]

Close file.

daily_images(day, **kwargs)[source]

Yield all images for a day.

Parameters:

day (datetime.date) –

Returns:

img – pygeobase.object_base.Image object

Return type:

object

flush()[source]

Flush data.

get_tstamp_from_filename(filename)[source]

Return the timestamp contained in a given file name in accordance to the defined fname_templ.

Parameters:

filename (string) – File name.

Returns:

tstamp – Time stamp according to fname_templ as datetime object.

Return type:

datetime.dateime

iter_images(start_date, end_date, **kwargs)[source]

Yield all images for a given date range.

Parameters:
Returns:

image – pygeobase.object_base.Image object

Return type:

object

read(timestamp, **kwargs)[source]

Return an image for a specific timestamp.

Parameters:

timestamp (datetime.datetime) – Time stamp.

Returns:

image – pygeobase.object_base.Image object

Return type:

object

resample_image(*args, **kwargs)[source]
tstamps_for_daterange(start_date, end_date)[source]

Return all valid timestamps in a given date range. This method must be implemented if iteration over images should be possible.

Parameters:
Returns:

dates – list of datetimes

Return type:

list

write(timestamp, data, **kwargs)[source]

Write image data for a given timestamp.

Parameters:
  • timestamp (datetime.datetime) – exact timestamp of the image

  • data (object) – pygeobase.object_base.Image object

class pygeobase.io_base.StaticBase(filename, mode='r', **kwargs)[source]

Bases: object

The StaticBase class serves as a template for i/o objects used in GriddedStaticBase.

Parameters:
  • filename (str) – File name.

  • mode (str, optional) – Opening mode. Default: r

abstract close()[source]

Close file.

abstract flush()[source]

Flush data.

abstract read(gpi)[source]

Read data for given grid point.

Parameters:

gpi (int) – Grid point index.

Returns:

data – Data set.

Return type:

numpy.ndarray

abstract write(data)[source]

Write data.

Parameters:

data (numpy.ndarray) – Data records.

class pygeobase.io_base.TsBase(filename, mode='r', **kwargs)[source]

Bases: object

The TsBase class serves as a template for i/o objects used in GriddedTsBase.

Parameters:
  • filename (str) – File name.

  • mode (str, optional) – Opening mode. Default: r

close()[source]

Close file.

flush()[source]

Flush data.

abstract read(gpi, **kwargs)[source]

Read time series data for given grid point.

Parameters:

gpi (int) – Grid point index.

Returns:

data – pygeobase.object_base.TS object.

Return type:

object

abstract write(gpi, data, **kwargs)[source]

Write data.

Parameters:
  • gpi (int) – Grid point index.

  • data (object) – pygeobase.object_base.TS object.

pygeobase.object_base module

class pygeobase.object_base.Image(lon, lat, data, metadata, timestamp, timekey=None)[source]

Bases: object

The Image class represents the base object of an image.

Parameters:
  • lon (numpy.array) – array of longitudes

  • lat (numpy.array) – array of latitudes

  • data (dict) – dictionary of numpy arrays that holds the image data for each variable of the dataset

  • metadata (dict) – dictionary that holds metadata

  • timestamp (datetime.datetime) – exact timestamp of the image

  • timekey (str, optional) – Key of the time variable, if available, stored in data dictionary.

property dtype

Fake numpy recarray dtype field based on the dictionary keys and the dtype of the numpy array.

class pygeobase.object_base.TS(gpi, lon, lat, data, metadata)[source]

Bases: object

The TS class represents the base object of a time series.

Parameters:
  • lon (float) – Longitude of the time series

  • lat (float) – Latitude of the time series

  • data (pandas.DataFrame) – Pandas DataFrame that holds data for each variable of the time series

  • metadata (dict) – dictionary that holds metadata

plot(*args, **kwargs)[source]

wrapper for pandas.DataFrame.plot which adds title to plot and drops NaN values for plotting

Returns:

ax – matplotlib axes of the plot

Return type:

axes

pygeobase.utils module

pygeobase.utils.split_daterange_in_intervals(start, end, mi)[source]

Split a daterange in non overlapping intervals of mi minutes - 1 microsecond.

Parameters:
Returns:

intervals – list of (start, end) of intervals

Return type:

list of tuples

Module contents