netCDF I/O module#
netCDF4 functions to copy a netcdf file while doing some transformations on variables and dimensions.
An elaborate example of copying a netcdf file could remove one variable (removevar), rename another variable (renamevar), replace name and values of a third variable (replacevar), setting some attributes of old and new variables (replaceatt), and leave the output file open for further modification (noclose):
import numpy as np
import pyjams.ncio as ncio
newarr = np.full((720, 360), 273.15)
fo = ncio.copy_file('infile.nc', 'infile.nc',
removevar=['var0'],
renamevar={'var1': 'varnew1'},
replacevar={'var2': {'varnew2': newvar}},
replaceatt={'varnew1': {'long_name': 'renamed var1'},
'varnew2': {'long_name': 'new variable for var2',
'units': 'arbitrary'}},
noclose=True)
# change var3 afterwards
ovar = fo.variables['var3']
ovar[:] = ovar[:] * 3.
fo.close()
Using the individual routines for manipulating dimensions and variables could copy all variables in a file while making latitudinal means (assume longitudes in the last dimension), and add a new variable:
import netCDF4 as nc
import pyjams.ncio as ncio
ifile = 'input.nc' # name of input file
vtime = 'time' # name of time variable
vlon = 'lon' # name of longitude variable
# open files
ofile = ncio.set_output_filename(ifile, '-latmean.nc')
fi = nc.Dataset(ifile, 'r')
if 'file_format' in dir(fi):
fo = nc.Dataset(ofile, 'w', format=fi.file_format)
else:
fo = nc.Dataset(ofile, 'w', format='NETCDF4')
ntime = fi.dimensions[vtime].size
# meta data
ncio.copy_global_attributes(fi, fo, add={'history': 'latitudinal mean'})
# copy dimensions
ncio.copy_dimensions(fi, fo, removedim=[vlon])
# create variables
# this could be one command (time=None or keyword time left out)
# but I like to have the non-time dependent variables at the beginning
# of the netcdf file
# create static variables (independent of time)
ncio.create_variables(fi, fo, time=False, timedim=vtime, fill=True,
removedim=[vlon])
# create dynamic variables (time dependent)
ncio.create_variables(fi, fo, time=True, timedim=vtime, fill=True,
removedim=[vlon])
# create new variable
dims = list(fi.variables['var1'].dimensions)
dims = dims[:-1]
odict = {'name': 'var2',
'dtype': fi.variables['var1'].dtype,
'standard_name': 'var2',
'long_name': '2nd variable',
'dimensions': dims}
var2 = ncio.create_new_variable(odict, fo, izip=True, fill=True)
# copy static variables (not time-dependent) making latitudinal means
for ivar in fi.variables.values():
if vtime not in ivar.dimensions:
ovar = fo.variables[ivar.name]
if vlon in ivar.dimensions:
ovar[:] = ivar.mean(axis=-1)
else:
ovar[:] = ivar[:]
# copy dynamic variables (time-dependent) making latitudinal means
for tt in range(ntime):
for ivar in fi.variables.values():
if vtime in ivar.dimensions:
ovar = fo.variables[ivar.name]
if vlon in ivar.dimensions:
ovar[tt, ...] = ivar[tt, ...].mean(axis=-1)
else:
if ivar.ndim == 1:
ovar[tt] = ivar[tt]
else:
ovar[tt, ...] = ivar[tt, ...]
# set new variable
shape1 = fi.variables['var1'].shape
shape1 = shape1[:-1]
var2[:] = np.arange(np.prod(shape1)).reshape(shape1)
# finish
fi.close()
fo.close()
- copyright:
Copyright 2020-2022 Matthias Cuntz, see AUTHORS.rst for details.
- license:
MIT License, see LICENSE for details.
Subpackages#
netCDF4 functions to a copy netcdf file while doing some transformations on variables and dimensions. |
- copy_dimensions(fi, fo, removedim=[], renamedim={}, changedim={}, adddim={})[source]#
Create dimensions in output file from dimensions in input file.
- Parameters:
fi (file_handle) – File handle of opened netcdf input file
fo (file_handle) – File handle of opened netcdf output file
removedim (list of str, optional) – Do not create dimensions given in removedim in output file.
renamedim (dict, optional) – Rename dimensions in output file compared to input file. Dimension names in input file are given as dictionary keys, corresponding dimension names of output file are give as dictionary values.
changedim (dict, optional) – Change the size of the output dimension compared to the input file. Dimension names are given as dictionary keys, corresponding dimension sizes are given as dictionary values.
adddim (dict, optional) – Add dimension to output file. New dimension names are given as dictionary keys and new dimension sizes are given as dictionary values.
- Returns:
The output file will have the altered and unaltered dimensions of the input file.
- Return type:
nothing
Examples
copy_dimensions(fi, fo, removedim=['patch'], renamedim={'x': 'lon', 'y': 'lat'}, changedim={'mland': 1})
- copy_file(ifile, ofile, timedim='time', removevar=[], renamevar={}, replacevar={}, replaceatt={}, addglobalatt={}, noclose=False)[source]#
Copy variables from input file into output file.
- Parameters:
ifile (str) – File name of netcdf input file
ofile (str) – File name of netcdf output file
timedim (str, optional) – Name of time dimension in input file (default: ‘time’).
removevar (list of str, optional) – Do not copy variables given in removevar to output file.
renamevar (dict, optional) – Copy variables from input file with different name in output file. Variable names in input file are given as dictionary keys, corresponding variable names of output file are give as dictionary values.
replacevar (dict, optional) – Replace existing variables with variables in dictionary. Variable names in input file are given as dictionary keys, dictionary values are also dictionaries where keys are the output variable name and values are the output variable values.
replaceatt (dict, optional) – Replace or set attributes of variables in dictionary keys (case sensitive). Dictionary values are also dictionaries with {‘attribute_name’: attribute_value}. Dictionary keys are the names of the output variables after renaming and replacing.
addglobalatt (dict, optional) – Create or add to global file attributes. dict values will be given to attributes given in dict keys. Attributes will be created if they do not exist yet.
noclose (bool, optional) – Return file handle of opened output file for further manipulation if True (default: False)
- Returns:
The output file will have the altered or unaltered variables copied from the input file.
- Return type:
nothing or file_handle
Examples
ovar = np.arange(100) copy_variable('in.nc', 'out.nc', renamevar={'lon': 'longitude'}, replacevar={'var1': {'arange': ovar}}, replaceatt={'arange': {'long_name': 'A range', 'unit': '-'}})
- copy_global_attributes(fi, fo, add={}, remove=[])[source]#
Create global output file attributes from input global file attributes.
- Parameters:
fi (file_handle) – File handle of opened netcdf input file
fo (file_handle) – File handle of opened netcdf output file
add (dict, optional) – dict values will be given to attributes given in dict keys. Attributes will be created if they do not exist yet.
remove (list, optional) – Do not create global attributes given in remove in the output file.
- Returns:
Output will have global file attributes
- Return type:
nothing
Examples
copy_global_attributes( fi, fo, add={'history': time.asctime()+': '+' '.join(sys.argv)})
- copy_variables(fi, fo, time=None, timedim='time', removevar=[], renamevar={})[source]#
Copy variables from input file into output file.
- Parameters:
fi (file_handle) – File handle of opened netcdf input file
fo (file_handle) – File handle of opened netcdf output file
time (None or bool, optional) – None: copy all variables (default). True: copy only variables having dimension timedim. False: copy only variables that do not have dimension timedim.
timedim (str, optional) – Name of time dimension (default: ‘time’).
removevar (list of str, optional) – Do not copy variables given in removevar to output file.
renamevar (dict, optional) – Copy variables from input file with different name in output file. Variable names in input file are given as dictionary keys, corresponding variable names of output file are give as dictionary values.
- Returns:
The output file will have the altered or unaltered variables copied from the input file.
- Return type:
nothing
Examples
copy_variable(fi, fo, fill=True, renamevar={'lon': 'longitude'})
- create_new_variable(invardef, fo, izip=False, fill=None, chunksizes=True)[source]#
Create variable in output file from dictionary with variable attributes.
- Parameters:
invardef (dict) – Dictionary with name and dtype plus further attributes used in netCDF4.Dataset.createVariable; all other entries are set as variable attributes: ‘dimensions’, ‘zlib’, ‘complevel’, ‘shuffle’, ‘fletcher32’, ‘contiguous’, ‘chunksizes’, ‘endian’, ‘least_significant_digit’, ‘fill_value’, ‘chunk_cache’
fo (file_handle) – File handle of opened netcdf output file
izip (bool, optional) – True: the data will be compressed in the netCDF file using gzip compression independent of ‘zlib’ entry in input dictionary invardef (default: False).
fill (float, bool or None, optional) – Determine the behaviour if variable has no _FillValue or missing_value. If None or False: no _FillValue will be set. If True: _FillValue will be set to default value of the Python netCDF4 package for this type. If number: _FillValue will be set to number.
chunksizes (bool, optional) – True: include possible chunksizes in output file (default). False: do not include chunksize information from input file in output file, even if given in input dictionary invardef.
- Returns:
Handle to newly created variable in output file.
- Return type:
variable handle
Examples
nvar = {'name': 'new_field', 'dtype': np.dtype(np.float), 'dimensions': ('time', 'y', 'x'), 'units': 'kg/m2/s', } ovar = create_new_variable(nvar, fo, fill=True, izip=True)
- create_variables(fi, fo, time=None, timedim='time', izip=False, fill=None, chunksizes=True, removevar=[], renamevar={}, removedim=[], renamedim={}, replacedim={})[source]#
Create variables in output from variables in input file.
- Parameters:
fi (file_handle) – File handle of opened netcdf input file
fo (file_handle) – File handle of opened netcdf output file
time (None or bool, optional) – None: create all variables (default). True: create only variables having dimension timedim. False: create only variables that do not have dimension timedim.
timedim (str, optional) – Name of time dimension (default: ‘time’).
izip (bool, optional) – True: the data will be compressed in the netCDF file using gzip compression (default: False).
fill (float, bool or None, optional) – Determine the behaviour if variable have no _FillValue or missing_value. If None or False: no _FillValue will be set. If True: _FillValue will be set to default value of the Python package netCDF4 for this type. If number: _FillValue will be set to number.
chunksizes (bool, optional) – True: include possible chunksizes in output file (default). False: do not include chunksize information from input file in output file. Set to False, for example, if dimension size gets changed because the chunksize on a dimension can not be greater than the dimension size.
removevar (list of str, optional) – Do not create variables given in removevar in output file.
renamevar (dict, optional) – Rename variables in output file compared to input file. Variable names in input file are given as dictionary keys, corresponding variable names of output file are give as dictionary values.
removedim (list of str, optional) – Remove dimensions from variable definitions in output file.
renamedim (dict, optional) – Rename dimensions for variables in output file. Dimension names in input file are given as dictionary keys, corresponding dimension names of output file are give as dictionary values.
replacedim (dict, optional) – Replace dimensions for variables in output file. Dimension names in input file are given as dictionary keys, corresponding dimension names of output file are given as dictionary values. The output names can be tuples or lists to extend dimensions of a variable.
- Returns:
The output file will have the altered or unaltered variables of the input file defined.
- Return type:
nothing
Examples
create_variable(fi, fo, fill=True, izip=True, removedim=['patch'], renamevar={'lon': 'longitude'}, replacedim={'land': ('y', 'x')})
- get_fill_value_for_dtype(dtype)[source]#
Get default _FillValue of netCDF4 for the given data type.
- Parameters:
dtype (np.dtype) – numpy data type
- Return type:
default _FillValue of given numpy data type
Examples
fill_value = get_fill_value_for_dtype(var.dtype)
- get_variable_definition(ncvar)[source]#
Collect information on input variable.
- Parameters:
ncvar (netcdf4 variable) – Variable of input file
- Returns:
Containing information on input variable withkey/value pairs. The following keys are returned: ‘name’, ‘dtype’, ‘dimensions’, ‘fill_vallue’, ‘chunksizes’
- Return type:
Examples
get_variable_definition(fi.variables['GPP'])
- set_output_filename(ifile, ext)[source]#
Create output file name from input file name by adding ext before the file suffix.
- Parameters:
- Returns:
output filename with ext before file suffix
- Return type:
Examples
>>> set_output_filename('in.nc', '-no_patch') in-no_patch.nc >>> set_output_filename('in.nc', '.nop') in.nop.nc