seabirdfilehandler.cnvfile module

Inheritance diagram of seabirdfilehandler.cnvfile
class seabirdfilehandler.cnvfile.CnvFile(path_to_file, only_header=False, create_dataframe=False, absolute_time_calculation=False, event_log_column=False, coordinate_columns=False)[source]

Bases: DataFile

A representation of a cnv-file as used by SeaBird.

This class intends to fully extract and organize the different types of data and metadata present inside of such a file. Downstream libraries shall be able to use this representation for all applications concerning cnv files, like data processing, transformation or visualization.

To achieve that, the metadata header is organized by the parent-class, DataFile, while the data table is extracted by this class. The data representation can be a numpy array or pandas dataframe. The handling of the data is mostly done inside parameters, a representation of the individual measurement parameter data and metadata.

This class is also able to parse the edited data and metadata back to the original .cnv file format, allowing for custom data processing using this representation, while still being able to use Sea-Birds original software on that output. It also allows to stay comparable with other parsers or methods in general.

Parameters:
  • path_to_file (Path | str:) – the path to the file

  • only_header (bool :) – Whether to stop reading the file after the metadata header.

  • create_dataframe (bool :) – Whether to create a pandas DataFrame from the data table.

  • absolute_time_calculation (bool:) – whether to use a real timestamp instead of the second count

  • event_log_column (bool:) – whether to add a station and device event column from DSHIP

  • coordinate_columns (bool:) – whether to add longitude and latitude from the extra metadata header

create_dataframe()[source]

Plain dataframe creator.

Return type:

DataFrame

reading_start_time()[source]

Extracts the Cast start time from the metadata header.

Return type:

datetime | None

absolute_time_calculation()[source]

Replaces the basic cnv time representation of counting relative to the casts start point, by real UTC timestamps. This operation will act directly on the dataframe.

Return type:

bool

add_start_time()[source]

Adds the Cast start time to the dataframe. Necessary for joins on the time.

Return type:

bool

get_processing_step_infos()[source]

Collects the individual validation modules and their respective information, usually present in key-value pairs.

Return type:

CnvProcessingSteps

df2cnv(df=None)[source]

Parses a pandas dataframe into a list that represents the lines inside of a cnv data table.

Parameters:

df (DataFrame to export, default is self.df)

Return type:

a list of lines in the cnv data table format

array2cnv()[source]
Return type:

list

to_cnv(file_name=None, use_dataframe=True)[source]

Writes the values inside of this instance as a new cnv file to disc.

Parameters:
  • file_name (Path:) – the new file name to use for writing

  • use_current_df (bool:) – whether to use the current dataframe as data table

  • use_current_validation_header (bool:) – whether to use the current processing module list

  • header_list (list:) – the data columns to use for the export

add_processing_metadata(addition)[source]

Adds new processing lines to the list of processing module information

Parameters:

addition (str:) – the new information line

add_station_and_event_column()[source]

Adds a column with the DSHIP station and device event numbers to the dataframe. These must be present inside the extra metadata header.

Return type:

bool

add_position_columns()[source]

Adds a column with the longitude and latitude to the dataframe. These must be present inside the extra metadata header.

Return type:

bool

add_cast_number(number=None)[source]

Adds a column with the cast number to the dataframe.

Parameters:

number (int:) – the cast number of this files cast

Return type:

bool