seabirdfilehandler.datafiles module

Inheritance diagram of seabirdfilehandler.datafiles
class seabirdfilehandler.datafiles.DataFile(path_to_file, only_header=False)[source]

Bases: object

The base class for all Sea-Bird data files, which are .cnv, .btl, and .bl . One instance of this class, or its children, represents one data text file. The different information bits of such a file are structured into individual lists or dictionaries. The data table will be loaded as numpy array and can be converted to a pandas DataFrame. Datatype-specific behavior is implemented in the subclasses.

Parameters:
  • path_to_file (Path | str :) – The file to the data file.

  • only_header (bool :) – Whether to stop reading the file after the metadata header.

read_file()[source]

Reads and structures all the different information present in the file. Lists and Dictionaries are the data structures of choice. Uses basic prefix checking to distinguish different header information.

sensor_xml_to_flattened_dict(sensor_data)[source]

Reads the pure xml sensor input and creates a multilevel dictionary, dropping the first two dictionaries, as they are single entry only

Parameters:

sensor_data (str:) – The raw xml sensor data.

Return type:

A list of sensor information, which is a structured dict.

structure_metadata(metadata_list)[source]

Creates a dictionary to store custom metadata, of which Sea-Bird allows 12 lines in each file.

Parameters:

metadata_list (list :) – a list of the individual lines of metadata found in the file

Return type:

a dictionary of the lines of metadata divided into key-value pairs

define_output_path(file_path=None, file_name=None, file_type='.csv')[source]

Creates a Path object holding the desired output path.

Parameters:
  • file_path (Path :) – directory the file sits in (Default value = self.file_dir)

  • file_name (str :) – the original file name (Default value = self.file_name)

  • file_type (str :) – the output file type (Default = ‘.csv’)

Return type:

a Path object consisting of the full path of the new file

to_csv(data, with_header=True, output_file_path=None, output_file_name=None)[source]

Writes a csv from the given data.

Parameters:
  • data (pd.DataFrame | np.ndarray :) – The source data to use.

  • with_header (boolean :) –

    indicating whether the header shall appear in the output

    (Default value = True)

  • output_file_path (Path :) – file directory (Default value = None)

  • output_file_name (str :) – original file name (Default value = None)

selecting_columns(list_of_columns, df)[source]

Alters the dataframe to only hold the given columns.

Parameters:
  • list_of_columns (list or str : a collection of columns)

  • df (pandas.Dataframe :) – Dataframe (Default value = None)