visiannot.tools.data_loader

Summary

Module with functions for loading and saving data files

Data

visiannot.tools.data_loader.SEEK_CUR

int([x]) -> integer int(x, base=10) -> integer

visiannot.tools.data_loader.SEEK_END

int([x]) -> integer int(x, base=10) -> integer

Functions

visiannot.tools.data_loader.convert_intervals_to_time_series(…)

Converts intervals as 2D array to a time series of 0 and 1 (1D array)

visiannot.tools.data_loader.convert_time_series_to_intervals(…)

Gets the intervals of a 1D signal with a specific value

visiannot.tools.data_loader.get_attribute_generic(…)

Gets an attribute in a .mat or .h5 file

visiannot.tools.data_loader.get_attribute_h5(…)

Gets an attribute in a .h5 file

visiannot.tools.data_loader.get_data_duration(…)

Gets the ending date-time of a data file (.mat, .h5 or .txt)

visiannot.tools.data_loader.get_data_generic(path)

Loads data from a file (.h5, .mat or .txt)

visiannot.tools.data_loader.get_data_h5(…)

Reads a dataset in a .h5 file

visiannot.tools.data_loader.get_data_interval(path)

Loads file containing temporal intervals, output shape (n_{intervals},2)

visiannot.tools.data_loader.get_data_interval_as_time_series(path)

Loads file containing temporal intervals, output shape (n_{samples},)

visiannot.tools.data_loader.get_data_mat(…)

Loads data from a .mat file

visiannot.tools.data_loader.get_data_txt(path)

Loads data from a .txt file

visiannot.tools.data_loader.get_last_sample_generic(path)

Gets the last sample in a data file (.mat, .h5 or .txt)

visiannot.tools.data_loader.get_nb_samples_generic(path)

Gets number of samples in a data file (.mat, .h5 or .txt)

visiannot.tools.data_loader.get_txt_lines(path)

Loads a file as a list of lines

visiannot.tools.data_loader.get_working_directory(path)

Gets working directory when ViSiAnnoT is launched, which depends on wether it is launched as a Python script or an executable (generated with PyInstaller)

visiannot.tools.data_loader.slice_dataset(dataset)

Slices a dataset

API

Data

visiannot.tools.data_loader.SEEK_CUR = 1

int([x]) -> integer int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.__int__(). For floating point numbers, this truncates towards zero.

If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by ‘+’ or ‘-‘ and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal. >>> int(‘0b100’, base=0) 4

visiannot.tools.data_loader.SEEK_END = 2

int([x]) -> integer int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments are given. If x is a number, return x.__int__(). For floating point numbers, this truncates towards zero.

If x is not a number or if base is given, then x must be a string, bytes, or bytearray instance representing an integer literal in the given base. The literal can be preceded by ‘+’ or ‘-‘ and be surrounded by whitespace. The base defaults to 10. Valid bases are 0 and 2-36. Base 0 means to interpret the base from the string as an integer literal. >>> int(‘0b100’, base=0) 4

Functions

visiannot.tools.data_loader.convert_intervals_to_time_series(intervals, n_samples=0)[source]

Converts intervals as 2D array to a time series of 0 and 1 (1D array)

Parameters
  • intervals (numpy array or list) – intervals in frame numbers, shape (n_{intervals}, 2)

  • n_samples (int) – number of frames of the time series, default end frame of the last interval

Returns

intervals as a time series, shape (n_{frames},)

Return type

numpy array

If the end time of an interval is -1 (second column of intervals, then the end time is set to n_samples.

Example:

>>> a = np.array([[4, 5], [9, 12], [16, -1]])
>>> convert_intervals_to_time_series(a, 20)
array([0., 0., 0., 0., 1., 0., 0., 0., 0., 1., 1., 1., 0., 0., 0., 0.,
       1., 1., 1., 1.])
visiannot.tools.data_loader.convert_time_series_to_intervals(data, value)[source]

Gets the intervals of a 1D signal with a specific value

Parameters
  • data (numpy array) – 1D array

  • value – value that defines the intervals to retrieve from data

Returns

2D array with indexes of intervals (ending index is not included in the interval, as with range in Python)

Return type

numpy array

Example:

>>> a = np.array([0, 0, 0, 0, 5, 1, 1, 1, 1, 5, 5, 5, 0, 0, 0, 0])
>>> convert_time_series_to_intervals(a,0)
array([[ 0,  4],
       [12, 16]])
>>> convert_time_series_to_intervals(a,1)
array([[5, 9]])
>>> convert_time_series_to_intervals(a,5)
array([[ 4,  5],
       [ 9, 12]])
visiannot.tools.data_loader.get_attribute_generic(path, key)[source]

Gets an attribute in a .mat or .h5 file

Parameters
  • path (str) – path to the file

  • key_path (str) – path to the attribute in the file

Returns

attribute (if the file is not .mat or .h5, it returns key)

visiannot.tools.data_loader.get_attribute_h5(path, key_path)[source]

Gets an attribute in a .h5 file

Parameters
  • path (str) – path to the file

  • key_path (str) – path to the attribute in the file

Returns

attribute

visiannot.tools.data_loader.get_data_duration(path, freq, key='', flag_interval=False, **kwargs)[source]

Gets the ending date-time of a data file (.mat, .h5 or .txt)

It raises an exception if the format is not supported.

The beginning date-time must be in the path of the data files.

Parameters
  • path (list) – path to the data file

  • freq (float) – data frequency, set to 0 if signal not regularly sampled

  • key (str) – key to access the data (in case of .mat or .h5)

  • flag_interval (bool) – specify if data is intervals

  • kwargs – keyword arguments of get_nb_samples_generic()

Returns

duration of the data file in seconds

Return type

float

visiannot.tools.data_loader.get_data_generic(path, key='', **kwargs)[source]

Loads data from a file (.h5, .mat or .txt)

It raises an exception if the format is not supported.

Parameters
  • path (str string containing the path to the data) – path to the data file

  • key (str) – key to access the data (in case of .mat or .h5)

  • kwargs – keyword arguments of get_data_mat(), get_data_h5(), get_data_txt() or audio.get_data_audio(), depending on file format

Returns

data

Return type

numpy array

visiannot.tools.data_loader.get_data_h5(path, key, **kwargs)[source]

Reads a dataset in a .h5 file

Parameters
  • path (str) – path to the file

  • key (str) – path to the H5 dataset to load

  • kwargs – keyword arguments of slice_dataset()

Returns

dataset

Return type

numpy array

visiannot.tools.data_loader.get_data_interval(path, key='')[source]

Loads file containing temporal intervals, output shape (n_{intervals},2)

The file format must be supported by get_data_generic().

The data can be stored in two ways:

  • shape (n_{intervals},2), where each line contains the start frame and end frame of an interval, then no conversion is needed

  • shape (n_{samples},) with 0 and 1, then it is converted to shape (n_{intervals},2)

Parameters
  • path (str) – path to the data file

  • key (str) – key to access the data in case of mat or h5 file, for txt file it is ignored

Returns

numpy array of shape (n_{intervals},2) with intervals in frames number

Return type

numpy array

visiannot.tools.data_loader.get_data_interval_as_time_series(path, n_samples=0, key='', **kwargs)[source]

Loads file containing temporal intervals, output shape (n_{samples},)

The data can be stored in two ways:

  • shape (n_{intervals},2), where each line contains the start frame and end frame of an interval, then it is converted to shape (n_{samples},), so the number of frames must be specified (allowed formats: txt, mat, h5)

  • shape (n_{samples},) with 0 and 1, then no conversion is needed (allowed formats: mat, h5)

Parameters
  • path (str) – path to the data file

  • n_samples (int) – number of samples of the time series, see convert_intervals_to_time_series()

  • key (str) – key to access the data in case of mat or h5 file, for txt file it is ignored

  • kwargs – keyword arguments of slice_dataset()

Returns

numpy array of shape (n_{samples},) with intervals as a time series of 0 and 1

Return type

numpy array

visiannot.tools.data_loader.get_data_mat(path, key, **kwargs)[source]

Loads data from a .mat file

Parameters
  • path (str) – path to the data file

  • key (str) – key to access the data

  • kwargs – keyword arguments of slice_dataset()

Returns

data

Return type

numpy array

visiannot.tools.data_loader.get_data_txt(path, slicing=(), **kwargs)[source]

Loads data from a .txt file

Parameters
  • path (str) – path to the data file

  • slicing (tuple) – see keyword argument of slice_dataset()

  • kwargs – keyword arguments of numpy.loadtxt

Returns

data

Return type

numpy array

visiannot.tools.data_loader.get_last_sample_generic(path, key='')[source]

Gets the last sample in a data file (.mat, .h5 or .txt)

It raises an exception if the format is not supported.

Parameters
  • path (list) – path to the data file

  • key (str) – key to access the data (in case of .mat or .h5)

Returns

last sample, returns 0 if no data found

Return type

float or str

visiannot.tools.data_loader.get_nb_samples_generic(path, key='', **kwargs)[source]

Gets number of samples in a data file (.mat, .h5 or .txt)

It raises an exception if the format is not supported.

Parameters
  • path (list) – path to the data file

  • key (str) – key to access the data (in case of .mat or .h5)

Returns

number of samples

Return type

int

visiannot.tools.data_loader.get_txt_lines(path)[source]

Loads a file as a list of lines

Parameters

path – path to the text file

Returns

list of strings with the lines of the file

Return type

list

visiannot.tools.data_loader.get_working_directory(path)[source]

Gets working directory when ViSiAnnoT is launched, which depends on wether it is launched as a Python script or an executable (generated with PyInstaller)

Typically, path is the path to a Python module of visiannot that is being executed.

In case it is launched as a Python script, it returns the absolute path to the directory containing the module.

In case it is launched as an executable generated with PyInstaller, it returns the path to the temporary directory created by PyInstaller where are putted source code and related data files.

Parameters

path (str) – typically __file__

visiannot.tools.data_loader.slice_dataset(dataset, slicing=())[source]

Slices a dataset

Parameters
  • dataset (numpy array or h5py.Dataset) – dataset to slice, might be a numpy array or a dataset in a HDF5 file

  • slicing (tuple or list or numpy array) –

    indexes for slicing output data:

    • (): no slicing

    • (start,): data[start:]

    • (start, stop): data[start:stop]

    • ("row", ind): data[ind]

    • ("col", ind): data[:, ind] (2D array only)

    • (ind, start, stop): data[:, start:stop]

    • directly a list or numpy array of indexes on first dimension: data[slicing]

Returns

sliced dataset

Return type

numpy array