vocalpy.utils package

Submodules

vocalpy.utils.image_processing module

VocalPy - Vocal analysis framework

vocalpy.utils.image_processing.bradley_roth(image, s=None, t=None)

Implements the Bradley-Roth adaptive thresholding algorithm ‘Adaptive Thresholding Using the Integral Image’ paper: https://people.scs.carleton.ca/~roth/iit-publications-iti/docs/gerh-50002.pdf

Parameters:
  • image (ndarray) – image the be tresholded

  • s (int) – window size

  • t (int) – thresholding sensitivity

Returns:

image – returns trehsholded image in binary values

Return type:

ndarray

vocalpy.utils.image_processing.contrast_adjustment(data, lower_percentile, upper_percentile)

Contrast adjustment by saturating extreme values

Parameters:
  • data (ndarray) – input data

  • lower_percentile (int) – values bellow this percentile will be set to 0

  • upper_percentile (int) – values above this percentile will be set to 1

Returns:

data – original data with extreme values saturated

Return type:

ndarray

vocalpy.utils.image_processing.normalize(data)

Rescales data to be in the range (0,1)

Parameters:

data (ndarray) – data to be rescaled

Returns:

data – rescaled data

Return type:

ndarray

vocalpy.utils.image_processing.numpy_to_grayscale_image(data)

Converts a numpy array to a grayscale PIL Image

Parameters:

data (ndarray) – data array

Returns:

img – data array as a grayscale image

Return type:

PIL.Image

vocalpy.utils.image_processing.scatter_over_spectrogram(spectrogram, coordinates)

Creates a PIL Image overlay scatter of the segmentation on the spectrogram

Parameters:
  • spectrogram (ndarray) – array containing the spectrogram data

  • coordinates (ndarray) – array containing the segmentation coordinates

Returns:

img – image containing the segmentation as a scatter plot over the spectrogram

Return type:

PIL.Image

vocalpy.utils.io module

VocalPy - Vocal analysis framework

vocalpy.utils.io.create_directory(path)

Creates a directory at the provided path

Parameters:

path (str) – path to be created

vocalpy.utils.io.create_output_directory_structure(list_of_files)

Creates directory structure for output files from VocalPy

Parameters:

list_of_files (List[str]) – list of files provided by the user

vocalpy.utils.io.get_output_directory_for_audio_file(path)
vocalpy.utils.io.get_vocalpy_file_path(filename, path)
vocalpy.utils.io.load_checkpoint(checkpoint, model, device, optimizer=None)

Loads model parameters (state_dict) from file_path. If optimizer is provided, loads state_dict of optimizer assuming it is present in checkpoint.

Parameters:
  • checkpoint – (string) filename which needs to be loaded

  • model – (torch.nn.Module) model for which the parameters are loaded

  • optimizer – (torch.optim) optional: resume optimizer from checkpoint

vocalpy.utils.io.load_model(model_path, device)

Directly load a pretrained pytorch model

Parameters:
  • model_path (str) – path to model

  • device (torch.device) – device to run (CPU or GPU)

vocalpy.utils.io.load_pickle_file(filename, path, expected_object_type=None)

Loads vocalpy pickle object from a path

Parameters:
  • filename (str) – object filename

  • path (str) – path to object

Raises:

ValueError – if the path does not exist

vocalpy.utils.io.load_recording_data(path)

Loads vocalpy recording object from a path

Parameters:

path (str) – full path to object

Raises:

ValueError – if the file does not exist

vocalpy.utils.io.load_vocalpy_file(path, expected_object_type=None)

Load a VocalPy serialized object with legacy raw-pickle compatibility.

vocalpy.utils.io.parse_input_path(path=None, search_tree=False)

Parse input path. If it is a directory, return list of files; if it is a file, return the file path

Parameters:

path (str) – path provided by the user

vocalpy.utils.io.read_audio(path_to_file, start=0, stop=None)

Reads audio and metadata using SoundFile

Parameters:

path_to_file (str) – path to audio file

Returns:

(samples, sample_rate) – mono audio samples (always first channel) and audio sampling frequency

Return type:

(ndarray, int)

vocalpy.utils.io.read_audio_information(path_to_file)

Reads audio metadata using SoundFile

Parameters:

path_to_file (str) – path to audio file

Returns:

metadata

returns audio metadata in a dictionary including:

path to file sampling rate number of channels duration format

Return type:

dict

vocalpy.utils.io.read_yaml(path_to_file)

Loads a YAML configuration file

Parameters:

path_to_file (str) – path to YAML file

Returns:

yml_data – YAML file data read into a dictionary

Return type:

dict

vocalpy.utils.io.remove_directory(path)

Removes a directory at the provided path

Parameters:

path (str) – path to be removed

vocalpy.utils.io.rewrite_vocalpy_file(path, expected_object_type=None, object_type=None)

Rewrite a VocalPy artifact through the current versioned envelope.

vocalpy.utils.io.save_dataframe_as_csv(dataframe, path, filename)

Saves a Pandas DataFrame to disk

Parameters:
  • dataframe (pandas.DataFrame) – dataframe to be saved to disk

  • path (str) – target path

  • filename (str) – image file name

Raises:

ValueError – if target path does not exist

vocalpy.utils.io.save_image_to_disk(image, path, filename, img_format='png')

Saves PIL Image to disk

Parameters:
  • image (PIL.Image) – image to be saved to disk

  • path (str) – target path

  • filename (str) – image file name

  • img_format (str, optional) – image encoding format (png, jpg, gif, …)

Raises:

ValueError – if target path does not exist

vocalpy.utils.io.write_pickle_file(file, filename, path, object_type=None)

Writes vocalpy pickle object to a path

Parameters:
  • file (Object) – object to be saved

  • filename (str) – filename to be used

  • path (str) – path to save the object

Raises:

ValueError – if the path does not exist

vocalpy.utils.io.write_vocalpy_file(payload, filename, path, object_type=None)

Serialize a VocalPy object with versioned metadata.

vocalpy.utils.io.write_yaml(data, path_to_file)

Writes a YAML configuration file

Parameters:
  • data (dict) – dict data to write as a YAML file

  • path_to_file (str) – path to YAML file

vocalpy.utils.misc module

VocalPy - Vocal analysis framework

vocalpy.utils.misc.create_dataframe_from_list_of_vocals(list_of_vocals)

Creates a Pandas DataFrame from a ListOfVocals

Parameters:

list_of_vocals (ListOfVocals) – list of vocals is a ListOfVocals instance

Returns:

dataframe – dataframe containing all vocals from the list of vocals

Return type:

pandas.DataFrame

vocalpy.utils.misc.create_logger(args=None, out_dir=None)

Creates a logger to log information during execution

vocalpy.utils.misc.validate_animal(animal)
vocalpy.utils.misc.validate_arguments(args)

Validates arguments passed by the user

Parameter

args : ArgumentParser

vocalpy.utils.misc.validate_bin_size(bin_size)
vocalpy.utils.misc.validate_frequency_range(lower_frequency, higher_frequency, animal)
vocalpy.utils.misc.validate_segmentation_arguments(args)
vocalpy.utils.misc.validate_thread_count(threads)

vocalpy.utils.signal_processing module

VocalPy - Vocal analysis framework

vocalpy.utils.signal_processing.butter_bandpass(lower_frequency_cutoff, higher_frequency_cutoff, fs, order=25)

Creates a Butterworth bandpass filter

Parameters:
  • lower_frequency_cutoff (int) – lower frequency cutoff for the butterworth filter

  • higher_frequency_cutoff – higher frequency cutoff for the butterworth filter

  • fs (int) – signal sampling rate

  • order (int, optional) – Butterworth filter order

Returns:

(b, a) – numerator (b) and denominator (a) polynomials of the IIR filter

Return type:

(ndarray, ndarray)

vocalpy.utils.signal_processing.butter_bandpass_filter(samples, lower_frequency_cutoff, higher_frequency_cutoff, fs, order=25)

Applies a Butterworth bandpass filter to a signal

Parameters:
  • samples (ndarray) – source signal to be filtered

  • lower_frequency_cutoff (int) – lower frequency cutoff for the butterworth filter

  • higher_frequency_cutoff – higher frequency cutoff for the butterworth filter

  • fs (int) – signal sampling rate

  • order (int, optional) – Butterworth filter order

Returns:

y – source signal filtered using the butterworth filter

Return type:

ndarray

vocalpy.utils.signal_processing.butter_highpass(higher_frequency_cutoff, fs, order=25)

Creates a Butterworth highpass filter

Parameters:
  • higher_frequency_cutoff – higher frequency cutoff for the butterworth filter

  • fs (int) – signal sampling rate

  • order (int, optional) – Butterworth filter order

Returns:

(b, a) – numerator (b) and denominator (a) polynomials of the IIR filter

Return type:

(ndarray, ndarray)

vocalpy.utils.signal_processing.butter_highpass_filter(samples, higher_frequency_cutoff, fs, order=25)

Applies a Butterworth highpass filter to a signal

Parameters:
  • samples (ndarray) – source signal to be filtered

  • higher_frequency_cutoff – higher frequency cutoff for the butterworth filter

  • fs (int) – signal sampling rate

  • order (int, optional) – Butterworth filter order

Returns:

y – source signal filtered using the butterworth filter

Return type:

ndarray

vocalpy.utils.signal_processing.butter_lowpass(lower_frequency_cutoff, fs, order=25)

Creates a Butterworth lowpass filter

Parameters:
  • lower_frequency_cutoff (int) – lower frequency cutoff for the butterworth filter

  • fs (int) – signal sampling rate

  • order (int, optional) – Butterworth filter order

Returns:

(b, a) – numerator (b) and denominator (a) polynomials of the IIR filter

Return type:

(ndarray, ndarray)

vocalpy.utils.signal_processing.butter_lowpass_filter(samples, lower_frequency_cutoff, fs, order=25)

Applies a Butterworth lowpass filter to a signal

Parameters:
  • samples (ndarray) – source signal to be filtered

  • lower_frequency_cutoff (int) – lower frequency cutoff for the butterworth filter

  • fs (int) – signal sampling rate

  • order (int, optional) – Butterworth filter order

Returns:

y – source signal filtered using the butterworth filter

Return type:

ndarray

vocalpy.utils.signal_processing.compute_multitaper_spectrogram(samples, fs, window_size=512, window_halfbandwidth=4, window_count=6, noverlap=256, nfft=1024, lower_frequency_cutoff=None, higher_frequency_cutoff=None)

Computes a multitaper spectrogram, applies a frequency cutoff, and converts power values to decibel

Parameters:
  • samples (ndarray) – audio samples time series

  • fs (int) – sampling frequency of the audio

  • window_size (int (optional)) – window size in points

  • window_halfbandwidth (int (optional)) – time halfbandwith for spheroidal sequences (2*NW = BW/f0)

  • window_count (int (optional)) – number of windows (spheroidak sequences) to use

  • noverlap (int (optional)) – number of points to overlap between segments

  • nfft (int (optional)) – number of points to compute the stft

  • lower_frequency_cutoff (int (optional)) – frequencies lower than this limit will be removed

  • higher_frequency_cutoff (int (optional)) – frequencies higher than this limit will be removed

Returns:

(f, t, Pxx) – f contains the frequency bins t contains the time bins Pxx contains the power values for each bin in decibel

Return type:

(ndarray, ndarray, ndarray)

vocalpy.utils.signal_processing.compute_spectrogram(samples, fs, window_type='hamming', window_size=512, noverlap=256, nfft=1024, lower_frequency_cutoff=None, higher_frequency_cutoff=None)

Computes the spectrogram, applies a frequency cutoff, and converts power values to decibel

Parameters:
  • samples (ndarray) – audio samples time series

  • fs (int) – sampling frequency of the audio

  • window_type (str (optional)) – windowing function https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.get_window.html#scipy.signal.get_window

  • window_size (int (optional)) – window size

  • noverlap (int (optional)) – number of points to overlap between segments

  • nfft (int (optional)) – number of points to compute the stft

  • lower_frequency_cutoff (int (optional)) – frequencies lower than this limit will be removed

  • higher_frequency_cutoff (int (optional)) – frequencies higher than this limit will be removed

Returns:

(f, t, Pxx) – f contains the frequency bins t contains the time bins Pxx contains the power values for each bin in decibel

Return type:

(ndarray, ndarray, ndarray)

vocalpy.utils.signal_processing.filter_band_pass(data, frequencies, lower_frequency_cutoff, higher_frequency_cutoff)
vocalpy.utils.signal_processing.filter_high_pass(data, frequencies, frequency_cutoff)
vocalpy.utils.signal_processing.filter_low_pass(data, frequencies, frequency_cutoff)

Module contents

Utility helpers for I/O, baseline validation, and signal/image processing.