Pliers API Reference

This is the full API reference for all user-facing classes and functions in the pliers package.

Converters (pliers.converters)

The Converter hierarchy contains Transformer classes that take a Stim of one type as input and return a Stim of a different type as output.


ComplexTextIterator([name]) Iterates elements in a ComplexTextStim as TextStims.
IBMSpeechAPIConverter([username, password, …]) Uses the IBM Watson Text to Speech API to run speech-to-text transcription on an audio file.
GoogleSpeechAPIConverter([language_code, …]) Uses the Google Speech API to do speech-to-text transcription.
GoogleVisionAPITextConverter([…]) Detects text within images using the Google Cloud Vision API.
MicrosoftAPITextConverter([language, …]) Detects text within images using the Microsoft Vision API.
TesseractConverter([name]) Uses the Tesseract library to extract text from images.
VideoFrameCollectionIterator([name]) Iterates frames in a DerivedVideoStim as ImageStims.
VideoFrameIterator([name]) Iterates frames in a VideoStim as ImageStims.
VideoToAudioConverter([name]) Convert a VideoStim to an AudioStim by extracting the audio track using moviepy.
VideoToComplexTextConverter([steps]) Converts a VideoStim directly to a ComplexTextStim.
VideoToTextConverter([steps]) Converts a VideoStim directly to a TextStim.
WitTranscriptionConverter([api_key, rate_limit]) Speech-to-text transcription via the API.


get_converter(in_type, out_type, *args, **kwargs) Scans the list of available Converters and returns an instantiation of the first one whose input and output types match those passed in.

Dataset utilities (pliers.datasets)

The datasets module contains utilities for working with datasets (mostly remote text datasets).


fetch_dictionary(name[, url, format, index, …]) Retrieve a dictionary of text norms from the web or local storage.

Diagnostic utilities (pliers.diagnostics)

The diagnostics module contains functions for computing basic metrics that may be of use in determining the quality of Extractor results.


Diagnostics(data[, columns])


correlation_matrix(df) Returns a pandas DataFrame with the pair-wise correlations of the columns.
eigenvalues(df) Returns a pandas Series with eigenvalues of the correlation matrix.
condition_indices(df) Returns a pandas Series with condition indices of the df columns.
variance_inflation_factors(df) Computes the variance inflation factor (VIF) for each column in the df.
mahalanobis_distances(df[, axis]) Returns a pandas Series with Mahalanobis distances for each sample on the axis.
variances(df) Returns a pandas Series with variances for each column

Extractors (pliers.extractors)

The Extractor hierarchy contains Transformer classes that take a Stim of any type as input and return extracted feature information (rather than another Stim instance).


Base extractors and associated objects

Extractor([name]) Base class for all pliers Extractors.
ExtractorResult(data, stim, extractor[, …]) Stores feature data produced by an Extractor.

Audio feature extractors

ChromaCENSExtractor([n_chroma]) Extracts a chroma variant “Chroma Energy Normalized” (CENS) chromogram from audio (via Librosa).
ChromaCQTExtractor([n_chroma]) Extracts a constant-q chromogram from audio using the Librosa library.
ChromaSTFTExtractor([n_chroma]) Extracts a chromagram from an audio’s waveform using the Librosa library.
MeanAmplitudeExtractor([name]) Mean amplitude extractor for blocks of audio with transcription.
MelspectrogramExtractor([n_mels]) Extracts mel-scaled spectrogram from audio using the Librosa library.
MFCCExtractor([n_mfcc]) Extracts Mel Frequency Ceptral Coefficients from audio using the Librosa library.
PolyFeaturesExtractor([order]) Extracts the coefficients of fitting an nth-order polynomial to the columns of an audio’s spectrogram (via Librosa).
SpectralCentroidExtractor([feature, hop_length]) Extracts the spectral centroids from audio using the Librosa library.
SpectralBandwidthExtractor([feature, hop_length]) Extracts the p’th-order spectral bandwidth from audio using the Librosa library.
SpectralContrastExtractor([n_bands]) Extracts the spectral contrast from audio using the Librosa library.
SpectralRolloffExtractor([feature, hop_length]) Extracts the roll-off frequency from audio using the Librosa library.
STFTAudioExtractor([frame_size, hop_size, …]) Short-time Fourier Transform extractor.
TempogramExtractor([win_length]) Extracts a tempogram from audio using the Librosa library.
TonnetzExtractor([feature, hop_length]) Extracts the tonal centroids (tonnetz) from audio using the Librosa library.
ZeroCrossingRateExtractor([feature, hop_length]) Extracts the zero-crossing rate of audio using the Librosa library.

Image feature extractors

BrightnessExtractor([name]) Gets the average luminosity of the pixels in the image
ClarifaiAPIImageExtractor([api_key, model, …]) Uses the Clarifai API to extract tags of images.
ClarifaiAPIVideoExtractor([api_key, model, …])
FaceRecognitionFaceEncodingsExtractor(…) Uses the face_recognition package to extract a 128-dimensional encoding for every face detected in an image.
FaceRecognitionFaceLandmarksExtractor(…) Uses the face_recognition package to extract the locations of named features of faces in the image.
FaceRecognitionFaceLocationsExtractor(…) Uses the face_recognition package to extract bounding boxes for all faces in an image.
GoogleVisionAPIFaceExtractor([…]) Identifies faces in images using the Google Cloud Vision API.
GoogleVisionAPILabelExtractor([…]) Labels objects in images using the Google Cloud Vision API.
GoogleVisionAPIPropertyExtractor([…]) Extracts image properties using the Google Cloud Vision API.
GoogleVisionAPISafeSearchExtractor([…]) Extracts safe search detection using the Google Cloud Vision API.
GoogleVisionAPIWebEntitiesExtractor([…]) Extracts web entities using the Google Cloud Vision API.
IndicoAPIImageExtractor([api_key, models, …]) Uses to Indico API to extract features from Images, such as facial emotion recognition or content filtering.
MicrosoftAPIFaceExtractor([face_id, …]) Extracts face features (location, emotion, accessories, etc.).
MicrosoftAPIFaceEmotionExtractor([face_id, …]) Extracts facial emotions from images using the Microsoft API
MicrosoftVisionAPIExtractor([features, …]) Base MicrosoftVisionAPIExtractor class.
MicrosoftVisionAPITagExtractor([…]) Extracts image tags using the Microsoft API
MicrosoftVisionAPICategoryExtractor([…]) Extracts image categories using the Microsoft API
MicrosoftVisionAPIImageTypeExtractor([…]) Extracts image types (clipart, etc.) using the Microsoft API
MicrosoftVisionAPIColorExtractor([…]) Extracts image color attributes using the Microsoft API
MicrosoftVisionAPIAdultExtractor([…]) Extracts the presence of adult content using the Microsoft API
SaliencyExtractor([name]) Determines the saliency of the image using Itti & Koch (1998) algorithm
SharpnessExtractor([name]) Gets the degree of blur/sharpness of the image
VibranceExtractor([name]) Gets the variance of color channels of the image

Text feature extractors

ComplexTextExtractor([name]) Base ComplexTextStim Extractor class; all subclasses can only be applied to ComplexTextStim instance.
DictionaryExtractor(dictionary[, variables, …]) A generic dictionary-based extractor that supports extraction of arbitrary features contained in a lookup table.
IndicoAPITextExtractor([api_key, models, …]) Uses to Indico API to extract features from text, such as sentiment extraction.
LengthExtractor([name]) Extracts the length of the text in characters.
NumUniqueWordsExtractor([tokenizer]) Extracts the number of unique words used in the text.
PartOfSpeechExtractor([batch_size]) Tags parts of speech in text with nltk.
PredefinedDictionaryExtractor(variables[, …]) A generic Extractor that maps words onto values via one or more pre-defined dictionaries accessed via the web.
TextVectorizerExtractor([vectorizer]) Uses a scikit-learn Vectorizer to extract bag-of-features from text.
VADERSentimentExtractor() Uses nltk’s VADER lexicon to extract (0.0-1.0) values for the positve, neutral, and negative sentiment of a TextStim.
WordEmbeddingExtractor(embedding_file[, …]) An extractor that uses a word embedding file to look up embedding vectors for text.

Video feature extractors

FarnebackOpticalFlowExtractor([pyr_scale, …]) Extracts total amount of dense optical flow between every pair of video frames.


merge_results(results[, format, timing, …]) Merges a list of ExtractorResults instances and returns a pandas DF.

Filters (pliers.filters)

The Filter hierarchy contains Transformer classes that take a Stim of one type as input and return a Stim of the same type as output (but with some changes to its data).


FrameSamplingFilter([every, hertz, top_n]) Samples frames from video stimuli, to improve efficiency.
ImageCroppingFilter([box]) Crops an image.
PillowImageFilter([image_filter]) Uses the ImageFilter module from PIL to run a pre-defined image enhancement filter on an ImageStim.
PunctuationRemovalFilter([name]) Removes punctuation from a TextStim.
TokenizingFilter([tokenizer]) Tokenizes a TextStim into several word TextStims.
TokenRemovalFilter([tokens, language]) Removes tokens (e.g., stopwords, common words, punctuation) from a TextStim.
WordStemmingFilter([stemmer, tokenize]) Nltk-based word stemming Filter.

Graph construction (pliers.graph)

The graph module contains tools for constructing and executing graphs of pliers Transformers.


Graph([nodes, spec]) Graph-like structure that represents an entire pliers workflow.
Node(transformer[, name]) A graph node/vertex.

Stimuli (pliers.stimuli)

The Stim hierarchy contains pliers representations of any object from which features can potentially be extracted.


AudioStim([filename, onset, sampling_rate, …]) Represents an audio clip.
ComplexTextStim([filename, onset, duration, …]) A collection of text stims (e.g., a story), typically ordered and with onsets and/or durations associated with each element.
CompoundStim(elements) A container for an arbitrary set of Stim elements.
ImageStim([filename, onset, duration, data, url]) Represents a static image.
TextStim([filename, text, onset, duration, …]) Any simple text stimulus–most commonly a single word.
TweetStimFactory([consumer_key, …]) An object from which to generate TweetStims, creates an Api instance from
TweetStim(status) Represents the text and associated media from a single tweet.
TranscribedAudioCompoundStim(audio, text) An AudioStim with an associated text transcription.
VideoFrameCollectionStim([filename, …]) A collection of video frames.
VideoFrameStim(video, frame_num[, duration, …]) A single frame of video.
VideoStim([filename, onset, url, clip]) A video.


load_stims(source[, dtype, fail_silently]) Load one or more stimuli directly from file, inferring/extracting metadata as needed.

Transformers (pliers.transformers)

The transformers module contains the base Transformer class from which all other pliers transformers inherit, as well as Transformer subclasses that have multiple subclasses spanning different modules (e.g., Google Cloud extractors that span audio, image, etc.).


BatchTransformerMixin([batch_size]) A mixin that overrides the default implicit iteration behavior.
GoogleAPITransformer([discovery_file, …]) Base GoogleAPITransformer class.
GoogleVisionAPITransformer([discovery_file, …]) Base class for transformers using the Google Vision API.
Transformer([name]) Base class for all pliers Transformers.


get_transformer(name[, base]) Scans list of currently available Transformer classes and returns an instantiation of the first one whose name perfectly matches (case-insensitive).