Simple voice activity detection based endpointing. More...

#include <pocketsphinx/endpointer.h>

Public Member Functions
POCKETSPHINX_EXPORT ps_endpointer_t *	ps_endpointer_init (double window, double ratio, ps_vad_mode_t mode, int sample_rate, double frame_length)

POCKETSPHINX_EXPORT ps_endpointer_t *	ps_endpointer_retain (ps_endpointer_t *ep)

POCKETSPHINX_EXPORT int	ps_endpointer_free (ps_endpointer_t *ep)

POCKETSPHINX_EXPORT ps_vad_t *	ps_endpointer_vad (ps_endpointer_t *ep)

POCKETSPHINX_EXPORT const int16 *	ps_endpointer_process (ps_endpointer_t ep, const int16 frame)

POCKETSPHINX_EXPORT const int16 *	ps_endpointer_end_stream (ps_endpointer_t ep, const int16 frame, size_t nsamp, size_t *out_nsamp)

POCKETSPHINX_EXPORT int	ps_endpointer_in_speech (ps_endpointer_t *ep)

POCKETSPHINX_EXPORT double	ps_endpointer_speech_start (ps_endpointer_t *ep)

POCKETSPHINX_EXPORT double	ps_endpointer_speech_end (ps_endpointer_t *ep)

POCKETSPHINX_EXPORT int	ps_endpointer_set_timestamp_func (ps_endpointer_t ep, ps_endpointer_timestamp_cb_t cb, void user_data)

POCKETSPHINX_EXPORT double	ps_endpointer_timestamp (ps_endpointer_t *ep)

Detailed Description

Simple voice activity detection based endpointing.

Examples: live.c, live_portaudio.c, live_pulseaudio.c, and live_win32.c.

Member Function Documentation

◆ ps_endpointer_init()

POCKETSPHINX_EXPORT ps_endpointer_t * ps_endpointer_init	(	double	window,
		double	ratio,
		ps_vad_mode_t	mode,
		int	sample_rate,
		double	frame_length
	)

Initialize endpointing.

Parameters

window	Seconds of audio to use in speech start/end decision, or 0 to use the default (PS_ENDPOINTER_DEFAULT_WINDOW).
ratio	Ratio of frames needed to trigger start/end decision, or 0 for the default (PS_ENDPOINTER_DEFAULT_RATIO).
mode	"Aggressiveness" of voice activity detection. Stricter values (see ps_vad_mode_t) are less likely to misclassify non-speech as speech.
sample_rate	Sampling rate of input, or 0 for default (which can be obtained with ps_vad_sample_rate()). Only 8000, 16000, 32000, 48000 are directly supported, others will use the closest supported rate (within reason). Note that this means that the actual frame length may not be exactly the one requested, so you must always use the one returned by ps_endpointer_frame_size() (in samples) or ps_endpointer_frame_length() (in seconds).
frame_length	Requested frame length in seconds, or 0.0 for the default. Only 0.01, 0.02, 0.03 currently supported. Actual frame length may be different, you must always use ps_endpointer_frame_length() to obtain it.

Returns: Endpointer object or NULL on failure (invalid parameter for instance).

◆ ps_endpointer_retain()

POCKETSPHINX_EXPORT ps_endpointer_t * ps_endpointer_retain ( ps_endpointer_t * ep )

Retain a pointer to endpointer

Parameters

ep	Endpointer.

Returns: Endpointer with incremented reference count.

◆ ps_endpointer_free()

POCKETSPHINX_EXPORT int ps_endpointer_free ( ps_endpointer_t * ep )

Release a pointer to endpointer.

Parameters

ep	Endpointer

Returns: New reference count (0 if freed).

◆ ps_endpointer_vad()

POCKETSPHINX_EXPORT ps_vad_t * ps_endpointer_vad ( ps_endpointer_t * ep )

Get the voice activity detector used by the endpointer.

Returns: VAD object. The endpointer retains ownership of this object, so you must use ps_vad_retain() if you wish to use it outside of the lifetime of the endpointer.

◆ ps_endpointer_process()

POCKETSPHINX_EXPORT const int16 * ps_endpointer_process	(	ps_endpointer_t *	ep,
		const int16 *	frame
	)

Process a frame of audio, returning a frame if in a speech region.

Note that the endpointer is not thread-safe. You must call all endpointer functions from the same thread.

Parameters

ep	Endpointer.
frame	Frame of data, must contain ps_endpointer_frame_size() samples.

Returns: NULL if no speech available, or pointer to a frame of ps_endpointer_frame_size() samples (no more and no less).

◆ ps_endpointer_end_stream()

POCKETSPHINX_EXPORT const int16 * ps_endpointer_end_stream	(	ps_endpointer_t *	ep,
		const int16 *	frame,
		size_t	nsamp,
		size_t *	out_nsamp
	)

Process remaining samples at end of stream.

Note that the endpointer is not thread-safe. You must call all endpointer functions from the same thread.

Parameters

ep	Endpointer.
frame	Frame of data, must contain ps_endpointer_frame_size() samples or less.
nsamp	Number of samples in frame.
out_nsamp	Output, number of samples available.

Returns: Pointer to available samples, or NULL if none available.

◆ ps_endpointer_in_speech()

POCKETSPHINX_EXPORT int ps_endpointer_in_speech ( ps_endpointer_t * ep )

Get the current state (speech/not-speech) of the endpointer.

This function can be used to detect speech/non-speech transitions. If it returns 0, and a subsequent call to ps_endpointer_process() returns non-NULL, this indicates a transition to speech. Conversely, if ps_endpointer_process() returns non-NULL and a subsequent call to this function returns 0, this indicates a transition to non-speech.

Parameters

ep	Endpointer.

Returns: non-zero if in a speech segment after processing the last frame of data.

◆ ps_endpointer_speech_start()

POCKETSPHINX_EXPORT double ps_endpointer_speech_start ( ps_endpointer_t * ep )

Get the start time of the last speech segment.

◆ ps_endpointer_speech_end()

POCKETSPHINX_EXPORT double ps_endpointer_speech_end ( ps_endpointer_t * ep )

Get the end time of the last speech segment

◆ ps_endpointer_set_timestamp_func()

POCKETSPHINX_EXPORT int ps_endpointer_set_timestamp_func	(	ps_endpointer_t *	ep,
		ps_endpointer_timestamp_cb_t	cb,
		void *	user_data
	)

Set a callback function to provide external timestamps.

By default, the endpointer uses audio sample counts to calculate timestamps. This can cause drift between audio time and system time when the audio clock is not synchronized with the system clock.

This function allows you to provide an external timestamp source (e.g., system time, monotonic clock) to avoid this drift.

Parameters

ep	Endpointer.
cb	Callback function that returns current time in seconds, or NULL to revert to audio-based timestamps.
user_data	User data to pass to the callback function.

Returns: 0 for success, -1 for error.

◆ ps_endpointer_timestamp()

POCKETSPHINX_EXPORT double ps_endpointer_timestamp ( ps_endpointer_t * ep )

Get the current timestamp from the endpointer.

This returns the current timestamp used by the endpointer, which may be based on audio samples or an external timestamp source if one has been set with ps_endpointer_set_timestamp_func().

Parameters

ep	Endpointer.

Returns: Current timestamp in seconds.

The documentation for this struct was generated from the following file:

pocketsphinx/endpointer.h

Public Member Functions

Detailed Description

Member Function Documentation

◆ ps_endpointer_init()

◆ ps_endpointer_retain()

◆ ps_endpointer_free()

◆ ps_endpointer_vad()

◆ ps_endpointer_process()

◆ ps_endpointer_end_stream()

◆ ps_endpointer_in_speech()

◆ ps_endpointer_speech_start()

◆ ps_endpointer_speech_end()

◆ ps_endpointer_set_timestamp_func()

◆ ps_endpointer_timestamp()