IEEE ICASSP 2020 Virtual Conference May 2020

Thu, 16 July, 2020

Showing 351 - 400 of 1951

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Impact Of A Shift-Invariant Harmonic Phase Model In Fully Parametric Harmonic Voice Representation And Time/Frequency Synthesis

00:15:19

0 views

Harmonic representation models are widely used, notably in speech coding and synthesis. In this paper, we describe two fully parametric harmonic representation and signal reconstruction alternatives that rely on a shift-invariant harmonic phase model and

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Linear Model-Based Intra Prediction In Vvc Test Model

00:12:48

0 views

This paper studies a new intra prediction method based on a linear model for improving the intra prediction performance of Versatile Video Coding (H.266/VVC) standard. The Linear Model-based Intra Prediction (LMIP) method in this work attempts to model th

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Augmentation Data Synthesis Via Gans: Boosting Latent Fingerprint Reconstruction

00:12:57

0 views

Latent fingerprint reconstruction is a vital preprocessing step for its identification. This task is very challenging due to not only existing complicated degradation patterns but also its scarcity of paired training data. To address these challenges, we

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Misspecified Cramer-Rao Bound For Delay Estimation With A Mismatched Waveform: A Case Study

00:14:34

0 views

In this paper we investigate the problem of time of arrival estimation which occurs in many real-world applications, such as indoor localization or non-destructive testing via ultrasound or radar. A problem that is often overlooked when analyzing these sy

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

In-Domain And Out-Of-Domain Data Augmentation To Improve Children's Speaker Verification System In Limited Data Scenario

00:13:38

0 views

In this paper, we present our efforts towards developing a robust automatic speaker verification (ASV) system for children when the domain-specific data is limited. For that purpose, we have studied the effect of in-domain and out-of-domain data augmentat

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Multi-Task Learning In Autonomous Driving Scenarios Via Adaptive Feature Refinement Networks

00:13:50

0 views

Many deep learning applications benefit from multi-task learning with several related objectives. In autonomous driving scenarios, being able to accurately infer motion and spatial information is essential for scene understanding. In this paper, we combin

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Anti-Jamming Routing For Internet Of Satellites: A Reinforcement Learning Approach

00:13:45

0 views

The anti-jamming routing for the Internet of Satellites (IoS) has drawn increasing attentions due to the unknown interrupts, unexpected congestion and smart jamming. This paper investigates anti-jamming routing scheme for heterogeneous IoS, with the aim o

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Frequency And Temporal Convolutional Attention For Text-Independent Speaker Recognition

00:11:27

0 views

Majority of the recent approaches for text-independent speaker recognition apply attention or similar techniques for aggregation of frame-level feature descriptors generated by a deep neural network (DNN) front-end. In this paper, we propose methods of co

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Admm-Based One-Bit Quantized Signal Detection For Massive Mimo Systems With Hardware Impairments

00:13:53

0 views

This paper considers signal detection in massive multiple-input multiple-output (MIMO) systems with general additive hardware impairments and one-bit quantization. First, we present the quantization-unaware and Bussgang decomposition-based linear receiver

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Back-To-Back Butterfly Network, An Adaptive Permutation Network For New Communication Standards

00:09:56

0 views

In this paper, we introduce an adaptive Back-to-Back Butterfly Network (B?BN) dedicated to next communication standards. It can perform any kind of permutation, and its architecture is based on a concatenation of basic networks. However for a set of permu

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Low-Rank Tensor Ring Model For Completing Missing Visual Data

00:13:21

0 views

Low rank tensor factorization can be viewed as a higher order generalization of low-rank matrix factorization, both of which have been used for image and video representation and reconstruction from compressive measurements. In this paper, we present an a

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Robust Unsupervised Audio-Visual Speech Enhancement Using A Mixture Of Variational Autoencoders

00:12:47

0 views

Recently, an audio-visual speech generative model based on variational autoencoder (VAE) has been proposed, which is combined with a non-negative matrix factorization (NMF) model for noise variance to perform unsupervised speech enhancement. When visual d

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Coded Illumination And Multiplexing For Lensless Imaging

00:09:13

0 views

Mask-based lensless cameras offer an alternative option to conventional cameras. Compared to conventional cameras, lensless cameras can be extremely thin, flexible, and light-weight. Despite these advantages, the quality of images recovered from the lensl

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Detecting Multiple Speech Disfluencies Using A Deep Residual Network With Bidirectional Long Short-Term Memory

00:12:08

0 views

Stuttering is a speech impediment affecting tens of millions of people on an everyday basis. Even with its commonality, there is minimal data and research on the identification and classification of stuttered speech. This paper tackles the problem of dete

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Transformer-Based Acoustic Modeling For Hybrid Speech Recognition

[2 Videos ]

We propose and evaluate transformer-based acoustic models for hybrid speech recognition. Several modeling choices are discussed in this work, including various positional embedding methods and an iterated loss to enable training deep transformers. We also

Show videos in this product

Transformer-Based Acoustic Modeling For Hybrid Speech Recognition

00:18:28

0 views

We propose and evaluate transformer-based acoustic models for hybrid speech recognition. Several modeling choices are discussed in this work, including various positional embedding methods and an iterated loss to enable training deep transformers. We also
Transformer-Based Acoustic Modeling For Hybrid Speech Recognition

00:00:00

0 views

We propose and evaluate transformer-based acoustic models for hybrid speech recognition. Several modeling choices are discussed in this work, including various positional embedding methods and an iterated loss to enable training deep transformers. We also

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Speaker Embeddings Incorporating Acoustic Conditions For Diarization

00:14:59

0 views

We present our work on training speaker embeddings, especially effective for speaker diarization. For various speaker recognition tasks, extracting speaker embeddings using Deep Neural Networks (DNNs) has become major methods. These embeddings are general

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Learning Perception And Planning With Deep Active Inference

00:08:17

0 views

Active inference is a process theory of the brain that states that all living organisms infer actions in order to minimize their (expected) free energy. However, current experiments are limited to predefined, often discrete, state spaces. In this paper we

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Hearing Aid Research Data Set For Acoustic Environment Recognition

00:13:21

1 view

State-of-the-art hearing aids (HA) are limited in recognizing acoustic environments. Much effort is spent on research to improve listening experience for HA users in every acoustic situation. There is, however, no dedicated public database to train acoust

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Recurrent Neural Audiovisual Word Embeddings For Synchronized Speech And Real-Time Mri

00:14:21

0 views

In this paper, the use of word embeddings for the segments found in audio and real-time magnetic resonance imaging (rtMRI) videos is addressed. In this study, word embeddings are created to store and retrieve data efficiently, and their representation pow

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

1.5Gbit/S 4.9W Hyperspectral Image Encoders On A Low-Power Parallel Heterogeneous Processing Platform

00:11:18

3 views

This work explores the utilization of low-power heterogeneous devices for parallelizing the compute-intensive hyper-spectral and multispectral image compression CCSDS-123 entropy encoders. Multithread processing allows for the near-optimal system?s bandwi

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Improving Music Transcription By Pre-Stacking A U-Net

00:10:00

1 view

We propose to pre-stack a U-Net as a way of improving the polyphonic music transcription performance of various baseline Convolutional Neural Networks (CNNS). The U-Net, a network architecture based on skip-connections between layers acts as a transformat

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Privacy Aware Acoustic Scene Synthesis Using Deep Spectral Feature Inversion

00:11:24

0 views

Gathering information about the acoustic environment of urban areas is now possible and studied in many major cities in the world. Part of the research is to find ways to inform the citizen about its sound environment while ensuring her privacy. We study

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Cloud-Driven Multi-Way Multiple-Antenna Relay Systems: Best-User-Link Selection And Joint Mmse Detection

00:18:03

0 views

In this work, we present a cloud-driven uplink framework for multi-way multiple-antenna relay systems which facilitates joint linear Minimum Mean Square Error (MMSE) symbol detection in the cloud and where users are selected to simultaneously transmit to

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Sparse Branch And Bound For Exact Optimization Of L0-Norm Penalized Least Squares

00:12:01

0 views

We propose a global optimization approach to solve l_0-norm penalized least-squares problems, using a dedicated branch-and-bound methodology. A specific tree search strategy is built, with branching rules inspired from greedy exploration techniques. We sh

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

An Lstm Based Architecture To Relate Speech Stimulus To Eeg

00:13:48

0 views

Modeling the relationship between natural speech and a recorded electroencephalogram (EEG) helps us understand how the brain processes speech and has various applications in neuroscience and brain-computer interfaces. In this context, so far mainly linear

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Composite Dnn Architecture For Speech Enhancement

00:14:16

0 views

In speech enhancement, the use of supervised algorithms in the form of deep neural networks (DNNs) has become tremendously popular in recent years. The target function of the DNN (and the associated estimators) is often either a masking function applied t

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Low-Rank Gradient Approximation For Memory-Efficient On-Device Training Of Deep Neural Network

00:12:20

0 views

Training machine learning models on mobile devices has the potential of improving both privacy and accuracy of the models. However, one of the major obstacles to achieving this goal is the memory limitation of mobile devices. Reducing training memory enab

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Image De-Raining Via Rdl: When Reweighted Convolutional Sparse Coding Meets Deep Learning

00:12:19

0 views

Over the past few decades, image de-raining has witnessed substantial progress due to the development of priors and deep learning based methods. However, few studies combine the merits of both. In this paper, we argue that domain expertise of conventional

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Minimum Latency Training Strategies For Streaming Sequence-To-Sequence Asr

00:16:19

0 views

Recently, a few novel streaming attention-based sequence-to-sequence (S2S) models have been proposed to perform online speech recognition with linear-time decoding complexity. However, in these models, the decisions to generate tokens are delayed compared

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Lqaid: Localized Quality Aware Image Denoising Using Deep Convolutional Neural Networks

00:12:01

0 views

In this paper we propose the Localized Quality Aware Image Denoising (LQAID) technique for image denoising using deep convolutional neural networks (CNNs). LQAID relies on local quality estimates over global cues like noise standard deviation since the pe

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Decomposed Cyclegan For Single Image Deraining With Unpaired Data

00:14:19

0 views

Most previous learning-based methods required paired rain image data. In practice, however, paired rain data cannot be collected. Inspired by adopting unpaired data in task of translation, in this paper we present a new method for rain removal using unpai

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

One-Bit Normalized Scatter Matrix Estimation For Complex Elliptically Symmetric Distributions

00:14:51

0 views

One-bit quantization has attracted attention in massive MIMO, radar, and array processing, due to its simplicity, low cost, and capability of parameter estimation. Specifically, the shape of the covariance of the unquantized data can be estimated from the

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Proximal Multitask Learning Over Distributed Networks With Jointly Sparse Structure

00:15:34

0 views

Modeling relations between local optimum parameter vectors in multitask networks has attracted much attention over the last years. This work considers a distributed optimization problem for parameter vectors with a jointly sparse structure among nodes, th

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Generating Multilingual Voices Using Speaker Space Translation Based On Bilingual Speaker Data

[2 Videos ]

We present progress towards bilingual Text-to-Speech which is able to transform a monolingual voice to speak a second language while preserving speaker voice quality. We demonstrate that a bilingual speaker embedding space contains a separate distribution

Show videos in this product

Generating Multilingual Voices Using Speaker Space Translation Based On Bilingual Speaker Data

00:13:15

0 views

We present progress towards bilingual Text-to-Speech which is able to transform a monolingual voice to speak a second language while preserving speaker voice quality. We demonstrate that a bilingual speaker embedding space contains a separate distribution
Generating Multilingual Voices Using Speaker Space Translation Based On Bilingual Speaker Data

00:13:15

0 views

We present progress towards bilingual Text-to-Speech which is able to transform a monolingual voice to speak a second language while preserving speaker voice quality. We demonstrate that a bilingual speaker embedding space contains a separate distribution

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Neural Percussive Synthesis Parameterised By High-Level Timbral Features

00:12:10

0 views

We present a deep neural network-based methodology for synthesising percussive sounds with control over high-level timbral characteristics of the sounds. This approach allows for intuitive control of a synthesizer, enabling the user to shape sounds withou

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Robustness Of Sparse Bayesian Learning In Correlated Environments

00:14:22

0 views

In this paper we explore the robustness of Sparse Bayesian Learning (SBL) in an environment with correlated sources. We provide two new perspectives to understand SBL's strategy for handling correlated sources. Using a Minimum Power Distortionless Respons

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

End-To-End Multi-Person Audio/Visual Automatic Speech Recognition

00:15:17

0 views

Traditionally, audio-visual automatic speech recognition has been studied under the assumption that the speaking face on the visual signal is the face matching the audio. However, in a more realistic setting, when multiple faces are potentially on screen

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Bband Index: A No-Reference Banding Artifact Predictor

00:13:47

0 views

Banding artifact, or false contouring, is a common video compression impairment that tends to appear on large flat regions in encoded videos. These staircase-shaped color bands can be very noticeable in high-definition videos. Here we study this artifact,

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Gender Differences On The Perception And Production Of Utterances With Willingness And Reluctance In Chinese

00:13:11

0 views

This study intends to explore the effects of gender differences on the perception and production of emotional intonation with willingness and reluctance. In the perceptual study, 20 native Mandarin listeners were instructed to rate perceived degree of wil

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Experiments In Creating Online Course Content For Signal Processing Education

00:13:06

0 views

The creation of the NPTEL platform in India has led to a vast population of engineering students getting access to quality online content for Signal Processing. These courses are globally accessible, free of cost, and also provide a means of obtaining cer

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Gait Phase Segmentation Using Weighted Dynamic Time Warping And K-Nearest Neighbors Graph Embedding

00:12:12

0 views

Gait phase segmentation is the process of identifying the start and end of different phases within a gait cycle. It is essential to many medical applications, such as disease diagnosis or rehabilitation. This work utilizes inertial measurement units (IMUs

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Towards Blind Quality Assessment Of Concert Audio Recordings Using Deep Neural Networks

00:10:18

0 views

Live music audio and video recordings represent a large percentage of the huge amount of User Generated Content (UGC) that is available on the internet today. Applications and services related to the management and consumption of this content may signific

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

On The Limit Distribution Of The Canonical Correlation Coefficients Between The Past And The Future Of A High-Dimensional White Noise

00:13:03

0 views

It is shown that the distribution of the estimated canonical correlation coefficients between the past and the future of a high-dimensional multivariate white noise sequence converges almost surely towards a limit distribution whose density is given in cl

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Weight Sharing And Deep Learning For Spectral Data

00:12:03

0 views

We propose a novel method to co-train deep convolutional neural networks for data sets of differing position specific data. This is an advantage in chemometrics where individual measurements represent exact chemical compounds, e.g. for given wavelengths,

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Pathloss Prediction Using Deep Learning With Applications To Cellular Optimization And Efficient D2D Link Scheduling

00:14:56

0 views

In this paper we propose a highly efficient and very accurate method for estimating the propagation pathloss from a point x to all points y on the 2D plane. Our method, termed RadioUNet, is a deep neural network. For applications such as user-cell site as

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Audio-Based Detection Of Explicit Content In Music

00:14:55

0 views

We present a novel automatic system for performing explicit content detection directly on the audio signal. Our modular approach uses an audio-to-character recognition model, a keyword spotting model associated with a dictionary of carefully chosen keywor

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Ordinal Learning For Emotion Recognition In Customer Service Calls

00:13:49

0 views

Approaches toward ordinal speech emotion recognition (SER) tasks are commonly based on the categorical classification algorithms, where the rank-order emotions are arbitrarily treated as independent categories. To employ the ordinal information between em

All Channels page: Communities submenu block

Communities

All Channels page: Societies submenu block

Societies

Events Showcase: ES submenu block

Event showcases

Recently Added Speakers

Events Hub Submenu block

Education: Education submenu block

Education Activity

2020 EAB AWARDS

2020 EAB AWARDS

IEEE ICASSP 2020 Virtual Conference May 2020