IEEE ICASSP 2020 Virtual Conference May 2020

Thu, 16 July, 2020

Showing 1501 - 1550 of 1951

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Pre-Training For Query Rewriting In A Spoken Language Understanding System

00:15:25

0 views

Query rewriting (QR) is an increasingly important technique for reducing customer friction resulting from errors in a spoken language understanding pipeline originating from various sources such as speech recognition errors, language understanding errors

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Learning Data Representation And Emotion Assessment From Physiological Data

00:14:39

0 views

Aiming at a deeper understanding of human emotional states, we explore deep learning techniques for the analysis of physiological data. In this work, two-channel pre-frontal raw electroencephalography and photoplethysmography signals of 25 subjects were c

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Weighted Speech Distortion Losses For Neural-Network-Based Real-Time Speech Enhancement

00:14:56

3 views

This paper investigates several aspects of training a RNN (recurrent neural network) that impact the objective and subjective quality of enhanced speech for real-time single-channel speech enhancement. Specifically, we focus on a RNN that enhances short-t

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Effective Wavenet Adaptation For Voice Conversion With Limited Data

00:11:58

0 views

WaveNet has shown its great potential as a direct conversion model in voice conversion. However, due to the model complexity, WaveNet always requires a large amount of training data, which has limited its applications in voice conversion, where training d

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Scalpnet: Detection Of Spatiotemporal Abnormal Intervals In Epileptic Eeg Using Convolutional Neural Networks

00:14:18

0 views

We propose ScalpNet: A deep neural network to detect spatiotemporal abnormal intervals from EEGs of epilepsy patients. Since the number of trained clinicians is very limited, it is very crucial to establish automatic detection of abnormal signals caused b

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Video Frame Interpolation Via Exceptional Motion-Aware Synthesis

00:13:16

0 views

In this paper, we propose a novel video frame interpolation method via exceptional motion-aware synthesis, in which accurate optical flow could be estimated even with exceptional motion patterns. Specifically, we devise two deep learning modules: exceptio

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Fixed-Point Optimization Of Transformer Neural Network

00:12:51

377 views

The Transformer model adopts a self-attention structure and shows very good performance in various natural language processing tasks. However, it is difficult to implement the Transformer in embedded systems because of its very large model size. In this s

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Bangla Voice Command Recognition In End-To-End System Using Topic Modeling Based Contextual Rescoring

00:14:39

0 views

In this work, we perform contextual rescoring using multi-label topic modeling to improve the performance of an End-to-End Bangla voice command recognition system. We use a hybrid of Connectionist Temporal Classification (CTC) and Attention mechanism in o

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Study Of Closed Phase Resonance Bandwidths For Oral And Nasal Tracts Using Zero Time Windowing

00:13:45

0 views

The periodic opening and closing of the vibrating vocal folds changes the production system continuously during the pro- duction of voiced speech. The subglottal and supraglottal cavities have distinct structure and impedance. A coupling and decoupling of

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Robust Speaker Clustering Method Based On Discrete Tied Variational Autoencoder

00:15:20

828 views

Recently, the speaker clustering model based on aggregation hierarchy cluster (AHC) is a common method to solve two main problems: no preset category number clustering and fix category number clustering. In general, model takes features like i-vectors as

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation

00:14:54

0 views

We investigate the problem of simultaneous machine translation of long-form speech content. We target a continuous speech-to-text scenario, generating translated captions for a live audio feed, such as a lecture or play-by-play commentary. As this scenari

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Investigating Generalization In Neural Networks Under Optimally Evolved Training Perturbations

00:15:05

0 views

In this paper, we study the generalization properties of neural networks under input perturbations and show that minimal training data corruption by a few pixel modifications can cause drastic overfitting. We propose an evolutionary algorithm to search fo

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Sequence-To-Sequence Singing Synthesis Using The Feed-Forward Transformer

00:13:19

501 views

We propose a sequence-to-sequence singing synthesizer, which avoids the need for training data with pre-aligned phonetic and acoustic features. Rather than the more common approach of a content-based attention mechanism combined with an autoregressive dec

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Asr Error Correction And Domain Adaptation Using Machine Translation

00:13:09

0 views

Off-the-shelf pre-trained Automatic Speech Recognition (ASR) systems are an increasingly viable service for companies of any size building speech-based products. While these ASR systems are trained on large amounts of data, domain mismatch is still an iss

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Anisotropic Guided Filtering

00:19:10

834 views

The guided filter and its derivatives have been widely employed in many image processing and computer vision applications due to their low complexity and good edge-preservation properties. Despite this success, these variants are unable to handle more agg

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Fast High-Dimensional Kernel Filtering

00:12:35

0 views

The bilateral and nonlocal means filters are instances of kernel-based filters that are popularly used in image processing. It was recently shown that fast and accurate bilateral filtering of grayscale images can be performed using a low-rank approximatio

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Enriched Speech For Effortless Listening

00:13:42

1 view

Human-machine speech interaction is increasingly common in the industrialised world. A (natural or synthetic) speech output that is optimised for high intelligibility and low cognitive load is of interest for both academia and industry: ENRICH (www.enrich

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Video-Driven Speech Reconstruction

00:15:00

0 views

This demo will showcase our video-to-audio model which attempts to reconstruct speech from short videos of spoken statements. Our model does so in a completely end-to-end manner where raw audio is generated based on the input video. This approach bypasses

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Clustering Of Nonnegative Data And An Application To Matrix Completion

00:00:02

0 views

In this paper, we propose a simple algorithm to cluster nonnegative data lying in disjoint subspaces. We analyze its performance in relation to a certain measure of correlation between said subspaces. We use our clustering algorithm to develop a matrix co

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Sndcnn: Self-Normalizing Deep Cnns With Scaled Exponential Linear Units For Speech Recognition

00:14:45

0 views

Very deep CNNs achieve state-of-the-art results in both computer vision and speech recognition, but are difficult to train. The most popular way to train very deep CNNs is to use shortcut connec- tions (SC) together with batch normalization (BN). Inspired

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Exploring Entity-Level Spatial Relationships For Image-Text Matching

00:10:19

0 views

Exploring the entity-level (i.e., objects in an image, words in a text) spatial relationship contributes to understanding multimedia content precisely. The ignorance of spatial information in previous works probably leads to misunderstandings of image con

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Equalization Of Ofdm Waveforms With Insufficient Cyclic Prefix

00:11:53

1 view

In this paper, a simple equalization strategy for OFDM waveforms is proposed that specifically targets the case where the cyclic prefix is insufficient to span the whole channel duration. The proposed architecture can be very efficiently implemented in th

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

An Improved Frame-Unit-Selection Based Voice Conversion System Without Parallel Training Data

00:13:42

0 views

A frame-unit-selection based voice conversion system proposed earlier by us is revisited here to enhance its performance in both speech naturalness and speaker similarity. Speaker independent, bilingual (Mandarin Chinese and American English) deep neural

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Modelling Sea Clutter In Sar Images Using Laplace-Rician Distribution

00:17:35

0 views

This paper presents a novel statistical model for the characterisation of synthetic aperture radar (SAR) images of the sea surface. The analysis of ocean surface is widely performed using satellite imagery as it produces information for wide areas under v

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Dual-Path Rnn: Efficient Long Sequence Modeling For Time-Domain Single-Channel Speech Separation

00:12:39

0 views

Recent studies in deep learning-based speech separation have proven the superiority of time-domain approaches to conventional time-frequency-based methods. Unlike the time-frequency domain approaches, the time-domain separation systems often receive input

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Faster-Than-Nyquist Signaling Via Spatiotemporal Symbol-Level Precoding For Multi-User Miso Redundant Transmissions

00:15:20

0 views

This paper tackles the problem of both multi-user and intersymbol interference stemming from co-channel users transmitting at a faster-than-Nyquist (FTN) rate in multi-antenna downlink transmissions. We propose a framework for redundant block-based symbol

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Differential Approach For Rain Field Tomographic Reconstruction Using Microwave Signals From Leo Satellites

00:14:38

0 views

A differential approach is proposed for tomographic rain field reconstruction using the estimated signal-to-noise ratio of microwave signals from low earth orbit satellites at the ground receivers, with the unknown baseline values eliminated before using

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Allocation Of Computing Tasks In Distributed Mec Servers Co-Powered By Renewable Sources And The Power Grid

00:12:02

0 views

We consider a Multiaccess Edge Computing (MEC) network where distributed servers have energy harvesting (e.g., solar) and storage (e.g., batteries) capabilities. Energy from a connected power grid is also available, in case that harvested from ambient sou

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Sstnet: Detecting Manipulated Faces Through Spatial, Steganalysis And Temporal Features

00:13:51

0 views

Compared to conventional object detection which focuses on high-level image content, face manipulation detection pays more attention to low-level artifacts and temporal discrepancies. However, there are few methods considering both of these two characteri

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Neural Time Warping For Multiple Sequence Alignment

00:12:13

0 views

Multiple sequence alignment (MSA) is a traditional and still challenging task for time-series analyses. The MSA problem is intrinsically a discrete optimization and, in principle, dynamic programming is available for solving MSA. However, the computation

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Smoothing Graph Signals Via Random Spanning Forests

00:14:55

0 views

Another facet of the elegant link between random processes on graphs and Laplacian-based numerical linear algebra is uncovered: based on random spanning forests, novel Monte-Carlo estimators for graph signal smoothing are proposed. These random forests ar

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Large-Scale Weakly-Supervised Content Embeddings For Music Recommendation And Tagging

00:14:37

0 views

We explore content-based representation learning strategies tailored for large-scale, uncurated music collections that afford only weak supervision through unstructured natural language metadata and co-listen statistics. At the core is a hybrid training s

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Large-Scale Time Series Clustering With K-Ars

00:12:26

0 views

Time-series clustering involves grouping homogeneous time series together based on certain similarity measures. The mixture AR model (MxAR) has already been developed for time series clustering, as well as an associated EM algorithm. However, this EM clus

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Boffin Tts: Few-Shot Speaker Adaptation By Bayesian Optimization

00:12:50

0 views

We present BOFFIN TTS (Bayesian Optimization For FIne-tuning Neural Text To Speech), a novel approach for few-shot speaker adaptation. Here, the task is to fine-tune a pre-trained TTS model to mimic a new speaker using a small corpus of target utterances.

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Multitask Learning With Capsule Networks For Speech-To-Intent Applications

00:14:51

0 views

Voice controlled applications can be a great aid to society, especially for physically challenged people. However this requires robustness to all kinds of variations in speech. A spoken language understanding system that learns from interaction with and d

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Electro-Magnetic Side-Channel Attack Through Learned Denoising And Classification

00:14:58

0 views

This paper proposes an upgraded Electro-Magnetic (EM) side-channel attack that automatically reconstructs the intercepted data. A novel system is introduced, running in parallel with leakage signal interception and catching compromising data on the fly. L

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Adversarial Detection Of Counterfeited Printable Graphical Codes: Towards

00:12:53

0 views

This paper addresses a problem of anti-counterfeiting of physical objects and aims at investigating a possibility of counterfeited printable graphical code detection from a machine learning perspectives. We investigate a fake generation via two different

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Regularized Partial Phase Synchrony Index Applied To Dynamical Functional Connectivity Estimation

00:14:06

0 views

We study the inference of conditional independence graph from the partial Phase Locking Value (PLV) index of multivariate time series. A typical application is the inference of temporal functional connectivity from brain data. We extend the recently propo

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

The Fractional Quaternion Fourier Number Transform

00:13:02

0 views

In this paper, we define a fractional version of the quaternion Fourier number transform (QFNT). With this purpose, we first study the eigenstructure of the QFNT; this is used to obtain the eigendecomposition of the corresponding transform matrix, from wh

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Advms: A Multi-Source Multi-Cost Defense Against Adversarial Attacks

00:14:47

0 views

Designing effective defense against adversarial attacks is a crucial topic as deep neural networks have been proliferated rapidly in many security-critical domains such as malware detection and self-driving cars. Conventional defense methods, although sho

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Joint Phoneme-Grapheme Model For End-To-End Speech Recognition

00:14:05

0 views

This paper proposes methods to improve a commonly used end-to-end speech recognition model, Listen-Attend-Spell (LAS). The methods we proposed use multi-task learning to improve generalization of the model by leveraging information from multiple labels. T

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Adaptive Normalization For Forecasting Limit Order Book Data Using Convolutional Neural Networks

00:13:40

0 views

Deep learning models are capable of achieving state-of-the-art performance on a wide range of time series analysis tasks. However, their performance crucially depends on the employed normalization scheme, while they are usually unable to efficiently handl

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Image Super-Resolution Using Residual Global Context Network

00:11:52

0 views

Recent studies have showed that convolutional neural networks (CNN) can effectively improve the performance of single image super-resolution (SR). However, previous methods rarely considered long-range dependencies between pixels and channel-wise interdep

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Gated Hypernet Decoder For Polar Codes

00:12:28

0 views

Hypernetworks were recently shown to improve the performance of message passing algorithms for decoding error correcting codes. In this work, we demonstrate how hypernetworks can be applied to decode polar codes by employing a new formalization of the pol

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Novel Method For Obtaining Diffuse Field Measurements For Microphone Calibration

[2 Videos ]

We propose a straightforward and cost-effective method to perform diffuse soundfield measurements for calibrating the magnitude response of a microphone array. Typically, such calibration is performed in a diffuse soundfield created in reverberation chamb

Show videos in this product

A Novel Method For Obtaining Diffuse Field Measurements For Microphone Calibration

00:14:44

0 views

We propose a straightforward and cost-effective method to perform diffuse soundfield measurements for calibrating the magnitude response of a microphone array. Typically, such calibration is performed in a diffuse soundfield created in reverberation chamb
A Novel Method For Obtaining Diffuse Field Measurements For Microphone Calibration

00:05:48

0 views

NOVELTY OF THE DEMO: Is it possible to obtain a diffused field response of a microphone array and perform calibration in under a minute? If such a method exists, is it possible to achieve an accuracy of half a dB from the expected response? The answer to

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Shape From Bandwidth: Central Projection Case

00:12:03

0 views

Consider an unknown surface painted with a band-limited texture. We show that only the knowledge of the bandwidth of the texture is enough to estimate the shape of the surface from a single image taken by a camera. We model the problem as a central projec

All Channels page: Communities submenu block

Communities

All Channels page: Societies submenu block

Societies

Events Showcase: ES submenu block

Event showcases

Recently Added Speakers

Events Hub Submenu block

Education: Education submenu block

Education Activity

2020 EAB AWARDS

2020 EAB AWARDS

IEEE ICASSP 2020 Virtual Conference May 2020