IEEE ICASSP 2020 Virtual Conference May 2020

Thu, 16 July, 2020

Showing 1801 - 1850 of 1951

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Return To Dereverberation In The Frequency Domain Using A Joint Learning Approach

00:12:53

0 views

Dereverberation is often performed in the time-frequency domain using mostly deep learning approaches. Time-frequency domain processing, however, may not be necessary when reverberation is modeled by the convolution operation. In this paper, we investigat

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Adaptation Of Rnn Transducer With Text-To-Speech Technology For Keyword Spotting

00:13:57

1 view

With the advent of recurrent neural network transducer (RNN-T) model, the performance of keyword spotting (KWS) systems has greatly improved. However, the KWS systems, employed for wake-word detection, still rely on the availability of keyword specific tr

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Regression Before Classification For Temporal Action Detection

00:12:01

0 views

Action classification combined with location regression is a widely-utilized mechanism in existing temporal action detection methods. However, there exists an inconsistency problem between locations and categories of action instances in this mechanism. Mo

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Resource Management In The Multibeam Noma-Based Satellite Downlink

00:15:41

0 views

A beam-free approach to channel allocation in a multi-beam four-color satellite coverage area is taken. Non-Orthogonal Multiple Access (NOMA) and Orthogonal Multiple Access (OMA) are compared as methods to serve users non-necessarily located on the refere

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Iq-Stan: Image Quality Guided Spatio-Temporal Attention Network For License Plate Recognition

00:13:55

0 views

License plate recognition (LPR) is one of the essential components in intelligent transportation systems. Although the image processing algorithms for LPR have been extensively studied in the past several years, the recognition performance is still not sa

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Unified Sequence-To-Sequence Front-End Model For Mandarin Text-To-Speech Synthesis

00:15:40

0 views

In Mandarin text-to-speech (TTS) system, the front-end text processing module significantly influences the intelligibility and naturalness of synthesized speech. Building a typical pipeline-based front-end which consists of multiple individual components

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Unsupervised Key Hand Shape Discovery Of Sign Language Videos With Correspondence Sparse Autoencoders

00:12:03

0 views

Recognition of sign language is a difficult task which often requires tedious annotations by sign language experts. End-to-end learning attempts that bypass frame level annotations have achieved some success in limited datasets, but it has been shown that

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Self-Supervised Learning For Audio-Visual Speaker Diarization

00:12:23

0 views

Speaker diarization, which is to find the speech segments of specific speakers, has been widely used in human-centered applications such as video conferences or human-computer interaction systems. In this paper, we propose a self-supervised audio-video sy

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Balanced Binary Neural Networks With Gated Residual

00:12:16

0 views

Binary neural networks have attracted numerous attention in recent years. However, mainly due to the information loss stemming from the biased binarization, how to preserve the accuracy of networks still remains a critical issue. In this paper, we attempt

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Robust Speaker Recognition Using Unsupervised Adversarial Invariance

00:11:47

0 views

In this paper, we address the problem of speaker recognition in challenging acoustic conditions using a novel method to extract robust speaker-discriminative speech representations. We adopt a recently proposed unsupervised adversarial invariance architec

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Text Adaptation For Speaker Verification With Speaker-Text Factorized Embeddings

00:12:02

1 view

Text mismatch between pre-collected data, either training data or enrollment data, and the actual test data can significantly hurt text-dependent speaker verification (SV) system performance. Although this problem can be solved by carefully collecting dat

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Multi-View Clustering Via Mixed Embedding Approximation

00:12:21

0 views

This paper tackles multi-view clustering via proposing a novel mixed embedding approximation (MEA) method. Formally, we aim to learn a uniform orthogonal embedding based on the orthogonal pre-embeddings of each view. At first, we hope that the uniform emb

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Multilinear Generalized Singular Value Decomposition (Ml-Gsvd) With Application To Coordinated Beamforming In Multi-User Mimo Systems

00:14:31

0 views

In this paper, we propose a new Multilinear Generalized Singular Value Decomposition (ML-GSVD) which allows to jointly factorize a set of matrices with one common dimension. The ML-GSVD is an extension of the Generalized Singular Value Decomposition (GSVD

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Wind: Wasserstein Inception Distance For Evaluating Generative Adversarial Network Performance

00:09:32

0 views

In this paper, we present Wasserstein Inception Distance (WInD), a novel metric for evaluating performance of Generative Adversarial Networks (GANs). The proposed metric extends on the rationale of the previously proposed Fr?chet Inception Distance (FID),

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Gci Detection From Raw Speech Using A Fully-Convolutional Network

00:13:34

0 views

Glottal Closure Instants (GCI) detection consists in automatically detecting temporal locations of most significant excitation of the vocal tract from the speech signal. It is used in many speech analysis and processing applications, and various algorithm

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Oh, Jeez! Or Uh-Huh? A Listener-Aware Backchannel Predictor On Asr Transcriptions

00:12:04

0 views

This paper presents our latest investigation on modeling backchannel in conversations. Motivated by a proactive backchanneling theory, we aim at developing a system which acts as a proactive listener by inserting backchannels, such as continuers and asses

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Graphtts: Graph-To-Sequence Modelling In Neural Text-To-Speech

00:12:18

528 views

This paper leverages the graph-to-sequence method in neural text-to-speech (GraphTTS), which maps the graph embedding of the input sequence to spectrograms. The graphical inputs consist of node and edge representations constructed from input texts. The en

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Indylstms: Independently Recurrent Lstms

00:14:56

0 views

We introduce Independently Recurrent Long Short-term Memory cells: IndyLSTMs. These differ from regular LSTM cells in that the recurrent weights are not modeled as a full matrix, but as a diagonal matrix, i.e. the output and state of each LSTM cell depend

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Bipartite Belief Propagation Polar Decoding With Bit-Flipping

00:12:22

0 views

For the scenarios with high throughput requirements, the belief propagation (BP) decoding is one of the most promising decoding strategies for polar codes. By pruning the redundant variable nodes (VNs) and check nodes (CNs) in the original factor graph, t

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

High-Resolution Attention Network With Acoustic Segment Model For Acoustic Scene Classification

00:00:00

703 views

The spectral information of acoustic scenes is diverse and complex, which poses challenges for acoustic scene tasks. To improve the classification performance, a variety of convolutional neural networks (CNNs) are proposed to extract richer semantic infor

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Low-Complexity Accurate Mmwave Positioning For Single-Antenna Users Based On Angle-Of-Departure And Adaptive Beamforming

00:13:23

0 views

The problem of position estimation of a mobile user equipped with a single antenna receiver using downlink transmissions in addressed. The advantages of this setup compared to the classical MIMO and uplink scenarios are analyzed in terms of achievable the

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Analysis Of Acoustic Features For Speech Sound Based Classification Of Asthmatic And Healthy Subjects

00:14:40

0 views

Non-speech sounds (cough, wheeze) are typically known to perform better than speech sounds for asthmatic and healthy subject classification. In this work, we use sustained phonations of speech sounds, namely, /A:/, /i:/, /u:/, /eI/, /oU/, /s/, and /z/ fro

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Context And Uncertainty Modeling For Online Speaker Change Detection

00:14:19

0 views

Speaker change detection is often addressed as a key component in speaker diarization systems. In this work we focus on online speaker change detection as a standalone task which is required for online closed captioning of broadcast television. Contrary t

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Differentially Modulated Spectrally Efficient Frequency-Division Multiplexing

00:12:03

0 views

This letter proposes a differentially modulated non-orthogonal spectrally efficient frequency-division multiplexing (D-SEFDM) architecture, which allows us to dispense with any pilot overhead needed for channel estimation at the receiver, while increasing

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Speech Enhancement Using A Two-Stage Network For An Efficient Boosting Strategy

00:00:01

0 views

A novel neural network architecture, called two-stage network (TSN), with a multi-objective learning (MOL) method for an efficient boosting strategy (BS) is proposed for speech enhancement. BS is an ensemble method using multiple base predictions (MBPs) f

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Real-Time Sound Event Detection On The Edge: Porting Vggish On Low-Power Iot Microcontrollers

00:11:12

0 views

Internet of Things (IoT) applications typically require a large number of heterogeneous devices to be distributed in the environment, which can generate large amounts of data for wireless transmission, affecting the energy requirements and lifetime of the

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Learning To Transfer Multi-Speaker Emotional Prosody To A Neutral Speaker

00:14:59

417 views

Most recent emotional speech synthesizers have been studied with a large training data. These systems require a sufficient number of audios to be recorded with respect to different emotions for each speaker. Acquiring emotional speech is more expensive th

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Attentive Item2Vec: Neural Attentive User Representations

00:14:04

0 views

Factorization methods for recommender systems tend to represent users as a single latent vector. However, user behavior and interests may change in the context of the recommendations that are presented to the user. For example, in the case of movie recomm

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Supervised Canonical Correlation Analysis Of Data On Symmetric Positive Definite Manifolds By Riemannian Dimensionality Reduction

00:14:10

0 views

Most computer vision problems entail data that reside on Riemannian manifolds. Canonical correlation analysis (CCA) is a powerful method that captures correlations between any two sets of matrices. In this paper, we propose a framework for a supervised CC

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Dynamic Oversampling In 1-Bit Quantized Asynchronous Large-Scale Multiple-Antenna Systems For Sustainable Iot Networks

00:21:13

0 views

In this paper, we propose a dynamic oversampling technique for asynchronous large-scale multiple-antenna systems with 1-bit analog-to-digital converters at the base station that is suitable for sustainable internet of things and cellular networks. To the

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Conditional Density Driven Grid Design In Point-Mass Filter

00:13:18

0 views

The paper is devoted to the state estimation of nonlinear stochastic dynamic systems. The stress is laid on a grid-based numerical solution to the Bayesian recursive relations using the point-mass filter (PMF). In the paper, a novel conditional density dr

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Camera Configuration Design In Cooperative Active Visual 3D Reconstruction: A Statistical Approach

00:13:20

0 views

Visual 3D reconstruction is an essential technique in computer vision which restores the 3D model of the scene from multi-view images. In this paper, we propose a statistical framework for the active visual 3D reconstruction. We first derive a closed-form

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Real Time Implementation Of A Bayer Domain Image Deblurring Core For Optical Blur Compensation

00:12:36

0 views

In this letter, we present an implementation of deblurring hardware to mitigate blur incurred by optical aberrations in a real-time manner to increase resolution for mobile camera modules. As optical aberrations tend to be variant according to spatial loc

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Trace Norm Generative Adversarial Networks For Sensor Generation And Feature Extraction

00:12:42

0 views

Generative Adversarial Networks (GANs) have been shown effective to generate realistic enough sensor data for industrial failure prediction. Compared to computer vision problems, where it is very common to have more than 1000 classes, the number of classe

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Multichannel Kalman-Based Wiener Filter Approach For Speaker Interference Reduction In Meetings

00:14:47

0 views

Recording a meeting and obtaining clean speech signals of each speaker is a challenging task. Even with a multichannel recording, in which all speakers are equipped with a close-talk microphone, speech of an active speaker still couples not only into his

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Simplified Dynamic Sc-Flip Polar Decoding

00:14:47

0 views

SC-Flip (SCF) decoding is a low-complexity polar code decoding algorithm alternative to SC-List (SCL) algorithm with small list sizes. To achieve the performance of the SCL algorithm with large list sizes, the Dynamic SC-Flip (DSCF) algorithm was proposed

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Full Reference Video Quality Measures Improvement Using Neural Networks

00:12:39

0 views

The accuracy of video quality metrics (VQMs) is an important issue for several applications. In this work, first we observe that the accuracy of several video quality metrics (VQMs) is strongly related to the spatial complexity index (SI) of the source. I

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Non-Uniform Video Time-Lapse Method Based On Motion Scenario And Stabilization Constraint

00:13:29

0 views

Time-lapse of user captured video becomes popular in many applications recently, non-uniform sampling and digital video stabilization (VS) are usually two independent steps to keep meaningful contents and provide stabilized output. However, non-uniform sa

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Federated Learning With Quantization Constraints

00:15:55

0 views

Traditional deep learning models are trained on centralized servers using labeled sample data collected from edge devices. This data often includes private information, which the users may not be willing to share. Federated learning (FL) is an emerging ap

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Estimating The Degree Of Sleepiness By Integrating Articulatory Feature Knowledge In Raw Waveform Based Cnns

00:13:21

0 views

Speech-based degree of sleepiness estimation is an emerging research problem. This paper investigates an end-to-end approach, where given raw waveform as input, a convolutional neural network (CNN) estimates at its output the degree of sleepiness. Within

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Triplet Loss Feature Aggregation For Scalable Hash

00:14:43

0 views

The increasing demands of high resolution and quality aggravate the status of heavy burden of cluster storage side and restricted bandwidth resources. Hence, video de-duplication in storage and transmission is becoming an important feature for video cloud

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Sequential Semi-Orthogonal Multi-Level Nmf With Negative Residual Reduction For Network Embedding

00:13:00

0 views

Network embedding is intended to produce low-dimensional vector representations of nodes in a network to preserve and extract the latent network structure, which has higher robustness to noise, outliers, and redundant data. Although a recently proposed mu

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Improved Probability Modelling For Exception Handling In Lossless Screen Content Coding

[2 Videos ]

Competitive methods for lossless screen content coding are based on modelling of probability distributions. The most effective approach for losslessly compressing images with up to 90000 colours is known as `soft context formation' (SCF). It scans the ima

Show videos in this product

Improved Probability Modelling For Exception Handling In Lossless Screen Content Coding

00:13:44

0 views

Competitive methods for lossless screen content coding are based on modelling of probability distributions. The most effective approach for losslessly compressing images with up to 90000 colours is known as `soft context formation' (SCF). It scans the ima
Improved Probability Modelling For Exception Handling In Lossless Screen Content Coding

00:00:00

0 views

Competitive methods for lossless screen content coding are based on modelling of probability distributions. The most effective approach for losslessly compressing images with up to 90000 colours is known as `soft context formation' (SCF). It scans the ima

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Ensemble Network For Ranking Images Based On Visual Appeal

00:12:04

0 views

We propose a computational framework for ranking images (group photos) taken at the same event within a short time span. The ranking is expected to correspond with human perception of overall appeal of the images. We hypothesize (and provide evidence thro

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

A Framework For The Robust Evaluation Of Sound Event Detection

00:15:00

0 views

This work defines a new framework for performance evaluation of polyphonic sound event detection (SED) systems, which overcomes the limitations of the conventional collar-based event decisions, event F-scores and event error rates. The proposed framework

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Compressing Flow Fields With Edge-Aware Homogeneous Diffusion Inpainting

00:13:25

0 views

In spite of the fact that efficient compression methods for dense two-dimensional flow fields would be very useful for modern video codecs, hardly any research has been performed in this area so far. Our paper addresses this problem by proposing the first

IEEE MemberUS $11.00
Society MemberUS $0.00
IEEE Student MemberUS $11.00
Non-IEEE MemberUS $15.00

Purchase

Audio Feature Extraction For Vehicle Engine Noise Classification

00:13:50

0 views

In this paper we propose a new scheme for vehicle engine noise classification as a more privacy-preserving alternative to classifying vehicles based on video recordings. We establish two scenarios: diesel vs. petrol and heavy goods vehicle vs. personal ca

All Channels page: Communities submenu block

Communities

All Channels page: Societies submenu block

Societies

Events Showcase: ES submenu block

Event showcases

Recently Added Speakers

Events Hub Submenu block

Education: Education submenu block

Education Activity

2020 EAB AWARDS

2020 EAB AWARDS

IEEE ICASSP 2020 Virtual Conference May 2020