Showing 1 - 50 of 1951
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Libri-Adapt: A New Speech Dataset For Unsupervised Domain Adaptation
This paper introduces a new dataset, Libri-Adapt, to support unsupervised domain adaptation research on speech recognition models. Built on top of the LibriSpeech corpus, Libri-Adapt contains 7200 hours of English speech recorded on mobile and embedded-sc
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Accurate Semidefinite Relaxation Method For 3-D Rigid Body Localization Using Aoa
This paper addresses the rigid body localization problem using angle-of-arrival measurements. We formulate the problem as a constrained weighted least squares (CWLS) minimization problem with the rotation matrix and position vector as variables, which is
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Attention Guided Region Division For Crowd Counting
Crowd counting has drawn more and more attention in computer vision. There are two mainstream approaches to deal with crowd counting tasks, regression and detection. Regression-based methods usually overestimate the count in sparse areas, while detection-
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Disentangled Speech Embeddings Using Cross-Modal Self-Supervision
The objective of this paper is to learn representations of speaker identity without access to manually annotated data. To do so, we develop a self-supervised learning objective that exploits the natural cross-modal synchrony between faces and audio in vid
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Learning Approach To Cooperative Communication System Design
The cooperative relay network is a type of multi-terminal communication system. We present in this paper a Neural Network (NN)-based autoencoder (AE) approach to optimize its design. This approach implements a classical three-node cooperative system as on
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning Recurrent Neural Network Language Models With Context-Sensitive Label Smoothing For Automatic Speech Recognition
Recurrent neural network language models (RNNLMs) have become very successful in many natural language processing tasks. However, RNNLMs trained with a cross entropy loss function and hard output targets are prone to overfitting, which weakens the languag
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
End-To-End Multi-Talker Overlapping Speech Recognition
In this paper we present an end-to-end speech recognition system that can recognize single-channel speech where multiple talkers can speak at the same time (overlapping speech) by using a neural network model based on Recurrent Neural Network Transducer (
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multi-Polarization Information Fusion For Object Contour Display In Passive Millimeter-Wave And Terahertz Security Imaging
Passive millimeter-wave/terahertz (PMMW/PTW) imaging has been widely developed for personal security screening in recent years. In PMMW/PTW images, object contours are an important feature for object detection and recognition. In this paper, a physical-ba
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Intra Frame Rate Control For Versatile Video Coding With Quadratic Rate-Distortion Modelling
With numerous coding tools adopted in the forthcoming Versatile Video Coding (VVC) standard, much less work has been dedicated to study the corresponding Rate-Distortion (R-D) characteristics. This paper proposes a new quadratic R-D model for Versatile Vi
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Tensor-To-Vector Regression For Multi-Channel Speech Enhancement Based On Tensor-Train Network
We propose a tensor-to-vector regression approach to multi-channel speech enhancement in order to address the issue of input size explosion and hidden-layer size expansion. The key idea is to cast the conventional deep neural network (DNN) based vector-to
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Adaptive Region Aggregation Network: Unsupervised Domain Adaptation With Adversarial Training For Ecg Delineation
Electrocardiogram (ECG) delineation, which provides clinically useful information for the diagnosis of cardiovascular disease, is an essential task in automated ECG analysis. The discrepancies among ECG signals from different datasets, namely domain shift
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Toward Better Speaker Embeddings: Automated Collection Of Speech Samples From Unknown Distinct Speakers
The accuracy of speaker verification and diarization models depends on the quality of the speaker embeddings used to separate audio samples from different speakers. With the goal of training better embedding models, we devise an au- tomatic pipeline for l
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Joint Blind Calibration And Time-Delay Estimation For Multiband Ranging
In this paper, we focus on the problem of blind joint calibration of multiband transceivers and time-delay (TD) estimation of multipath channels. We show that this problem can be formulated as a particular case of covariance matching. Although this proble
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
From Symbols To Signals: Symbolic Variational Autoencoders
We introduce Symbolic Variational Autoencoders which generate images from symbols that represent semantic concepts. Unlike generic Variational Autoencoders (VAEs) or Generative Adversarial Networks (GANs), the latent distribution from the Symbolic Variati
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Federating Solar, Storage And Communications In The Electric Grid And Internet Of Things
A futuristic infrastructure model is envisioned with distributed modules that can produce solar energy, have a storage system and provide services of lighting, electric-vehicle charging and communications. A stochastic model is formulated for the solar po
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Channel Charting: An Euclidean Distance Matrix Completion Perspective
Channel charting (CC) is an emerging machine learning framework that aims at learning lower-dimensional representations of the radio geometry from collected channel state information (CSI) in an area of interest, such that spatial relations of the represe
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Image Processing In Dna
The main obstacles for the practical deployment of DNA-based data storage platforms are the prohibitively high cost of synthetic DNA and the large number of errors introduced during synthesis. In particular, synthetic DNA products contain both individual
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Joint Optimization Of Sampling Patterns And Deep Priors For Improved Parallel Mri
Multichannel imaging techniques are widely used in MRI to reduce the scan time. These schemes typically perform undersampled acquisition and utilize compressed-sensing based regularized reconstruction algorithms. Model-based deep learning (MoDL) framework
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Speaker Independence Of Neural Vocoders And Their Effect On Parametric Resynthesis Speech Enhancement
Traditional speech enhancement systems produce speech with compromised quality. Here we propose to use the high quality speech generation capability of neural vocoders for better quality speech enhancement. We term this parametric resynthesis (PR). In pre
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
An Unsupervised Retinal Vessel Extraction And Segmentation Method Based On A Tube Marked Point Process Model
Retinal vessel extraction and segmentation is essential for supporting diagnosis of eye-related diseases. In recent years, deep learning has been applied to vessel segmentation and achieved excellent performance. However, these supervised methods require
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Graph Construction From Data By Non-Negative Kernel Regression
Data driven graph constructions are often used in machine learning applications. However, learning an optimal graph from data is still a challenging task. $K$-nearest neighbor and $epsilon$-neighborhood methods are among the most common graph constructio
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Submodular Rank Aggregation On Score-Based Permutations For Distributed Automatic Speech Recognition
Distributed automatic speech recognition (ASR) requires to aggregate outputs of distributed deep neural network (DNN)-based models. This work studies the use of submodular functions to design a rank aggregation on score-based permutations, which can be us
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Two-Step Sound Source Separation: Training On Learned Latent Targets
In this paper, we propose a two-step training procedure for source separation via a deep neural network. In the first step we learn a transform (and it's inverse) to a latent space where masking-based separation performance using oracles is optimal. For t
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Blind Multi-Spectral Image Pan-Sharpening
We address the problem of sharpening low spatial-resolution multi-spectral (MS) images with their associated misaligned high spatial-resolution panchromatic (PAN) image, based on priors on the spatial blur kernel and on the cross-channel relationship. In
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Unified Signal Compression Using Generative Adversarial Networks
We propose a unified compression framework that uses generative adversarial networks (GAN) to compress image and speech signals. The compressed signal is represented by a latent vector fed into a generator network which is trained to produce high quality
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Dual-Staged Context Aggregation Method Towards Efficient End-To-End Speech Enhancement
In speech enhancement, an end-to-end deep neural network converts a noisy speech signal to a clean speech directly in time domain without time-frequency transformation or mask estimation. However, aggregating contextual information from a high-resolution
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multi-Microphone Complex Spectral Mapping For Speech Dereverberation
This study proposes a multi-microphone complex spectral mapping approach for speech dereverberation on a fixed array geometry. In the proposed approach, a deep neural network (DNN) is trained to predict the real and imaginary (RI) components of direct sou
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis
This paper proposes a hierarchical, fine-grained and interpretable latent variable model for prosody based on the Tacotron 2 text-to-speech model. It achieves multi-resolution modeling of prosody by conditioning finer level representations on coarser leve
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Addressing Challenges In Building Web-Scale Content Classification Systems
Understanding the semantic meaning of content on the web through the lens of a taxonomy has many practical advantages. However, when building large-scale content classification systems, practitioners are faced with unique challenges involving finding the
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Low-Complexity Fixed-Point Convolutional Neural Networks For Automatic Target Recognition
There has been growing interest in developing neural network based automatic target recognition systems for synthetic aperture radar applications. However, these networks are typically complex in terms of storage and computation which inhibits their deplo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
An Online Kernel Scalar Quantization Scheme For Signal Classification
Distributed relay networking is one way of enabling connectivity between users that lack the necessary infrastructure to communicate with each other. An important advantage of such networks is the restoration of wireless communication coverage in the case
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Robust Parameter Estimation Of Contaminated Damped Exponentials
Parameter estimation of damped exponential signals has wide applications including fault detection and system parameter identification, etc. However, existing methods for estimating parameters of damped exponentials are either sensitive to noise or restri
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Generalization Of Principal Component Analysis
Conventional principal component analysis (PCA) finds a principal vector that maximizes the sum of second powers of principal components. We consider a generalized PCA that aims at maximizing the sum of an arbitrary convex function of principal components
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Matching Pursuit Based Dynamic Phase-Amplitude Coupling Measure
Long-distance neuronal communication in the brain is enabled by the interactions across various oscillatory frequencies. One interaction that is gaining importance during cognitive brain functions is phase amplitude coupling (PAC), where the phase of a sl
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Training Deep Spiking Neural Networks For Energy-Efficient Neuromorphic Computing
Spiking Neural Networks (SNNs) encode input information temporally using sparse spiking events, which can be harnessed to achieve higher computational efficiency. However, considering the rapid strides in accuracy enabled by Analog Neural Networks (ANNs),
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Semi-Supervised Learning Of Processes Over Multi-Relational Graphs
Semi-supervised learning (SSL) of dynamic processes over graphs is encountered in several applications of network science. Most of the existing approaches are unable to handle graphs with multiple relations, which arise in various real-world networks. Thi
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Detection Of Adversarial Attacks And Characterization Of Adversarial Subspace
Adversarial attacks have always been a serious threat for any data-driven model. In this paper, we explore subspaces of adversarial examples in unitary vector domain, and we propose a novel detector for defending our models trained for environmental sound
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Latent Fused Lasso
Fused lasso norm is classically adopted to model sparse piecewise constant signals, however it is not the convex hull of the best representation of such simultaneously structured signal. In this paper, we propose a convex variational norm for better model
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Improving The Scalability Of Deep Reinforcement Learning-Based Routing With Control On Partial Nodes
[2 Videos ]
Machine Learning (ML)-based routing optimization has been proposed to optimize the performance of flow routing for future networks, such as Software-Defined Networks (SDNs). However, existing studies are either hard to converge for large networks or vulne
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Frame-Level Phoneme-Invariant Speaker Embedding For Text-Independent Speaker Recognition On Extremely Short Utterances
This paper investigates a phoneme-invariant speaker embedding approach for speaker recognition on extremely short utterances. Intuitively, phonemes are nuisance information for text-independent speaker recognition task since the contents of the speech are
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Channel Attention Based Generative Network For Robust Visual Tracking
In recent years, Siamese trackers have achieved great success in visual tracking. Siamese networks can achieve competitive performance in both accuracy and speed. However, they may suffer from the performance degradation due to the case of large pose vari
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Label Reuse For Efficient Semi-Supervised Learning
In this paper, we propose a new learning strategy for semi-supervised deep learning algorithms, called label reuse, aiming to significantly reduce the expensive computational cost of pseudo label generation and the like for each unlabeled training instanc
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Rate-Invariant Autoencoding Of Time-Series
For time-series classification and retrieval applications, an important requirement is to develop representations/metrics that are robust to re-parametrization of the time-axis. Temporal re-parametrization as a model can account for variability in the und
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Qos-Aware Flow Control For Power-Efficient Data Center Networks With Deep Reinforcement Learning
Reducing the power consumption and maintaining the Flow Completion Time (FCT) for the Quality of Service (QoS) of applications in Data Center Networks (DCNs) are two major concerns for data center operators. However, existing works either fail in guarante
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Super-Resolution Of 3D Color Point Clouds Via Fast Graph Total Variation
3D point clouds acquired by low-cost sensors are often in lower spatial resolutions than desired for rendering images on high-resolution displays. In this paper, we propose a fast super-resolution (SR) algorithm for color 3D point clouds. We first populat
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Truth-To-Estimate Ratio Mask: A Post-Processing Method For Speech Enhancement Direct At Low Signal-To-Noise Ratios
This study proposes a bi-directional recurrent neural network (Bi-RNN) post-processing method for speech enhancement (SE) at low signal-to noise ratios (SNR). Current speech enhancement solutions performed badly under low SNR situations. Loizou and Kim pr
Cart
Create Account
Sign In