Showing 801 - 850 of 1951
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Zero-Shot Multi-Speaker Text-To-Speech With State-Of-The-Art Neural Speaker Embeddings
While speaker adaptation for end-to-end speech synthesis using speaker embeddings can produce good speaker similarity for speakers seen during training, there remains a gap for zero-shot adaptation to unseen speakers. We investigate multi-speaker modeling
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Srzoo: An Integrated Repository For Super-Resolution Using Deep Learning
Deep learning-based image processing algorithms, including image super-resolution methods, have been proposed with significant improvement in performance in recent years. However, their implementations and evaluations are dispersed in terms of various dee
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Detection And Analysis Of T/D Deletion In Librispeech
In this study we developed a new method for automatic identification of t/d deletion. Our method achieved 94% accuracy on TIMIT and 87% on human-annotated data from Librispeech. We then conducted an analysis of t/d deletion on more than half of a million
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multi-Task Learning For Speaker Verification And Voice Trigger Detection
Automatic speech transcription and speaker recognition are usually treated as separate tasks even though they are interdependent. In this study, we investigate training a single network to perform both tasks jointly. We train the network in a supervised m
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Alternative Half-Sample Interpolation Filters For Versatile Video Coding
To reduce the residual energy of a video signal, motion compensated prediction with fractional-sample accuracy has been successfully employed in modern video coding technology. In contrast to the fixed quarter-sample motion vector resolution for the luma
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Reversal No Longer Matters: Attention-Based Arrhythmia Detection With Lead-Reversal Ecg Data
In this paper, we propose an attention-based multi-scale neural network for arrhythmia detection with lead-reversal electrocardiogram data. Electrocardiogram with a set of 12 waveforms(known as 12-lead ECG) measures myocardial electrophysiological activit
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Message Transmission Through Underspread Time-Varying Linear Channels
It is common to model rapidly varying communication channels by time-varying linear systems. The output of a time-varying linear system can be described by a superposition of time-frequency (delay-Doppler) shifts of the input signal. This paper investigat
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
The Compressed Nested Array For Underdetermined Doa Estimation By Fourth-Order Difference Coarray
In this paper, a new sparse array structure, which further improves the degrees of freedom (DOFs) and enhanced the DOA estimation performance, for the fourth-order cumulant based direction of arrival (DOA) estimation is proposed. The new-formed array is h
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Improved Large-Margin Softmax Loss For Speaker Diarisation
Speaker diarisation systems nowadays use embeddings generated from speech segments in a bottleneck layer, which are needed to be discriminative for unseen speakers. It is well-known that large-margin training can improve the generalisation ability to unse
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Conditional Mutual Information Neural Estimator
Several recent works in communication systems have proposed to leverage the power of neural networks in the design of encoders and decoders. In this approach, these blocks can be tailored to maximize the transmission rate based on aggregated samples from
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Propeller Noise Detection With Deep Learning
Due to the complexity of environment and source modelling, underwater target detection is a rather challenging task. In the Deep Learning community, many attempts were made to deal with this problem, mainly through expert features, but few assessed the be
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Approximate Bayesian Computation With The Sliced-Wasserstein Distance
Approximate Bayesian Computation (ABC) is a popular method for approximate inference in generative models with intractable but easy-to-sample likelihood. It constructs an approximate posterior distribution by finding parameters for which the simulated dat
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning Signed Graphs From Data
Signed graphs have recently been found to offer advantages over unsigned graphs in a variety of tasks. However, the problem of learning graph topologies has only been considered for the unsigned case. In this paper, we propose a conceptually simple and fl
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Sampling Strategies For Gan Synthetic Data
Generative Adversarial Networks (GANs) have been used widely to generate large volumes of synthetic data. This data is being utilized for augmenting with real examples in order to train deep Convolutional Neural Networks (CNNs). Studies have shown that th
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Automatic Data Augmentation Via Deep Reinforcement Learning For Effective Kidney Tumor Segmentation
Conventional data augmentation realized by performing simple pre-processing operations (e.g., rotation, crop, etc.) has been validated for its advantage in enhancing the performance for medical image segmentation. However, the data generated by these conv
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Robust Phase Retrieval With Outliers
An outlier-resistance phase retrieval algorithm based on alternating direction method of multipliers (ADMM) is devised in this paper. Instead of the widely used least squares criterion that is only optimal for Gaussian noise environment, we adopt the leas
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Relative Cost Based Model Selection For Sparse High-Dimensional Linear Regression Models
In this paper, we propose a novel model selection method named multi-beta-test (MBT) for the sparse high-dimensional linear regression model. The estimation of the correct subset in the linear regression problem is formulated as a series of hypothesis tes
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Neural Network-Based Spike Sorting Feature Map That Resolves Spike Overlap In The Feature Space
When inserting an electrode array in the brain, its electrodes will record so-called 'spikes' which are generated by the neurons in the neighbourhood of the array. Spike sorting is the process of detecting and assigning these recorded spikes to their puta
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Accurate And Scalable Version Identification Using Musically-Motivated Embeddings
The version identification (VI) task deals with the automatic detection of recordings that correspond to the same underlying musical piece. Despite many efforts, VI is still an open problem, with much room for improvement, specially with regard to combini
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Blind Inference Of Centrality Rankings From Graph Signals
We study the blind centrality ranking problem, where our goal is to infer the eigenvector centrality ranking of nodes solely from nodal observations, i.e., without information about the topology of the network. We formalize these nodal observations as gra
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Unsupervised Image-To-Image Translation Via Fair Representation Of Gender Bias
Fairness becomes a critical issue of computer vision to reduce discriminative factors in various systems. Among computer vision tasks, Image-to-Image translation for facial attributes editing can yield discriminative results. The unexpected gender changed
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Better Safe Than Sorry: Risk-Aware Nonlinear Bayesian Estimation
Despite the simplicity and intuitive interpretation of minimum mean squared error (MMSE) estimators, their effectiveness in certain scenarios is questionable. Indeed, minimizing squared errors on average does not provide any form of stability, as the vola
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Mixture Factorized Auto-Encoder For Unsupervised Hierarchical Deep Factorization Of Speech Signal
Speech signal is constituted and contributed by various informative factors, such as linguistic content and speaker characteristic. There have been notable recent studies attempting to factorize speech signal into these individual factors without requirin
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Active Semi-Supervised Learning For Diffusions On Graphs
Diffusion-based semi-supervised learning on graphs consists of diffusing labeled information of a few nodes to infer the labels on the remaining ones. The performance of these methods heavily relies on the initial labeled set, which is either generated ra
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Towards An Efficient And General Framework Of Robust Training For Graph Neural Networks
Graph Neural Networks (GNNs) have made significant advances on several fundamental inference tasks. As a result, there is a surge of interest in using these models for making potentially important decisions in high-regret applications. However, despite GN
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Gyroscope Aided Video Stabilization Using Nonlinear Regression On Special Orthogonal Group
This paper presents a novel approach for gyroscope aided video stabilization. With the raw 3D rotational motion captured by a gyroscope, it is then smoothed through nonlinear regression directly on the Special Orthogonal Group. Instead of solving a variat
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Acoustic Model Adaptation For Presentation Transcription And Intelligent Meeting Assistant Systems
We present our solution for unsupervised rapid speaker adaptation in a state-of-art presentation and intelligent meeting transcription system. We adopt the Kullback-Leibler (KL) divergence regularized model adaptation paradigm. For the adaptation architec
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Ssgd: Sparsity-Promoting Stochastic Gradient Descent Algorithm For Unbiased Dnn Pruning
While deep neural networks (DNNs) have achieved state-of-the-art results in many fields, they are typically over-parameterized. Parameter redundancy, in turn, leads to inefficiency. Sparse signal recovery (SSR) techniques, on the other hand, find compact
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Training Asr Models By Generation Of Contextual Information
Supervised ASR models have reached unprecedented levels of accuracy, thanks in part to ever-increasing amounts of labelled training data. However, in many applications and locales, only moderate amounts of data are available, which has led to a surge in s
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Speech-Based Parameter Estimation Of An Asymmetric Vocal Fold Oscillation Model And Its Application In Discriminating Vocal Fold Pathologies
So far, several physical models have been proposed for the study of vocal fold oscillations during phonation. The parameters of these models, such as vocal fold elasticity, resistance, etc. are traditionally determined through the observation and measurem
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Deep Clustering For Domain Adaptation
We address the heterogeneous domain adaptation task: adapting a classifier trained on data from one domain to operate on another domain that also has a different label space. We consider two settings that both exhibit label scarcity of some form---one whe
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Sparse Recovery With Non-Linear Fourier Features
Random non-linear Fourier features have recently shown remarkable performance in a wide-range of regression and classification applications. Motivated by this success, this article focuses on a sparse non-linear Fourier feature (NFF) model. We provide a c
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Audio-Attention Discriminative Language Model For Asr Rescoring
End-to-end approaches for automatic speech recognition benefit from modeling the probability of the word sequence given the input audio stream directly in a single neural network. However, compared to conventional ASR systems, these models typically requi
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Image Recovery From Rotational And Translational Invariants
We introduce a framework for recovering an image from its rotationally and translationally invariant features based on autocorrelation analysis. This work is an instance of the multi-target detection statistical model, which is mainly used to study the ma
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Attention-Based Gated Scaling Adaptive Acoustic Model For Ctc-Based Speech Recognition
In this paper, we propose a novel adaptive technique that uses an attention-based gated scaling (AGS) scheme to improve deep feature learning for connectionist temporal classification (CTC) acoustic modeling. In AGS, the outputs of each hidden layer of th
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Large-Scale Deep Architecture For Personalized Grocery Basket Recommendations
With growing consumer adoption of online grocery shopping through platforms such as Amazon Fresh, Instacart, and Walmart Grocery, there is a pressing business need to provide relevant recommendations throughout the customer journey. In this paper, we intr
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Learning Approach To Cooperative Communication System Design
The cooperative relay network is a type of multi-terminal communication system. We present in this paper a Neural Network (NN)-based autoencoder (AE) approach to optimize its design. This approach implements a classical three-node cooperative system as on
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Graph Construction From Data By Non-Negative Kernel Regression
Data driven graph constructions are often used in machine learning applications. However, learning an optimal graph from data is still a challenging task. $K$-nearest neighbor and $epsilon$-neighborhood methods are among the most common graph constructio
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Qos-Aware Flow Control For Power-Efficient Data Center Networks With Deep Reinforcement Learning
Reducing the power consumption and maintaining the Flow Completion Time (FCT) for the Quality of Service (QoS) of applications in Data Center Networks (DCNs) are two major concerns for data center operators. However, existing works either fail in guarante
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Latent Fused Lasso
Fused lasso norm is classically adopted to model sparse piecewise constant signals, however it is not the convex hull of the best representation of such simultaneously structured signal. In this paper, we propose a convex variational norm for better model
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Linear Time Partitioning Algorithm For Frequency Weighted Impurity Functions
Partitioning algorithms play a key role in machine learning, signal processing, and communications. They are used in many well-known NP-hard problems such as k-means clustering and vector quantization. The goodness of a partition scheme is measured by a g
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Low-Rank Toeplitz Matrix Estimation Via Random Ultra-Sparse Rulers
We study how to estimate a nearly low-rank Toeplitz covariance matrix T from compressed measurements. Recent work of Qiao and Pal addresses this problem by combining sparse rulers (sparse linear arrays) with frequency finding (sparse Fourier transform) al
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Automotive Collision Risk Estimation Under Cooperative Sensing
This paper offers a technique for estimating collision risk for automated ground vehicles engaged in cooperative sensing. The technique allows quantification of (i) risk reduced due to cooperation, and (ii) the increased accuracy of risk assessment due to
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Practical Two-Stage Training Strategy For Multi-Stream End-To-End Speech Recognition
The multi-stream paradigm of audio processing, in which several sources are simultaneously considered, has been an active research area for information fusion. Our previous study offered a promising direction within end-to-end automatic speech recognition
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Towards High-Performance Object Detection: Task-Specific Design Considering Classification And Localization Separation
Object detection performs two tasks (classification and localization) simultaneously. Two tasks share a similarity: they need robust features that effectively represent the visual appearance of the objects. However, two tasks also have different propertie
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Byzantine-Robust Decentralized Stochastic Optimization
In this paper, we consider the Byzantine-robust stochastic optimization problem defined over a decentralized network, where the agents collaboratively minimize the summation of expectations of stochastic local cost functions, but some of the agents are un
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Stochastic Admm For Byzantine-Robust Distributed Learning
In this paper, we aim at solving a distributed machine learning problem under Byzantine attacks. In the distributed system, a number of workers (termed as Byzantine workers) could send arbitrary messages to the master and bias the learning process, due to
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Synthetic Crowd And Pedestrian Generator For Deep Learning Problems
Deep Neural networks (DNN) dominate the state of art results in computer vision (CV) and other fields. One of the primary reasons why DNN outperform existing algorithms is that these produce superior results when more labelled data are used, unlike classi
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Deep Multimodal Approach For Map Image Classification
Map images (e.g., illustrated maps, historical maps, and geographic maps) have been published around the world, not only for giving location but also to attract tourists or hand down the histories of locations. The management of map data, however, has bee