Showing 451 - 500 of 1951
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Improving Efficiency In Large-Scale Decentralized Distributed Training
Decentralized Parallel SGD (D-PSGD) and its asynchronous variant Asynchronous Parallel SGD (AD-PSGD) is a family of distributed learning algorithms that have been demonstrated to perform well for large-scale deep learning tasks. One drawback of (A)D-PSGD
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Supervised Deep Hashing For Efficient Audio Event Retrieval
Efficient retrieval of audio events can facilitate real-time implementation of numerous query and search-based systems. This work investigates the potency of different hashing techniques for efficient audio event retrieval. Multiple state-of-the-art weak
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multimodal Speaker Diarization Of Real-World Meetings Using D-Vectors With Spatial Features
Deep neural network based audio embeddings (d-vectors) have demonstrated superior performance in audio-only speaker diarization compared to traditional acoustic features such as mel-frequency cepstral coefficients (MFCCs) and i-vectors. However, there has
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Language Independent Gender Identification From Raw Waveform Using Multi-Scale Convolutional Neural Networks
In this work, we propose a raw waveform based multi- scale convolution neural network approach for language- independent gender identification. Our approach uses raw audio waveform as input to the 1-dimensional multi-scale convolutional neural network ins
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
End-To-End Multi-Speaker Speech Recognition With Transformer
Recently, fully recurrent neural network (RNN) based end-to-end models have been proven to be effective for multi-speaker speech recognition in both the single-channel and multi-channel scenarios. In this work, we explore the use of Transformer models for
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Model Of Double Descent For High-Dimensional Logistic Regression
We consider a model for logistic regression where only a subset of features of size $p$ is used for training a linear classifier over $n$ training samples. The classifier is obtained by running gradient-descent (GD) on logistic-loss. For this model, we in
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Low-Complexity Levenberg-Marquardt Algorithm For Tensor Canonical Polyadic Decomposition
In this paper, we propose CPD-fLM++, a fast implementation of the Levenberg-Marquardt (LM) algorithm for the tensor canonical polyadic decomposition. The overall algorithmic framework follows exactly the LM approach, which enjoys locally a super-linear co
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
The Discrete Stockwell Transforms For Infinite-Length Signals And Their Real-Time Implementations
The various forms of the Stockwell transforms (ST) introduced in the literature have been developed for off-line signal processing on finite-length signals. However, in many applications such as audio, medical or radar signal processing, signals to be ana
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Graph Vertex Sampling With Arbitrary Graph Signal Hilbert Spaces
Graph vertex sampling set selection aims at selecting a set of vertices of a graph such that the space of graph signals that can be reconstructed exactly from those samples alone is maximal. In this context, we propose to extend sampling set selection bas
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Rethinking Retinal Landmark Localization As Pose Estimation: Naive Single Stacked Network For Optic Disk And Fovea Detection
Automatic detection of optic disk and fovea, the two fundamental biological landmarks of the retinal system, is crucial to track the disease progression in a diabetic patient. Recent advances in this direction were mostly limited to applying CNN based net
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Dnn-Based Speech Recognition For Globalphone Languages
This paper describes new reference benchmark results based on hybrid Hidden Markov Model and Deep Neural Networks (HMM-DNN) for the GlobalPhone (GP) multilingual text and speech database. GP is a multilingual database of high-quality read speech with corr
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Efficient Bird Sound Detection On The Bela Embedded System
Monitoring wildlife is an important aspect of conservation initiatives. Deep learning detectors can help with this, although it is not yet clear whether they can run efficiently on an embedded system in the wild. This paper proposes an automatic detection
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Deep Rainrate Estimation From Highly Attenuated Downlink Signals Of Ground-Based Communications Satellite Terminals
While the use of weather radars to continuously monitor the spatio-temporal dynamics of precipitation has grown in recent years, these systems are expensive and sparsely deployed across the world. In this context, densely located ground-based terminals fo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Estimation Of Information In Parallel Gaussian Channels Via Model Order Selection
We study the problem of estimating the overall mutual information in M independent parallel discrete-time memory-less Gaussian channels from N independent data sample pairs per channel (inputs and outputs). We focus on the case where the number of active
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Distributed Tracking And Circumnavigation Using Bearing Measurements
This paper is concerned with the problem of bearings based multi-agent circumnavigation of a maneuvering target. Agents are assumed to have access to their own individual bearing measurements as well as the ones from their immediate neighbors. The aim is
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Continual Learning For Infinite Hierarchical Change-Point Detection
Change-point detection (CPD) aims to locate abrupt transitions in the generative model of a sequence of observations. When Bayesian methods are considered, the standard practice is to infer the posterior distribution of the location of change points. Howe
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Sound Event Detection In Synthetic Domestic Environments
We present a comparative analysis of the performance of state-of-the-art sound event detection systems. In particular, we study the robustness of the systems to noise and signal degradation, which is known to impact model generalization. Our analysis is b
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Instance-Based Model Adaptation For Direct Speech Translation
Despite recent technology advancements, the effectiveness of neural approaches to end-to-end speech-to-text translation is still limited by the paucity of publicly available training corpora. We tackle this limitation with a method to improve data exploit
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Optimal Window Design For Joint Spatial-Spectral Domain Filtering Of Signals On The Sphere
We present the optimal design of an azimuthally symmetric window signal for carrying out joint spatial-spectral domain filtering of a spherical (source) signal contaminated by a realization of an anisotropic noise process. The resulting window is used in
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Keyword Search For Sign Language
Keyword search is the search for a written query in an archive, which is often assumed to be a collection from a spoken language. Yet, the main languages of the Deaf, i.e. sign languages, are mostly neglected in this definition due to being visual languag
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multichannel Signal Processing For Road Surface Identification
The development of autonomous or semi-autonomous car technology is attracting much attention in recent years. An important aspect of this research is automatic identification of road surfaces, since adjustments can be made to improve the safety of the car
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Improving Lpcnet-Based Text-To-Speech With Linear Prediction-Structured Mixture Density Network
In this paper, we propose an improved LPCNet vocoder using a linear prediction (LP)-structured mixture density network (MDN). The recently proposed LPCNet vocoder has successfully achieved high-quality and lightweight speech synthesis systems by combining
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Information Theoretic Approach For Waveform Design In Coexisting Mimo Radar And Mimo Communications
We investigate waveform design for coexistence between a multiple input multiple-output (MIMO) radar and MIMO communications (MRMC), with a radar-centric criterion that leads to a minimal interference in the communications system. The communications use t
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Channel Invariant Speaker Embedding Learning With Joint Multi-Task And Adversarial Training
Using deep neural network to extract speaker embedding has significantly improved the speaker verification task. However, such embeddings are still vulnerable to channel variability. Previous works have used adversarial training to suppress channel inform
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Effect Of Undersampling On Non-Negative Blind Deconvolution With Autoregressive Filters
This paper considers the problem of blind deconvolution where the input signal is non-negative and sparse, and the unknown convolutional kernel is a first order autoregressive filter. Our objective is to understand if it is possible to recover both the si
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Image Fusion Using Joint Sparse Representations And Coupled Dictionary Learning
The image fusion problem consists in combining complementary parts of multiple images captured, for example, with different focal settings into one image of higher quality. This requires the identification of the sharpest areas in sets of input images. Re
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Effects Of Spectral Tilt On Listeners' Preferences And Intelligibility
High intelligibility can be achieved when listening to synthetic or artificially-produced speech under adverse conditions. But can listener preferences reveal any extra information when intelligibility is at ceiling? This paper describes a real-time speec
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
The Picasso Algorithm For Bayesian Localization Via Paired Comparisons In A Union Of Subspaces Model
We develop a framework for localizing an unknown point $\w$ using paired comparisons of the form ``$\w$ is closer to point $\x_i$ than to $\x_j$'' when the points lie in a union of known subspaces. This model, which extends a broad class of existing metho
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Polyphonic Sound Event Detection Using Transposed Convolutional Recurrent Neural Network
In this paper we propose a Transposed Convolutional Recurrent Neural Network (TCRNN) architecture for polyphonic sound event recognition. Transposed convolution layer, which caries out a regular convolution operation but reverts the spatial transformation
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Precise Performance Analysis Of The Box-Elastic Net Under Matrix Uncertainties
In this letter, we consider the problem of recovering an unknown sparse signal from noisy linear measurements, using an enhanced version of the popular Elastic-Net (EN) method.We modify the EN by adding a box-constraint, and we call it the Box-Elastic Net
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Real-Time Epileptic Seizure Detection During Sleep Using Passive Infrared (Pir) Sensors
According to World Health Organization (WHO), millions of people suffer from epilepsy, which is a chronic disorder of the brain. Sudden Unexplained Death in Epilepsy (SUDEP) is considered as one of the most dangerous threats to the patients who suffer fro
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Self-Tuning Algorithms For Multisensor-Multitarget Tracking Using Belief Propagation
Situation-aware technologies enabled by multitarget tracking algorithms will create new services and applications in emerging fields such as autonomous navigation and maritime surveillance. The system models underlying multitarget tracking algorithms ofte
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Single-Wavelength Real-Time Material-Sensing Camera Based On Time-Of-Flight Measurements
Time-of-Flight (ToF) cameras provide a fast and robust way of acquiring the 3D shape of real scenes. Dense depth images can be generated at tens of frame per second. 3D shapes can be then segmented and objects classified, but can we directly sense the obj
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Convolutional Beamspace For Array Signal Processing
A new type of beamspace for array processing is introduced called convolutional beamspace. It enjoys the advantages of traditional beamspace such as lower computational complexity, increased parallelism of subband processing, and improved resolution thres
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Automatic Lyrics Alignment And Transcription In Polyphonic Music: Does Background Music Help?
Automatic lyrics alignment and transcription in polyphonic music are challenging tasks because the singing vocals are corrupted by the background music. In this work, we propose to learn music genre-specific characteristics to train polyphonic acoustic mo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Data-Driven Model Set Design For Model Averaged Particle Filter
This paper is concerned with sequential state filtering in the presence of nonlinearity, non-Gaussianity and model uncertainty. For this problem, the Bayesian model averaged particle filter (BMAPF) is perhaps one of the most efficient solutions. Major adv
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Robust Multi-Channel Speech Recognition Using Frequency Aligned Network
Conventional speech enhancement technique such as beamforming has known benefits for far-field speech recognition. Our own work in frequency-domain multi-channel acoustic modeling has shown additional improvements by training a spatial filtering layer joi
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
C3Dvqa: Full-Reference Video Quality Assessment With 3D Convolutional Neural Network
Traditional video quality assessment (VQA) methods evaluate localized picture quality and video score is predicted by temporally aggregating frame scores. However, video quality exhibits different characteristics from static image quality due to the exist
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Lightweight Hardware Implementation Of Vvc Transform Block For Asic Decoder
Versatile Video Coding (VVC) is the next generation video coding standard expected by the end of 2020. Compared to its predecessor, VVC introduces new coding tools and techniques to make compression more ef?cient at the expense of higher computational com
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Blood Pressure Estimation From Ppg Signals Using Convolutional Neural Networks And Siamese Network
Blood pressure (BP) is a vital sign of the human body and an important parameter for early detection of cardiovascular diseases. It is usually measured using cuff-based devices or monitored invasively in critically-ill patients. This paper presents two te
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Object Detection With Color And Depth Images With Multi-Reduced Region Proposal Network And Multi-Pooling
Object detection technology has received increasing research attention with recent developments in automation technology. Most studies in this field, however, use RGB images as input to deep-learning classifiers, and they rarely use depth information. So,
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Subject Transfer Framework Based On Source Selection And Semi-Supervised Style Transfer Mapping For Semg Pattern Recognition
To construct subject-specific feature extractors and classifiers for a new subject using pooled datasets, overcoming inter-subject variabilities is required. In this study, we investigate the efficiency of the proposed subject transfer framework, which ap
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Using Vaes And Normalizing Flows For One-Shot Text-To-Speech Synthesis Of Expressive Speech
We propose a Text-to-Speech method to create an unseen expressive style using one utterance of expressive speech of around one second. Specifically, we enhance the disentanglement capabilities of a state-of-the-art sequence-to-sequence based system with a
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Novel Two-Pathway Encoder-Decoder Network For 3D Face Reconstruction
3D Morphable Model(3DMM) is a statistical tool widely employed in reconstructing 3D face shape. Existing methods are aimed at predicting 3DMM shape parameters with a single encoder but suffer from unclear distinction of different attributes. To address th
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Mixup-Breakdown: A Consistency Training Method For Improving Generalization Of Speech Separation Models
Deep-learning based speech separation models confront poor generalization problem that even the state-of-the-art models could abruptly fail when evaluating them in mismatch conditions. To address this problem, we propose an easy-to-implement yet effective
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Whiteness Test Based On The Spectral Measure Of Large Non-Hermitian Random Matrices
In the context of multivariate time series, a whiteness test against an MA(1) correlation model is proposed. This test is built on the eigenvalue distribution (spectral measure) of the non-Hermitian one-lag sample autocovariance matrix, instead of its sin