Showing 551 - 600 of 1951
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Fast Reduced-Rank Sound Zone Control Algorithm Using The Conjugate Gradient Method
Sound zone control enables different users to enjoy different audio contents in the same acoustic environment. Generalized eigenvalue decomposition (GEVD)-based methods allow us to control the trade-off between the acoustic contrast (AC) and signal distor
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Prototypical Triplet Loss For Cover Detection
Automatic cover detection -- the task of finding in a audio dataset all covers of a query track -- has long been a challenging theoretical problem in MIR community. It also became a practical need for music composers societies requiring to detect automati
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Performance Bounds For Displaced Sensor Automotive Radar Imaging
In automotive radar imaging, displaced sensors offer improvement in localization accuracy by jointly processing the data acquired from multiple radar units, each of which may have limited individual resources. In this paper, we derive performance bounds o
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Analyzing Asr Pretraining For Low-Resource Speech-To-Text Translation
Previous work has shown that for low-resource source languages, automatic speech-to-text translation (AST) can be improved by pretraining an end-to-end model on automatic speech recognition (ASR) data from a high-resource language. However, it is not clea
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Stability Of Graph Neural Networks To Relative Perturbations
Graph neural networks (GNNs), consisting of a cascade of layers applying a graph convolution followed by a pointwise nonlinearity, have become a powerful architecture to process signals supported on graphs. Graph convolutions (and thus, GNNs), rely heavil
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A New Multihypothesis Prediction Scheme For Compressed Video Sensing Reconstruction
For multihypothesis-based compressed video sensing schemes, the low accuracy of weight prediction and degradation of recovery quality for high-motion videos are open challenges. To solve this problem, this paper proposes a new multihypothesis prediction s
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Mspec-Net : Multi-Domain Speech Conversion Network
In this paper, we present a multi-domain speech conversion technique by proposing a Multi-domain Speech Conversion Network (MSpeC-Net) architecture for solving the less-explored area of Non-Audible Murmur-to-SPeeCH (NAM2-SPCH) conversion. The murmur produ
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Full-Sum Decoding For Hybrid Hmm Based Speech Recognition Using Lstm Language Model
In hybrid HMM based speech recognition, LSTM language models have been widely applied and achieved large improvements. The theoretical capability of modeling any unlimited context suggests that no recombination should be applied in decoding. This motivate
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Using Personalized Speech Synthesis And Neural Language Generator For Rapid Speaker Adaptation
We propose to use the personalized speech synthesis and the neural language generator to synthesize content relevant personalized speech for rapid speaker adaptation. It has two distinct aspects: First, it relieves the general data sparsity issue in rapid
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Signal Sensing And Reconstruction Paradigms For A Novel Multi-Source Static Computed Tomography System
Conventional Computed Tomography (CT) systems use a single X-ray source and an arc of detectors mounted on a rotating gantry to acquire a set of projection data. Novel CT systems are now being pioneered in which a complete ring of distributed X-ray source
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Computation Of "Best" Interpolants In The Lp Sense
We study a variant of the interpolation problem where the continuously defined solution is regularized by minimizing the Lp-norm of its second-order derivative. For this continuous-domain problem, we propose an exact discretization scheme that restricts t
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Fast Intent Classification For Spoken Language Understanding Systems
Spoken Language Understanding (SLU) systems consist of several machine learning components operating together (e.g. intent classification, named entity resolution and recognition). Deep learning models have obtained state of the art results on several of
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Exploiting Channel Locality For Adaptive Massive Mimo Signal Detection
We propose MMNet, a deep learning MIMO detection scheme that significantly outperforms existing approaches on realistic channels with the same or lower computational complexity. MMNet?s design builds on the theory of iterative soft-thresholding algorithms
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Hybrid Model For Bipolar Disorder Classification From Visual Information
Bipolar Disorder (BD) is one of the most prevalent mental illnesses in the world. It has a negative impact on people?s social and personal functions. The principal indicator of BD is the extreme swing in the mood ranging from manic to depressive states. T
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Densely Connected Neural Network With Dilated Convolutions For Real-Time Speech Enhancement In The Time Domain
In this work, we propose a fully convolutional neural network for real-time speech enhancement in the time domain. The proposed network is an encoder-decoder based architecture with skip connections. The layers in the encoder and the decoder are followed
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Person Identification Using Deep Convolutional Neural Networks On Short-Term Signals From Wearable Sensors
In this work, we explore the discriminating ability of short-term signal patterns (e.g. few minutes long) with respect to the person identification task. We focus on signals recorded by simple wearable devices, such as smartwatches, which can measure move
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multimodal Learning For Classroom Activity Detection
Classroom activity detection (CAD) focuses on accurately classifying whether the teacher or student is speaking and recording both the length of individual utterances during a class. A CAD solution helps teachers get instant feedback on their pedagogical
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Attention Guided Region Division For Crowd Counting
Crowd counting has drawn more and more attention in computer vision. There are two mainstream approaches to deal with crowd counting tasks, regression and detection. Regression-based methods usually overestimate the count in sparse areas, while detection-
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Speaker Independence Of Neural Vocoders And Their Effect On Parametric Resynthesis Speech Enhancement
Traditional speech enhancement systems produce speech with compromised quality. Here we propose to use the high quality speech generation capability of neural vocoders for better quality speech enhancement. We term this parametric resynthesis (PR). In pre
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Unsupervised Content-Preserved Adaptation Network For Classification Of Pulmonary Textures From Different Ct Scanners
Deep network based methods have been proposed for accurate classification of pulmonary textures on CT images. However, such methods well-trained on CT data from one scanner cannot perform well when they are directly applied to the data from other scanners
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Improving The Scalability Of Deep Reinforcement Learning-Based Routing With Control On Partial Nodes
[2 Videos ]
Machine Learning (ML)-based routing optimization has been proposed to optimize the performance of flow routing for future networks, such as Software-Defined Networks (SDNs). However, existing studies are either hard to converge for large networks or vulne
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Q-Gadmm: Quantized Group Admm For Communication Efficient Decentralized Machine Learning
In this paper, we propose a communication-efficient decentralized machine learning (ML) algorithm, coined quantized group ADMM (Q-GADMM). Every worker in Q-GADMM communicates only with two neighbors, and updates its model via the group alternating direct
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Speaker Diarization Using Latent Space Clustering In Generative Adversarial Network
In this work, we propose deep latent space clustering for speaker diarization using generative adversarial network (GAN) back-projection with the help of an encoder network. The proposed diarization system is trained jointly with GAN loss, latent variable
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Adaptive Distributed Stochastic Gradient Descent For Minimizing Delay In The Presence Of Stragglers
We consider the setting where a master wants to run a distributed stochastic gradient descent (SGD) algorithm on $n$ workers each having a subset of the data. Distributed SGD may suffer from the effect of stragglers, i.e., slow or unresponsive workers who
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Gpu-Accelerated Viterbi Exact Lattice Decoder For Batched Online And Offline Speech Recognition
We present an optimized weighted finite-state transducer (WFST) decoder capable of online streaming and offline batch processing of audio using Graphics Processing Units (GPUs). The decoder is efficient in memory utilization, input/output (I/O) bandwidth,
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Tree Of Shapes Cut For Material Segmentation Guided By A Design
In manufacturing, the monitoring of the fabrication process is crucial in order to be sure that objects are compliant. For nano-objects, most of this monitoring is done manually. In this paper, we propose a method to segment different materials in a manuf
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Variational Student: Learning Compact And Sparser Networks In Knowledge Distillation Framework
The holy grail in deep neural network research is porting the memory- and computation-intensive network models on embedded platforms with a minimal compromise in model accuracy. To this end, we propose Variational Student where we reap the benefits of com
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Rde-Moga: Automatic Selection Of Rate-Distortion-Energy Control Points For Video Encoders Using Muti-Objetive Genetic Algorithm
Controlling energy consumption of video encoders is acomplex multi-objective optimization problem of great im-portance. In this work we propose the RDE-MOGA, an multi-objective genetic algorithm capable of finding energeticallyefficient configurations for
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Emotional Voice Conversion Using Multitask Learning With Text-To-Speech
Voice conversion (VC) is a task that alters the voice of a person to suit different styles while conserving the linguistic content. Previous state-of-the-art technology used in VC was based on the sequence-to-sequence (seq2seq) model, which could lose lin
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Novel Approach For Intelligibility Assessment In Dysarthric Subjects
Dysarthria is a motor speech impairment caused by muscle weakness. Individuals, with this condition, are unable to control rapid movement of the velum leading to reduction in intelligibility, audibility, naturalness and efficiency of vocal communication.
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Sensor Selection For Model-Free Source Localization: Where Less Is More
The ability for a wireless network to precisely localize the radio nodes composing it is a great tool towards system optimization and is increasingly seen as a basic service requirement. In the past, model-free algorithms such as weighted centroid localiz
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Epigraphical Reformulation For Non-Proximable Mixed Norms
This paper proposes an epigraphical reformulation (ER) technique for non-proximable mixed norm regularization. Various regularization methods using "mixed norms" have been proposed, where their optimization relies on efficient computation of the proximity
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Study Of Child Speech Extraction Using Joint Speech Enhancement And Separation In Realistic Conditions
In this paper, we design a novel joint framework of speech enhancement and speech separation for child speech extraction in realistic conditions, targeting the problem of extracting child speech from daily conversations in BabyTrain mega corpus. To the be
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Back-And-Forth Prediction For Deep Tensor Compression
Recent AI applications such as Collaborative Intelligence with neural networks involve transferring deep feature tensors between various computing devices. This necessitates tensor compression in order to optimize the usage of bandwidth-constrained channe
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Transfer Learning From Youtube Soundtracks To Tag Arctic Ecoacoustic Recordings
Sound provides a valuable tool for long-term monitoring of sensitive animal habitats at a spatial scale larger than camera traps or field observations, while also providing more details than satellite imagery. Currently, the ability to collect such record
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Preconditioned Ghost Imaging Via Sparsity Constraint
Ghost imaging via sparsity constraint (GISC) can recover objects from the intensity fluctuation of light fields at a sampling rate far below the Nyquist rate. However, its imaging quality may degrade severely when the coherence of sampling matrices is lar
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Learning-Based Content Caching And User Clustering: A Deep Deterministic Policy Gradient Approach
The joint design of content caching and user clustering (JCC) in cache-enabled heterogeneous networks is challenging, due to various unknown, possibly time-varying, system parameters which potentially give rise to various design tradeoffs in practice. Thi
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Upgrading Crfs To Jrfs And Its Benefits To Sequence Modeling And Labeling
Two important sequence tasks are sequence modeling and labeling. Sequence modeling involves determining the probabilities of sequences, e.g. language modeling. It is still difficult to improve language modeling with additional relevant tags, e.g. part-of-
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Accent Estimation Of Japanese Words From Their Surfaces And Romanizations For Building Large Vocabulary Accent Dictionaries
In Japanese text-to-speech (TTS), it is necessary to add accent information to the input sentence. However, there are a limited number of publicly available accent dictionaries, and those dictionaries e.g. UniDic, do not contain many compound words, prope
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Communication Constrained Learning With Uncertain Models
We consider the problem of distributed inference of a group of agents in a social network, where the agents construct, share, and update beliefs in a non-Bayesian framework to identify the underlying true state of the world. We build upon the concept of u
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Leveraging Ordinal Regression With Soft Labels For 3D Head Pose Estimation From Point Sets
Head pose estimation from depth image is a challenging problem, considering its large pose variations, severer occlusions, and low quality of depth data. In contrast to existing approaches that take 2D depth image as input, we propose a novel deep regress
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Optimal Window Design For W-Ofdm
Windowing is an effective approach to reduce out-of-band radiation (OBR) in multicarrier systems in order to avoid adjacent channel interference. However, commonly used window functions are chosen in an ad hoc manner and fixed. We present an optimal windo
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Multi-Motifgan (Mmgan): Motif-Targeted Graph Generation And Prediction
Generative graph models create instances of graphs that mimic the properties of real-world networks. Generative models are successful at retaining pairwise associations in the underlying networks but often fail to capture higher-order connectivity pattern
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A Study Of Generalization Of Stochastic Mirror Descent Algorithms On Overparameterized Nonlinear Models
We study the convergence, the implicit regularization and the generalization of stochastic mirror descent (SMD) algorithms in overparameterized nonlinear models, where the number of model parameters exceeds the number of training data points. Due to overp
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
Nonlinear Spatial Filtering For Multichannel Speech Enhancement In Inhomogeneous Noise Fields
A common processing pipeline for multichannel speech enhancement is to combine a linear spatial filter with a single-channel postfilter. In fact, it can be shown that such a combination is optimal in the minimum mean square error (MMSE) sense if the noise
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
A New Application Of Ultrasound Signal Processing For Archaeological Ceramic Classification
Identifying archaeological ceramic pieces is a challenging problem for archaeologists, since fragments of archaeological pottery from the same site might have been made in different distant locations from the site. The pieces look very similar and context
- IEEE MemberUS $11.00
- Society MemberUS $0.00
- IEEE Student MemberUS $11.00
- Non-IEEE MemberUS $15.00
One-Bit Doa Estimation Via Sparse Linear Arrays
Parameter estimation from noisy and quantized received signals has become an important topic in signal processing, as it offers low cost and low complexity in the implementation. Techniques to achieve high estimation performance in spite of the coarse qua