# Electrical Engineering and Systems Science

## New submissions

[ total of 11 entries: 1-11 ]
[ showing up to 2000 entries per page: fewer | more ]

### New submissions for Fri, 23 Feb 18

[1]
Title: Bounds for Tracking Error in Constant Stepsize Stochastic Approximation
Subjects: Signal Processing (eess.SP)

This work revisits the constant stepsize stochastic approximation algorithm for tracking a slowly moving target and obtains a bound for the tracking error that is valid for all time, using the Alekseev non-linear variation of constants formula.

[2]
Title: A new optimization problem in FSO communication system
Subjects: Signal Processing (eess.SP)

According to the physical phenomena of atmospheric channels and wave propagation, performance of wireless communication systems can be optimized by simply adjusting its parameters. This way is more economically favorable than consuming power or using processing techniques. In this paper for the first time an optimization problem is developed on the performance of free-space optical multi-input multi-output (FSO-MIMO) communication system. Also it is the first time that the optimization of FSO is developed under saturated atmospheric turbulences. In order to get closer to the actual results, the effect of pointing error is taken into considerations. Assuming MPSK, DPSK modulation schemes, new closed-form expressions are derived for Bit Error Rate (BER) of the proposed structure. Furthermore, an optimization is developed taking into account the beam width as the variable parameter, and BER as the objective function, there is no constraint in this system. The obtained results can be a useful outcome for FSO-MIMO system designers in order to limit effects of pointing error as well as atmospheric turbulences and thus achieves optimum performance.

[3]
Title: MEC-assisted End-to-End Latency Evaluations for C-V2X Communications
Comments: Submitted to EuCNC'18
Subjects: Signal Processing (eess.SP)

The efficient design of fifth generation (5G) mobile networks is driven by the need to support the dynamic proliferation of several vertical market segments. Considering the automotive sector, different Cellular Vehicle-to-Everything (C-V2X) use cases have been identified by the industrial and research world, referring to infotainment, automated driving and road safety. A common characteristic of these use cases is the need to exploit collective awareness of the road environment towards satisfying performance requirements. One of these requirements is the End-to-End (E2E) latency when, for instance, Vulnerable Road Users (VRUs) inform vehicles about their status (e.g., location) and activity, assisted by the cellular network. In this paper, focusing on a freeway-based VRU scenario, we argue that, in contrast to conventional, remote cloud-based cellular architecture, the deployment of Multi-access Edge Computing (MEC) infrastructure can substantially prune the E2E communication latency. Our argument is supported by an extensive simulation-based performance comparison between the conventional and the MEC-assisted network architecture.

[4]
Title: On the Effects of Resistive and Reactive Loads on Signal Amplification
Comments: A working manuscript with 13 pages and 13 figures
Subjects: Signal Processing (eess.SP)

The effects of reactive loads into amplification is studied. A simplified common emitter circuit configuration was adopted and respective time-independent and time-dependent voltage and current equations were obtained. As phasor analysis cannot be used because of the non-linearity, the voltage at the capacitor was represented in terms of the respective integral, implying a numerical approach. The effect of purely resistive loads was investigated first, and it was shown that the fanned structure of the transistor isolines can severely distort the amplification, especially for $V_a$ small and $s$ large. The total harmonic distortion was found not to depend on $V_a$, being determined by $s$ and the load resistance $R$. An expression was obtained for the current gain in terms of the base current and it was shown that it decreases in an almost perfectly linearly fashion with $I_B$. Remarkably, no gain variation, and hence perfectly linear amplification, is obtained when $R=0$, provided maximum power dissipation limits are not exceeded. Capacitive loads imply the detachment of the circuit trajectory from a straight line to an "ellipsoidal"-like loop. This implies a gain asymmetry along upper or lower arcs of this loop. By using the time-dependent circuit equations, it was possible to show numerically and by an analytical approximation that, at least for the adopted circuit and parameter values, the asymmetry induced by capacitive loads is not substantial. However, capacitive loads will imply lag between the output voltage and current and, hence, low-pass filtering. It was shown that smaller $V_a$ and larger $s$ can substantially reduce the phase lag, but at the cost of severe distortion.

[5]
Title: Sliding Bidirectional Recurrent Neural Networks for Sequence Detection in Communication Systems
Comments: accepted for publication in the proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2018. arXiv admin note: text overlap with arXiv:1802.02046 and arXiv:1705.08044
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT); Learning (cs.LG)

The design and analysis of communication systems typically rely on the development of mathematical models that describe the underlying communication channel. However, in some systems, such as molecular communication systems where chemical signals are used for transfer of information, the underlying channel models are unknown. In these scenarios, a completely new approach to design and analysis is required. In this work, we focus on one important aspect of communication systems, the detection algorithms, and demonstrate that by using tools from deep learning, it is possible to train detectors that perform well without any knowledge of the underlying channel models. We propose a technique we call sliding bidirectional recurrent neural network (SBRNN) for real-time sequence detection. We evaluate this algorithm using experimental data that is collected by a chemical communication platform, where the channel model is unknown and difficult to model analytically. We show that deep learning algorithms perform significantly better than a detector proposed in previous works, and the SBRNN outperforms other techniques considered in this work.

[6]
Title: Analysis of Fourier ptychographic microscopy with half of the captured images
Subjects: Image and Video Processing (eess.IV); Optics (physics.optics)

Fourier ptychography microscopy (FPM) is a new computational imaging technique that can provide gigapixel images with both high resolution and a wide field of view (FOV). However, time consuming of the data-acquisition process is a critical issue. In this paper, we make an analysis on the FPM imaging system with half number of the captured images. Based on the image analysis of the conventional FPM system, we then compare the reconstructed images with different number of captured data. Simulation and experiment results show that the reconstructed image with half number captured data do not show obvious resolution degradation compared to that with all the captured data, except a contrast reduction. In particular in the case when the object is close to phase-only/amplitude only, the quality of the reconstructed image with half of the captured data is nearly as good as the one reconstructed with full data.

### Cross-lists for Fri, 23 Feb 18

[7]  arXiv:1802.07860 (cross-list from cs.SD) [pdf, other]
Title: Neural Predictive Coding using Convolutional Neural Networks towards Unsupervised Learning of Speaker Characteristics
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)

Learning speaker-specific features is vital in many applications like speaker recognition, diarization and speech recognition. This paper provides a novel approach, we term Neural Predictive Coding (NPC), to learn speaker-specific characteristics in a completely unsupervised manner from large amounts of unlabeled training data that even contain multi-speaker audio streams. The NPC framework exploits the proposed short-term active-speaker stationarity hypothesis which assumes two temporally-close short speech segments belong to the same speaker, and thus a common representation that can encode the commonalities of both the segments, should capture the vocal characteristics of that speaker. We train a convolutional deep siamese network to produce "speaker embeddings" by optimizing a loss function that increases between-speaker variability and decreases within-speaker variability. The trained NPC model can produce these embeddings by projecting any test audio stream into a high dimensional manifold where speech frames of the same speaker come closer than they do in the raw feature space. Results in the frame-level speaker classification experiment along with the visualization of the embeddings manifest the distinctive ability of the NPC model to learn short-term speaker-specific features as compared to raw MFCC features and i-vectors. The utterance-level speaker classification experiments show that concatenating simple statistics of the short-term NPC embeddings over the whole utterance with the utterance-level i-vectors can give useful complimentary information to the i-vectors and boost the classification accuracy. The results also show the efficacy of this technique to learn those characteristics from large amounts of unlabeled training set which has no prior information about the environment of the test set.

[8]  arXiv:1802.08008 (cross-list from cs.SD) [pdf, other]
Title: Sounderfeit: Cloning a Physical Model with Conditional Adversarial Autoencoders
Authors: Stephen Sinclair
Comments: Published in the Brazilian Symposium on Computer Music (SBCM 2017)
Journal-ref: Proc. Brazilian Symp. on Comp. Music., 2017. p. 67--74
Subjects: Sound (cs.SD); Learning (cs.LG); Audio and Speech Processing (eess.AS)

An adversarial autoencoder conditioned on known parameters of a physical modeling bowed string synthesizer is evaluated for use in parameter estimation and resynthesis tasks. Latent dimensions are provided to capture variance not explained by the conditional parameters. Results are compared with and without the adversarial training, and a system capable of "copying" a given parameter-signal bidirectional relationship is examined. A real-time synthesis system built on a generative, conditioned and regularized neural network is presented, allowing to construct engaging sound synthesizers based purely on recorded data.

### Replacements for Fri, 23 Feb 18

[9]  arXiv:1710.07654 (replaced) [pdf, other]
Title: Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning
Comments: Published as a conference paper at ICLR 2018. (v3 changed paper title)
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Learning (cs.LG); Audio and Speech Processing (eess.AS)
[10]  arXiv:1710.10224 (replaced) [pdf, other]
Title: BridgeNets: Student-Teacher Transfer Learning Based on Recursive Neural Networks and its Application to Distant Speech Recognition
Comments: Accepted to 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018)
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[11]  arXiv:1801.05458 (replaced) [pdf, other]
Title: Deep Network for Simultaneous Decomposition and Classification in UWB-SAR Imagery
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[ total of 11 entries: 1-11 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)