Detection scheme for linear dispersion space-time block codes using modified SQRD decomposition

In this work, a novel architecture for detection of Hybrid MIMO Systems over Rayleigh fading channels is presented. The Hybrid MIMO scheme consists of one Alamouti space-time block code unit, plus antennas operating as layers of V-BLAST in the transmitter. The proposed receiver is based on the algorithm Sorted QR decomposition (SQRD) presented in (Wubben et al., 2001). This proposal reduces significantly the complexity while maintaining the advantages of the SQRD scheme, obtaining a considerable gain in performance in terms of bit error rate (BER) at the expense of decreasing data rate.


Introduction
The demand for communication systems that effectively exploit the wireless channel's limited capacity (Telatar, 1999) has rapidly grown during the last decade.In recent years, multiple-input, multiple-output (MIMO) systems have emerged as an attractive technique to increase the bit rate without increasing neither power nor bandwidth resources.A MIMO system employs multiple antennas, both at the transmitter and the receiver, adding an extra degree of freedom in the design of communication systems.In particular, two techniques have been developed to take advantage of MIMO systems: Spatial Multiplexing and Diversity Transmission.The first technique aims at increasing the spectral efficiency.One of its main proponents is Sorted QR decomposition (SQRD), which was first introduced in (Wubben et al., 2001).The second technique is aimed at increasing diversity gain.This was achieved by the Space-Time Block Codes (STBC) (Tarokh et al., 1999).The implementation of STBC decoders is relatively easy to carry out, but one drawback is that their spectral efficiency is low.A popular scheme that reaches full diversity and full-rate was proposed by Alamouti in (Alamouti, 1998).
SQRD can attain very high spectral efficiency while maintaining a very low complexity of their receiver.However, because it does not obtain the optimal detection order, and use a scheme of Successive Cancelation Interference (SIC), it suffers great degradation due to the error propagation, in his decision feedback.In (Raza et al., 2004) a scheme called KIL-VBLAST was proposed, whose main goal is diminishing the error propagation between layers, it sacrifices spectral efficiency because it transmits a known symbol in the receiver.In this case, they propose that it belongs at the first sub-stream to detect, with this proposal the total diversity of V-BLAST is increased achieved a better performance, their principal disadvantage is that need hardware resources to transmit known information.In (Jiang et al., 2005), a scheme called GMD-VBLAST was proposed, these scheme uses a joint transceiver in the receiver and the transmitter based on the Geometric Mean Decomposition (GMD) .
It achieves a better performance than V-BLAST, maintaining their high spectral efficiency.However, the complexity of the receiver and transmitter is increased considerably, and it also needs Channel State Information (CSI) to work, therefore their hardware implementation can be complicated and expensive.
An alternative approach, known as hybrid coding (HC) (Mao, Motani, 2005), (Meng, Tuqan, 2007), (Cortez, et al., 2007), (Longoria, et al., 2007), (Bazdresch, et al., 2012)  the selection of some of the available transmitting antennas to work in STBC mode, while the remaining ones operate as V-BLAST.In particular, the STBC-VBLAST scheme (Mao, Motani, 2005), (Meng, Tuqan, 2007) is an interesting example of hybrid coding, as it uses Alamouti layers and spatial layers, increasing spectral efficiency over pure STBC, and its structure allows using an OSIC scheme decoding based on the QR decomposition.The principal disadvantage of (Meng, Tuqan, 2007) is that it has high receiver complexity and for (Mao, Motani, 2005) is that it needs two kinds of decoders in the receiver: STBC decoder and V-BLAST decoder.Also, the spectral efficiency of the system is greatly diminished when the number of Alamouti encoders is increased in the transmitter.
This paper presents a modification to the hybrid coding system proposed in (Mao, Motani, 2005).The scheme consists of only one Alamouti space-time block code unit in the last layer, plus antennas operating as V-BLAST in the transmitter.We obtained an equivalent channel matrix using the Linear Dispersion Space-Time Codes (LDCs) techniques proposed in (Hassibi, Hochwald, 2002), with this rearrangement we can achieve the benefits of the schemes described previously (Wubben, et al., 2001), (Raza, et al., 2004).
The receiver is based on scheme proposed in (Wubben, et al., 2001).It refers to this new architecture as ZF-SQRD-LDSTBC, the estimation and detection of the transmitted symbols are carried out of identical way as in SQRD (Wubben, et al., 2001).The Symbols exhibiting diversity gain (Alamouti Layer) are detected first, followed by the ordered spatially-multiplexed symbols.
The scheme proposal increased the total diversity of the system and mainly of the first layer to detect, and since the receiver is based on the scheme proposed in (Wubben, et al., 2001), in this case, an efficient detector with low complexity was proposed, a better performance is obtained, and no CSI is needed in this case.Therefore, an implementation can be carried out in hardware at lower costs.The implementation of the sorted QR decomposition was done using Modified Gram-Schmidt algorithm (MGS) (Golub, Van Loan, 1996).The scheme exploit the special matrix structure that comes from the LDCs representation to improve the lower complexity of the receiver proposed in (Wubben, et al., 2001).
This proposal was compared with the schemes proposed in (Wubben, et al., 2001), (Raza, et al., 2004).The results obtained shown that, at the same spectral efficiency, this proposal outperforms both schemes in terms of achieved Bit Error Rate (BER), with the same number of receiver and transmitter antennas.With respect to the scheme proposed in (Jiang, et al., 2005), this scheme achieves a lower performance but with a less computational complexity.In the next sections, the system model, method and results were presented.

Method
A space-time block code (STBC) is a mapping from a vector of symbols to a space-time code matrix that specifies how symbols are transmitted over the available antennas and time intervals.
The STBC-VBLAST linear space-time block code transmits 2  + 1 complex symbols over T = 2 symbol intervals and   transmit antennas.We assume the receiver uses   antennas.A full description was shown in this section of the paper.

a. Linear dispersion codes (LDC)
An LDC codeword is defined as a matrix S given by: (1) Where we have assumed that symbols are transmitted in a codeword, each symbol , and the complex linear dispersion matrices and (of size ) specify the code.The columns of and represent the antenna for which the symbol is through at the channel and the rows indicates the time instant when this event happened.

b. Channel model
The propagation channel between each transmitting and receiving antenna can be modeled as a Rayleigh narrowband stationary stochastic process.Also this work considers that the channel is scatterer-rich at both the transmitter and receiver ends.
1 () The MIMO channel can be modeled as a random matrix of size , where and represent the number of antennas at the receiver and transmitter, respectively.The elements of are denoted , for , .Each element of is a sample of a complex Gaussian random variable with zero mean and variance 0.5 per dimension.We assume that the channel fading is slow, and the channel matrix remains constant during the transmission of one space-time codeword of duration .A new realization of the channel matrix, independent of the previous one, is then generated for each new space-time codeword.
We assume that all the antennas transmit information symbols from the same M-QAM constellation map, and that the receiver has a CSI perfectly synchronized.The total transmitted power is normalized to 1 watt.LDSTBC system based on the scheme proposed in (Bazdresch, et al., 2012) is shown in the Figure 1.

a. ZF-SQRD-LDSTBC Transmitter
The received signal is represented by the matrix: (2) Where is a matrix of complex Gaussian random noise variables with zero mean and independent real and imaginary parts with variance per dimension.In Table I, the proposed coding scheme is done in space and time over the antenna arrangement for the symbol sequence , …, , where and .

ZF-SQRD-LDSTBC Receiver
Under the assumption that the Channel State Information (CSI) is perfectly known by the receiver and in presence of Gaussian noise, the detection and decoding process of the transmitted signal vector , at the time block slice, where and the minimum number of two antennas per STBC, , the receive signal in equation 2 can be expressed as: In Equation 3, and from now on, the subscript indicate the correspondent transmitter or receiver antenna, and the super-indices indicate the correspondent block emission time .Where is defined as: (4) And its respective blocks have the structure indicated as follows: (5) And (6) In Equations 5 and 6, the matrix at the left indicates the symbol arrangement in the antennas and the notation introduced in the right matrix has been elaborated to simplify the explanation of the detection algorithm.Note that the matrix corresponds to the symbols transmitted by V-BLAST layers and to Alamouti-STBC encoder.
The system Equation 3 can be reformulated as a LDSTBC code according to (Hassibi, Hochwald, 2002).The resultant expression is ,1 ( The Equation 7 can be reformulated in compact form as: Where is named Linear Dispersion Matrix, and its sub matrices are defined by: ( Where (10) For and . ( Where each element of equation 11 is given by: ( For .The matrix is the portion of that links the spatial antenna with the receiver antenna.The same applies to that links the Alamouti block to the receiver antenna. A matrix structure similar to Equation 11 is shown in (Longoria, et al., 2007).The reformulation of equation 8 of the system equation 3, lead to the next rearrangement of matrix ( 13) Where ( 14) And ( 15) The reformulation of Equation 3as Equation 8allows to consider the MIMO system as an equivalent version with nsym transmit antennas, ignoring any distinction between STBC block and V-BLAST layers.
In this way, we propose a modified and optimized SQRD that can be applied directly over to use a simple linear detection (OSIC), using the output permutation vector order during the reordering stage, like is presented in (Wubben, et al., 2001).The distribution of power between antennas was done using the next equations: (16) Where and correspondent to the power feed at the antennas spatial (V-BLAST) and the antennas used in the layer of Alamouti respectively.The objective with this distribution is to send all the symbols with the same power.

b. ZF-SQRD-LDSTBC as Linear dispersion space-time code
The transmission matrix for ZF-SQRD-LDSTBC can be specified for the next expression The dispersion matrices and for and for are: For .
Where and was defined for the spatial layers as: , And for the Almaouti layer was defined as:

c. Modified SQRD
For the detection of the nsym transmitted symbols, is necessary to calculate the QR decomposition of , apparently if the MGS algorithm was applied directly, the complexity is increased meaningfully, because the dimensions of .
By taking advantage of the structure of is possible to split the QR decomposition in two parts, first the part correspondent to the spatial layers was obtained and after the part of the Alamouti layer was obtained.It is important to mention that we work with the original matrix and obtain , from which is possible to obtain applying the equivalences defined in (Longoria, et. al., 2007), (Bazdresch, et. al., 20102) therefore is possible to have a complexity similar at the algorithm showed in (Wubben, et al., 2001).The complete algorithm is shown in the Algorithm 1.The ordered of detection only is applied in the part of the scheme correspondent at V-BLAST layers.Also a modification to calculate the norms (we called ) of the columns of used in the order which the symbols were established.With this modification, an important quantity of flops was saved.

G. calculate
In order to decrease the complexity of the QR decomposition and because the elements of have a distribution Gaussian, we can use an estimator in the calculation of the energy of to obtain the detection order, such as was described in (Kim, et al., 2006).We have found that the performance is the same using the estimator or the exact equation, therefore a meaningful saving of flops is achieved.The equation to calculate the exact energy of a column vector is ( 21) The estimator that we used to calculate the energy of a column vector is  (Kim, et al., 2006) and (Le, et al., 2005) 62:  = 1 The operations required in the estimator, no multiplications are needed but sums, therefore a saving in complexity is achieved without loss of performance in our proposed scheme.

A. BER Performance
To demonstrate the advantages of the proposed scheme, we have performed several simulations to compare the BER performance of different MIMO systems under the mentioned conditions, employing 16-QAM and 32-QAM modulation schemes.Throughout this paper, the block length is considered fixed to and all simulations were run until 2000 block-error were found.The BER is represented as a function of the average SNR, where and is the average symbol energy.Figure 2 shows the BER performance comparison between SQRD (Wubben, et al., 2001), KIL-BLAST algorithm (Raza, et al., 2004), GMD-VBLAST (Jiang, et al As can be seen from Figure 2, the proposal improves around 7.5dB and 10.5dB with respect to that of SQRD when 32-QAM and 16-QAM are used, respectively for a , this performance was achieved without necessity of knowing any information of the CSI neither increase the complexity of the algorithm.In respect to KIL-BLAST we have achieved almost 4dB of improve for a .As shown, GMD-BLAST outperforms around of 4dB at the proposal when 32-QAM modulation is used, and 1.3dB for a 16-QAM modulation, but this gain has a penalty, a high complexity in both sides of the scheme, also it needs to know CSI to operate.For this reason, we can highlight that the scheme proposal increases considerably the performance of SQRD, maintaining the complexity, and easily of detection of the same, therefore, we have considered that ZF-SQRD-LDSTBC is a scheme that can be efficiently implemented in hardware.

B. Complexity analysis
To compare the complexity for the different schemes, we only have considered the number of arithmetical operations necessary to obtain the QR decomposition.Besides, we have assigned 1 flop per sum or multiplication of real numbers executed for the algorithms, and for the squareroot and divisions we took the number of flops assigned in (Bazdresch, et al., 2012), which are 8 flops per division of real numbers, and 30 flops for the square root.The results for schemes of different size are presented in table III.As might be seen, our proposal improves around twenty percent with respect to the scheme proposed in (Wubben, et. al., 2001).This improvement is mainly due to the use of the estimator for the energy in the order detection.With respect to KIL-BLAST, the complexity to calculate the QR decomposition is the same to our proposal.

Table II .
Spatial Code Rate.

Table III .
Complexity comparison.