Channel and Delay Estimation for Base-station–based Cooperative Communications in Frequency-selective Fading Channels

and delay estimation for base-station–based cooperative communications in frequency-selective fading channels. A channel and delay estimation algorithm for both positive and negative delay, based on the distributed Alamouti scheme, has been recently discussed for base-station–based asynchronous cooperative systems in frequency-flat fading channels. This paper extends the algorithm, the maximum likelihood estimator, to work in frequency-selective fading channels. The minimum mean square error (MMSE) performance of channel estimation for both packet schemes and normal schemes is discussed in this paper. The symbol error rate (SER) performance of equalisation and detection for both time-reversal space-time block code (STBC) and single-carrier STBC is also discussed in this paper. The MMSE simulation results demonstrated the superior performance of the packet scheme over the normal scheme with an improvement in performance of up to 6 dB when feedback was used in the frequency-selective channel at a MSE of 3 x 10 –2. The SER simulation results showed that, although both the normal and packet schemes achieved similar diversity orders, the packet scheme demonstrated a 1 dB coding gain over the normal scheme at a SER of 10 –5. Finally, the SER simulations showed that the frequency-selective fading system outperformed the frequency-flat fading system.


Introduction
Space-time block coding is a technique used to achieve spatial diversity in both synchronous multiple-input multiple-output and synchronous cooperative communication systems in frequency-flat fading channels. 1However, cooperative communication is generally asynchronous, not synchronous.This is because different relays have different locations, and the transmitted signals from different relays may arrive at different times.In asynchronous cooperative communication systems, when the relays use orthogonal space-time block codes (STBC) to forward the received data from source to destination, the code structure at the receiver is not orthogonal. 2The system can only achieve a diversity order of 1.Therefore, new transmission schemes based on STBC are required, and the estimation of the relative delays between different paths at the destination is needed. 2New transmission schemes based on STBC have been studied, but these studies did not take into account delay estimation. 2,3,4,5In this paper, we focus on channel and delay estimation for asynchronous cooperative communication systems.
A typical example of asynchronous cooperative communication systems is a base-station-based cooperative communication in macrocell downlink networks, proposed by Skjevling et al. 6 Tourki and Deneire 7 proposed a channel and delay estimation algorithm that achieved a lower Cramér-Rao bound (CRB) in Skjevling et al.'s system.But Tourki and Deneire's system was derived only for positive delays, that is, when Transmitter one's data always arrives at the receiver before Transmitter two's data. 7Recently, Xu and Padayachee 8 extended Tourki and Deneire's scheme to accommodate negative delays (i.e. when Transmitter two's data arrives at the receiver before Transmitter one's data).However, Xu and Padayachee's scheme 8 works only in frequency-flat fading channels.
The main motivation of this paper was to extend the scheme described in Xu and Padayachee 8 to work in frequency-selective fading channels.Extending the scheme to accommodate transmission in frequency-selective channels is pivotal because current and future broadband wireless communication systems aim to have high data rates, which gives rise to frequency-selective propagation effects.A typical example is that of a mobile communicating with a base station.As a result of the reflections of buildings, hills, cars and other obstacles, there are multiple delayed receptions of the transmitted signals at the receiver.This causes frequency-selective propagation effects.Mheidat et al. 9 have already discussed cooperative communication over frequencyselective channels, but their scheme assumed perfect channel knowledge at the receiver, as well as perfect synchronisation.Simeone and Spagnolini 10

discussed channel estimation in
The channel impulse responses of the frequency-selective channels from BS1 and BS2 to the mobile node are given by h 1 = [h 1 (0), h 1 (1),...,h 1 (L 1 )] and h 2 = [h 2 (0), h 2 (1),...,h 2 (L 2 )], where L i , i = 1,2 is the channel memory length which is assumed to be known at the receiver.These channel impulse responses are also assumed to be Rayleigh block flat-fading, that is, the amplitude of the fading envelope follows a Rayleigh distribution which stays constant for the duration of each frame but varies independently from frame to frame.Each h i (l i ), where i = 1,2, and l i = 0,1,..., L i , is assumed to be an independently and identically distributed zero-mean complex Gaussian with a variance of 1/(L i + 1). 14For simplicity it is also assumed that L 1 = L 2 = L c in this paper.
When designing the system model, the aim was to follow the following technique for linear equalisation 7,15 : Step 1 -model the received signals as frequency-selective fading channels but still maintained the assumption of synchronicity.Sirbu 11 addressed channel and delay estimation in wireless communication systems, but assumed that the delay was fixed.Very recently Li et al. 12 proposed a simple orthogonal space-time coding scheme for asynchronous cooperative systems for frequency-selective fading channels.Their proposed system used an orthogonal frequency division multiplexing technique to convert the delay in the time domain into phase in the frequency domain, but they assumed that full channel state information was known at the receiver.

System model
The base-station-based cooperative diversity system shown in Figure 1 is a macrocell consisting of two base stations transmitting wirelessly to a single mobile receiver. 6,7The base stations (BS1, BS2) and mobile node have only one transceiver each.It is assumed that both base stations have the same set of data to transfer to the mobile receiver.This can be achieved in reality via either a wired high-speed connection such as Ethernet, or a wireless transmission of the data between the base stations preceding the transmission scheme.
To begin with, the data set to be transmitted is parsed into two blocks of N symbols each, Step 3 -define a unitary matrix K TR as 16 [Eqn 6] where ⊗ represents the Kronecker product and I A denotes a square identity matrix of size A.
Step 4 -because multiplying by unitary matrices does not result in any loss of decoding optimality, 14  where b n out represents the filtered noise, which is still zeromean complex Gaussian noise with each entry having a variance of N 0 /2 per dimension. 16The decoupled data streams can then be sent to a standard equaliser for detection.
In order to extend the scheme in Xu and Padayachee 8 to follow the above technique, it is required that the received signal is broken up into two parts, namely r n and r n+1 , which are then treated separately.A new (N + 2L) x (N + 2L) delay matrix P(τ) is then introduced to account for the asynchronicity in the space-time block coded system.P(τ) is given by The asynchronicity introduced by cooperative communication poses a threat to the system's orthogonality.One method proposed here to ensure orthogonality is to impose certain conditions on the choice of pilot symbols, which will enable the receiver to employ linear equalisation.This is explained in Figure 3 and the subsequent discussion.The insertion of P(τ) into the system model will only accurately describe the system if and only if d 2 = ts 1 * when τ ≥ 0 and if d 1 =ts 2 * when τ < 0. ts 1 and ts 2 are given by and T s is an L x L time-reversal matrix.
Figure 3 shows the difference between the actual system and system model for τ ≥ 0. In Figure 3, T N is an (N + L) x (N + L) time-reversal matrix.It can be seen that when there is a positive delay in the actual system, the last τ symbols of ts 1 * are received in the beginning of the current frame interval of BS2's transmission, which destroys the orthogonality of the STBC.In the system model, however, P(τ) performs a cyclic shift on s n+1 , which results in the last τ symbols of d 2 being placed at the beginning of BS2's frame.To overcome this discrepancy between the actual system and system model, setting d 2 = ts 1 * when designing the pilot symbols will yield no difference between the system model and the actual system for positive delays, thus allowing the insertion of P(τ) to restore the orthogonality of the system.Using similar reasoning, one can see the need for d 1 =-ts 2 * for negative delays.
An illustration showing the differences between the actual system and the system model for Base station 1 (BS1) and Base station 2 (BS2) when the difference between the arrival times of the two signals from BS1 and BS2 is greater than or equal to zero (τ ≥ 0).

Actual system
System model

Previous frame
Current frame

Channel and delay estimation algorithm
A maximum likelihood estimation algorithm in this paper essentially follows the same steps as those used by Tourki and Deneire 7 and Sirbu 11 : Step 1 -define ss1, ss2 and S In order to account for the effects of the frequency-selective channels, ss 1 , ss 2 and S have to be defined differently to their definitions used by Tourki and Deneire 7 .Before one endeavours to understand how ss 1 and ss 2 are defined by equations, one should first understand how the pilot symbols are used to create these matrices.For a clearer understanding, the effect of P(τ), an L x L delay matrix, is shown in Figure 4 for τ = 1.
The composition of S is illustrated in Figure 5 for L = 4, L c = 2 and τ = 1.Note that d 2 = ts 1 * and d 1 = -ts 2 * are used in Figure 5.
In Figure 5, the symbols denoted by 'τ' and 'x' are unknown at the receiver and hence cannot be used in the estimation process because S is the region of the received signal where only pilot symbols overlap.where h ˆ is the final estimate of h based on the delay estimate τ ˆ.
The mean square error (MSE) is a widely used evaluation criteria for channel estimation.It is given by

Equalisation and detection
Once the channel parameters and delay have been estimated, the next step is to equalise the received signal before extracting the data.In this paper, two equalisation techniques will be presented: time-reversal STBC 16 and single-carrier STBC. 14

Time-reversal STBC
The first step in the time-reversal equalisation process is to conjugate and time reverse r n+1 as defined in [Eqn 10] for τ ≥ 0, then multiply the result by the delay matrix P(τˆ) .This will produce the following equation: and x and y are the row and column numbers of Q, respectively.In this paper it is assumed that the size of the DFT matrix, N + 2L, is a power of 2 and thus the terms FFT (fast Fourier transform) and DFT are interchangeable. 14e frequency domain equations are given by ( )

Pilot symbol design for frequency-selective fading channels
The Cramér-Rao bound (CRB) of the channel estimation for a given τ is 17 [Eqn 56] where tr( .) is the trace operator.
If optimal sequences are used, then the trace in the CRB definition in [Eqn 56] will be minimised, and the lowest achievable MSE of the channel estimation will thus also be reduced.It was shown that the use of optimal sequences in a similar synchronous system will result in the following condition 18 : where (2L -L c ) is the number of rows of S. The number of rows of S is basically related to the number of pilot symbols available for use in the estimation process.As expected, the estimation performance improves when more pilot symbols are available.Using a similar reasoning, the condition for optimality for an asynchronous system can be defined as: where (2L -L c -|τ|) is the number of rows of S(τ).
Once again, because the CRB is inversely proportional to [Eqn 58], it can be seen that increased pilot sequence lengths and decreased channel memory lengths and delay values will decrease the CRB and improve the channel estimation performance.[Eqn 58] also demonstrates that the orthogonality of S(τ) hinges upon the channel memory lengths and the delay value, hence one would need to know what the channel memory lengths and delay value are, before attempting to design optimal pilot sequences.This implies that the base stations would require knowledge of the channel and delay.Generally, however, the base stations do not have access to this information, and hence it is not possible to satisfy [Eqn 58].In these scenarios, using an exhaustive search to identify a pilot sequence that has an impulse-like autocorrelation is suggested.The impulse-like autocorrelation criteria will not minimise the MSE of the channel estimation, but it will optimise the delay estimation and hence improve the channel estimation and overall performance of the system.
It was mentioned that the number of rows of S(τ), in this case (2L -L c -|τ| ), must be even, 8 because it would otherwise be impossible to obtain zeros in the off-diagonal terms of (S(τ)) H S(τ) and hence [Eqn 58] would not be satisfied.This implies that optimal sequences only exist if the channel memory lengths and delay are both even or both odd.When this condition is not satisfied, suboptimal sequences for neighbouring delay values are used.

QP(τ ˆ) Tr
An additional condition specific to the system model described in this paper is that d 2 = T s d 1 * for τ ≥ 0 and d 1 = -T s d 2 * for τ < 0. While these conditions were stipulated in order to maintain the orthogonality of the system as a whole, they minimise the possibility of finding sequences that will ensure the orthogonality of S(τ).Although all the above conditions generally result in only suboptimal sequences being found, they tremendously reduce the size of the search space and hence decrease the computational complexity required to find these suboptimal sequences.

Simulation results
In our simulation, a 4-PSK modulation was used with N = 100, L = 14 and L c = 3.The channel and noise parameters used in this paper were described from [Eqn 1] and [Eqn 2].The channels were assumed to be unknown and hence needed to be estimated.Unless otherwise stated, the delay was assumed to remain constant over each frame but was allowed to vary randomly from frame to frame.The delays were uniformly distributed between -(L-1) and (L-1).The signal-to-noise ratio (SNR) values used refer to the ratio between symbol and noise energy.The single-input singleoutput (SISO) equaliser used was a MMSE equaliser.
Two schemes, a normal scheme and a packet scheme, were used for data transmission in this paper.Normal schemes transmit pilot symbols, followed by data, whilst packet schemes transmit x pilot frames after every y normal frames.
The packet scheme can be divided into three phases.In the first phase, the base stations transmit x pilot frames consisting of pilot symbols only.The mobile receiver then uses these frames to obtain an average estimate of the channel delay, and then transmits the estimated delay back to the base stations via the feedback channel in phase two.In phase three, the base stations use this delay estimate to select the corresponding optimal pilot sequences for channel estimation to be used in the next y normal frames consisting of data and pilot symbols.It is assumed that the delay remains constant over each round of the above three phases, but is allowed to vary from round to round.
The first simulation was done for a packet transmission scheme, which shows how the delay error probability varies in the presence of a frequency-selective channel as a function of the number of pilot frames, x, and the SNR at which these frames are transmitted.Figure 7 shows the delay estimation performance for a varying number of pilot frames.As can be seen, the delay estimation performance improves as the number of pilot frames increases.Depending on the bandwidth and power requirements of the system, as well as its error tolerances, the user may choose the number of pilot frames and transmission power that is best suited for their system.Also, it is worthy to note that in a system with a relatively high SNR, users may transmit the pilot frames at different and lower power levels than the data frames, depending on the desired error performance.
The next simulation was done for both a normal transmission scheme and a packet transmission scheme.The simulation investigated the performance of channel estimation in a frequency-selective fading channel.
The CRB given by [Eqn 56] was derived assuming that the delay was a known parameter, therefore it was once again only used here in order to obtain a tractable comparison.Figure 8 shows the MSE of the channel estimation for varying scenarios.It has the following plots: • MSE (normal) -MSE of the channel estimation in the normal system where the packet scheme was not used.• MSE (no feedback) -MSE of the channel estimation when the packet scheme was used but no feedback was employed.
• CRB (no feedback) -CRB for the channel estimation when the packet scheme was used but no feedback was employed, that is, a fixed pilot sequence was used for channel estimation.• MSE (with feedback) -MSE of the channel estimation when the packet scheme was used and feedback was employed.• CRB (with feedback) -CRB for the channel estimation when the packet scheme was used and feedback was employed, that is, the base stations used specific pilot sequences for channel estimation based on the delay estimates that they received.
From Figure 8 it can immediately be seen that the use of the packet scheme provided a substantial decrease in the MSE of the channel estimation when compared to the normal scheme -an approximate 5dB decrease without feedback, and a further 2dB decrease when feedback was employed.Also important to note is that the packet scheme overlapped with the respective CRB, whether feedback was employed or not.This highlights the fact that the use of the packet scheme resulted in a near zero delay error probability, as the CRB used was derived assuming the delay was a known parameter.
Figure 9 shows the symbol error rate (SER) performance of the normal scheme compared to that of the packet scheme.Figure 9 also shows the performance curves of the system under ideal conditions, that is, when the system is assumed to be synchronous and the channel parameters are assumed to be known at the receiver.Figure 9 contains the following plots: • SER TR (normal) -SER performance when the timereversal equalisation technique was used and the packet scheme was not employed.• SER SC (normal) -SER performance when the singlecarrier equalisation technique was used and the packet scheme was not employed.• SER TR (packet) -SER performance when the timereversal equalisation technique was used and the packet scheme was employed.After every 200 normal frames, 5 pilot frames at an SNR of 10 dB were transmitted.• SER SC (packet) -SER performance when the singlecarrier equalisation technique was used and the packet scheme was employed.After every 200 normal frames, 5 pilot frames at an SNR of 10 dB were transmitted.• SER TR (ideal) -SER performance of the system under ideal conditions with time-reversal equalisation used at the receiver.• SER TR (ideal) -SER performance of the system under ideal conditions with single-carrier equalisation used at the receiver.• SER SC (ref) -SER performance of a similar system as presented by Mheidat and Uysal 16 , in which ideal conditions were assumed.
The first notable observation from Figure 9 is that the performances of the two equalisation techniques overlap regardless of which scheme was used.Mheidat and Uysal 16 also reported the same results.It can also be seen that the performance of the system presented here when ideal conditions were assumed, overlapped with the performance of the system presented by Mheidat and Uysal 16 , as expected.
The packet scheme shows a 1 dB improvement in performance over the normal scheme and is only between 1 dB and 1.5 dB 'off ideal' performance.This reiterates the packet scheme's credibility and further justifies the minimal increase in overhead that the system may incur as a result of its employment.
Finally, to demonstrate the benefits of the mulitpath diversity, Figure 10 shows the SER performance of the system when transmitting over frequency-flat fading channels (SER FF) versus the performance of the system when transmitting over frequency-selective fading channels (SER FS).As can be seen, even though there were more parameters to estimate, such are the benefits of multipath diversity that it enabled the frequency-selective fading system to outperform the frequency-flat fading system by 6 dB at an SER of 8 x 10 -4 , even when ideal conditions were assumed in the frequencyflat fading system.

Conclusion
The channel and delay estimation algorithm proposed by Xu and Padayachee 8 was extended to accommodate frequency-selective channels.New time-reversal and single-

FIGURE 8:
The mean square error (MSE) of the channel estimation with an increasing signal-to-noise ratio (SNR, dB) under different scenarios: Cramér-Rao bound (CRB)(no feedback) -CRB for the channel estimation when the packet scheme was used but no feedback was employed; CRB with feedback -CRB for the channel estimation when the packet scheme was used and feedback was employed; MSE (normal) -MSE of the channel estimation in the normal system where the packet scheme was not used; MSE (no feedback) -MSE of the channel estimation when the packet scheme was used but no feedback was employed; and MSE (with feedback) -MSE of the channel estimation when the packet scheme was used and feedback was employed.

FIGURE 9:
The symbol error rate (SER) performance with an increasing signalto-noise ratio (SNR, dB) of the normal scheme compared to the packet scheme under different scenarios: SER TR (normal) -SER performance when the time-reversal equalisation technique was used and the packet scheme was not employed; SER SC (normal) -SER performance when the single-carrier equalisation technique was used and the packet scheme was not employed; SER TR (packet) -SER performance when the time-reversal equalisation technique was used and the packet scheme was employed; SER SC (packet) -SER performance when the single-carrier equalisation technique was used and the packet scheme was employed; SER TR (ideal) -SER performance of the system under ideal conditions when time-reversal equalisation was used at the receiver; SER TR (ideal) -SER performance of the system under ideal conditions when single-carrier equalisation was used at the receiver; and SER SC (ref) -SER performance of a system such as that presented by Mheidat and Uysal 16 , in which ideal conditions were assumed.

FIGURE 10:
The symbol error rate (SER) performances of the frequency-flat fading channels (SER FF and Ideal FF) compared to that of the frequencyselective fading channels (SER FS and Ideal FS) with an increasing signal-to-noise ratio (SNR, dB), demonstrating the benefits of multipath diversity.carrier equalisation techniques proposed were capable of accommodating the asynchronicity of the system.The required criteria to design optimal pilot sequences for frequency-selective channels were presented, with the imposed conditions suggesting that suboptimal sequences will have to be used in most cases.The MSE simulation results demonstrated the superior performance of the packet scheme over the normal scheme with an improvement in performance of up to 5 dB when feedback was used.The SER simulations showed that, although both the normal and packet schemes achieved similar diversity orders, the packet scheme demonstrated a 1 dB gain over the normal scheme.Also, it was important to note that, although there were more fading coefficients to estimate, the frequency-selective fading channel model still outperformed the frequency-flat fading channel model in terms of SER performance because of the diversity benefits provided by the multipath nature of the frequency-selective fading channel.

[
r n and r n+1 are (N + 2L) x 1 received symbol vectors.b n and b n+1 represent (N + 2L) x 1 zero-mean complex Gaussian noise with each entry having a variance of N 0 /2 per dimension, T is an (N + 2L) x (N + 2L) time-reversal matrix as described by [Eqn 3] and H 1 and H 2 are (N + 2L) x (N + 2L) circulant channel matrices given by [Eqn 4] Step 2 -conjugate and time-reverse r n+1 and allow the received signals r n and r n+1 to be grouped in matrix form by using the identity TH i * T = H i H , 9 which is shown in [Eqn 5] [Eqn 5] where H i H , i = 1,2, denotes the Hermitian matrix.

2 BS
n and d n+1 , where n is the block number.Training symbols d 1 and d 2 , each of length L, are then added at the end of d n and d n+1 , respectively, to form two (N + L) x 1 vectors.To insert a cyclic prefix of training symbols between any two successive blocks these vectors are pre-multiplied by a precoding matrix F p .Inserting the pilot sequences in clusters in the middle of the data frame minimises the CRB.13 Multiplying the vectors by F p results in two (N + 2L) x 1 blocks s n and s n+1 .The blocks are then transmitted according to the time-reversed block form of Alamouti's scheme shown in Figure2.In Figure2, T is a time-reversal matrix which is described in [Eqn 3]., base station; h, channel impulse response of the frequency-selective channel from the base station to the mobile receiver.

FIGURE 1 :
FIGURE 1: An illustration of the system model of the base-station-based cooperative diversity system, showing the macrocell consisting of two base stations transmitting wirelessly to a single mobile receiver.
station; T denotes a time-reversal matrix; (s n )* denotes the conjugate and n is the block number.

FIGURE 2 :
FIGURE 2: A block transmission of the time-reversed block form of Alamouti's scheme.

[[
Eqn 8]   where|.|denotes the absolute operator and 0 AxB denotes a matrix of zeros of dimensions (A x B).P(τ) is basically a circularly shifted identity matrix used to circularly shift a symbol vector by τ symbols, τ is the difference between the arrival times of the two signals, τ 1 and τ 2, where τ 1 and τ 2 are the arrival times of the first (BS1) and second (BS2) signals, respectively.With the inclusion of P(τ), the asynchronous received signal can be modelled as follows: When τ ≥ 0, Eqn 9] to [Eqn 12] are derived simply by noting that for positive delays the symbols from BS2 are shifted with respect to BS1, whilst for negative delays the symbols from BS1 are shifted with respect to BS2.

FIGURE 4 :
FIGURE 4: An illustration showing the effect of the L x L delay matrix, P(τ), when the difference between the arrival times of the two signals from BS1 and BS2 equals 1 (τ = 1).

FIGURE 5 :
FIGURE 5: An illustration showing the composition of the S matrices for Base station 1 (BS1) and Base station 2 (BS2) when the channel memory length equals 2 (L c = 2), the length of the training symbol equals 4 (L = 4) and the difference between the arrival times of the two signals from BS1 and BS2 equals 1 (τ = 1).

[
Eqn 34] The identity, [Eqn 35], is proven in Appendix 1: [Eqn 35] Using [Eqn 35] and the identity TH i * T = H i H , 9 [Eqn 34] can be reduced to [Eqn 36] Combining [Eqn 9] and [Eqn 36] in matrix form yields [Eqn 37] Next we define the unitary matrix K TR as 16 [Eqn 38] [Eqn 38] is identical to the definition in [Eqn 6] and is just repeated here for convenience.Multiplying [Eqn 37] by K TR allows for the decoupling of the output streams according to [Eqn 39] and [Eqn 40] where b n out and b n+1 still represent zero-mean complex Gaussian noise with each entry having a variance of N 0 /2 per dimension.Because the output streams are decoupled, standard equalisation techniques such as minimum mean square error (MMSE) or maximum likelihood sequence estimation (MLSE) may be employed.The analysis that yielded [Eqn 39] and [Eqn 40] was carried out assuming that τ ≥ 0. A similar analysis for τ < 0 will yield the following decoupled output equations: [Eqn 41] and [Eqn 42] Single-carrier STBC The first step in the single-carrier equalisation process is to once again conjugate r n+1 and then multiply it by P(τ)T to obtain [Eqn 34].Next, transfer r n and [Eqn 34] to the frequency domain by multiplying them by the discrete Fourier transform (DFT) matrix Q where [Eqn 43]

1 FIGURE 7 :
FIGURE 7:The delay error probability with an increase in the signal-to-noise ratio (SNR, dB) for a varying number of pilot frames (x = 1 ,…, 5).