TU Wien:Digital Communications 2 VU (Hlawatsch)/DC2 Summary

Aus VoWi
Zur Navigation springen Zur Suche springen

\documentclass[12pt, intlimits, headings=normal, usenames, dvipsnames]{scrartcl}

\usepackage[english]{babel} \usepackage[utf8]{inputenc} \usepackage{lmodern}  % \usepackage{amsmath} \usepackage{graphicx} \usepackage{amssymb} \usepackage{amsfonts} \usepackage[colorinlistoftodos]{todonotes} \usepackage{bm} \usepackage{csquotes} \usepackage{xurl}

\usepackage{trfsigns}

\usepackage[colorlinks=true,

           citecolor=NavyBlue,
           linkcolor=Purple,
           urlcolor=BlueViolet
           ]{hyperref}  % hidelinks

\usepackage{enumitem} \setlist[enumerate]{itemsep=0mm} \setlist[itemize]{itemsep=0mm}

\usepackage{caption} \captionsetup{format=plain}

% command for making vectors, etc., boldface \newcommand{\vex}[1]{\boldsymbol{#1}}  % vs. mathbf{} \newcommand{\tp}{^{\mathsf{T}}}  % transposition \newcommand{\herm}{^{\mathsf{H}}}  % conjugate transpose

% \newcommand{\ept}{\mathcal{E}}  % expectation operator \newcommand{\ept}{\mathrm{E}}  % expectation operator \newcommand{\cov}{\mathrm{Cov}}  % \newcommand{\cor}{\mathrm{Cor}}  % \newcommand{\var}{\mathrm{Var}}  % \newcommand{\sd}{\mathrm{Sd}}  %

\newcommand{\rv}[1]{\mathsf{#1}}  % \newcommand{\est}[1]{\hat{#1}}  %

\newcommand{\diff}{\mathrm{d}}

\newcommand{\field}[1]{\mathbb{#1}} \newcommand{\C}{\field{C}} \newcommand{\R}{\field{R}} \newcommand{\Z}{\field{Z}} \newcommand{\N}{\field{N}} \newcommand{\Prime}{\field{P}}

% shorthand for GF() \newcommand{\gf}[1]{\mathrm{GF}(#1)}

\providecommand{\bone}{\quad \laplace \quad}

% ------------------------------------------------------------------------------------------- \usepackage{exsheets} % \RenewQuSolPair{question}[Question]{solution} \SetupExSheets{question/name=Question} \SetupExSheets{solution/name=Answer} \SetupExSheets{solution/print=true} % -------------------------------------------------------------------------------------------

% ------------------------------------------------------------------------------------------- % -------------------------------------------------------------------------------------------

\title{\LARGE{Dgitial Communications 2}\\

      \Large{Formulary, Summaries, Questions and Answers}}

\author{ $\sim$ondelette$\sim$ } \date{\large \today}

% -------------------------------------------------------------------------------------------

% ----------------------------------------------------------------------- \begin{document} \maketitle % \clearpage % -----------------------------------------------------------------------

\tableofcontents \clearpage % -----------------------------------------------------------------------

% =========================================================================================== \section*{What is this supposed to be?}

Glad you asked. A humble student's attempt at coping with this lecture. All summarized stuff below was in one way or another taken and adapted from the official lecture notes, a quite old version of those actually. If you find a page number or similar, pointing at the lecture notes, chances are that they won't help you much if you have a more recent version. But don't be angry, rather put the new page number there and everyone will rejoice!

There is absolutely no warranty whatsoever that the stuff in this document is correct (although it was done with care). It may help you to get an idea of what are the most important things to know and look at.


% =========================================================================================== \section{Possible Questions Oral Exams} Those were handed down through generations of students taking this class. Not an exhaustive list. Not all of them are worked out in the following, but the material necessary to answer most of them is summarized further below.

\begin{itemize}

   \item HISO channel model, how the characteristic is described an how this changes if the channel is memory-less and gaussian.
   HISO channel and its statistical characterization, how does it change if the channel is memory-less and when it’s time-invariant memory-less 
   \item Catastrophic encoding of convolutional codes.
   So basic of convolutional codes, what does catastrophic mean and what is the criterion.
   \item Dual code.
    Check Matrix and Dual Code - just an basic overview not too specific
   \item Check matrix
   \item MD and BMD decoder - how they work and error probability
   \item Polynomial matrix description and non-catastrophic encoder.
   Polynomial generator matrix of convolutional codes
   \item Turbo Codes - Encoder, performance, why use IIR filters, weight distribution, why interleaver.
   Turbo codes: encoder, type of constituent codes, weight distribution, interleaver
   \item Primitive cyclic codes: principles of cyclic codes, primitive block-length, splitting field, factorization of generator and check polynomial, DFT over GF($q^m$), special properties of RS codes

\end{itemize}


% ===================================================================

\noindent\rule{\textwidth}{1pt} % ------------------------------------------------------------------- % ===================================================================

% ------------------------------------------------- \begin{question} HISO channel model, how the characteristic is described an how this changes if the channel is memory-less and gaussian. \\ HISO channel and its statistical characterization, how does it change if the channel is memory-less and when it’s time-invariant memory-less. \\ (Bascically: know everything about the characterization of the channels.) \end{question}  % -------------------------------------------------

\begin{solution} % ------------------------------------------------- HISO = Hard Input, Soft Output.

TO DO: alternative, equivalent descriptions from symbols, etc., to the soft pseudosymbols.

Different versions and their assumptions \begin{itemize}

   \item HISO: transition pdf of the channel is used for characterizing the channel.
   Channel can be scalar-, vector-valued.
   \item Memoryless HISO: can be time-variant or time-invariant.
   Memorylessness is characterized by independent scalar uses of the scalar channel.
   \item Gaussian Memoryless HISO: assumes memorylessness, time-invariance of the channel, Gaussian noise.

\end{itemize}

TO DO: what is the connection between memorylessness and ISI? How does Gaussian noise relate to the memorylessness? Also check DC1 lecture notes if necessary.

\end{solution}  % -------------------------------------------------

\noindent\rule{\textwidth}{1pt} % -------------------------------------------------------------------

% ===================================================================

% ------------------------------------------------- \begin{question} HIHO channel model, how the characteristic is described an how this changes if the channel is memory-less and ... . \\ HIHO channel and its statistical characterization, how does it change if the channel is memory-less and when it’s time-invariant memory-less. \\ (Bascically: know everything about the characterization of the channels.) \end{question}  % -------------------------------------------------

\begin{solution} % ------------------------------------------------- HIHO = Hard Input, Hard Output.

TO DO: alternative, equivalent descriptions from symbols, etc.

Different versions and their assumptions \begin{itemize}

   \item HIHO: transition pdf of the channel is used for characterizing the channel.
   Channel can be scalar-, vector-valued.

\end{itemize}


\end{solution}  % -------------------------------------------------

\noindent\rule{\textwidth}{1pt} % -------------------------------------------------------------------


% ===================================================================

\begin{question} % ------------------------------------------------- Dual code. Check Matrix and Dual Code: just basic overview, not too specific. \end{question}  % -------------------------------------------------


\begin{solution} % -------------------------------------------------

p. 134 (Linear Block Codes), p. 184, 194 (Cyclic BC).

\minisec{Dual Code, Linear Block Codes} A linear $(N, K)_q$ code $\mathcal{C}$ is a $K$-dimensional linear space (a linear subspace of $[GF(q)]^N$. There exists an orthogonal subspace of dimension $(N-K)$ that is again a code, this is the dual code to the original code. The dual code is defined over the same symbol alphabet, has the same blocklength $N$, but different dimension, i.e., number of parity symbols.

The rate of the dual code is $1-R$ of the original code.

The dual of the dual code is the original code.

The check matrix of the dual code is the generator matrix of the original code, and vice-versa.

Weight enumerator of code and dual code are related by MacWilliams identity. This can be used to calculate weight enumarator via the dual code, which can be easier, especially in case of high-rate codes.


\minisec{Dual Code, Cyclic Block Codes} The dual code of a cyclic code is again a cyclic code. The check polynomial of a cyclic code is closely related to the generator polynomial of the cual code. They are the reciprocal polynomials of each other, respectively

The relations show up as well between the check and generator matrices, when matrix descriptions are used, especially their cyclic structure (they are Toeplitz-matrices).


\end{solution}  % -------------------------------------------------

\noindent\rule{\textwidth}{1pt} % -------------------------------------------------------------------

% ===================================================================


\begin{question} % ------------------------------------------------- Basics of convolutional codes, what does catastrophic mean and what is the criterion. Catastrophic encoding of convolutional codes. \end{question}  % -------------------------------------------------

\begin{solution} % -------------------------------------------------

Catastrophic encoders: p. 248f, Noncatastrophic encoders: p. 266f (criterion for obtaining noncatastrophic encoders).

\minisec{Basics of Convolutional Codes} [p. 242f]

Not a block code, i.e. not mapping finite number of data symbols to finite number of code symbols. Instead, continuous stream of data symbols mapped to continuous stream of code symbols.

Internally, frame structure, each data frame has length $K$, is mapped to code frame of length $N$. The sequences of frames (which are finite) may be infinite, i.e., contain infinitely many frames.

For block codes, $K$ and $N$ may be very large, whereas for conv. codes, they are very small, in the range of 1, 2, 3.

Encoding is not memory-less: each data frame depends explicitly on a certain number of past data frames. This is the memory of the encoder.

CHECK: does this yield a channel that is not memoryless? I assume so, as the encoder is part of the channel as well?!

TO DO: check the constraint length of the code $\nu$ [p. 248].

Properties: \begin{itemize}

   \item linearity
   \item time-invariance
   \item causality
   \item rate
   \item frame memory order
   \item constraint length
   \item systematic encoding
   \item catastrophic encoding

\end{itemize}


Different ways of describing conv. codes: \begin{itemize}

   \item Matrix description: generator and check matrices
   \item Polynomial description
   \item Polynomial matrix description
   \item Trellis representation, graph searching decoders. Viterbi algorithm for HI and SI decoding.
   \item (Not relevant for exam) Trellis Coded Modulation (TCM)

\end{itemize}


\minisec{Catastrophic encoder} Encodes at least one data sequence of infinite weight into a code sequence of finite weight (that is then transmitted over a channel); property of the encoder, not the code. It could happen that the finite number of (nonzero) code symbols get zeroed during the transmission over the channel. This would results in an infinite number of symbols errors caused by a finite number of transmission errors on the channel (because the encoded data sequence has infinite weight, thus infinite length (?!)).

Note that this is a property of the encoder, not the code.

A catastrophic encoder can be detected when the state diagram exhibits a cycle labeled with input data symbols that are not all zero but code symbols that are all zero [p. 276].


\minisec{Noncatastrophic encoder}

There exists a necessary and sufficient condition on the polynomials formed from the tap constellation vectors. These can be arranged in a polynomial generator matrix.

The GCD of the generator polynomials needs to be 1 (in practice; $x^L$ for some $L \in \N$) for the encoder to be noncatastrophic



\end{solution}  % -------------------------------------------------

\noindent\rule{\textwidth}{1pt} % -------------------------------------------------------------------

% ===================================================================


\begin{question} % ------------------------------------------------- Polynomial matrix description and non-catastrophic encoder. Polynomial generator matrix of convolutional codes. \end{question}  % -------------------------------------------------

\begin{solution} % -------------------------------------------------

Section 7.4, pp 262, 265.

One can form polynomials for the data and code streams. Also, finite order polynomials for the tap coefficient vectors.


\end{solution}  % -------------------------------------------------

\noindent\rule{\textwidth}{1pt} % -------------------------------------------------------------------

% ===================================================================


\begin{question} % ------------------------------------------------- MD and BMD decoder: how they work and error probability. \end{question}  % -------------------------------------------------

\begin{solution} % -------------------------------------------------

Sections 3.3. pp. 85, 89.

Both MD and BMD decoder as based on purely deterministic principles, do not suppose any knowledge of the statistical characteristic of the channel (tranhsmission pdf, pmf) or any statistics of the source pmf. In general sub-optimum performance, but have guaranteed error correction capability $t$. Can be equal to optimum decoders for certain cases.

% ------------------------------------------------------------------- \minisec{Minimum Distance (MD)}

Chooses codeword closest to senseword: \begin{equation}

   \hat{\vex{c}}_{\mathrm{MD}} = \arg \min d_H (\vex{c}, \vex{v}), \quad \forall \vex{c} \in \mathcal{C}.

\end{equation} The MD decoder is equal to the ML decoder for the symmetric DMC and the BSC. All these minimize the block error probability in the case that all codewords are equally likely.

If $m$ errors occur, $d_H(\vex{c}, \vex{v}) = m$.

The MD decoder makes use of Hamming distance $d_H$, thus we are interested in the minimum distance of the code $d_{\mathrm{min}}$. From this we can compute the quantity $t$, the so called random error correction capability (random as in errors can occur at random places in the codeword): \begin{equation}

   t = \left\lfloor \frac{d_{\mathrm{min}} - 1}{2} \right\rfloor.

\end{equation} The MD decoder can \textbf{correct} all possible sensewords with up to $t$ symbol errors, this is guaranteed. If there are more than $t$ errors, then decoding errors are possible, i.e., the decoder may choose a wrong codeword for the received senseword.

Decoding requires an exhaustive search over all $q^K$ codewords for each given senseword $\vex{v}$. This is often impractical.


% ------------------------------------------------------------------- \noindent \textbf{Block Error Probability}

A channel has to be assumed to state this? In the notes, a DMC is assumed, here making the MD and the ML decoder equivalent.

The probability of a block error is then approximately given by the mean number of nearest neighbors of a codeword and $\mathcal{Q}(d_{\min, p}$ where $p$ is the prob. of a symbol error. $\mathcal{Q}(d_{\min, p}$ is a sum over products $\binom{d_{\min}}{m} p^m (1-p)^{d_{\min}-m}$.

% ------------------------------------------------------------------- \minisec{Bounded Minimum Distance (BMD)}

The bounded minimum distance decoder only considers a limited number of sensewords around each codeword as candidates for decoding to the respective codeword. This leads to a smaller search space as compared to the MD decoder that needs to compare a given senseword to each codeword in order to find the one with minimum Hamming distance.

How is this actually done for the BMD, does it have a dictionary of codewords up to some maximum Hamming distance $r$ for each senseword? Such that all codedwords are included in the search? Going through all the codewords is not an option, that would be the same as the MD decoder again.

The MD decoder is able to correct some sensewords with more than $t$ errors, whereas the BMD decoder is not. It will always declare a decoding failure for these sensewords.

BMD only considers sensewords within Haming sphere of radius $r \leq t$ for each codeword. Between the spheres around each of the codewords is the interstitial region. Spheres associated with the codewords are called decoding spheres. Because the minimum distance between any 2 codewords is $d_{\min}$, and $r\leq t$, the decoding spheres never intersect.

BMD can: \begin{enumerate}

   \item correct all possible sensewords with up to $r \leq t$ symbol errors but no sensewords with more than $r$ errors; some of these could be decoding failures.
   \item the MD decoder can correct all sensewords with up to $t$ errors, and some with more than $t$ errors, as long as they are still closer to the correct codeword than to another codeword.

\end{enumerate}


% ------------------------------------------------------------------- \noindent \textbf{Block Error Probability}

Again assuming the symmetric DMC with error prob. $p$ and correct prob $1-P$, $P = (|\mathcal{D}|-1)p$.

\begin{align}

   P_{BMD}\{ \mathcal{E} \} &= \sum_{m=r+1}^{N} \binom{N}{m} P^m (1-P)^{N-m} = \\
   1 - P_{BMD}\textbf{C}    &= 1 - \sum_{m=0}^{r} \binom{N}{m}P^m (1-P)^{N-m}.

\end{align}


\end{solution}  % -------------------------------------------------

\noindent\rule{\textwidth}{1pt} % -------------------------------------------------------------------

% ===================================================================


\begin{question} % ------------------------------------------------- Primitive cyclic codes: principles of cyclic codes, primitive block-length, splitting field, factorization of generator and check polynomial, DFT over GF($q^m$). Special properties of RS codes. \end{question}  % -------------------------------------------------

\begin{solution} % -------------------------------------------------

Section 6.


\end{solution}  % -------------------------------------------------

\noindent\rule{\textwidth}{1pt} % -------------------------------------------------------------------

% ===================================================================

\begin{question} % ------------------------------------------------- Turbo Codes: Encoder, performance, why use IIR filters, weight distribution, why interleaver. \\ Turbo codes: encoder, type of constituent codes, weight distribution, interleaver. \end{question}  % -------------------------------------------------

\begin{solution} % -------------------------------------------------

Constituent codes: typically terminated cbinary onvolutional codes; usually the same codes with $K_1 = K_2 = K$ are used. Typically, $K$ (also interleaver length!) should be large, and so should be $N$.

Identical recursive (IIR filter) encoders for systematic encoding. Second encoder preceded by interleaver. This is used to ensure few low-weight codewords of the overall code. If the one branch's parity word has low weight, the other branch's parity word should have high(er) weight, so that the overall codeword has good weight. This effect of few low-weight words is caused spectral thinning.

BER performance shows coding gain of few dB over other codes with encoders of comparable complexity. However, very low BER is not easy to achive (directly) because of error floor performance that is due to not too high $d_{\min}$. Concatenation with an outer code can improve BER performance at cost of slightly reduced rate.


\end{solution}  % -------------------------------------------------

\noindent\rule{\textwidth}{1pt} % -------------------------------------------------------------------


% ===================================================================




% =========================================================================================== \section{Notation and Variables}

\begin{align*}

   & \nu = \frac{R_b}{B_T} \ldots \text{spectral efficiency} \\

\end{align*}

\begin{align*}

   & \mathcal{D} \ldots \text{data (and code) symbol alphabet} \\
   & \mathcal{A} \ldots \text{transmit symbol alphabet} \\
   & \lvert \mathcal{D} \rvert \ldots \text{symbol alphabet size} \\
   & N \ldots \text{length of codeword } \vex{c} \\
   & K \ldots \text{length of dataword } \vex{u}, N \geq K  \\
   & u_k \ldots \text{data, information symbols, $\in \mathcal{D}$}, \quad k=0, \ldots, K-1 \\
   & \vex{u} \ldots \text{dataword}, \quad K \times 1 \\
   & c_n \ldots \text{code symbols, $\in \mathcal{D}$}, \quad n=0, \ldots, N-1 \\
   & \vex{c} \ldots \text{codeword}, \quad N \times 1 \\
   & a_l \ldots \text{transmit symbols, $\in \mathcal{A}$}, \quad l=0, \ldots, L-1 \\
   & \vex{a} \ldots \text{transmit symbol vector}, \quad L \times 1 \\
   & q_l \ldots \text{(receive) pseudo-symbols, $\in \R$ or $\in \C$}, \quad l=0, \ldots, L-1 \\
   & \vex{q} \ldots \text{pseudo-symbol vector}, \quad L \times 1 \\
   & v_n \ldots \text{sense symbols, $\in \mathcal{D}$}, \quad n=0, \ldots, N-1 \\
   & \vex{v} \ldots \text{senseword}, \quad N \times 1 \\
   & s(t) \ldots \text{transmit signal} \\
   & n(t) \ldots \text{noise signal} \\
   & y(t) \ldots \text{receive signal} \\
   & \mathcal{C} \ldots \text{a block code, set of all codewords} \\
   & \lvert \mathcal{C} \rvert = \lvert \mathcal{D} \rvert^K \ldots \text{code size, i.e. number of codewords} \\
   & d_{H, \min}(\mathcal{C}) = w_{H, \min}(\mathcal{C}) \ldots \text{a property of linear codes} \\

\end{align*}


% =========================================================================================== \section{Basic Properties of Codes}

\begin{itemize}

   \item Symbol alphabet size $q = |\mathcal{D}|$
   \item blocklength $N$, dataword length $K$
   \item Rate $R = \frac{K}{N}$
   \item minimum distance $d_{\min}$
   \item Algebraic structure: linear, cyclic, primitive cyclic, etc.
   \item Equivalent code: codes equal up to a fixed permutation of the codes symbols within a block/frame/...
   \item Dual code

\end{itemize}

Random error \textbf{correction} capability: \begin{equation}

   t = \left\lfloor \frac{d_{\mathrm{min}} - 1}{2} \right\rfloor.

\end{equation} $t$ is also called the \textbf{packing radius} of the block code. This is the largest radius such that each senseword $\vex{v}$ is in at most one decoding sphere. These are spheres around the codewords $\vex{c}$ with a certain radius $r$. For $r = t$, we get the largest possible spheres that do not intersect.

There is also the \textbf{covering radius}, that yields spheres such that each senseword is in at least on sphere. Typically the covering radius is larger than the packing radius, unless for perfect codes where it is equal.

There is also the random error \textbf{detection} capability. For all patterns of up to $m$ symbol errors, detection is possible iff: \begin{equation}

   1 \leq m \leq d_{\mathrm{min}} - 1.

\end{equation} It may be possible to detect more error patterns even if m $\geq d_{\mathrm{min}}$.

For the BMD it is possible to trade off error correction and error detection.


For \textbf{perfect codes}, the interstitial region is empty.

In general, for $m$ errors and $\rho$ erasures, a suitable MD decoder can detect / fill this many errors / erasures, iff: \begin{equation}

   2 m + \rho \leq d_{\mathrm{min}} - 1.

\end{equation}



% =========================================================================================== \section{Types of Codes considered in class}

% ------------------------------------------------------------------------------------------- \minisec{Block Codes} Block code is the set $\mathcal{C}$ of all codewords $\vex{c}^{(1)}$ to $\vex{c}^{(M)}$. The code itself does not specify an encoding mapping from the datawords to the codewords. There are $M = q^K$ different codewords, where $q = | \mathcal{D}|$. CHECK


Denoted $(N, K, d)_q$, $d = d_{H, \min}(\mathcal{C})$

$q\ldots$ characteristic of $\gf{q}$. $q = p^m$, $p \in \Prime$, $m \in \N$.

\noindent \textbf{Linear Block Codes} \begin{itemize}

   \item structure: weighted sum of codewords is again a codeword, weights from underlying $\gf{q}$
   \item the $\vex{0}$ codeword is part of any linear code
   \item $\mathcal{C}$ is a $K$-dimensional linear subspace of $[ \gf{q}]^N$
   \item standard array: way of tabulating all sensewords $\vex{v}$ relative to given linear block code $\mathcal{C}$
   \item offer matrix description: encoding, check matrix
   \item dual codes
   \item $d_{H, \min}(\mathcal{C}) = w_{H, \min}(\mathcal{C})$. Typically it is much easier to determine $w_{H, \min}(\mathcal{C})$ than the minimum distance, so this is a useful property.
   \item Weight distribution: number $A_w$ of codewords of weight $w$.
   The weight enumerator is another representation of the weight distribution, using a polynomial.

\end{itemize}

% ------------------------------------------------------------------------------------------- \minisec{Cyclic Block Codes} All cyclic block codes are linear codes (p. 114).

A block code $\mathcal{C}$ is cyclic if cyclic shift of codeword $\vex{c}$ is again a codeword in $\mathcal{C}$.


Descriptions: \begin{itemize}

   \item Polynomial: codeword, generator, check polynomials
   \item Matrix: generator, check matrices
   \item Shift-register circuits: 

\end{itemize}

Variants, with possible additional representations: \begin{itemize}

   \item Primitive Cyclic Codes: DFT representation
   \item Reed-Solomon (RS) Codes
   \item BCH Codes

\end{itemize}

\noindent \textbf{Primitive Cyclic Codes}

Have primitive blocklength $N = q^m - 1$, $m \in \N, m \geq 1$. $q \ldots$ size of the symbol alphabet, $N \ldots$ is the number of entries in the codeword $\vex{c} \in \mathcal{C}$.

Allow for describing zeros of generator-polynomial $g(x)$ in closed form.

Allows for \enquote{frequency domain} description of codes using DFT.


% ------------------------------------------------------------------------------------------- \minisec{Convolutional Codes}

Encoding with memory, recursive (FIR) or non-recursive (IIR).

Matrix description, polynomial description, trellis description


% ------------------------------------------------------------------------------------------- \minisec{Turbo Codes} TO DO


% =========================================================================================== \section{Error detection, Error correction capabilities}

\begin{itemize}

   \item Random errors: errors can occur in any position, all positions independent of each other
   \item Random erasures: as above, can occur in any position
   \item Burst errors: errors are confined to some bust interval $l \ll N$

\end{itemize}


Random error channels: \begin{itemize}

   \item line of sight wireless channels
   \item deep space channels
   \item many satellite channels

\end{itemize}

Note that burst-error channels are \textbf{channels with memory}. Types of burst-error channels: \begin{itemize}

   \item mobile wireless channels
   \item cable channels
   \item magnetic recording channels

\end{itemize} There are specific codes designed for burst-error channels.



% =========================================================================================== \section{Linear Block Codes}

Generator matrix $\vex{G}: N \times K$: \begin{equation}

   \vex{G} = (\vex{g}_0, \vex{g}_1, \ldots, \vex{g}_{K-1})

\end{equation} The column vectors in the matrix are linearly independent and thus form a basis of the code. A generator matrix is a quite efficient way to generate all codewords of a code from the datawords. Much less storage is needed for storing the matrix elements than e.g. all codewords.

Encoding from the datawords $\vex{u}$: \begin{equation}

   \vex{c} = \vex{G} \vex{u}

\end{equation}

Systematic encoder (any generator matrix can be brought to this form using elementary column and row permutations): \begin{equation}

   \vex{G} = \begin{pmatrix}
               \vex{I}_K \\
               \vex{P}
             \end{pmatrix},

\end{equation} where $\vex{P}$ is $(N-K) \times K$.

There is an orthogonal space $\mathcal{C}^{\perp}$ that is an ($N-K$-dimensional) subspace of $[\gf{q}]^N$ that holds all vectors orthogonal to all codewords and thus to $\mathcal{C}$. There may be vectors that are orthogonal to themselves, thus can occur in both subspaces $\mathcal{C}$ and $\mathcal{C}^{\perp}$, those are thus not disjoint.

Check matrix $\vex{H}: N \times (N-K)$ (for the systematic encoder): \begin{equation}

   \vex{H} = \begin{pmatrix}
                   -\vex{P}\tp \\
                   \vex{I}_{N-K}
             \end{pmatrix}

\end{equation} The check matrix can also be used for decoding. The check matrix is built from linear independent column vectors from $\mathcal{C}^{\perp}$.

\begin{align}

   \vex{H}\tp \vex{c} &= \vex{0}, \quad \forall \vex{c} \in \mathcal{C},\\
   \vex{H}\tp \vex{v} &= \vex{0}, \quad \iff \vex{v} \in \mathcal{C}

\end{align} The latter allows to determine whether the senseword $\vex{v}$ is a codeword of $\mathcal{C}$ or not. This thus allows \textbf{error detection}.

In general, we have: \begin{equation}

   \vex{G}\tp \vex{H} = \vex{0}_{(N-K) \times K}.

\end{equation}

The check matrix of a code has $w$ linearly dependent rows \textbf{iff} there exists a codeword of Hamming weight $w$. Since minimum Hamming distance and weight are equal for a linear code, with a given check matrix, the minimum distance can be found by finding the minimum number of linearly dependent rows of the check matrix.


% ------------------------------------------------------------------------------------------- \minisec{Dual Codes}

For a $(N, K)_q$ code, there exists a $(N, N-K)_q$ code, the dual code.

\begin{align}

   \vex{G} &= \vex{H}_{\perp} \\
   \vex{H} &= \vex{G}_{\perp}

\end{align}

\begin{equation}

   R_{\perp} = 1 - R

\end{equation}

There may be benefits for finding weight enumerator and distribution of the dual code and then get those quantities also for the original code from them.


% ------------------------------------------------------------------------------------------- \minisec{Syndromes}

Can be used for decoding (error correction): \begin{equation}

   \vex{s} = \vex{H}\tp \vex{v}, \quad \vex{s} = \vex{0} \iff \vex{v} \in \mathcal{C}

\end{equation}


Syndrome $\vex{s}$, error word $\vex{e}$: \begin{equation}

   \vex{s} = \vex{H}\tp \vex{e}

\end{equation}

Syndrome decoding, equivalent to MD decoding, but with much less complexity (but still impractical for too large $N-K$): \begin{enumerate}

   \item calculate syndrome from senseword
   \item lookup minimum weight error word $\hat{\vex{e}}$ that yields the same syndrome via the check matrix as the senseword
   \item decoder output: $\hat{\vex{c}} = \vex{v} - \hat{\vex{e}}$

\end{enumerate}


% % ------------------------------------------------------------------------------------------- % \subsection{}

% % ------------------------------------------------------------------------------------------- % \subsection{}



% =========================================================================================== \section{Cyclic Codes}

Linear block codes with additional property: cyclic shift of codeword is again a codeword. Description of code using polynomials. The additional structure offers more efficient implementations based on polynomial calculations, implemented using shift-registers, filters.

Important examples: \begin{itemize}

   \item Reed-Solomon (RS) codes
   \item Cyclic Redundancy Check (CRC) codes

\end{itemize}

Polynomial division: \begin{equation}

   a(x) : g(x) = q(x) + \frac{r(x)}{g(x)}.

\end{equation} The remainder of the polynomial division always has smaller degree than the polynomial divisor $g(x)$.

Some stuff about polynomials: \begin{itemize}

   \item $a(x)$ is divisible by $g(x)$ if $a(x) = q(x) g(x)$, i.e., with zero remainder $r(x)$ of the polynomial division of $a(x)$ by $g(x)$.
   Degree of $g(x)$ is $\leq$ to degree of $a(x)$.
   \item A polynomial that is not divisible by any polynomial of degree $\geq 1$ is \textbf{irreducible}.
   Then, no factorization is possible in the underlying field.
   If it is also a monic polynomial, it is a \textbf{prime} polynomial.
   \item Any polynomial can be uniquely factorized into prime polynomials and a factor, but it may be only one single prime polynomial in case the polynomial to be factorized is irreducible.

\end{itemize}


% ------------------------------------------------------------------------------------------- \subsection{Polynomials for Cyclic Codes} Codewords $\vex{c}$ as polynomials $c(x)$.

% ------------------------------------------------------------------------------------------- \minisec{Generator polynomial}

Generator polynomial $g(x)$: unique monic nonzero codeword polynomial of smallest degree (=$N-K$). The generator polynomial is itself a codeword. (CHECK)

For any cyclic code of blocklength $N$, $g(x)$ is a factor of $x^N-1$ (the famous \emph{magic polynomial}), because $g(x) h(x) = x^N-1$, see below.

Codewords $c(x)$ can be generated using any polynomial $a(x)$ with max. degree $K-1$ and the generator polynomial: \begin{equation}

   c(x) = a(x) g(x)

\end{equation} The properties and performance of the code rely on $g(x)$, e.g. minimum distance, weight distribution. The above equation can be directly used for non-systematic encoding.

Equivalently, using $\vex{u}'$, i.e. the dataword $\vex{u}$ zero padded to length $N$: \begin{equation}

   \vex{c} = \vex{u}' \circledast \vex{g} = \vex{u}' \ast \vex{g}

\end{equation} where $\circledast$ is cyclic convolution.

% ------------------------------------------------------------------------------------------- \minisec{Check polynomial}

Check polynomial $h(x)$, of degree $K$: \begin{equation}

   g(x) h(x) = x^N-1.

\end{equation} Equivalently ($\vex{h}$ is zero padded length $N$ vector version of $h(x)$): \begin{equation}

   \vex{g} \circledast \vex{h} = \vex{0}.

\end{equation} Also (useful for decoding): \begin{equation}

   \vex{v} \circledast \vex{h} = \vex{0}.

\end{equation}

Another check equation: \begin{equation}

   R_{x^N-1}\{ c(x) h(x) \} = 0

\end{equation} This is a necessary and sufficient condition for $v(x)$ to be a codeword (needs to have degree at most $N-1$). \begin{equation}

   R_{x^N-1}\{ v(x) h(x) \} = 0

\end{equation}

% ------------------------------------------------------------------------------------------- \minisec{Syndrome Polynomials} There are also syndrome polynomials.

With $v(x) = c(x) + e(x)$: \begin{equation}

   s(x) = R_{g(x)}\{ c(x) + e(x) \} = R_{g(x)}\{ e(x) \}

\end{equation} There is a 1:1 correspondence between each error polynomial of weight $m \leq t$ and a single syndrome polynomials. This is the basis for a syndrome decoding approach.

Alternative syndrome polynomial $\overset{\sim}{s}(x)$: \begin{equation}

   \overset{\sim}{s}(x) = s(x) h(x)

\end{equation} Then \begin{equation}

   \overset{\sim}{s}(x) = R_{x^N-1}\{ v(x) + h(x) \},

\end{equation} equivalently ($\overset{\sim}{\vex{s}}$ is of length $N$, $\vex{s}$ of length $N-K$): \begin{equation}

   \overset{\sim}{\vex{s}} = \vex{v} \circledast \vex{h}.

\end{equation}

For decoding using the syndromes, again a decoding table with the syndrome polynomials and their associated minimum weight error polynomials needs to be created and stored. For decoding, the syndrome is computed from the senseword (by dividing by $g(x)$), then the minimum weight error word is found from the table, subtracted from the senseword to form the estimated codeword. This is an implementation of the BMD decoder with decoding radius $t$ (smaller ones also possible), similar as to the one for block codes (using syndromes). More efficient than for block codes, but still too costly for large blocklengths.


% ------------------------------------------------------------------------------------------- \subsection{Matrix Description for Cyclic Codes}

Basis for the code is built from single generator or check polynomials by cyclic shift. This is possible here due to the cyclic structure of the code. Thus, a single polynomial defines the whole basis and in turn generator (or check) matrix.

\begin{equation}

   g_k(x) = x^k g(x), \quad k = 0, \ldots, K-1.

\end{equation} From the $g_(x)$, one can form $\vex{g}_k$, the column vectors of the generator matrix. The check matrix is formed from vectors created from cyclically shifted versions of the reciprocal check polynomial: \begin{equation}

   h_k(x) = x^k \, \overline{h}(x) = x^{K + k} \, h(x^{-1}), \quad k = 0, \ldots, N-K-1.

\end{equation}

The resulting matrices are Toeplitz matrices. They are at first in non-systematic form but can be brought to systematic form by elementary column and row operations.



% ------------------------------------------------------------------------------------------- \subsection{Dual Codes}

Dual code of a cyclic code is again a cyclic code.

\begin{align}

   g^{\perp}(x) &= \overline{h}(x)  =  x^K \, h(x^{-1}) \\
   h^{\perp}(x) &= \overline{g}(x)  =  x^{N-K} \, g(x^{-1})

\end{align}



% =========================================================================================== \section{Primitive Cyclic Codes}

For $q$ the size of the symbol alphabet, the block-length is now related to it: $N = q^m -1$, where $m \in \N$ and $m \geq 1$. For primitive codes, the zeros of the generator polynomial can be characterized in closed form. Also, such codes can intuitively be described using the Fourier Transform.

Examples: \begin{itemize}

   \item CRC codes
   \item Reed-Solomon codes
   \item BCH codes

\end{itemize}

% ------------------------------------------------------------------------------------------- \subsection{Primitive Elements and Stuff}

Every Galois-field has at least one primitive element. If $q-1 \in \Prime$, i.e., a prime number, then \emph{every} element of the GF not 0 or 1 is a primitive element. Iff an element is a primitive element, one can generate all elements of the GF via powers of a primitive element: $x = z^k$ for $k = 0, 1, \ldots, q-2$. Every GF can thus be represented using powers of its at least one existing primitive element.

Because of \begin{equation}

   z^{q-1} = 1,

\end{equation} the field elements repeat for higher powers than $q-2$, i.e., \begin{equation}

   z^k = z^{k \mod(q-1) }.

\end{equation}


% ------------------------------------------------------------------------------------------- \subsection{Extension and Splitting Fields}

Consider $\gf{q}$ with characteristic $p \in \Prime$ and $q = p^m_0$ for $m_\in \N$. We can also have $q' = q^m = (p^{m_0})^m = p^{m_0 \, m}$, i.e., $q' > q$. We can also have $q = p^{m_0 \, n \, m}$, with $n \in \N$. Then: \begin{itemize}

   \item $\gf{q}$ = $\gf{  p^{m_0 \, n \, m} }$ is an extension field to both $\gf{q}$ and $\gf{q'}$
   \item $\gf{q'}$ is an extension field to $\gf{q}$
   \item both $\gf{q}$ and $\gf{q'}$ are subfields of $\gf{q}$.

\end{itemize}

In general, some polynomial $a(x)$ will have more zeros in an extension field $\gf{q^m}$ of its underlying field $\gf{q}$. For any given polynomial $a(x)$ with degree $D$ over some $\gf{q}$, there exists an extension field $\gf{q^m}$ where the polynomial has $D$ zeros and thus can be completely factorized into \emph{linear factors}. Such an extension field is called a \textbf{splitting field}. Every polynomial has a splitting field.


% ------------------------------------------------------------------------------------------- \subsection{Primitive Polynomials}

A primitive polynomial of degree $m$ is irreducible over $\gf{q}$ (cannot be factored into non-constant polynomials with coefficients in the underlying field $\gf{q}$). Page 169 says: irreducible means that a polynomial is not divisible (i.e., division with zero remainder) by any polynomial of degree $\geq 1$.

A primitive polynomial $f(x)$ over $\gf{q}$ of degree $m$ is: \begin{itemize}

   \item irreducible over $\gf{q}$
   \item all its $m$ zeros (they are distinct) $z_l \in \gf{q^m}$ (extension field, a splitting field for the polynomial) are primitive elements of $\gf{q}$.

\end{itemize}

$f(x)$ can be factorized completely in $\gf{q^m}$ into linear factors, using its zeros $z_l = z^{(q^l)}$ for $l = 0, \ldots, m-1$: \begin{equation}

   f(x) = \prod_{l=0}^{m-1} (x-z^{(q^l)})

\end{equation}


% ------------------------------------------------------------------------------------------- \subsection{Defining Set Stuff}

For primitive blocklength $N = q^m -1$, in $\gf{q^m}$, $x^N-1$ can be completely factorized into linear factors, where the $N$ distinct zeros are all the nonzero elements of $\gf{q^m}$. \begin{equation}

   x^N-1 = \prod_{k=0}^{N-1} (x-z^k), \quad N = q^m-1,

\end{equation} where $z$ is a primitive element of $\gf{q^m}$ and all zeros are $z_k = z^k$.

Because of $x^N-1 = g(x) h(x)$, $g(x)$ must consist of some of the zeros of $x^N-1$, i.e., some $z^{k_i}$ for $i = 1, \ldots, N-K$: \begin{equation}

   g(x) = \prod_{i=1}^{N-K} (x-z^{k_i})

\end{equation} The zeros $\beta_i = z_{k_i}$ define the generator polynomial and thus the whole code $\mathcal{C}$ completely (and the remaining $K$ zeros define the check polynomial $h(x)$ ?). They are thus called the \textbf{defining set} $\mathcal{D} = \{\beta_1, \ldots, \beta_{N-K} \}$.

% ------------------------------------------------------------------------------------------- \minisec{Checking Sensewords using Zeros}

$v(x)$ is a codeword \textbf{iff} all $\beta_i$ are zeros of $v(x)$: \begin{equation}

   v(x) \in \mathcal{C}  \Leftrightarrow v(\beta_i) = 0, \quad \forall \beta_i \in \mathcal{D}

\end{equation} This also holds for $c(x)$: all zeros of $g(x)$ are also zeros of $c(x)$, but $c(x)$ may have $K-1$ additional zeros, as it has max. degree $N-1$.

% ------------------------------------------------------------------------------------------- \subsection{DFT Representation}

The DFT uses a primitive element $z$ as the \enquote{basis function} for the projection. The $z$ in all of the following equations is the same primitive element used for the DFT.

Vectors $\vex{v} \bone \vex{V}$ can be described as polynomials $v(x) \bone V(x)$ over $\gf{q}$ and $\gf{q^m}$ in general, respectively, of degree at most $N-1$: (TO DO: CHECK) \begin{align}

   v(x) &= v_0 + v_1 x + v_2 x^2 + \ldots + v_{N-1} x^{N-1}  \\
   V(x) &= V_0 + V_1 x + V_2 x^2 + \ldots + V_{N-1} x^{N-1}

\end{align} Note that the vector representations may need to be zero padded appropriately if the degree is smaller than $N-1$.

\begin{align}

   V_k &= v(z^k) \\
   v_n &= \frac{1}{\overset{\sim}{N}} V(z^{-n})

\end{align}

Zeros in the polynomials \begin{align}

   v(z^k) &= 0 \quad \Leftrightarrow \quad V_k = 0  \\
   V(z^{-n}) &= 0  \quad \Leftrightarrow \quad v_n = 0

\end{align}

Instead of \begin{equation}

   \vex{c} = \vex{a} \circledast \vex{g},

\end{equation} we can use, with the DFT representations (this would be non-systematic encoding): \begin{equation}

   C_k = A_k G_k, \quad k = 0, 1, \ldots, N-1.

\end{equation}

$g(x)$ has $N-K$ zeros $\beta_i = z^{k_i}$ in $\gf{q^m}$ ($z$ any primitive element of it), where $q^m-1 = N$ is the primitive block-length of the primitive cyclic code. The set of zeros of $g(x)$ is the \textbf{defining set} of the code, as it fixes $g(x)$ and thus the whole code $\mathcal{C}$. \begin{equation}

   g(z^{k_i}) = 0, \quad i = 1, \ldots, N-K.

\end{equation}

The zeros of $g(x)$ are also zeros of any codeword, and any polynomial is a codeword iff it has the same zeros as $g(x)$. With $G_k = g(z^k)$ it follows that the zeros of $g(x)$ are zeros of the DFT representations, at frequencies $k=k_i$, called check frequencies, and there are again $N-K$ zeros.

The frequency domain representations of the polynomials from $\gf{q}$ exhibit a conjugacy property: \begin{equation}

   C_k^q = C_{qk \mod N}, \quad k = 0, \ldots, N-1,

\end{equation} which is necessary and sufficient for $c(x)$ to be from $\gf{q}$.

Cyclic shift of codeword results in another codeword, this also works in the frequency domain: \begin{equation}

   C_k' = z^{lk} \, C_k, \quad k = 0, \ldots, N-1.

\end{equation}

\begin{equation}

   G_k H_k = 0, \quad k = 0, \ldots, N-1.

\end{equation}


Any $V(x)$ satisfying the conjugacy property is the DFT of a codeword iff: \begin{equation}

   V_k H_k = 0, \quad k = 0, \ldots, N-1.

\end{equation} The DFT of any codeword satisfies: \begin{equation}

   C_k H_k = 0, \quad k = 0, \ldots, N-1.

\end{equation}


% ------------------------------------------------------------------------------------------- \minisec{Dual Code}

Dual code of primitive cyclic code is again a primitive cyclic code. There is a relation between the check frequencies of the code and its dual code.




% =========================================================================================== \section{Reed-Solomon-Codes}

RS-code: primitive cyclic codes over $\gf{q}$, with $q > 2$, i.e. non-binary codes. Popular for error correction. Used and analyzed in the frequency domain. They are maximum-distance codes, thus they attain the singleton-bound. The weight distribution for RS codes has a closed form expression. \begin{itemize}

   \item minimum (designed) distance $d_{\min} = d$ is a design parameter
   \item primitive block length $N = q - 1$, i.e. $m = 1$
   \item closed-form weight distribution has been derived
   \item good burst-error correction performance
   \item efficient encoding and decoding techniques available
   \item $K =N-d+1$
   \item $N-K = d-1$
   \item $R = 1 - \frac{d-1}{N}$
   \item $d_{\min} = d = N-K+1$

\end{itemize}

Code $\mathcal{C}$ consists of all codewords $\vex{c} \in [\gf{q}]^N$, whose DFT over $\gf{q}$, denoted $\vex{C}\bone{}\vex{c}$, is zero in a fixed frequency band $\mathcal{B}$ of $d - 1$ cyclically consecutive frequencies.

Block-length $N$ and symbol alphabet $q$ size are related via primitive block-length equation $N = q-1$.

RS codes use DFT represntation over the base field $\gf{q}$, not an extension field.

The check frequencies correspond to the zero band $\mathcal{B}$.

The zeros of the generator polynomial are all from the base field. Thus, $g(x)$ can be completely factorized into linear factors in the base field, that is a splitting field of $g(x)$!

Codewords can be \enquote{built} in the frequency domain, filling the $K$ nonzero positions appropriately with elements from $\gf{q}$. This gives a possibility for filling these positions directly with the DFT representation of the dataword, thus achieving systematic encoding in the frequency domain.

The conjugacy property does not have to be checked/fulfilled as the elements in the codeword are already from the base field $\gf{q}$.

\minisec{Zeros, Defining set}

There is the zero band $\mathcal{B}$ of consecutive zeros in the frequency domain, all elements of $\gf{q}$ \begin{equation}

   \beta_i = z^{(k_1 + i -1 ) \mod N}, \quad i = 1, \ldots , d-1.

\end{equation} stating the zero band completely specifies $g(x)$ and thus the code.


% =========================================================================================== \section{Convolutional Codes}

We now look at streams of data and code symbols. The possibly infinite streams are split up into frames. There are data frames of length $K$ and code frames of length $N$. Encoding of \enquote{new} data frames depends on past data frames, i.e., this is encoding with \textbf{memory}.

The encoders can be seen as FIR filters, or feed-forward shift registers, but also recursive encoders, i.e., IIR filters, can be used. Systematic encoders exist, but may need to be implemented using IIR filters.

For an $N, K$ encoder there are: \begin{itemize}

   \item $K$ internal data subsequences; these are \enquote{parallel} paths starting from the input S/P converter, indexed with $j = 0, \ldots, K-1$.
   \item $K$ shift registers (memory cell chains) of not necessarily the same length $m_j$, $j = 0, \ldots, K-1$.
   \item $N$ internal code subsequences. These are ending in the output P/S converter. Indexd with $i = 0, \ldots, N-1$.
   \item $KN$ tap constellation vectors $\vex{g}^{(i, j)}$ describing the encoder structure

\end{itemize}

There are different clocks / rates involved with the decoder: \begin{itemize}

   \item input data clock  $k$
   \item internal frame clock $l$
   \item output code clock $n$

\end{itemize} For $N$, $K$, we have these relationships between the clocks: \begin{align}

   n N &= k K \\
   n &= l N  \\
   k &= l K

\end{align}


Tap constellation vectors: \begin{equation}

   \vex{g}^{(i, j)} = (g_0^{(i, j)}, g_1^{(i, j)}, \ldots, g_{m_j}^{(i, j)})

\end{equation} Each of these describes the \enquote{path} from one output of the input S/P-converter to one input of the output P/S-converter, i.e., one tap constellation (with some memory cells somewhere along the path).

The encoding relation using the tap coefficients can be described as: \begin{equation}

   c_l^{(i)} = \sum_{j=0}^{K-1}  \left( \sum_{l'=0}^{m}  g_{l'}^{(i, j)} u_{l-l'}^{(j)} \right) = \sum_{l'=0}^{m} \left( \sum_{j=0}^{K-1}  g_{l'}^{(i, j)} u_{l-l'}^{(j)} \right),

\end{equation} the latter formulation is suitable for a certain matrix description, see below.

Some properties: \begin{itemize}

   \item encoding is with memory, the past $m$ data frames and the current data frame (so $m+1$ in total) influence the $l$-th code frame $\vex{c}_l$
   \item frame order memory $m = \max m_j$, \quad $\forall j = 0, \ldots K-1$
   \item property of the code: constraint length $\nu = \min \sum_{j = 0}^{K-1} m_j$ over all possible encoders for the same code
   \item convolutional code encoding is linear
   \item encoding is time-invariant at the frame, but not at the symbol level
   \item encoding is causal at the frame level
   \item rate $R = \frac{K}{N}$, as for block code, typically quite small, as $K$ often 1.

\end{itemize}


% ------------------------------------------------------------------------------------------- \subsection{Matrix Description}

\begin{equation}

   \vex{c} = \vex{G} \vex{u}

\end{equation} All of these are of infinite size.

Matrix of tap constellation coefficients for all $(i, j)$ for a single time-step $l$ with $l = 0, \ldots, m+1$, has shape $N \times K$: \begin{equation}

   \vex{G}_l = (g_l^{(i, j)})

\end{equation} An infinitely large overall generator matrix $\vex{G}$ can then be created as a block-Toeplitz matrix from the $m+1$ matrices $\vex{G}_l$.

There is a special form of the $\vex{G}_l$ for systematic encoding.

Similar formulations are available for check matrices $\vex{H}_l$ and an overall infinitely large matrix $\vex{H}$.

% ------------------------------------------------------------------------------------------- \minisec{Truncation, Termination}

Truncation: $(N, K)$ convolutional code to ($N_b, K_b) = (PN, PK)$ block code. The overall generator matrix is then a truncated block-Toeplitz matrix of shape $PN \times PK$. It is no longer infinite in size.

Termination: $(N, K)$ convolutional code to ($N_b, K_b) = ( (P+m) N, PK)$ block code, additional length due to zero forcing frames. The overall matrix again has a block-Toeplitz structure. It is no longer infinite in size.


% ------------------------------------------------------------------------------------------- \subsection{Polynomial representation}

Data and code sequences can be represented as power series (infinite degree polynomials): \begin{equation}

   u^{(j)}(x) = \sum_{l=0}^{\infty} u_l^{(j)} x^l, \quad j = 0, \ldots, K-1.

\end{equation} \begin{equation}

   c^{(i)}(x) = \sum_{l=0}^{\infty} c_l^{(i)} x^l, \quad i = 0, \ldots, N-1.

\end{equation} Note that there is a polynomial for each branch leaving a S/P or ending in a P/S converter.

Tap coefficient sequences of finite length are written as finite length polynomials of maximum degree $m_j$: \begin{equation}

   g^{(i, j)}(x) = \sum_{l=0}^{m_j} g_l^{(i, j)} x^l, \quad i = 0, \ldots, N-1, \; j = 0, \ldots, K-1.

\end{equation}


Encoding convolution relation can then be written as (sum of) products of data and tap coefficient polynomials: \begin{equation}

   c^{(i)}(x) = \sum_{j=0}^{K-1} g^{(i, j)}(x) u^{(j)}(x), \quad i = 0, \ldots, N-1.

\end{equation}


% ------------------------------------------------------------------------------------------- \minisec{Polynomial Generator Matrix}

Is of shape $N \times K$ and holds as elements the polynomials (!) $g^{(i, j)}(x)$ (of finite degree $m_j$): \begin{equation}

   \vex{G}(x) = (g^{(i, j)}(x))

\end{equation} There is also a systematic form available for the generator matrix.

The polynomials $u^{(j)}(x)$ can $\forall j$ then be stacked into a $K \times 1$ vector $\vex{u}(x)$ of polynomials, and equivalently for the codeword polynomials, yielding $\vex{c}(x)$. They can also be viewed as polynomials of dataframes $\vex{u}_l$, equivalently for the codeframes. Furthermore: $\vex{G}(x) = \sum_{l=0}^{m} \vex{G}_l x^l$, i.e., a matrix polynomial of the per-time-step $l$ defined matrices of tap constellation coefficients.

Encoding: \begin{equation}

   \vex{c}(x) = \vex{G}(x) \vex{u}(x)

\end{equation}


% ------------------------------------------------------------------------------------------- \subsection{Non-catastrophic Encoder}

In practice (avoiding unnecessary delays in the memory chains), a necessary and sufficient condition for a polynomial generator matrix $\vex{G}(x)$ to describe a non-catastrophic encoder is that the GCD of all tap coefficient polynomials is 1 (corresponding to $x^L = x^0 = 1$).

Note that a systematic encoder is non-catastrophic as it always includes the non-zero data symbol sequences directly! If the input sequences have infinite weight, then so will the output sequences as a consequence of that.

Non-catastrophic property of a non-recursive encoder is necessary for existence of a left inverse of the polynomial generator matrix $\vex{G}(x)$ and thus an non-recursive inverse encoder.



% =========================================================================================== \section{Turbo Codes}


Two fundamental ideas: \begin{enumerate}

   \item design code with random-like properties
   \item decoder that uses 2 or more SISO constituent decoders within iterative scheme

\end{enumerate}

Design of the interleaver for the second (or multiple) branch(es) is quite important to achieve pseudorandom properties, as is large length $K$ of the interleaver.

Ideally, spectral thinning is achieved, i.e., lower count of low-weight codewords.



% =========================================================================================== \section{Modifications of (Block) Codes}

Important? TO DO.


% =========================================================================================== \section{HISO Channels Family Tree}


% ------------------------------------------------------------------------------------------- \subsection{HISO} Soft pseudosymbols $Q$, transmit symbols $A$. We use a transition pdf of the channel for description.

In general, we consider the \enquote{block} channel transition pdf: \begin{equation}

   f_{\vex{Q}|\vex{A}} (\vex{q}|\vex{a}).

\end{equation}

If we assume the output symbols $Q_l$ at different times $l$ to be statistically independent, the pdf can be factored as: \begin{equation}

   f_{\vex{Q}|\vex{A}} (\vex{q}|\vex{a}) = \prod_{l=0}^{L-1} f_{Q_l|\vex{A}}(q_l|\vex{a})

\end{equation} If we furthermore assume that each output symbol $Q_l$ only depends on the input symbol $A_l$, and is statistically independent of all other $A_i$, $i \neq l$, we can write for the single output symbol $Q_l$: \begin{equation}

   f_{Q_l|\vex{A}}(q_l| \vex{a}) =  f_{Q_l|A_l}(q_l|a_l)

\end{equation} Both equations above together yield the description of a memoryless HISO channel, where each $Q_l$ only depends on its corresponding $A_l$. Note that the pdf could change with each time-step $l$.


% ------------------------------------------------------------------------------------------- \subsection{Memoryless HISO} Time-variant and time-invariant versions exist.

Memoryless, time-variant: \begin{equation}

   f_{\vex{Q}|\vex{A}} (\vex{q}|\vex{a}) = \prod_{l=0}^{L-1} f_{Q_l|A_l}(q_l|a_l)

\end{equation} Memoryless, time-invariant; $L$ independent uses of scalar HISO channel $A \rightarrow Q$: \begin{equation}

   f_{\vex{Q}|\vex{A}} (\vex{q}|\vex{a}) = \prod_{l=0}^{L-1} f_{Q|A}(q_l|a_l)

\end{equation}


% ------------------------------------------------------------------------------------------- \subsection{Gaussian Memoryless HISO} Assumes time-invariance, no ISI.

Memoryless, time-invariant; $L$ independent uses of scalar HISO channel $A \rightarrow Q$ (as above): \begin{equation}

   f_{\vex{Q}|\vex{A}} (\vex{q}|\vex{a}) = \prod_{l=0}^{L-1} f_{Q|A}(q_l|a_l)

\end{equation}

Specialized here to the Gaussian noise: \begin{equation}

   f_{\vex{Q}|\vex{A}} (\vex{q}|\vex{a}) = \frac{1}{\pi N_0 E_h} \exp \left( - \frac{\lvert q - E_h a \rvert^2 }{N_0 E_h}\right).

\end{equation} Here: $N_0 \ldots$ CHECK: one or two-sided?, $E_h \ldots $ ?, . This is scalar, $\C$ Gaussian noise.



% =========================================================================================== \section{HIHO Channels Family Tree}


% ------------------------------------------------------------------------------------------- \subsection{HIHO} Senseword $\vex{V}$, codeword $\vex{C}$. We use a transition pmf of the channel for description: \begin{equation}

   p_{\vex{V}|\vex{C}} (\vex{v}|\vex{c}).

\end{equation}

If we assume the output symbols $V_n$ at different times $n$ to be statistically independent, the pmf can be factored as: \begin{equation}

   p_{\vex{V}|\vex{C}} (\vex{v}|\vex{c}) = \prod_{n=0}^{N-1} p_{V_n|\vex{C}}(v_n|\vex{c}).

\end{equation} If we furthermore assume that each output symbol $V_n$ only depends on the input symbol $C_n$, and is statistically independent of all other $C_i$, $i \neq n$, we can write for the single output symbol $V_n$: \begin{equation}

   p_{V_n|\vex{C}}(v_n| \vex{c}) =  p_{V_n|C_n}(v_n|c_n).

\end{equation} Both equations above together yield the description of a memoryless HIHO channel, where each $V_n$ only depends on its corresponding $C_n$. Note that the pmf could change with each time-step $n$, as indicated by $V_n$, $C_n$ in the factors of the total pmf for the memoryless, time-variant HIHO channel: \begin{equation}

   p_{\vex{V}|\vex{C}} (\vex{v}|\vex{c}) = \prod_{n=0}^{N-1} p_{V_n|C_n}(v_n|c_n).

\end{equation}


% ------------------------------------------------------------------------------------------- \subsection{Discrete Memoryless Channel (DMC)} This is a memoryless, time-invariant HIHO channel.

\begin{equation}

   p_{\vex{V}|\vex{C}} (\vex{v}|\vex{c}) = \prod_{n=0}^{N-1} p_{V|C}(v_n|c_n)

\end{equation}

This corresponds to $N$ independent uses of a scalar HIHO channel $C \rightarrow V$ [p. 38].

% ------------------------------------------------------------------------------------------- \subsection{Symmetric DMC} Assumes time-invariance. All symbol errors are equally likely, with error probability $p$ for all error cases $n \neq c$.

\begin{equation}

   p_{\vex{V}|\vex{C}} (\vex{v}|\vex{c}) = \prod_{n=0}^{N-1} p_{V|C}(v_n|c_n) = (1 - P)^{N - d_H(\vex{c}, \vex{v})} \, p^{d_H(\vex{c}, \vex{v})}.

\end{equation} The first part in last equation is correct symbols, the second part for the erroneous symbols.

$P = (|\mathcal{D}| - 1) p$. Binary case: $P = p$, $1-P = 1-p$. % Here, $p_e$ is the probability of an error, here a symbol error.


% ------------------------------------------------------------------------------------------- \subsection{Binary Symmetric Channel (BSC)} Assumes time-invariance. All symbol errors are equally likely, with error probability $p$. Symbol alphabet is binary.

% ------------------------------------------------------------------------------------------- \minisec{Scalar}

\begin{equation}

   p_{V|C}(v|c) = \begin{cases}
                       1-p, \quad v|c = \begin{cases}
                                           0|0\\
                                           1|1
                                       \end{cases},\\
                       p, \quad v|c = \begin{cases}
                                           0|1\\
                                           1|0
                                       \end{cases},\\
                   \end{cases}

\end{equation}


% ------------------------------------------------------------------------------------------- \minisec{Vector}

\begin{equation}

   p_{\vex{V}|\vex{C}}(\vex{v}|\vex{c}) = (1 - P)^{N - w_H(\vex{e})} \, p^{w_H(\vex{e})}.

\end{equation}


Alternatively: \begin{equation}

   p_{\vex{V}|\vex{C}}(\vex{v}|\vex{c}) = \left( \frac{p}{1-p} \right)^{d_H(\vex{c}, \vex{v})} (1-p)^N.

\end{equation}



% =========================================================================================== \section{Galois-Fields}

In general, we denote a Galois-field as $\gf{q}$ with $q = p^m$, with $m \in \N$, and $p \in \Prime$, i.e. $p$ is a prime number, called the characteristic of $\gf{q}$. If we find $m = 1$, i.e., $q = p$, we will just write $\gf{p}$. Note that for $m \neq 1$, $q \not \in \Prime$. For some $q$, we can have $q - 1 \in \Prime$.

% We will look at two different cases of Galois-fields, either $\gf{p}$ or $\gf{q}$. % The first case is where $p \in \Prime$, i.e., $p$ is a prime number. % The prime numbers will be denoted as $\Prime$.

% The other case is where we have $q = p^m$, with $p \in \Prime$, $q \not \in \Prime$ (but $q \in \N$) and $m \in \N$. % (The case $m = 1$ is covered as $\gf{p}$).


Some examples: \begin{itemize}

   \item $\gf{2}$: $q = p = 2$,               $q - 1 = 1 \not \in \Prime$
   \item $\gf{3}$: $q = p = 3$,               $q - 1 = 2 \in \Prime$
   \item $\gf{4}$: $q = 4$, $p = 2$, $m = 2$, $q - 1 = 3 \in \Prime$
   \item $\gf{5}$: $q = p = 5$,               $q - 1 = 4 \not \in \Prime$
   \item $\gf{6}$: not relevant for us
   \item $\gf{7}$: $q = p = 7$,               $q - 1 = 6 \not \in \Prime$
   \item $\gf{8}$: $q = 8$, $p = 2$, $m = 3$, $q - 1 = 7 \in \Prime$

\end{itemize}

Some properties: \begin{itemize}

   \item For $\gf{p}$: Addition (+) and multiplication ($\cdot$) are to be executed with $\mod p$.
   \item For $\gf{2^m} = \gf{q}$, i.e. fields with characteristic $p = 2$, addition and subtraction are identical.
   This also means $a = -a$, for field element $a$.
   \item For $\gf{q}$: If $q - 1 \in \Prime$, then every element of $\gf{q}$ that is not $\{0, 1\}$ is a primitive element.
   \item Every $\gf{q}$ has at least 1 primitive element. 
   \item Every non-zero element of a $\gf{q}$ can be represented as a power of a primitive element $z$.
   $z^k = x$, for every $x \in \gf{q} \setminus \{0\}$, $k = 0, 1, 2, \ldots, q-2$.
   As there can be multiple primitive elements, there may be different ways to get the elements.

\end{itemize}


\subsection{Constructing a Galois-field} TO DO: CHECK THE NOTATION HERE! $q$ vs. $q'$.

For constructing a $\gf{q^m} = \gf{q'}$: \begin{itemize}

   \item Get a primitive polynomial $f(x)$ of order $m$ (irreducible, all zeros $z_l \in \gf{p^m}$) over $\gf{q}$.
   \item For a primitive element $z$, we know that $f(z) = 0$.
   All non-zero elements of the GF are powers of $z$, $k = 0, 1, \ldots, q' - 2$.
   \item Example: $f(x) = x^3 + x + 1$, $\Rightarrow f(z) = z^3 + z + 1 = 0$, i.e. $z^3 = z + 1$.
   Present all other elements as powers of $z$, get all $z^k = \ldots$ equations.
   The easiest way is to multiply $z^3 = z + 1$ consecutively with $z, z^2, \ldots$.
   From these equations, construct the addition table.
   \item In case $q^m = q' - 1$ is a prime number, any element not in $\{0, 1\}$, i.e. $\{2, 3, \ldots, q' - 2 \}$, is a primitive element that can be chosen as $z$, for constructing the table.
   \item $z^k = z^{k \mod (q-1)}$

\end{itemize}


\end{document}