next up previous
Next: Inter-Quantizer Prioritization by Bit Up: Progressive Transmission of Motion Previous: Progressive Transmission of Motion


Attention-Based Quantizer Formation

In the first stage of the coding, a three level spatio-temporal wavelet transform is applied to the motion sequences. Following [8], on the 3D subband structure, a spatio-temporal orientation tree naturally defines the spatio-temporal relationship on the 3D pyramid that results from the wavelet transformation. Reference [8] gives the only one reasonable parent-offspring linkage.

Once a three level spatio-temporal wavelet transform is applied to the sequences of images, although most of the energy is concentrated in the temporal low frequency, spatial residual redundancy in the high temporal frequency band does exist due to the motion. That is, there exists not only spatial similarity inside each frame across scales, but also temporal similarity between two frames. However, quantizer formation allows the exploitation of self-similarity across an spatio-temporal orientation tree using zerotree coding. The grouping and segregation of wavelet coefficients to form quantizers is achieved now by attention-based quantizer formation. In this formation process, wavelet coefficients of a 3D wavelet transform are partitioned into different quantizers by attention thresholding. It compacts points of maximum attention to a small number of high-attention quantizers.

Points $(i,j)$ of maximum attention can be calculated by searching for peaks in a local attention function. Local attention functions should be defined depending on the particular application of interest. We are here interested primarily in the transmission of sequences of moving targets (Section V). Since reference [9] demonstrates that velocity only accounted for forty-eight percent variation in the probability of target detection, the local attention function is here computed to measure the velocity at any given point.

Points $(i,j)$ of maximum attention are then signaled by peaks in the attention function $e (i,j)$ and thus, to detect points of maximum attention, the values $e (i,j)$ can be thresholded using a global threshold:

\begin{displaymath}
s (i,j) = \left\{ \begin{array}{ll}
1 & \mbox{if $ e (i,j) > T$} \\
0 & \mbox{otherwise}
\end{array} \right.
\end{displaymath} (5)

where the value of $T$ is selected using a performance rule on a sample of sequences. The performance rule is a top-down rule where the threshold is adjusted based upon improving target detection performance. The model is then applied using the same threshold without need of adjustment.

The grouping and segregation of wavelet coefficients into a small number of quantizers $Q_1, Q_2, \cdots, Q_n, \cdots, Q_m$, is achieved through quantizer formation using detected points of maximum attention as follows:

Quantizer Formation Algorithm:

  1. Detect points $(i,j)$ of maximum attention, $s(i,j)=1$.
  2. For each 3D-region $R_n$ of the sequence of frames:
    1. Form a new quantizer $Q_n$ corresponding to the wavelet coefficients for the points $(i,j)$ in $R_n$ of maximum attention. The remaining coefficients are set to zero.
  3. Form a last quantizer $Q_m$ corresponding to the wavelet coefficients which were set to zero at quantizers $Q_n$ for every 3D-region $R_n$. The remaining coefficients are set to zero.
  4. Stop.


next up previous
Next: Inter-Quantizer Prioritization by Bit Up: Progressive Transmission of Motion Previous: Progressive Transmission of Motion
J. Fdez-Valdivia 2006-03-13