Abstract:
Video or, Multi Frame, Super Resolution (VSR or, MFSR) techniques aim to
generate high-resolution frame reconstructions of corresponding low-resolution
ones. These techniques differ from the Single Image or, Frame, Super Resolution
(SISR or, SFSR) ones in that they can additionally exploit the temporal
nature of the input data. The advantage of having additional temporal information
does not translate directly into an easier problem to solve. The challenge
lies in the optimal extraction of this additional information. Current
state-of-the-art methods are still using some variants of optical flow estimation
plus warping for extraction and integration of temporal information but
it is well known that during the warping process a lot of the high frequency
information get lost. We investigated alternative architectures to alleviate or
suppress this problem but they only perform on par or slightly worst than
current SOTA networks.