Neural OmniVideo: Fusing World Knowledge into Smart Video-Specific Models
The field of computer vision has made unprecedented progress in applying Deep Learning (DL) to images. Nevertheless, expanding this progress to videos is dramatically lagging behind, due to two key challenges: (i) video data is hi...
The field of computer vision has made unprecedented progress in applying Deep Learning (DL) to images. Nevertheless, expanding this progress to videos is dramatically lagging behind, due to two key challenges: (i) video data is highly complex and diverse, requiring order of magnitude more training data than images, and (ii) raw video data is extremely high dimensional. These challenges make the processing of entire video pixel-volumes at scale prohibitively expensive and ineffective. Thus, applying DL at scale to video is restricted to short clips or aggressively sub-sampled videos.
On the other side of the spectrum, video-specific models—a single or a few neural networks trained on a single video—exhibit several key properties: (i) facilitate effective video representations (e.g., layers) that make video analysis and editing significantly more tractable, (ii) enable long-range temporal analysis by encoding the video through the network, and (iii) are not restricted to the distribution of training data. Nevertheless, the capabilities, applicability and robustness of such models are hampered by having access to only low-level information in the video
We propose to combine the power of these two approaches by the new concept of Neural OmniVideo Models: DL-based frameworks that effectively represent the dynamics of a given video, coupled with the vast knowledge learned by an ensemble of external models. We are aimed at pioneering novel methodologies for developing such models for video analysis and synthesis tasks. Our approach will have several important outcomes:
• Give rise to fundamentally novel effective video representations.
• Go beyond state-of-the-art in classical video analysis tasks that involve long-range temporal analysis.
• Enhance the perception of our dynamic world through new synthesis capabilities.
• Gain profound understanding of the internal representation learned by state-of-the-art large-scale models, and unveil new priors about our dynamic.ver más
Seleccionando "Aceptar todas las cookies" acepta el uso de cookies para ayudarnos a brindarle una mejor experiencia de usuario y para analizar el uso del sitio web. Al hacer clic en "Ajustar tus preferencias" puede elegir qué cookies permitir. Solo las cookies esenciales son necesarias para el correcto funcionamiento de nuestro sitio web y no se pueden rechazar.
Cookie settings
Nuestro sitio web almacena cuatro tipos de cookies. En cualquier momento puede elegir qué cookies acepta y cuáles rechaza. Puede obtener más información sobre qué son las cookies y qué tipos de cookies almacenamos en nuestra Política de cookies.
Son necesarias por razones técnicas. Sin ellas, este sitio web podría no funcionar correctamente.
Son necesarias para una funcionalidad específica en el sitio web. Sin ellos, algunas características pueden estar deshabilitadas.
Nos permite analizar el uso del sitio web y mejorar la experiencia del visitante.
Nos permite personalizar su experiencia y enviarle contenido y ofertas relevantes, en este sitio web y en otros sitios web.