4th International Workshop on

Compact and Efficient Feature Representation and Learning in Computer Vision 2019

in conjunction with ICCV 2019
Seoul, Korea, October 27~November 2 2019

Feature representation is at the core of many computer vision and pattern recognition applications such as image classification, object detection, image and video retrieval, image matching and many others. For years, milestone engineered feature descriptors such as SIFT, SURF, HOG and LBP have dominated various domains of computer vision. The design of feature descriptors with low computational complexity has gained lots of attention and a number of efficient descriptors including BRIEF, FREAK, BRISK and DAISY have been presented. In the past few years we have witnessed significant progress in feature representation and learning. The popularity of traditional handcrafted features seems to be overtaken by the Deep Convolutional Neural Networks (DeepCNNs), which can learn powerful features automatically from data and have brought about breakthroughs in various problems in computer vision. However, these advances rely on deep networks with millions or even billions of parameters, and the availability of GPUs with very high computation capability and large scale labeled datasets plays a key role in their success. In other words, powerful DeepCNNs are data hungry and energy hungry.

Nowadays, given the exponentially increasing number of images and videos, the emerging phenomenon of big dimensionality exposes the inadequacies of existing approaches, no matter whether traditional handcrafted features or recent deep learning-based ones. Thus, there is a pressing need for new scalable and efficient approaches that can cope with this explosion of dimensionality.

In addition, with the prevalence of social media networks and the portable / mobile / wearable devices to access them, comes the current concern of the limited resources (e.g., battery life, memory, storage space, computational power, and bandwidth) these offer. The demands on sophisticated portable / mobile / wearable device applications in handling large-scale visual data is rising. In such applications, real time performance is of the utmost importance to users, since no one is willing to spend any time waiting nowadays. Therefore, there is a growing need for feature descriptors that are fast to compute, memory efficient, and that yet exhibit good discriminability and robustness.

Given sufficient annotated data, existing features - especially those produced by deep CNNs - have yielded good performance. Nonetheless, there are many applications where only limited amounts of annotated training data can be gathered (such as with many visual inspection or medical diagnostics tasks). Such applications are challenging for many existing feature representations, and require sample-efficient techniques to learn good representations.

A number of efforts, such as compact binary features, DCNN network quantization and compression, energy efficient network architectures, binary hashing techniques and data efficient techniques like meta learning, have appeared at top conferences (including CVPR, ICCV, ECCV, NIPS and ICLR) and top journals (including TPAMI and IJCV). The workshop aims at stimulating computer vision researchers to discuss the next steps in this important research area.

Important Dates(Tentative)

Event Date
Paper Submission DeadlineJuly 30 August 7, 2019
Notification of AcceptanceAugust 25, 2019
Camera-ready dueAugust 30, 2019
Workshop (Full day)October 27 November 2, 2019


We encourage researchers to study and develop new feature representations that are fast to compute, memory efficient, and data efficient, while exhibiting good discriminability and robustness. We also encourage the presentation of new theories and applications related to feature representation and learning for dealing with these challenges. We are soliciting original contributions that address a wide range of theoretical and practical issues including, but not limited to:

1. New features (handcrafted features, lightweight network architectures, deep model compression/quantization, and feature learning in supervised, weakly supervised or unsupervised way) that are fast to compute, memory efficient and suitable for large scale problems;

2. New compact and efficient features that are suitable for wearable devices (e.g., smart glasses, smart phones, smart watches) with strict requirements for computational efficiency and low power consumption;

3. Hashing/binary codes learning and its related applications in different domains, e.g., content-based retrieval;

4. Evaluations of current traditional descriptors and features learned by deep learning;

5. Hybrid methods combining strengths of handcrafted and learning based approaches;

6. Sample-efficient feature learning methods, e.g., meta learning, few shot learning;

7. New applications of existing features in different domains, e.g., medical domain.

Invited Speakers

(1) Professor Ramin Zabih (Confirmed) (Email: rdz@cs.cornell.edu)

Ramin Zabih received undergraduate degrees from MIT in computer science and math, and the PhD degree from Stanford in computer science. He is a professor of computer science at Cornell University at Cornell NYC Tech. He and the students developed graph cut methods for computer vision that have been widely used in both academia and industry. He received the Helmholtz Prize at ICCV in 2013, the Koenderink prize at ECCV in 2012, and Best Paper Awards at ECCV in 2002. He was a program chair for CVPR in 2007 and a general chair for CVPR in 2013, and will be a general chair for ECCV in 2018. He served as an editor-in-chief of the IEEE Transactions on Pattern Analysis and Machine Intelligence from 2009-2012, and since 2013 has chaired the PAMI TC. He is also the president of the nonprofit Computer Vision Foundation. He is a fellow of the ACM and the IEEE.

(2) Doctor Jifeng Dai (Email: daijifeng@sensetime.com)

Title: Deep Feature Flow for High Performance Video Recognition

Abstract: Recent years have witnessed significant success of deep convolutional neutral networks (CNNs) for image recognition. With their success, the recognition tasks have been extended from image domain to video domain, such as video semantic segmentation, and video object detection. Fast and accurate video recognition is crucial for high-value scenarios, e.g., autonomous driving and video surveillance. Nevertheless, applying existing image recognition networks on individual video frames not only introduces unaffordable computational cost for most applications, but also suffers from deteriorated object appearances in videos, such as motion blur, video defocus, rare poses, etc.
It is widely recognized that image content varies slowly over video frames, especially the high level semantics. We developed a principled approach, called Deep Feature Flow, to exploit such data redundancy and continuity. In it, end-to-end trainable motion estimation module is built into the network architecture, so as to align features across multiple frames. Deep feature flow can effectively reduce the computational overhead and improve the recognition accuracy on videos. This talk would cover our series of efforts towards this direction.

(3) Professor Liang Wang (Email: wangliang@nlpr.ia.ac.cn)

Liang Wang received the PhD degree in Pattern Recognition and Intelligent System from the National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CAS), China, in 2004. After graduation, he has worked as a Research Assistant at the Imperial College London, United Kingdom and Monash University, Australia, and a Research Fellow at the University of Melbourne, Australia, respectively. Before he returned back to China, he was a Lecturer with the Department of Computer Science, University of Bath, United Kingdom. Currently, he is a Professor of Hundred Talents Program of CAS at the Institute of Automation, Chinese Academy of Sciences, P. R. China. His major research interests include machine learning, pattern recognition, computer vision, multimedia processing, and data mining.

Program outline

Time Event
8:55~9:00Welcome Introduction
9:00~9:45Invited Talk (Ramin Zabih)
9:50~10:35Invited Talk (Jifeng Dai)
10:35~11:15Oral Session (2 presentaions: 20min each)
11:15~11:30Coffee break
11:30~12:10Oral Session (2 presentations: 20min each)
14:15~15:00Invited Talk (Liang Wang)
15:00~16:00Poster Session
16:00~17:00Oral Session (3 presentations: 20min each)
17:00~17:15Closing Remarks

Oral Session 1 (10:35~11:15)

Oral Session 2 (11:30~12:10)

Oral Session 3 (16:00~17:00)

Poster Session (15:00~16:00)

Paper Submission Information

All submissions will be handled electronically via the workshop’s CMT Website. Click the following link to go to the submission site: https://cmt3.research.microsoft.com/CEFRL42019.

Papers should describe original and unpublished work about the related topics. Each paper will receive double blind reviews, moderated by the workshop chairs. Authors should take into account the following:

The authors will submit full length papers (ICCV format) online, including:

(1) Title of paper and short abstract summarizing the main contribution,

(2) Names and contact info of all authors, also specifying the contact author,

(3) Contributions must be written and presented in English and

(4) The paper in PDF format.

All submissions will be peer-reviewed by at least 3 members of the program committee.

Poster guideline: The dimension of poster panels is 1950mm (width) x 950mm (height). It is the same as that of the main conference.


Dr. Li Liu
(University of Oulu & NUDT)
Dr. Yu Liu
(PSI group of KU Leuven)
Dr. Wanli Ouyang
(Univeristy of Sydney)
Dr. Jiwen Lu
(Tsinghua University)
Prof. Matti Pietikäinen
(University of Oulu)

Previous CEFRL Workshop

· 3rd CEFRL Workshop in conjunction with CVPR 2019

· 2nd CEFRL Workshop in conjunction with ECCV 2018

· 1st CEFRL Workshop in conjunction with ICCV 2017

Please contact Li Liu if you have question. The webpage template is by the courtesy of awesome Georgia.