The seminar of OBELIX team is currently held on thursdays 11:30 am, every two weeks, at the IRISA lab, Tohannic campus, room D106 (bat. ENSIBS). Usually, the presentation lasts 30 min and is followed by a discussion with the team.
The seminar is coordinated by Chloé Friguet.
Upcoming seminars (2020-2021)
- Date: May, 21st (Friday)
- Time: 13:00
- Room : visio
- Speaker: Ahmed Nassar (PhD OBELIX & ETH Zurich) – PhD defense
- Title: Learning to map street-side objects using multiple views
- Abstract: Creating inventories of street-side objects and their monitoring in cities is a labor-intensive and costly process. Field workers are known to conduct this process on-site to record properties about the object. These properties can be the location, species, height, and health of a tree as an example. To monitor cities, gathering such information on a large scale becomes challenging. With the abundance of imagery, adequate coverage of a city is achieved from different views provided by online mapping services (e.g., Google Maps and Street View, Mapillary). The availability of such imagery allows efficient creation and updating of inventories of street-side objects status by using computer vision methods such as object detection and multiple object tracking.
This thesis aims at detecting and geo-localizing street-side objects, especially trees and street signs, from multiple views. Solving the problem using an object detector, as with any problem solved with computer vision, brings up the usual problems of invariances such as occlusion, lighting, pose, viewpoint, and background. We rely on multiple views coupled with coarse pose information to solve these problems and for the benefit of getting more information about the object from these different views.Using multiple views brings another challenge, namely how to re-identify the objects in these different views to aggregate the information and not get duplicates of a particular object. Another major challenge is that the data sets acquired or used in our work contain imagery captured at a larger baseline, contrary to other data sets employed for person re-identification or self-driving and made of sequences of video frames. We propose several deep learning-based approaches to better detect, re-identify, and geo-localize objects and tackle these different challenges. In our first proposed approach, we aimed at investigating if using soft geometric constraints coupled with image evidence would provide a better re-identification or matching accuracy of objects across different views to overcome our large baseline obstacle. This method relied on image crops of the objects from ground-level imagery and geometric metadata acquired from the image and then given as an input to a novel Siamese convolutional neural network-based architecture that matches the image crops. Having confirmed that infusing our model with soft geometric constraints proved beneficial, our second approach aimed at achieving the same objective through an end-to-end model. The model takes as input a full image instead of crops, and our output is geo-localized bounding box detections tagged with identities across different views. To achieve such a task, we had to build a tool to annotate and create a data set of urban trees. Our final approach introduces another end-to-end model that relies on graph neural networks to improve flexibility and efficiency compared to the previous one. Also, in this approach, we include aerial imagery as another input for the first time.For all three proposed approaches in this thesis, we perform extensive experiments on curated data sets to demonstrate the proposed systems’ effectiveness.Keywords: Deep Learning, Computer Vision, Object detection, Re-identification, Graph Neural Networks, Urban objects, Multi-view
- Date: POSTPONED
- Time: 11:30
- Room :
- Speaker:Yann Soulard (MCF LETG Rennes)
- Date: POSTPONED
- Time: 11:30
- Room :
- Speaker: Jeremy Cohen (CR PANAMA, IRISA Rennes)
- Title: Learning with Low Rank Approximations
- Abstract: Matrix and tensor factorizations are widespread techniques to extract structure out of data in a potentially blind manner. However, several issues may be raised: (i) these models have often been designed for blind scenarios whereas many applications now features extensive training databases, (ii) their output may not be interpretable because of the lack of identifiability of the parameters and (iii) computing a good solution can be difficult. In this talk, after describing the link between separable functions and tensor/matrix factorizations, we will show that separability and low rank approximations are actually already at the core of many machine learning problems such as dictionary learning or simultaneous factorizations.