Streetview Stitching and Augmented Reality on the Android Phone
MENTOR: Gary Bradski
This project takes some background. Since the algorithms are known, it's medium technical difficulty but advanced implementation difficulty to make this work right.
This project builds the ability to reliably detect and recognize feature points between two overlapping images. This technology can be used here for two somewhat related projects given in order of priority:
- Image stitching to find and stitch together images of a given scene into a larger panorama. By creating a graph between panoramas, a user can create their own additions to streetview.
- Similar to the above, but doing Visual Slam to get the bearing between to panaoramas on up to full 3D Visual SLAM
- Augmented reality. By finding points on a known surface, we can deduce where the camera is and render graphics in a realistic way onto a real scene.
These projects can use 3 people: a computer vision person who builds on feature recognition to create an image stitching or camera location application, a person who has experience with Visual SLAM and sparse bundle adjustment, and a person who understands Android (and iPhone a plus) along with web programing. Specifically:
- The vision person must know C++ well
- You should have had a computer vision course, and understand
- geometric vision well
- alpha blending, color blending
- image registration and stitching
- The VSLAM person must know optimization, sparse bundle adjustment and have experience in VSLAM.
- The "web" person should know how to code Android (and a plus if they know iPhone) apps. along with how to deal with large amount of (image) data on a server.
- You need to be able to get time, compass and GPS related to images on the phones
- Write an app to collect the images and invoke the stitching or camera location algorithm on them.
- Negotiate with the streetview team for how to interface the linked panorams into streetview.
- Project 1: We want to stitched see panoramas and hopefully,
- Project 2: linked graphs of panoramas, each with a heading pixel served from the street view server.
- Project 3: We want to view a game board, report the homography back to a server and pass back data to be blended in 3D onto the scene.
- Project 1: Out of a collection of images, stitch them into an alpha blended panorama.
- Project 2: Show that you can get proper heading between 2 nearby panoramas. Say if you take 3 panorams walking north, the heading shoudl show north and form a line. Show that this can be done going indoors.
- Project 3: Find points on a game board and report these back to a server.
Medium goal: By end of summer, show the demos, first 2 on streetview.
Project 1,2: We'd like to be able to reconstruct the 3D model of a scene similar to photosynth and have that served on streetview.
- Project 3: We'd like to demo to run on a street scene that's accurately registered on google map/streetview.
Richard Szeliski's text book
The OpenCV book is Learning OpenCV: Computer Vision with the OpenCV Library
PROJECT 1,2: Streetview
We want to enable anyone with an Android cell phone (and hopefully an iPhone as well) to extend the imagery in Google's streetview. In particular, we want a business to be able to add its exterior and interior to streetview if they want. We want people to be able to add trails, gardens and parks to streetview.
Demo: By end of summer, we want to see linked graphs of cell phone acquired panoramas to be served on streetview.
- Standing in one spot, collect overlapping images, send them to a server and have the server stitch them into a panorama.
- This will use SURF, Calandor features in OpenCV as well as a new feature being developed by Gary Bradski.
- The VSLAM component simply tries to get consistent heading and location between stitched panoramas. Take 3 or more evenly spaced panoramas in a line going in some compass direction and show that this forms correctly spaced points going in the right direction (that is, do this outdoors and show that the visual cues align with the phone's GPS and compass information. Then do this going from outdoors to indoors and see if it makes spatial sense).
- We want to link panoramas together. This is how streetview works -- it's just a graph of panoramas where the central pixel has a proper location and heading. (in their case registered to a map).
- These will be discoverable by using the same features as used to stitch related images together but detecting scale change perhaps augmented by time, accelerometer and compass data along with user input.
- Once we can reliably find feature points, we can use structure from motion to create 3D models/Visual SLAM
- Such work is active now in Willow for robots called "Visual odometry" and "Visual SLAM" (Visual Simultaneous Localization and Mapping). This involves a routine called sparse bundle adjustment which tries to minimize the registration errors of the whole set of images at once.
- To be provided
PROJECT 3: Augmented Reality on Android
We want to be able to register a cellphone to a game board or a visual scene and overlay the scene with an augmented reality.
Demo: A known game board will be viewed by cell phone video. At least 4 key points will be found and sent back to a server. The server will compute the homography and render some 3D graphic objects onto the game board scene and send these back to the camera. These will be combined with the actual image to make an augmented reality scene.
- Take videos with a cell phone camera of a game board on a table and be able to identify 4 known points on the board and send those back to a server which will invoke OpenCV's homography computation code to relate the plane of the image to the game board plane. The found points will be sent back and rendered as dots into the scene.
- Render graphical objects onto the game board.
- Instead of a game board, we want to be able to do this outside on a street and render the objects onto streetview imagery.
- To be provided