The idea behind OuterCircle – 360 degrees image capture on iOS was to create a 360 view of a car. A similar media concept is described as a sphere with a fixed centre, represented by the observer.
The problem was broken down into several parts:
- Generating both the images and their spatial coordinates;
- Storing data in a flexible format for display on both mobile and web;
- Displaying the date in a custom player with support for iOS, Android and web.
The data gathering phase will require the device camera and its motion sensors. In this phase, both camera, motion and positioning data is collected and analysed in order to allow the storage and use of the most qualitative photos available. The principle behind the capturing of the 360 view is capturing images at specific locations – the image capture is triggered by the device’s movement. Because the user is the one that needs to move the application provides an innovative user interface that helps the user move around the object in a suitable way for this kind of recording.
Most devices have 2 cameras, the back one and front one. For the purpose of this project the back camera is preferred because it can take higher quality photos.
There are several outputs for the feed:
The app requires two outputs, one for the actual 360 degrees image capture and one for the real-time preview that helps and guides the user while recording – the application augments this preview with motion sensor data in order to help the user take high quality 360 views of the stationary object. The preview output will be processed using AVCaptureVideoPreviewLayer.
For performance purposes we will be using framebuffers as they are the least processed output AVFoundation can provide. The framebuffers have to be processes as soon as they are received from the iOS SDK. The implementation of this realtime processing needs to be done taking into account processing and memory limits on the devices.
For the purpose of this application we need to take into account the time each framebuffer was recorded and we need to corroborate this with the motion sensor data. They are delivered as they are recorded, but the delay for each framebuffer can vary, and processing more buffers increases delays as it increased the load on the CPU and GPU.
CoreMotion provides data on the device’s motion and rotation in space and around the stationary object.
There are 3 axes of interest for us:
Depending on the device orientation, 2 axes will be considered adjustments and one will provide the angle. For Landscape Right, yaw will provide the angle, while pitch and roll will be considered deviations and should be as close to 0 as possible for an optimal position.
The device will store the start point of the recording in order to determine:
- deviation for pitch and roll;
- progress based on yaw, angle.
We assume the CoreMotion data is provided in real time. The acceleration is also monitored to avoid positions that might cause image blur.
Once a framebuffer is received, the app checks the position at the time it was recorded.
The buffer will be discarded if:
- there is no positioning data;
- the position is already filled, and the new image’s quality is lower than the previous one’s.
The angle determines if a position has been filled by an image. If the 360 degrees image capture is set to be composed of 72 images, there should be an image for every 5 degrees.
Buffer quality for this recording is given by:
- roll deviation;
- pitch deviation;
- angle – which needs to be as balanced as possible;
The value is the geometrical distance in 4D, weight will be added to improve the results. Adjustment takes trial and error.
Buffers will be discarded directly if the value is lower than a specified threshold.
Each 360 degrees image capture contains the following metadata:
- uid – unique identifier;
- date – unixtime;
- radius –the optimal distance between 2 images (5 degrees in the previous example);
- list of elements, each element being the data of an image:
- time – relative to the sphere starting point;
The images are stored on disk, and only the name is used as an identifier.
The entire structure is exported into 2 components:
- a json file;
- a series of images.
During the export, the images are reordered according to angles, rather than time because the human’s natural movement involves both forward and backward moments. If there are gaps in the images they may be reordered. This process is critical for incomplete recording.
Ex: if the angles are 1, 7, 355, 10, 19, 14 -> 355, 1, 7, 10, 14 ,19
The images are loaded from the disk based on order and not angle.
Each image contains a current user angle. If there are missing angles, the closest image is displayed, based on its’ angle.
There are several fine adjustments to be considered:
- preventing CoreMotion floods;
- handling data on separate threads;
- calibrating the buffer values;
- adjusting the recording angle;
There are no adjustments possible to compensate for the vertical movement of the device. If the user keeps a stable rotation of the device and moves it vertically / horizontally, the images taken will have the same value. As a suggestion using a kimball device might help record better 360 images of stationary objects.
The 360 image recordings can be used in E-Commerce and M-Commerce in the presentation products, including large objects such as cars, houses, appliances, etc. Another application is in social media and tourism. There are also possibilities of including the applications into games.
Any applications that includes media can profit from the prototype.
HyperSense has developed a 360 degrees image capture prototype for Android as well using a similar but not identical algorithm.
For any questions or comments please contact us.