Digilite: Towards An Enriching Image-Viewing Experience

Contributor:  Saksham Singh 

The Need

Till now, the user on online platforms has had to view cropped images that the seller uploads which might not give sufficient information about the room.  These photos often cover minimal areas of the room and thus do not give an accurate description of what the apartment looks like. On the other hand, it is not feasible for every seller/broker to do a 3D photo shoot of the room as that is a costly and time-consuming affair.  Also, for apartments that are not yet fully built, sellers might want to hide some details which might not be possible in a 3D shoot.

Digilite provides a mid-way approach so that with minimal effort sellers can click and upload images which are then used by us to give a 3D-like view to the user. This approach helps the user to get an overall view of a room and does not require much effort from the seller side.


The Approach

The approach we took can be broadly divided into two steps:

  1. Assuming that we have the images in the desired format, how to create the viewing experience with the tools we already have so as to give the user a better view of the room. We initially tried creating a stitched panoramic image (Google Panorama) by taking multiple consecutive images and then combining them using a stitching algorithm. However, this would have required the seller/broker to click around 20 images which was not feasible. Also, for objects near to the camera that were present in two different images, the stitching did not result in good panoramas. Therefore, an alternate approach was thought of where no panorama creation was involved. Images clicked were required to have certain overlap between them. The idea was to transition between them in a way so as to give a 3D experience.

  2. How to get the seller/broker to click and upload the images in the required format and how to get the required data about the images. This required allowing the web browser on a smartphone to be able to access the camera and also for the smartphone to be able to provide orientation values.

The Viewing Experience

For the viewing experience, we had to, firstly, find a way to place the images in such a way so that we could easily transition between them. Secondly, we had to figure out how to make the transition between them seamless.


Placing the Images Three.js has been used to place the images. Images were applied as texture to planes in Three.js. Placing these planes required finding their relative position with respect to each other and with respect to the camera. Since the planes are equidistant from the camera we can assume that each plane acts as a tangent to the sphere with the camera at its center. If we see this in 2D, we can consider the sphere as a circle with the planes as tangents to the circle. Also from the general equation of a circle:

Where, P is any point on the circle, R is the radius of the circle,
Theta is the angle which the line from center of the circle to the center of
the tangent forms with the y-axis.

Using the above equation we can find the center of each plane using their distance from the camera and their relative orientation.

The Transition

Now that we have images placed at their respective positions, the next step was to figure out how to transition between consecutive images so as to give the user a 3D experience. This required tweaking the internal properties of the planes and interpolating their value when the camera rotates from one image to the next. After some hit and trial, we decided to change the opacity of the two planes involved during transition.  However, the opacity of the whole plane did not change. Instead, only the part that comes in the overlap region was faded in/out. This was done to prevent unnecessary blacking out of areas. Also, since the overlapping areas are actually the same with the difference being in the orientation, this gives a nice toggle effect.

Since we had to change the opacity of only a part of the plane, we had to write new shaders for the material. WebGL Shaders have two tightly coupled but distinct parts: a vertex shader and a fragment shader. The vertex shader calculates the position of each vertex while the fragment shader defines the colour.

The opacity of each pixel depended upon how far it was from the overlap boundary and in what direction was the camera rotating.

In simple words, if we transition to the next image, the pixels in the front overlapped part of the plane change their opacity based on their distance from the overlap boundary and with time the entire opacity is zero. The result of this is shown below:

Although the transition is smooth and there are no blackout areas, objects that are partitioned in both images tend to stand out. To deal with this we introduced a Blur Shader which as the name suggests blurs the image on rotation giving the effect of a very fast motion. The blur was applied on the pixels using the same principle we used for opacity. The end result is shown below:


Capturing the images

Now it was time to collect the actual data we would use. To achieve this we first had to figure out the following things- figuring out a way to allow smartphone web browsers to capture good quality photos; getting orientation details of each image so that they can be placed accurately; making sure that all images had a reasonable amount of overlap and that the device was not tilted or turned during capture.

Using Smartphone Camera  Image Capture is an API to capture still images and configure camera hardware settings while using Chrome. This API allows clicking of maximum resolution images the device camera would allow. The API enables control over camera features such as zoom, brightness, contrast, ISO and white balance. Capture can be done in two ways: quick snapshot which allows us to use the full resolution of the camera while grab frame only gives a snapshot of the live feed that we get from user media.

For our purposes we used quick snapshot to click high-resolution images. Navigator.mediaDevices gives us the list of media devices available from which we can select the back camera. The livestream is then passed to a video tag which then shows the feed we are receiving via the camera. Everytime a picture is taken, a bitmap image of appropriate width and height is created. On save, an API call is sent to save all the clicked images.

Getting the Orientation Google provides a Generic Sensor API which exposes sensor devices to the web platform. Chrome supports several sensors like:

Motion sensors: Accelerometer, Gyroscope, Linear acceleration sensor, AbsoluteOrientationSensor, RelativeOrientationSensor

Environmental sensors: AmbientLightSensor, Magnetometer
One of these sensors is the Relative Orientation Sensor which gives us an orientation of our device with reference to a stationary coordinate system. This sensor gives us the orientation in the form of quaternions which can be easily used in Three.js to compute angles between two planes thus letting us calculate the center of each Plane.

Checking for Overlap and Tilt: Since consecutive images should have a certain amount of overlap, it had to be made sure that while clicking, the user should not exceed the overlap or make the overlap too less. To facilitate this, the angle between the position of the last capture and the current device position is captured. If the angle does not lie in the expected range, an error message is displayed on screen.

To check for tilt or turn of device, the device orientation events were used. This event provides us with three values: alpha(rotation around z-axis), beta() and gamma(). Using these we are able to identify if the device is tilted and show the appropriate error message.

Conclusion

We have been able to successfully capture several projects using Digilite and have got good final products. The idea of the project was to better the user viewing experience and at the same time device a way which would not require much effort to create the project. A good amount of thought was given to ensure simplicity in the capture process by providing step-by-step instructions that can be easily followed. The transitions in the viewer feel seamless for large to medium overlaps but for smaller overlap between images they sometimes might feel abrupt.

Overall, we believe that through Digilite we have been able to solve both the mentioned problems and have developed something that has the potential to solve this use case.

You can open the below mentioned link to view already created projects or create your own project:  Digilite