Your AR App Sucks – OR – The Current Sad State of Augmented Reality

Right now, as for early 2018, Augmented Reality sucks. It’s not fun, it’s not useful or engaging, doesn’t solve any problems or provide any value.

Allow me to explain. Lets start at the beginning.

History

Augmented Reality has been around for a very long time. Back in 2005 I was hired as the R&D manager and lead developer for EyeClick, which specialized in projection based AR for public locations. It wasn’t labeled as “AR” back then, but it was identical to AR systems today. Here is a video of the stuff we made:

AR using smartphones is also ancient, I don’t know when exactly it got started but I remember seeing a demo in an Expo in Chicago around 2010. It used a 2D marker to show 3D content on a smartphone screen.

What is AR anyway?

The term Augmented Reality has become very fluid this past year or so. Pretty much anything that has a display might be labeled as AR, or in many cases, mislabeled. The different types of AR have some common properties but they are also very different and have wildly different use cases. Lets go over the list:

Interactive Displays

This is the oldest form of AR and probably the least AR’ish of the lot. Honestly, it’s debatable whether this should be labeled AR at all, but I’ll go with the flow. These are systems that react to touch and/or movement, using sensors like cameras or touchfoils. The very first system that I developed used a cheap Logitech webcam hacked to operate in IR so it won’t see the projection itself, just the people walking over it.

These systems work well, but they never really took off. There are a few reasons why:

Installation is difficult and expensive, projectors are costly to run.
Doesn’t scale up easily
Not really appealing to the general public
Falls into most people’s “advertisement blind spot”

The last two are the most important. We found that many just didn’t care. Best case scenario they would poke around the system for a few seconds, then move on and never touch it again. Worst case they wouldn't even see it. I’ve seen people instinctively walk around a floor projection system without even glancing at it.

The only audience that took to these displays were kids, which is why EyeClick eventually re-branded itself as Beam and are now catering their products exclusively to children.

0DOF Headsets / Glasses

DOF means Degrees Of Freedom. Headsets that are 3DOF can track orientation (where you are looking). 6DOF means tracking orientation and position (I’ve explained this in detail in a previous blog post)

Zero DOF headsets do not track anything, they overlay an digital image on top of whatever you are looking at, and while this image can be animated it can’t be “attached” to a real-world object or position. I personally don’t consider these AR at all, but they are often labeled as AR so I will include them.

The most well-known example is Google Glass which was originally a flop but has recently regained traction. There are a few others, like the Sony SED-E1, which offers stereo displays but is also much bulkier. I’ve personally used both and found them lacking. The display is floating in space and fixed to your head, making it very distracting. It’s hard to walk around and focus on the displayed content at the same time. We did a few tests and found it’s at most as useful as a tablet, but at a much greater cost and complexity.

Smartphone AR, with and without a marker

The most common platform for AR is the smartphone, and he most common method is to use a 2D marker. The marker can be anything, but it’s best if it’s high contrast and has a good amount of detail on it. Here is our very own business card AR app, mostly used at parties and events:

About a year ago Apple released ARKit and Google shortly followed suit with ARCore , both allow developers to create AR apps that don’t require markers. The downside is that both require high-end smartphones (as of 2018) and neither offer location pinned AR. Meaning, I can place a virtual object on any flat surface, but the system can’t distinguish between the kitchen counter top and the garage floor.

Here is a notable example, The Ring made in AR by the super talented and insanely prolific Abhi Shek.

The problem with these apps is that they look great in video, but in real-life are just… unfun. Lets do an experiment. Take out your phone, open the camera app, and now walk around your office or home, while looking only at the phone screen. How long do you think can you keep that up? A minute? Maybe. 10 minutes? An hour? Probably not. It’s exhausting, hard on your eyes, back and arms. It’s just not fun. It’s a gimmick that usually has a very short engagement period.

But wait, you’ll say. There are lots of successful AR apps. There’s Pokemon Go and … mmm… yeah. That’s it. Pokemon was a wild (if somewhat brief) success not because of AR. In fact, it’s barely AR at all, all it does is display an animated image over the camera feed. There is no tracking, the only sensory input that app cares about is GPS position (which has an error of about 5 meters).

3DOF and 6DOF Headsets

On the high-end of the AR spectrum are the 3DOF and 6DOF headsets, and the best one is still the HoloLens (despite being released in March 2016). There are other contenders, like the Darqi (which we bought and returned due to a poorly designed head band that was super painful to use) or the Meta 2. I consider Magic Leap to be vaporware until they actually release something – anything – a video of the headset, specs or a release date.

The HoloLens is a ground-breaking device and a marvel of miniaturization, but it’s not a good AR headset. In fact, it’s terrible.

The biggest problem is the field of view, which is around 32°. For comparison, the HTC Vive is 110, and even that feels limiting. 32° is downright unusable. If you want to visualize what it looks likes, hold your phone about 20cm from your face. The first time I tried the HoloLens was at a conference, and the attendant told me to pick up an object from a virtual table. I looked around and asked “What table?”. It was right in front of me and I missed it, because of the field of view is so tiny. You literally have to scan your environment back and forth like a radar, if you want to find something.

The display itself is additive. It’s a transparent screen that overlays digital information on top of light coming from the real-world. The key word is overlay, it cannot block light and it can’t display black, the best it can do is not display anything. That means you can always see through the display, and the more well lit your environment, the weaker the AR display. Outdoors, during the day, it’s almost invisible. It works best in the dark, but that totally negates the purpose of AR.

On top of these issues, the HoloLens is also bulky, expensive and the tracking is sometimes spotty. The transparent visor reflects things from behind you (really distracting) and it’s hard to use if you require glasses.

This picture is from the official Hololens website. Notice something weird? The model is wearing it wrong. The headband should go on your forehead. The proper way to hear a Hololens looks so silly that most publicity shots are taken with the headset worn incorrectly.

The correct way to hear a Hololens. Doesn’t look very sexy.

That was a bit ranty, so I will offset by saying that Headset AR is the most exciting and promising tech since the internet. The potential and market size is beyond measure, much bigger than VR. However, we are not there yet. The HoloLens was supposed to be a developer kit, but Microsoft decided to market and sell it as a final product, and it’s simply not ready.

A good AR headset is insanely difficult to make, an order of magnitude harder than a VR headset. The HoloLens V2 release date was pushed back 1-2 years, because Microsoft couldn’t finish all the fixes and improvements they wanted. Here is a list features that are required of an AR headset, all are exceedingly difficult or even outright impossible with current tech:

Transparent display with a wide FOV (at least 100 degrees), which is also high resolution, high refresh rate, low latency and has almost no ghosting
High resolution “blocking layer” that can stop outside light to prevent ghosting
Small high resolution 3D scanners that can map the world around the user
Tiny computer capable of driving the 3D scanner in real time AND render stereo 3D content AND doesn’t get super hot

When can we expect production-ready AR headsets? I predict 4-5 years, around 2023.

Honorable Mention: Pass-through camera AR

A hybrid approach is the camera pass-through AR. This is essentially a VR headset with one or two cameras, built-in or attached as an add-on. Two examples are the MergeVR and the Zed Mini, which I reviewed here.

On paper, this is the best of both worlds. AR without the complexity of transparent displays. However, the concept is fundamentally flawed:

Cameras have latency. They need to capture an image, digitize it, send it to the display. This doesn’t take long, in any other application you’ll be hard pressed to even notice it. Not in VR though. Anything over 20ms latency will make people really sick really fast. I did a few experiments with Vuforia and pass-through AR and I literally had to tear the headset off my face after 10 seconds, it was so nauseating.
Unlike your eyes, the cameras are located outside the headset. That means they don’t move like your eyes do, when you move your head. This is a small difference but it’s noticeable and it makes everything seem very weird. This too will make most people sick.

Conclusion

VR looks underwhelming in video but is amazing in real-life, AR looks amazing in video but is underwhelming in real-life.

I’m super excited about AR and I think it will disrupt so many industries in the near future, but we are not quite there yet. Currently AR has two uses:

To generate snazzy video and gain some PR. “Look at our engineers using HoloLens to design cars!” which is removed after filming and never touched again.
As a gimmicky crowd-pleaser. Nothing wrong with a good gimmick, as long as that’s your goal. It puts a smile on people’s faces just don’t expect them to remain engaged for more than a few minutes.

I’m still waiting for my AR headset. And jetpack.