Welcome back to the In Reality podcast, covering all things Augmented & Virtual Reality. In Reality features industry news, commentary, and perspective from AR/VR veterans and experts.
In Reality is co-hosted by Marxent’s Joe Johnson and Joe Bardi. Johnson is the Creative Director at Marxent, and has been in the AR/VR industry for four and a half years following a stint on Microsoft’s Office UX team. Bardi is Marxent’s Senior Content Strategist, and has been in the industry for a year, after having spent more than a decade in print and TV media.
In this week’s episode, the Joes sit down with Dr. Ken Moser, PhD, a giant in the industry with 7+ years of experience in HMDs, Computer Vision, and a whole slew of other specializations. What’s really in the ARKit’s toolbag, and what does it mean for the AR & VR industry at large? Dr. Ken explains it all.
Thanks for listening. We really do appreciate it! Enjoyed the show? Please check out previous episodes at the In Reality Soundcloud page, or subscribe at the iTunes store, Stitcher or the Google Play Music.
Questions, comments, concerns? Email us at email@example.com.
Check back soon for another episode of In Reality.
FULL TRANSCRIPT OF THIS EPISODE:
00:08 Speaker 1: Welcome to the In Reality Podcast. Now starting in three, two, one.
00:08 Joe Johnson: Covering all things augmented and virtual reality, the In Reality Podcast is hosted by Joe Bardi and Joe Johnson, and features news, commentary, and perspective from industry veterans and experts. First up, introductions. I’m Joe Johnson, creative director at Marxent, and I’ve been in the AR and VR industry for four-and-a-half years now since coming from Microsoft’s Office UX team.
00:25 Joe Bardi: And I’m Joe Bardi. I spent more than a decade in print and TV media before joining Marxent about a little more than a year ago.
00:32 JJ: We’ve got a very special guest this week, Dr. Ken Moser, our resident scientist, PhD, and all around much smarter than Joe and Joe know-it-all. Ken’s PhD comes with special interests in computer vision, tracking, and head-mounted displays. He’s got more than seven years of active participation in the AR and VR community, including VR research projects at the Naval Research Lab in Virginia, as well as the Interactive Media Design Lab at the Nara Institute of Science and Technology in Nara, Japan. I am officially in the presence of a higher intelligence, so thanks for joining us today, Ken.
01:00 Dr. Ken Moser: Thank you guys for having me. I am a long-time listener, first-time interviewee.
01:04 JJ: Mega-dittos.
01:05 JB: Thank you for that, Ken.
01:06 JJ: This week’s episode mostly consists of poking your brain, Ken, and seeing what oozes out. So this week’s format is gonna be a little different. Everyone ready to go?
01:13 JB: Yes.
01:14 DM: I am, indeed.
01:14 JJ: Alright.
01:24 JB: Alright. Let’s start with ARKit, Ken. It was announced at the Apple Keynote a few weeks ago. You’ve actually gotten a chance to play with the SDK. Why don’t you give us your initial reactions to it and your observations about it so far?
01:37 DM: Right. We were in the office when, of course, the WWDC was going and they made the announcement for it. Honestly, the only reason we were really watching it was for the announcement for their AR ’cause we weren’t sure if it was gonna be a depth camera or they had something more software-related.
01:52 JJ: Yeah, there were rumors flying around beforehand that it was going to be some sort of AR announcement, so I guess we really got what we were hoping for.
02:00 JB: And you guys didn’t want a really expensive, tiny speaker?
02:03 DM: Well, some people might have. I’m not really into those things. I’m not surprised, honestly, they added that to their repertoire of gadgets and gizmos.
02:13 JJ: We’re talking about the Alexa clone?
02:14 DM: Exactly.
02:14 JB: Yes. The HomePod, as it were.
02:16 DM: They wanna make sure they have an OS for everything. Everything’s gonna run on an Apple OS: TvOS, watchOS. There’s gonna be an OS for tiny home speakers.
02:25 JJ: So they did make an AR announcement.
02:27 DM: They did.
02:28 JJ: What are they bringing to the table?
02:29 DM: They announced the ARKit and their initial presentation, I guess, of the library itself, I thought they didn’t honestly devote enough time to it compared to other things they were announcing. After having used it now, I think they underplayed it considerably. [chuckle]
02:49 JB: Why do you say that?
02:50 DM: Their presentation, and forgive me, we may look it up and we may need to go back and add this in there so I can say the guy’s name right, whatever the guy that presented it was.
02:58 JJ: We’re just gonna use that.
03:00 DM: Okay.
03:00 JJ: I’m just gonna tell you right now, Ken, we’re just using that.
03:02 DM: Here in the office, we call them “Not Steve Job,” one through whatever number it is. So Not Steve Job Number Two…
03:08 JB: I think that’s Craig.
03:09 DM: Presented this. He, of course, had his phone, I presume it was an iPhone 7 or up, 7 Plus or whatever, and he was just showing the teacup or a candlestick, one of those two objects, just on the little desk there. And at first, we were like, “Okay.” Honestly, we’ve seen that kind of demo a hundred million times before. There’s lots of people that have done a simple thing on a table. We’ve done it. Everyone’s done it.
03:32 JJ: Basic tracking.
03:33 DM: It’s very basic stuff. And it’s like, “Okay, whatever. That’s nice.” And then, of course, they had the guys from…
03:39 JJ: I believe the name of that company is Wingnut.
03:41 DM: Oh, Wingnut. That’s it. Yes. Wingnut AR. Yes. In that particular demo, they showed it on the table, and some people might play it off as a game that you could play in it. It was obviously a pre-rendered scene. I don’t believe the guy was doing any interaction on the scene at all.
03:55 JJ: Oh, is that right?
03:56 DM: He didn’t appear to be. He was looking more around it. If he was doing interaction, it wasn’t obvious he was doing anything.
04:00 JJ: So he was just watching a scene.
04:01 DM: It looked like he was just watching a scene that was pre-animated, which is fine. It was a cut scene. It was like a cut scene. It happens in games all the time.
04:05 JB: I wonder if that has any play moving forward in passive entertainment styles, like movies or something like that. Is there a case to be made that we’re gonna be watching movies in AR at some point?
04:16 JB: I mean…
04:16 JJ: You can’t see this, but Joe just shrugged dismissively.
04:19 JB: Yeah, I mean to me, the point of that entire demo was, “Look at how cool this is,” but, much to Ken’s point, it was just essentially a narrated movie that went on, where it was like, “Look at this outpost. Oh, there’s people walking around. Oh, some ship showed up. Oh, they’re fighting.”
04:30 JJ: Okay. So we’re not actually really seeing anything new there?
04:32 JB: I think it was more of a proof-of-concept of the idea of, “Look, you can do the sort of storytelling thing and you could have all of these elements in the scene that know where everything is on this big table,” and it does point to all sorts of things in your living room that you could do, gaming-wise and whatever, but there was no actual gaming element to it.
04:47 JJ: Well, since I’ve interrupted Ken about useful things, what else did you see there?
04:50 DM: Right, right. Just tying onto what Joe B. Said, and I’m not sure… [chuckle]
04:54 JJ: Just say Joe. It’s funnier that way.
04:55 DM: Just say Joe, exactly. Had noted, the only other thing they might would have benefited, perhaps, from showing that, other than, I’m presuming, hundreds of thousand dollars they’d spent on Wingnut AR to devise that demo, was just to show that ARKit does tie into game engines like Unreal and Unity. Otherwise, it was no more complicated than the teacup demo that was playing, ’cause if it’s tracking, it’s tracking. All the ARKit is, and I think it’s important to note for anyone listening out there, they call it ARKit, of course, but really it’s just tracking, so it’s tracking to the device. Whether or not you show the video feed behind what you’re doing, is really up to you. You don’t have to show the video. You can show the video, then theoretically it becomes AR, but it’s really just tracking, so it’s device tracking. Again, they’re calling it AR, ARKit. It doesn’t provide any renderings. You, of course, have to do your own renderings, or tie-in to OpenGL, or Metal, or Unity, or Unreal. But it does provide the tracking of the device within the world, in world space, in world scale.
05:55 JJ: I think this is an opportune time to talk about what you think is AR, what would make what they’re doing augmented reality, specifically.
06:01 DM: Right. I can get on lots of long rants [chuckle] and had done so…
06:06 JJ: Please do.
06:07 JB: I promise.
06:08 DM: For Joe in many emails about mixed reality. [chuckle]
06:10 JB: We never do show notes or anything like that, but I promise on this one, I’ll include links. Ken has done… I’ve worked with him on a couple of things that include an entire explainer on mixed reality and the virtual reality spectrum and all of that stuff, and I’ll include those in the post so that people can actually click on them and link. There’s a ton of really good information on what these things really scientifically are.
06:31 DM: Augmented reality, of course, and again, if you go through the links, is a subset of the mixed reality. Mixed reality, again, just refers to, you have a real world, you have the virtual reality, virtual space, and then whenever you have elements of both simultaneously, then you have mixed reality. Augmented reality tends to be more towards you have more real things than you have fake things. That is this…
06:54 JJ: Well, to an uneducated observer or to somebody who’s not in the space, what they’re doing looks like, what you’re describing is augmented reality, so why is that not augmented reality?
07:02 DM: I guess, the difference between something actually being augmented reality and then really just being you rendering some virtual stuff on top of a video feed. In that particular case, ARkit does have the ability to track planar surfaces. It was tracking the tabletop, okay, so I guess it was tracking the tabletop so the guy could move around it and view it. But otherwise, what he was watching didn’t really have anything to do with the camera feed, and the camera feed wasn’t really affecting it any, other than tracking, even though the camera feed itself doesn’t actually have anything to do with the tracking. The tracking, you can have the camera feed on or off, it doesn’t affect the tracking any.
07:40 JB: So you’re saying it’s sort of like a question of interaction with the environment. If the 3D models and digital stuff was actually interacting with the environment, now you’re talking about more…
07:47 JJ: Or the user, who is using the device.
07:48 JB: Or the user, right.
07:48 DM: Or the user, exactly.
07:50 JJ: So the camera moving doesn’t qualify in this case.
07:53 DM: Right. So if the virtual content, you add some extra content. Again, the name is augmented reality so you’re augmenting reality. I guess the word “augment” depends, it could be lots of things, especially in this day and age…
08:06 JJ: We can argue semantics all day, trust me.
08:06 DM: Are very subjective, very subjective definitions of these terms. Technically, I guess, they were augmenting their reality. They were adding things to it. But what they were displaying, it could’ve really just been a VR scene. They could’ve really just rendered a planet and had it a nice little environment map behind it, had a nice Cubemap where it’s just like sky. And it’s basically what we would call magic windows, it’s handheld VR and they could walk around the stage. It would still…
08:31 JJ: Oh, that’s a fair point. Yeah.
08:32 DM: It would still have tracked, so it would’ve looked beautiful on the screen. The fact they had the video camera behind it didn’t really add anything to it, other than, of course, they’re just showing the tracking looks good, you can do things with it. But otherwise, what they explicitly showed wasn’t necessarily augmented reality. Maybe the teacup, ’cause you’re saying, “Okay, well, here’s a teacup. I wanna see what a teacup looks like on my table,” perhaps. Or it was like to another cup I had on my table.
08:54 JJ: It almost sounds like the use case is what defines it here.
08:57 DM: Basically, I would also agree that it is indeed the use case. I can guarantee you 100% when it first comes out later this year, people already have it for development, there’s doing to be a storm of applications on the App Store that will just have just random things. Probably, I’m sure games, you’re jumping around, looking at stuff, or planting things, on top of a video feed. That really could just be using the tracking to do probably something even better without showing the video feed behind it and just be virtual reality. So again, the ARKit is tracking the VIVE, which you guys have both used I’m sure. And people…
09:30 JJ: I have extensively, yes.
09:31 DM: People at home, ’cause I know people at home are listening to this and not professionals in the work space. It’s just random moms and pops, I’m sure, listening to your podcast. The people, the folks at home, I’m sure have also used VR, used the VIVE. You’re, of course, being tracked in the VIVE, but there’s no video feed behind anything normally, though VIVE does have a camera, you could do that.
09:49 JJ: Yeah, I’m sure you’re gonna see more of those kind of applications in the future, too. Like the VIVE becomes a window into your actual world through a camera.
09:57 JB: Yeah.
09:57 DM: So, let’s presume you did that, so the VIVE is using the camera. That’s what we got for ARKit. It’s just I think a lot of experiences that are first gonna come out are going to really be degraded, what they could’ve been, ’cause people just put it on top of the camera feed instead of making it just be VR, just be handheld VR with good tracking. Move around the VR world.
10:13 JJ: That’s an interesting analogy for helping people understand the difference between… Or to understand what they’re looking at. If you just considered basically a VIVE with a camera turned on, that’s an actually really good handle for people to understand.
10:26 JB: That explains what you mean by when you say that after the presentation, you were like, “Wow, it was actually underwhelming.” Or, “They undersold what they had actually produced.” So explain what are some of the things that you could actually do with that that you would have liked to have seen that they avoided in favor of just a marketing-style demo.
10:47 JJ: Or maybe they haven’t really identified for themselves.
10:49 JB: Yeah. Well, I think they’re leaving it up to the developers.
10:52 JJ: Well, I don’t wanna give Apple too much credit. You know how I am.
10:54 JB: Yes, that’s right.
10:55 DM: Yeah, and I would also agree. I’m sure Steve Jobs, Not Steve Jobs One through Seven, would probably agree with you. And so, kind of rewinding back, you can insert this wherever you’d like to. I’m gonna try to make it non-sequitur.
11:06 JB: Beware, Ken, ’cause he’ll leave that in the podcast.
11:11 DM: So, I’m trying to think of… Wait, [chuckle] I lost my train of thought.
11:14 JJ: What we were doing is we were running back to talk about stuff that they could have done in their demo that would not have undersold what they were doing, something that would have really displayed the power and capability of what they’ve built, which you do seem to be impressed by.
11:27 DM: Right. Right. We have our [11:28] ____ tracking solution. We’re kind of new players to the tracking space. We’re not like Vuforia, and Qualcomm, PTC, that have been in for it for quite some time. We’re kinda new players, but we’ve jumped in there pretty good, pretty heavily. Our aim, of course for our technology, was to be a bridge, ’cause Tango’s been out for quite some time, the HoloLens, they’re all using these depth cameras. Depth cameras are gonna be the eventuality. Things are gonna move to depth cameras, gonna move to the hardware. There’s only so many things you can do with an RGB camera that you can’t do with an RGB-D camera ’cause, of course if you have the precise depth you get a lot more measurement, you can get actual 3D models of things, you can save maps and reload them.
12:06 DM: So a lot you can do with RGB-D cameras or time-of-flight cameras, whatever you wanna call the nomenclature. So those are going to be the eventuality of what we’re gonna move to. With our tracking we were just kind of hoping to build that bridge until those became more ubiquitous ’cause they were already out. Tango’s already out. Lenovo had a tablet with a Tango device in it. The HoloLens came out. At the time Microsoft was touting their holographic, I don’t wanna say API but their holographic engine, and that they were kind of promoting that it was gonna be in devices, it’s gonna be out there, it’s gonna be in their Surface tablets which have yet to really be announced to being in anything other than the HoloLens. So we…
12:41 JJ: And then HoloLens 3 which comes out in two years.
12:44 DM: Exactly. So it may be the thing. Of course they have their own VR headset. They’re paired with HTC. And then one article actually that Joe had sent me a link to the guy seemed to, again, be ambivalous about what mixed reality seemed to be.
12:56 JB: Yes, that’s right.
12:57 DM: He would switch between AR and VR like they were the same thing. He didn’t know what was going on. He just called them everything mixed reality.
13:02 JB: A little background, Ken and I specialize in sending each other articles where we laugh at the way the people define mixed reality.
13:08 DM: We were, of course, presuming our technology would be a bridge to these RGB-D devices. We were definitely not anticipating Apple to really come out with something, for sure not this year, you had mentioned before they were kinda playing around with the notion that there might be some depth cameras in the newest iPhone or the newest iPad…
13:25 JJ: And we know that they picked up Metaio. Meta-io? Metayo?
13:28 JB: Yeah, we call it Metaio.
13:29 JJ: Something like that, yeah. We knew that they had picked them up, so they were gonna make some sort of AR play.
13:33 DM: Right, exactly. We weren’t sure what it was gonna be into, and they had them for quite some time, two years or more, of course were in talks before then. Metaio, of course, had been around before then for quite some time. We were anticipating it to be a depth camera jump. Not just a software-related like an ARKit is. It doesn’t really require… It doesn’t require at all even multiple cameras, just single camera, and of course, probably, their co-processor.
13:56 JB: Dig into that a little bit ’cause I think it’s an important conversation.
14:01 DM: In their announcement, Not Steve Jobs Number Two announced that it, of course, it would play on the A9 processors and up, which is not the 6, the iPhone 6 but the 6S, 6 Plus, and then the first generation iPad Pros and up. So that’s basically anything named iPad, so not the iPad Air, or the iPad Minis, but the iPad Pros, and then the newest generation of iPads this year which are just called iPads. The first generation of Minis perhaps, I’m not really exactly sure which processor it is. Maybe the A6 processors. They had a co-processor with those, the M, the motion co-processors. And their numbers, I believe, coincide with the A processors, like the A9 has an M9 and the M10 has an M10, Fusions have a Fusion co-processor. So those motion co-processors, I’m presuming, are… Obviously, and in the name, they’re co-processors, I mean they can offload extra processing that the processor doesn’t have to do so things can run faster.
14:52 JJ: Basically what you’re saying is their hardware’s giving them a leg up in terms of what they can develop.
14:56 DM: Right, exactly. They of course own that, so they know that in-and-out. They, I’m presuming, have probably designed it and just sent it out to whoever’s manufacturing it. And these co-processors are, more likely than not, the secret sauce behind the performance of their tracking. Because they announced that it only is gonna be the A9 processor and up, it would tend to make someone infer that it’s gonna be relying heavily on this co-processor, that they’ve either added new hardware for, specifically for this processing, or somehow is integrated into other sensors that are also doing some processing as well.
15:40 JJ: It has the added benefit of if there are a slew of useful AR apps developed for the devices, it handily obsoletes a bunch of their old hardware at the same time.
15:50 DM: Exactly. Going back to what you were talking about Apple’s play, what they care about these things. Their play is either it can run on other devices and they specifically don’t want it to do that so people upgrade to the A9s and up, meaning there is no secret sauce with these current motion processors. The secret sauce is “cha-ching” in getting newer updated devices.
16:13 JB: Which knowing how many stories have been written about how are they gonna get people to upgrade their iPads, not a bad guess.
16:18 DM: No, not a bad play. Not a bad play.
16:20 JJ: And then the other play is?
16:22 DM: Right, and the other play is people, one, may need updated devices, but it’s something new it’s something shiny. Apple, of course, makes money from people uploading apps on the App Store. Whether or not you make money on your app, Apple’s making money on your app, ’cause you have to have a developer profile or an enterprise profile that we have and you have to pay for that.
16:39 JJ: Oh, that’s an interesting observation.
16:40 DM: You have to have a paid profile to upload things to the App Store. So…
16:44 JJ: Do they get a slew of money if a bunch of developers make a bunch of chaff apps?
16:45 DM: I’m pretty sure… Well, if they’re uploading apps to the App Store they have to have a paid profile. And of course, if your app is paid, I’m pretty sure they get some monetary compensations. I don’t know if it’s percentage or if it’s just a fixed amount. I’ve never actually uploaded a paid app.
16:58 JB: For every dollar the app earns, they take 30 cents.
17:02 DM: So it’s 30%, 30 cents.
17:02 JJ: 30 cents?
17:04 JB: Yeah, it’s a 70-30 split.
17:05 DM: So they are going to make…
17:06 JJ: I did not know that Apple was a pimp.
17:08 JB: Yes, Apple’s a pimp.
17:09 DM: That’s more than PayPal. So they’re going to make billions of dollars on just apps being uploaded in the first year. Those apps, again, are going to be hit or miss. A lot of them, most of them, probably 80%, are going to be one offs. It’s like, “Oh, this is cool they’re all gonna be the same.” Some guy just running around doesn’t really use anything about AR. Some will probably be VR, some will probably take heed and realize that it’s just tracking and it can be used for non-AR-related things. It can be used for VR…
17:34 JJ: I could see gaming being big that way.
17:34 JB: Or GPS.
17:36 DM: Or your mobilization… Or localization, rather, like for maps, ’cause they can be registered through your compass. Apple, of course, has their own mapping API you can link into.
17:47 JJ: Great app, by the way, Apple Maps.
17:48 DM: Exactly. [chuckle] They’ll put a path on the ground so you can see if it’s going into the river before you begin driving.
17:53 JJ: That’s helpful. I watched that episode of The Office too.
17:56 DM: Exactly. So they’re gonna make a billion dollars on the App Store. They are gonna do that no matter what. So whether the play was to upgrade the hardware or not, for sure their play was, “We’re gonna get a lot of apps on the App Store.”
18:06 JJ: What about their SDK are they downplaying? Or are they not hyping enough?
18:11 DM: Right, right. They showed a little bit like Pokemon Go, and then they showed one other app that obviously didn’t make that big of an impression because I don’t remember what it was.
18:19 JJ: It was the IKEA app, actually.
18:20 DM: Oh, the IKEA app.
18:22 DM: IKEA’s been around… IKEA had been entangled. IKEA’s been in everything, all these furniture companies have been in everything. That doesn’t really impress me. That whole presentation by Not Steve Jobs Number Two was maybe 15 minutes. They spent a lot of time on the guy, on the same… I think the same Not Steve Jobs, giving presentations on High Sierra or whatever, giving those, talking about just the name, just the name of the… All the jokes he played on the operating system, and then ARKit comes in, he’s there for 10, 15 minutes and is out, he’s out the door.
18:47 JJ: Yeah.
18:48 DM: Our first impressions, again, not being able to download it immediately, you don’t have access to the beta stuff until after the presentation. And then afterwards, of course, people are downloading [18:55] ____ time, we did get it the same day, and then the next day were able to load up a sample application. After loading up a sample application, you realize that it is actually, in reality, a lot better than they showed there on the stage. Now granted, their stage was dark, and it wasn’t really that conducive to live demos where it does require the video feed, so those… Anyone at home wondering, I guess for the at-home listeners, not the people in their offices, so only people at home. If you’re wondering if it’s using the IMU, or the motion, or it’s using the video feed, it is indeed of using both. If you cover up the camera and just walk around like you think the IMU’s doing everything, it of course fails/ it doesn’t operate. You have to have the camera feed going, so it is indeed of using the camera feed and the IMU, of course, for orientation.
19:45 JJ: Okay. What’s the IMU again?
19:47 DM: The IMU’s the Inertial Motion Unit.
19:50 JJ: Okay, thanks.
19:51 DM: That’s just a gyroscope, accelerometer, there may be some other, some GPS. Sometimes, they lump in lots of other sensors into the IMU as well.
20:00 JJ: Okay.
20:00 DM: It’s just basically the standard sensor package that exists on every mobile device.
20:03 JJ: So it’s using both the feed and the data from the IMU, what are some other features that they did not roll out?
20:09 DM: Right. You saw, more or less, the planar tracking, so they’re tracking the desk. Just…
20:14 JJ: Multiple planar tracking, which is something I saw, which was really cool.
20:17 DM: Right. So these… It actually use what it calls “anchor points.” So it’s actually anchor points. And these anchor points are just reference points in the world that, of course, have a height for [20:26] ____ where you’re at, and you can, of course, devise your own plane from that. But you’re right, it does have multiple anchor points. You could have anchor points at various heights and at various locations, and utilize those in your applications once you detect those anchor points, but it also… I’m just, I think just the tracking in general, so it doesn’t actually require you to do planar tracking to track. You can actually just begin tracking, and just… You can actually not even use the anchor points or planes at all. It just, well, it just will track based on where the device began, so wherever you begin the application is zero.
20:58 JJ: So basically just a 3D track? Tracking in 3D space.
21:00 DM: Exactly. It’s a 3D, in the world, exactly. Which if you were doing a VR app, that’s what you’d probably would do is wherever you start, the device is in the air. That’s the zero. 0,0 in the 3D coordinate system.
21:10 JJ: Somebody might call that magic window or something.
21:12 DM: Exactly, exactly.
21:12 JJ: Yeah. A really well tracked magic window.
21:14 DM: Exactly, so it doesn’t require you to do planar tracking. It can just track in… There’s 3D. There’s 3D tracking, in the world, without having any knowledge of your space, once it starts. I think they underplayed that considerably. Again, it was a very dark stage, so perhaps they were afraid. The darkness, there wouldn’t be a lot of easy-to-get features.
21:32 JB: They didn’t want things to break.
21:32 JJ: As somebody who films AR apps…
21:34 JJ: That was a bit of a high-wire act for me. I was watching but like, “Are they gonna screw this up? How bad are they gonna to screw this up? They didn’t screw it up, holy crap!”
21:40 JB: No, and even after the VR demo, the person almost fell down.
21:44 JJ: Oh yeah, I caught that.
21:45 JB: They tripped, so it was after that had happened, I was like, “Oh, god.” Again, knowing how these AR demos, how difficult they are to actually produce, it really was a high-wire act.
21:55 DM: I believe, anyway, their ARKit is using SLAM, so they’re getting feature points from the world that are 3D, so they have 3D locations, and they’re deriving the location of the device by back-calculating that inside-out kind of tracking based on those feature points. So they have this point cloud of all these feature points around them, and they do give you access to this point cloud, whether it’s full access to the full amount of point cloud or just filtered point cloud data they’re giving you, you do have access to it. And with that, you can do, more-or-less, collision detection with, in the quotation marks, “the environment around you.” So…
22:31 JJ: If you could give me an example of what that enables you to do, that’d be great.
22:34 DM: Yeah, there is a Unity plug-in that you can actually download. For those of you at home, again, who are into Unity, you can just search “ARKit Unity plug-in.” And then Unity, of course, has a plug-in they’re releasing that’s from Unity developers themselves, the company Unity. You can download and install it, and they have a built-in demo where you can basically cast a cube, an innocuous cube, out into the collage with the world. So I’m walking around…
23:00 JJ: That sounds like some actual augmented reality.
23:01 DM: So that would be… Yes, that could be augmented reality. Maybe you’re putting tags. So instead of a cube, you’d be something more useful. But yes, you would then be adding information to, augmenting your information, about the real world. So you can, more in this particular demo, you would tap on the screen, and of course it sends a ray out into this point cloud that would then intersect with this point cloud, and then puts a cube down.
23:23 JJ: And then if it hits something that the software has detected exists in the quote unquote “real world… ”
23:28 DM: Exactly.
23:28 JJ: It will respond to it.
23:29 DM: Exactly.
23:29 JJ: Yeah.
23:30 DM: So the reason I say they may or may not wanted to have showed that, is that for my tests so far, it’s… I’m not going to say “unreliable,” but it is “dynamic in accuracy.”
23:43 JJ: That is the best euphemism for unreliable I’ve ever heard.
23:45 JB: Yes, that is amazing. I’m gonna to use that the next time I ever get in trouble anywhere in the world, yeah.
23:49 JJ: Dynamic in accuracy?
23:50 DM: It is dynamic in accuracy. So even…
23:52 JB: “You’re lying!” “No, I’m just being dynamically accurate.”
23:55 DM: Exactly.
23:56 JJ: Alternate facts.
23:58 DM: Exactly. So even with the same point, even testing the same point, you will often get varying results as to where it actually collided with there. With the point…
24:07 JJ: The Heisenberg’s uncertainty in augmented reality principle.
24:10 DM: Exactly.
24:11 JJ: Got it.
24:11 DM: Which would leave me to believe that the point cloud they’re, at least, giving you access to, is dynamic. Meaning they’re not keeping it. It’s not a true map.
24:20 JJ: They’re checking every now and then.
24:22 DM: Exactly. They’re updating it within some timeframe, and then it may get worse or it may get better, but it doesn’t persist on.
24:30 JJ: Okay.
24:32 DM: It may need to be recreated as you go back to the scene. Again, I would say that because from my testing with the point cloud, if you’re testing intersections, even on the same place, the same point, over, and over, and over again, with the device mounted on an unmoving tether…
24:46 JJ: Say, a tripod?
24:47 DM: Exactly. You will get differing results in sampling with this point cloud. However, I do seem… And the reason, again, I was saying it’s updating dynamically is the more you move around the more accurate it seems to be, which kind of makes sense because the more you’re moving around, the more 3D information it can get between frames it’s sampling.
25:03 JJ: Did you just say that their tech breaks if you’re not moving?
25:06 DM: Right. I think it is important to note that it, of course, is not infallible. It can be broken. Like I mentioned early, if you cover the camera up and only using the IMU it will not perform. If you’re in a dark environment it will not perform. If your program developer’s out there you can, of course, sample the tracking quality and the tracking error type. So it can tell you if it’s low light or not enough features. If you’re looking at the ceiling that’s all white, you’re not gonna get any features.
25:30 JJ: They’ll give you feedback that way?
25:32 DM: Yeah, exactly.
25:32 JJ: That’s good.
25:32 DM: So you can get… It’s an indication.
25:33 JJ: It will help people develop better applications, yeah.
25:36 DM: Exactly. So you can recover from poor tracking scenarios or let the user know that this is not a good area to be tracking in, which is important to do.
25:44 JJ: It sounds like they’ve built a pretty robust tool.
25:47 DM: Right, they have done a very good job for millions of dollars. They’ve done a very good job in development for it. Again, of course, they purchased, like we mentioned with a lot of other companies that had been working on it before, I’m presuming brought these guys in that knew the bad stuff. And in two years and millions of dollars, it is very good…
26:07 JB: But there’s room for improvement.
26:08 JB: I feel like you’re about to say, “But there’s room… ” So what would you improve about it?
26:12 DM: Right. I’m not gonna say I’d improve upon it, ’cause I couldn’t… Within the timeframe they did it, it would be difficult to do that.
26:17 JB: I don’t mean for you to personally go and prove it. I just saying, in general.
26:22 JJ: Where do you see opportunities for improvement? Is what he was asking. Hang on, let me prompt you, so there’s not so much talking. So where are the opportunities for improvement there?
26:29 DM: Right. Though it is a phenomenal piece of technology they put out there for developers. And for the first generation, it for sure beats, unfortunately for everyone else out there, it does beat anything else out there. Again, it is confined exclusively to iOS 11, so if you don’t upgrade your iOS 10 device to 11, you cannot use it. And of course, if you don’t have an A9 or up device, you cannot use it. So for the time being, it is limited. That, of course, will change, and I have… For their first foray, it is very good. And I have no doubt that they’re doing this…
27:05 JJ: You are qualifying the shit out of this statement right now.
27:08 DM: Exactly. I…
27:09 JJ: You are hesitant to just say, “I think this could get better.”
27:11 DM: Right, right. I know that they’re, of course, are working on the future. Again, I presume their future is going to be depth cameras in the devices making this particular type of tracking they’re using obsolete, in the sense that they no longer need to use this more vision-based… Even the depth cameras still use a SLAM but it’s no longer an RGB feature-based, it’s a 3D feature-based tracking.
27:35 JJ: Did Apple just make a gap solution?
27:38 DM: Yes. Apple has the best solution I’ve seen so far until the depth camera, which they will undoubtedly do. Again, on your question of how could it improve, like I said, the point cloud is not that accurate, meaning you can’t just do occlusion of AR content.
27:53 JJ: For everyone out there, that means when things in the real world pass in front of your 3D model that’s in the virtual space, it occludes part of that model so that it looks like it’s integrated into the world you’re looking at.
28:04 DM: Exactly, and anyone can go out today, if you don’t have an iOS 11 device, and download Snapchat. And they have their world lens so you can put some little stuff in the world and then you can, of course, move your hand in front of the screen and that stuff is still in front of your hand. You can see it no matter what. It does break immersion, it does break the realism of the augmentations. But that is a very hard problem. Occlusion is very hard ’cause you have to know information about the world.
28:28 JJ: Joe, you had something? Or am I…
28:31 JB: What I was gonna say is have you seen any — and I don’t know that you naturally would in the SDK — reference to, or indications of, a coming depth-sensing camera, because the rumors around the next iPhone are that there is going to be a depth-sensing camera on it, although I don’t know that it’s meant for AR applications as much as it may be meant for facial recognition.
28:51 DM: Yeah, I’ve also heard the rumors it would be a front-facing camera.
28:53 JB: Yeah exactly, which is why I’m like…
28:55 DM: For security, like that Microsoft does to let you in your phone, and do Snapchat kinda things. I suspect whether they do that in the front, or it’s another year or two before the depth cameras come out, this gives them now… At least the SDK is out there to the developers. Developers are developing with it. When they come out with depth-sensing cameras and technology there will probably be no transition for apps. It would use the exact same function calls and API calls…
29:20 JJ: They’ll just have better data.
29:21 DM: Exactly. Using the ARKit right now is very simple, very easy. Again, it’s just tracking. For those people in the computer developers and the computer graphics, tracking is just the location of the rendering camera in your scene. That’s all it gives you. Now just remember, it’s not doing… It’s not putting things in the world. That’s your Unity, and your OpenGL, and your Unreal. It’s just giving you tracking, so if you have a depth camera and it’s giving you the same exact information, it does not change your application at all. Your application immediately is compatible with the newer stuff.
29:52 JJ: They just provided a better tracking solution for the app you’ve already made.
29:55 DM: Exactly, the information you’re getting is just more robust. And it, of course, may have some extra features, like a mesh to put in your environment for occlusion, and things like that you can add in there. But the tracking, of course, would not change for your app. So I think…
30:06 JJ: That’s a good break down of the features.
30:08 DM: Yeah. So I think that they have a very good intro play. Lots of people are gonna be using it, getting familiar with the SDK, so when there are upgrades and changes, no doubt, it’ll be very easy to transition to those [30:19] ____ features.
30:20 JJ: Well, thanks so much for joining us today, Ken.
30:22 DM: Thank you guys for having me.
30:23 JJ: Your insights are invaluable and occasionally incomprehensible, but we appreciate it.
30:27 JB: Thank you.
30:28 JJ: Alright, guys. Well, thanks for listening and we’ll be around next week for more scintillating discussion about VR and AR. Have a good day.
30:34 DM: Here’s Sting to play you out.
30:35 JJ: Here’s Sting to play you out.