The Optimistic Case for the Metaverse

Mike Mills
HCVC
Published in
15 min readFeb 7, 2022

--

When Mark Zuckerberg talks about the Metaverse, I cringe. His vision of the future feels a bit like a targeted ad strapped to a human face, forever. I’m not alone in my negative reaction, as indeed almost every commentator, critic, technologist, and person not otherwise engaged in the business of selling ads or NFTs seems to have. I asked my network of legal, tech, and business experts for their optimistic cases for the future of the Metaverse, and received only two brave souls willing to give me a positive view.

Indeed, it’s not hard to find negative predictions about the Metaverse. Most of the iconic science fiction that explores the idea of the Metaverse is dystopian. Ready Player One is a dire warning about the risks of ecosystem collapse and crushing 80’s nostalgia, not an exciting look at the future of VR. We’re all biased against the concept of the Metaverse.

I’d like to explore whether this bias is wrong, whether it’s the result of a misunderstanding of what the Metaverse is, and what it is going to become. So over a few articles I’d like to build the Optimistic Case for the Metaverse (without ever mentioning NFTs).

We already live in the Metaverse

Any article like this begins, as it must, with deference to Neal Stephenson’s coining of the term Metaverse in the book Snow Crash. Such articles usually forget to mention that he also popularized (though did not coin) the word Avatar. Avatar comes from Sanskrit, referring to the material presence of a deity on Earth. In turn, our avatars are the digital presence of our material selves. In the Metaverse of Snow Crash, an avatar is a 3D VR representation. But an avatar could be any other digital representation. So long as we understand that a specific marker represents a material person, what’s the difference between a polygon, an animoji, a gif, or a simple username?

Avatars past and present. UO Paperdoll from UOGuide.com

You’re already living in the Metaverse

In fact, you’ve been living in it for a long time now. Here’s how Meta (né The Facebook) describes The Metaverse:

“The ‘metaverse’ is a set of virtual spaces where you can create and explore with other people who aren’t in the same physical space as you.”

If we take this definition, well, you’re in it right now as you read this article. Medium is a virtual space where we create, explore, and connect, without sharing physical connection. Like most children of the information era, I grew up with a Metaverse. Every day I came home from school and logged into Ultima Online, where I hung out with people I’d never in real life, but where I was still among my closest friends. Within the game world, we fought orcs and dueled in front of our guild’s castle. Most of the game world was static. We would “chop down” a tree, only to have it still standing. Orcs would respawn. When we logged off, our characters would simply disappear into thin air. The only persistent change we could make to the game world was in the form of a single castle we plopped down in the middle of a giant field. The castle became our clubhouse, where we would chat, duel, and launch each other through portals to lands unknown. We had other worlds where we interacted. Our forum was a hot mess of PC building arguments, discussion of future in-game quests, and cat photos. Our ICQ chats were conversations about teenage love lives, homework, and future careers in espionage (Hi Phish!). We casually flitted between virtual spaces, as our own identities were represented, alternately, by sprites in a digital world, names on a webpage, and usernames in a chat box. We were creating and exploring in multiple virtual spaces. Was this not the Metaverse?

Perhaps you don’t think that was quite enough. Or it’s a bit too nerdy. Today we all come and go through the Metaverse on a daily basis. Google Maps presents us with a digital abstraction of our position in space, along with extra layers of information that allow us to connect with one another: You can see real time analytics on the number of people at a restaurant, and read reviews about it while the app directs you how to walk there. Real time data from phones around the city tells you how quickly traffic moves. A friend can drop a pin to let you know exactly where they are. Google presents four different layers of abstraction (a digital recreation of the city, an aggregation of the people there, a list of reviews, and a digital twin in the form of a dot that represents your friend), and you are entirely accustomed to shifting between each layer and the real world. In fact, you probably make the transition between all of these layers without conscious effort. You are creating and exploring across space and time among people you will never meet. Is this not the Metaverse?

In fact, we can build a sort of table of different types of Metaverses based on the level of abstraction from the real world, using three different popular games.

A taxonomy of hours

Microsoft Flight Simulator takes Microsoft’s extensive data about the real world, from maps to weather data, and recreates it for people to fly planes in a shared digital universe. Pokémon Go abstracts the real world into a Pokemon game, where players can find a pokémon in the park and then compete with other players as they walk by a nearby café. World of Warcraft builds a universe entirely separate from the real world, where player avatars carry on quests together.

Satya Nadella, CEO of Microsoft, sees avatars as fluid. In an interview about the anticipated acquisition of Activision, he digs into this idea:

Take some of the research we’re doing on how should people relate to avatars. You and me may have a particular understanding of what an avatar is versus somebody who is a young kid who has already built their avatar in Minecraft or in Forza. [They would say] Oh yeah, I want to use an avatar, because having . . . multiple identities for different contexts is a much more expected thing if you’re going from gaming world to a gaming world.

A chat box, a forum, a digital map, a video game, all of these are a Metaverse. Children have been creating, sharing, and exploring in Minecraft for more than a decade. VR itself has been around for more than a decade. Which raises the obvious question, if these things have been around for so long, what’s unique now that makes everyone talk about The Metaverse as though it’s something new and worth investing in?

Is it possible that all this hype is a multibillion dollar rebranding exercise?

Let’s go back to that definition:

“The ‘metaverse’ is a set of virtual spaces where you can create and explore with other people who aren’t in the same physical space as you.”

Emphasis mine. When I played Ultima Online with my friends, I lived in three metaverses that were connected only though a shared fiction in my mind and the minds of others I played with that the avatar in game was the avatar on the message board that was the avatar on ICQ. The Metaverse existed within these mental connections. But we could imagine a technical feat that allowed for these connections to exist not just within our minds, but as an essential element of the internet. We’ve already created countless virtual spaces. There’s not much innovation left in their creation. But there is a lot of work to be done in the intentional design of a fluid interconnection between a set of virtual spaces.

Yanis Varoufakis, in a complex interview about cryptocurrencies and economics, spoke about this exact emergence of the Metaverse:

Team Fortress players were obsessed with digital hats. Initially part of free drops, some hats that were discontinued later became collectibles. Players began bartering within the game (e.g. I will give you two laser guns for this one hat of yours). Then, when the demand for some hat rose sufficiently, the players would step out of the game, meet up on eBay, trade the hat for (sometimes) thousands of dollars, before, finally, returning to the game where the vendor would hand the hat over to the buyer.

Ultimately, Varoufakis says, Valve stepped in to put itself in the middle, so that hat transactions could happen within its platform. The future is in the seamless movement of digital assets and avatars.

A dapper look in any world.

What would permit this interconnection is a collapsing of the shared fiction that three different avatars are all the same person, my means of the technical reality that allows for the digital presence of a singular avatar across many spaces. This means a technological innovation that allows physical and virtual spaces to blend together, such that the avatar and the person can be interchangeable in some spaces.

I would offer that there are three things that make the Metaverse of the future distinct from the digital spaces we experience today, that make the Metaverse something more than just VR:

  1. Fluid interconnectedness between the real world and multiple digital worlds.
  2. Those digital worlds being both dynamic and persistent.
  3. The construction of new physical tools to allow 1 and 2 to work seamlessly.

So let’s try a slightly more clarifying definition of The Metaverse, and explore what that really means:

“The ‘metaverse’ is the seamless interconnection of multiple virtual and physical spaces where you can create and explore with other people who aren’t in the same physical space as you.”

In this way, the Metaverse is more than a VR headset, in the same way that the internet is more than a laptop. And just as there’s an optimistic case for the internet, there’s an optimistic case for the Metaverse.

What we talk about when we talk Metaverse

M eta has a vision of the future that relies heavily on VR. As it’s the most vocal proponent of the Metaverse, it’s no wonder that when we talk about Metaverse, we think VR. But I think this is missing the point about the possible innovation to come. Critics rightly point out that VR has been here for a long time, and has never quite taken off.

Nobody wants this.

So let’s dive a bit deeper into my definition and figure out what what we’re really talking about.

It’s entirely normal to speak of what we see in a web browser as “The Internet” but it’s also entirely incorrect. The Internet is more than http, more than the World Wide Web. The technological revolution of a global network of servers, switches, cables, and communication protocols, feels at times like nothing more than reading the news in a web browser.

Similarly, to speak of the Metaverse as a 3D VR space is to miss the underlying technological innovation. For the people at Meta, this is what the Metaverse will be for most people, a 3D space to interact. But I suggest that it’s more than that:

“The ‘metaverse’ is the seamless interconnection of multiple virtual and physical spaces where you can create and explore with other people who aren’t in the same physical space as you.”

The Internet is a billion webpages, video streams, API calls, and so on, made possible at scale. The Metaverse is a billion digital spaces made possible at scale.

The Metaverse has three core features to come that do not fully exist yet. Each feature requires new innovation to accomplish. As I mentioned I think these are the three things that distinguish the Metaverse of today from the Metaverse of tomorrow:

  1. Fluid interconnectedness between the real world and multiple digital worlds.
  2. Those digital worlds being both dynamic and persistent.
  3. The construction of new physical tools to allow 1 and 2 to work seamlessly.

Fluid Interconnectedness

We already have a sort of interconnectedness of our identity in Web 2.0. We regularly use Facebook, Google, or Apple accounts to set up and log into accounts on disparate services across the web.

What I mean here is an interconnectedness that allows us to move seamlessly between the real world and separate digital spaces. This could be as simple as carrying an avatar from Fortnite to Minecraft and back, or as futuristic as a real-time projection of a colleague through Augmented Reality glasses. Just as we can move seamlessly between the real world and the digital world of Google Maps, a Metaverse should make it possible to move seamlessly between sitting at a desk in one’s office and sitting virtually in a room across the planet.

Any interconnectedness requires a complex set of standards regarding what and how information can be shared. It also requires bandwidth on a level we’re simply not used to having.

Dynamic and Persistent

For readers who are unfamiliar with it, Minecraft is one of the most- popular games ever made. In it, players are presented with an infinite, procedurally generated world, where they can collect resources and use them to unleash their creativity to build anything they can imagine. Most people experience Minecraft as a multiplayer game, where the world is hosted on a single server, and friends collaborate to explore and build together. What makes Minecraft so amazing is not just that it allows players to find creative solutions within a few broadly defined rules of the world, but that it allows changes players make to persist. When a player logs off, the world does not reset. The house they built remains standing, for all to see, explore, and edit.

It is the dynamic and persistent nature of the Minecraft world that allows creativity and ownership to flourish, and thrive. But it is also these things that poses a substantial problem to scale. Nobody wants to walk through the same lifeless rooms in a thousand virtual worlds. But lifeless rooms are easy to scale. Any arbitrary object in a 3D world is reducible down to a set of vectors and textures. Most 3D worlds can scale by putting a copy of those vectors and textures on the computer of every user, and sending only basic information such as the position of the object. This is easy when the universe is static and the set of objects is limited. This is exceedingly difficult when the world is dynamic and persistent.

The birth of any Minecraft world comes from a seed value. All computers, given the same seed, will generate the same world. The same trees in the same positions, next to the same flowing rivers. But the moment a user cuts down a tree, places a block, redirects a river, it is no longer possible to rebuild the world in parallel. The amount of data required to maintain parallel copies would be immense. A Minecraft world roughly the size of Earth would take up over 100 petabytes.

To make 3D digital spaces that can persist across sessions, and among thousands or millions of people, will require intense amounts of bandwidth, storage, and computing power.

Dynamic connections create other scale problems. A Minecraft character has only a few possible moves, all of which are pre-rendered and easily modeled. Most video games have a thin set of possible motions for characters. Exceptionally complex games use motion capture to model how characters move in space, but they still use only a limited set of possible motions. If one ever wants to inhabit a 3D space with people who cross the uncanny valley to look plausibly like people so that a VR meeting can feel “real”, it would require the modeling of hundreds of muscles, tiny facial expressions, and minute changes in the movement of a head. All of this is beyond what we can do at scale today.

Technical Tools

We can imagine just a photorealistic representation of an animoji, capable of dynamic movements, accurate representation, and believable presence in a 3D world. We cannot, however, build it just yet. Several technical hurdles must first be overcome. But we can imagine this because we sit just before the convergence of several trends in hardware. These trends are exponential increases in bandwidth, similarly exponential increases in rendering capabilities, the growth of edge computing, and improved machine vision. Let’s talk about the latter two.

Edge computing is the concept of distributed computing, between the cloud and local nodes. You experience edge computing every time you run Netflix; Netflix strategically places nodes local to users with content and bandwidth. Successful creation and maintenance of elaborate digital spaces will require edge computing, moving from a model of server-client like in Minecraft, to a model of server-node-client. Whether the edge is a local server, a phone in our pocket, or an AR headset strapped to our faces, the future of computing is a distributed computing platform where the cloud works seamlessly with local nodes.

Machine vision is critical to the establishment of digital twins. Today we use basic panoramic photos in the form of street view in Google Maps. But to create a real digital twin of a physical space, we will require orders of magnitude more data, starting with depth maps, and delving into object recognition. To build digital twins of machinery, we will need accuracy down to the millimeter. If a model of traffic on the street will work, it will need to recognize cars, bikes, and pedestrians.

Sending a Fax from the Beach

In 1993, AT&T, a US telecom, put out a series of ads describing some ideas of what the future would look like.

“Have you ever borrowed a book from thousands of miles away? Crossed the country without stopping for directions? Or sent someone a fax from the beach? You will …”

The video is both shockingly prescient while also amusingly wrong. Elements of the digital future are here, but they’re still locked into the 1993 vision of how we would interact with physical space and physical objects. Knowing that a digital book is limitless in supply, we might find the idea presented here a bit weird, to borrow a physical book through video. And most of us aren’t sending faxes these days.

When Meta shows us people sitting in a VR room that looks as inspiring as an airport hotel conference room, or touring a natural disaster in VR, they’re doing the same thing AT&T did in its video (though with a little less style): giving us a glimpse of the future, but one that’s still wrapped up in outmoded thinking. We don’t want to send a fax from the beach, we want to communicate from the beach. We don’t want to sit in a VR conference room with people far away, we want to connect with people far away.

Nobody wants this either.

The optimistic case for the Metaverse begins in the same place the Internet was in 1993. Let go of the models we have now, and ask how we can accomplish things more effectively than we could now. Here are some ways the Metaverse can improve how we do basic things.

Effective Communication at a Distance

While a person would be able to sit in a VR-generated room to discuss work with a colleague in another city, in the Metaverse, that user could stand up, walk out of the office, and continue that conversation, walking down the street alongside a virtual representation of that colleague. The user could move from a banking platform to the bank, from a virtual kitchen to the real kitchen.

Blurring Borders

During the first months of Covid-19, I sat in on a conversation between consultant friends living in Nigeria and Poland, who had an immensely optimistic take on what the virus would do for them. The knew that most of their work was doable remotely, but they were held back from lucrative clientele because of borders. They were right, Covid-19 drove a global change toward permanent remote working: there’s not much need for a coder to commute into San Francisco, if everything can be done with Zoom, Slack, and Github.

As we get better connections between digital and physical worlds, we open up the possibility for even more jobs to be remote. An engineer in Sub-Saharan Africa can work just as effectively from home, opening up opportunities around the world to unlock talent.

New Models of Collaboration

Where’s the promise not just of communication, but also in modeling. Boeing says that it plans to build future airplanes in the Metaverse. Today, complex engineering projects are done across multiple digital spaces, from CAD renderings to elaborate spreadsheets to gits to models. The seamless interconnection of digital spaces could allow Boeing to build an airplane in a fully realized 3D space, collaborating in real time as users run models on it that to see the impacts on every part.

In my next article I’ll talk a bit about Meta and what I think we can learn from their mistakes.

--

--

No, not that Mike Mills, though I get his email. Nerd. Expat. Lawyer. VC. Mayo-Hater. General Counsel at HCVC.