Social Audio: Connection over Community

Why social audio networks will be about connection not community, Clubhouse as a Commodity, Roadtrip, and the new developer-first social audio company yet to be made.

Voice and video have always been fascinating. Before Covid though, I couldn’t figure out why the maker community, businesses and consumers hadn’t fully embraced the technology. VoIP was invented in 1973 for ARPANET(this was a US government project that pretty much laid the groundwork for the internet). WebRTC - the protocol that lets people build browser and mobile apps with real time video and audio - came out in 2011. The point being is that the technology supporting this revolution isn’t new - but the  circumstances of the times are. 

Now is the perfect time to rally behind technology that has been quietly developed and enabled real time video and audio for a while. That’s part of why Clubhouse - the iOS audio only app that recently raised $10MM at a $100MM valuation from one of Silicon Valley’s top investors - has blown up recently. Admittedly, I don’t even have access to the app, but I’m still talking about it. It’s an important lesson about software - a lot of times it has nothing to do with the tech, but the circumstances and distribution that surround it. Clubhouse had the perfect storm:

  1. A release that coincided with the pandemic: people were lusting for social connection - not just by texting but by through a more personable medium  - voice

  2. COVID has created a unique situation where not only is this the first time in modern history where we’ve been completely isolated - but it’s the first time where we’ve been completely isolated physically but completely connected virtually. I like to think of it like this:

Whether we can be physically together or not, we desire connection. Our separation has only forced us to adapt our habits of connection. Sure, my fancy graph above isn’t perfectly accurate, but you get the gist. 

Oh and Clubhouse nailed distribution. How do you incubate the hype of a social network in days, not months? Extreme FOMO, super connector users who will champion the app and a little mystery. 

Whenever I talk to someone about Clubhouse, the first thing they ask is:

“Do you have access?” 

They’ve built a hype machine solely through word of mouth, some mystery, and killer supporters. Notice how their social and web presence is nearly non-existent but they’ve got stans all over Twitter, the major tech outlets and people who’ve never used the app advocating on their behalf. 

I think it’s worth looking at this Google Trend report for more context though. The search term I’m looking at is “Clubhouse App”.

The first major spike according to the graph was April 19th - the day after Josh Constantine published his piece on social audio on TechCrunch - with Clubhouse as the headliner. 

As for the second spike, I’d guess it’s the result of continued momentum, high engagement and retention rates with current users leading to increased chatter and word of mouth spread, and the sizable 10MM A round they raised from A16Z.

And of course, you’ll see the region with the highest interest is SF.

No surprise there.

Regardless of whether there’s covid or not, people still desire to be connected in some way. In the past few months we’ve just had to re-balance the “connection equation”. As physical connection has gone down, virtual connection has gone up. 

So why does all this matter?

Life for the foreseeable future is increasingly virtual - we need to build to that reality

Does Clubhouse scale beyond the tech bubble that it was incubated in though? 

We’ll see. The new social audio space is too nascent to really know yet. Maybe it works great at scale. Maybe it’s crap. Time will tell. One thing we know for sure: social audio is becoming a commodity.


Over the weekend(as of when this was written), Matt Mazzeo and Brian Wagner dropped their side project for beta in Testflight. Someone tweeted it was sick. I asked for an invite. And a couple minutes later I joined as ~ 100th beta user. 

A cross between the couch, the radio and your friends

New-comers think that Roadtrip is like a Clubhouse.


Roadtrip anchors around an activity. For their mvp/beta they’re focusing on music. That’s why you need to login with your spotify premium account to fully participate.

When you login, the app prompts you to login with your Spotify premium, allows you to import your contacts, and then drops you into the homepage.

After being dropped into the homepage, you really have two choices: you can join a room or you can create your own. Each room is like a mini collaborative online radio station - you can create your own track of music from spotify to broadcast to the rest of the room as the host. It’s like your dj’ing for everyone who’s joined in on the action. Or, if your more of a follower, you can join someone else's room.

Here’s a taste of what the experience is like:

Much like any other social network, you can follow people. Here you see the rooms that your friends have created.

Then you have a For You page, think of this like a chronological feed that Instagram used to have. It shows you all the latest rooms that are popping off with visitors in them.

Here’s what it looks like inside a room. In this room, Colton is the dj and is the only one who can speak and control the music. Myself and everyone else are just here for the ride. We can hear the music and anything Colton says.

When you join a room, there’s three distinctions of the level that you are at. You are either a host who can manage the playlist, manage the audience and speakers and talk, or a speaker who can talk with the other speakers over the music, or an audience member who is just listening to the speakers and the tunes.

By combining socially networked music, listening and audio chatting, Roadtrip has created a product where there’s serendipity amongst strangers. Casual conversations from music to business flow in and out of the rooms, normally awkward silences are occupied by music which fills the void, and the vibe is unmatched. 

I like to think of Roadtrip as the metaphorical vibe - it’s for when you're in your flow state but still want some connection - you want to chill on the couch with friends - or you want to jam out with your coworkers - it’s literally the vibe.

There’s something to be said about inventing a new category — sure, if you’re the first one there you get the initial attention - but I don’t think that will matter in the long run with social audio. Clubhouse was the first, it’s not the last and I’m not sure it will be the ultimate winner. Clubhouse proved that social audio works on at least a small scale, but there’s a lot more to be proven. Comparing Roadtrip and Clubhouse isn’t even that fair. Clubhouse didn’t build in an anchor. Roadtrip did. 

Music as the first anchor makes a ton of sense. Everyone can vibe to some dope tunes. It apolitical, doesn’t require thought-leadership and is just fun. Jamming is something that almost all walks of life can get onboard with. By using music as a unifying anchor, Roadtrip is able to appeal to those looking for connections outside thought leadership twitter and Clubhouse.

Clubhouse is for the “cool kids” whereas Roadtrip is for the people.

Look at the names. Clubhouse implies exclusivity. The Roadtrip however is for the everyman. Everyone can vibe with a Roadtrip

Now this isn’t to say that I believe clubhouse won’t succeed, but I do believe that they’ll need to put that $10MM to good use and think about how they can anchor their users to more than just each other. In Clubhouse, the room dies as people leave. In Roadtrip, people can be anchored to the music. People will come and go and conversations will bubble and sink — but the music serves as the constant that anchors the rooms.

For example, I was in a room one night chatting for an hour or two and jamming, and eventually it got late and people hopped off - but I didn’t close out the app. I kept jamming to my tunes and working. I’ve seen this with others too in the app. Matt Mazzeo, the cofounder, even said “there’s no prescription” for the app, so people are defining how it’s used as we go. Many have been finding it useful as a social music network - allowing them to leverage their friends and strangers' taste in music to explore new tunes. Others have been using it at work to jam and talk with coworkers. 

The possibilities are endless. 


That’s been the question that’s come up in so many different rooms across the app. I joined as about the ~100th user and now there’s well over 700 users. There’s anywhere from 1 person to 20 in a room at any given time and discovery has been getting harder and harder as more people join and are active.

Scaling(outside the technical challenges of scaling voice) will not be easy - but I think it’s a complex yet worthy problem to solve. In my opinion, discovery will become even more important as the app scales to 10k users and beyond. People will be lost if options that are unique to them aren’t served up on a platter. 

Connection Over Community

I’ll reiterate: connection over community. I truly believe that this new world will optimize for connection over community -- at least with audio apps. Case and point with this app: I’ve jammed with tons of people I just met and had a blast - it was about the personal connection through voice that made the experience so pleasant and fun. So, building discovery will require some ingenuity. I think part of the puzzle will be building social graphs based on people’s existing social profiles and surfacing mutuals. Another part of discovery will also need to be based on people’s interest . And dear I say this, the final part will need to be based on a score. As the app scales, the only way to maintain quality conversations will be to filter out the trolls, the lurkers and those who aren’t contributing to the connection. I think this can be filtered pretty easily with a rating system like Uber has for passengers/drivers. 

And the Product

Product-wise they knocked it out of the park for a side-project. Sure, it’s still buggy as hell and will need some work to improve stability over time but I commend Matt and Brian for shipping. Baked into the beta there’s a lot of attention to detail that makes for a delightful experience. A few of those details that I’ll highlight:

  1. Auto-ducking for the music -- based on whether anyone is speaking, the music will automatically increase/decrease the volume. This creates a smooth listening experience that accommodates talking and jamming together.

  2. Importing contacts -- you can see pretty quickly who is and isn’t on Roadtrip from your contacts -- and if they aren’t, you can invite them with one tap. This is also great for network effects for the app.

  3. Leaderboard for DJ’s/hosts -- adding in a public leaderboard of the best DJ’s is like the icing on the cake. People love a little gamification and this helps incentivize people to focus on creating quality music rooms - no one wants a crappy room with crappy music.

Some Feedback

  1. Create a more intuitive UI with easier navigation - I’ve had to walk multiple people through navigating the screens as they get into the app. Everything should be intuitive. It’s a consumer app, not mission control ;).

  2. Add some content/education about what Roadtrip actually is and what it’s for:even if it just says Roadtrip leaves the decision about how to use up to the user - most people I talk to just need something to go on. 

  3. Add a web-app. I’ve talked to numerous people who’d jam with Roadtrip in the background while they’re working. If you make it a seamless experience like spotify on the desktop with the added benefit of jamming with others, it’s a no brainer for a lot of people who are lacking connection while they’re working right now. 

Roadtrip certainly has a long road ahead of it to get it to where it needs to be, but the future certainly is promising (Clubhouse, albeit overvalued, has promise too). I’m not necessarily ready to say either will be super successful in their current form but I do think they both offer a lot of great potential. I’m really excited to see where social audio takes the world. Social audio starts to bridge the gap between pure messaging and more meaningful connection and conversation. The world won’t be going back to complete normalcy anytime soon and people will need something to stay sane and connected. Now, it’s just a matter of time before we see who will be able to take social audio from the newest commodity to the next FB acquisition or the next company to be ripped off by Instagram. 

But I’d argue there’s more opportunity focusing efforts somewhere else if you’re going to build in this space going forward.

The real winner : Audio Networks as a Service

I have a feeling that nailing the right combinations of integrations with other apps to create an audio first social network will be really messy at first. I’m not saying it’s impossible nor isn’t worth doing: just might take a few years to gain “real” traction as a social network.

I believe the short term winner in this game will be a new company we haven’t seen yet. A developer first audio API built for bringing social audio networks to existing apps. Think of all the games and apps that could be dramatically improved if there was in app audio chat. Look at Fortnite - Epic didn’t succeed just because they built a dope game - they built a social network with audio within the game. 

The voice chatting in rooms/games is key — it creates an addictive feedback loop. Kids playing the game and chatting with their friends are killing 2 birds with one stone. They socialize and they game at the same time. After a while, just like psychology tells us, they start to associate hanging with friends with playing Fortnite. This association isn’t an accident - by building audio first social networks into games/apps you open yourself up to a world way beyond just in app usage. You’re adding real social value to the consumer at that point -- and in some ways leveraging the social addiction to keep them coming back to your app.

I strongly believe that a new company which builds the technology and a simple API to allow a drop in audio first social network solution will win the day. If you can be the Plaid of audio first social, you’ll kill it. Audio technology is a PITA to build, scale and maintain - for games and apps where that’s not part of the core business — it’s not worth building. But if they can pay a single provider to layer this on top of their existing app/game, then it’s a no brainer. This won’t be a need for every app. But there are a lot that could benefit from something like this.  

What does the product for this look like from a product perspective:

  • The ability to create audio rooms at scale

  • The ability to add users in and out of the rooms quickly 

  • Be able to have 1 to X participants

  • Ability to add roles and permissions for each member of a room

  • Automatic garbage collection for stale rooms

  • The ability to control background noise for each room( add in music or other noise) — ie make a call from the server to inject an audio snippet at a certain time 

  • Be able to boot people from a room dynamically

  • Create drop in SDKs across web and mobile  and server side so developer can easily integrate the platform into their existing codebase

  • Provide drop in and customizable base UI elements for web and mobile for the developer to overlay if wanted — allows developers to get up and running without having to build their own UI at first

  • Pliability - make it easy for developers to customize everything 

  • Developer first

  • And plenty of other stuff I’m sure I’m blowing over

There’s a lot to be done in the audio space and the person who builds for the creators in this space will undoubtedly be very successful. 

If you’ve got any thoughts, questions or feedback please drop me a line - would love to chat! You can find me at twitter or