Here, there, and back again: Sourcegraph Cloud

Organization
Sourcegraph

Role
Staff Product Designer

Date
2021 – 2022

Sourcegraph is a universal code search and intelligence platform. More than 800,000 developers in some of the largest companies around the world rely on Sourcegraph to search across and navigate their company’s code on all of their code hosts.

As an early-stage startup, Sourcegraph started as a cloud SaaS product before realizing that the company needed to do more to earn trust with companies’ private code. This led to a pivot towards making Sourcegraph open source and self-hosted. Companies could look at Sourcegraph source code for themselves and run their Sourcegraph instance in their own trusted environment.

This early decision, as well as a focus on Enterprise-scale customers, was crucial to gaining traction. Today, many of the world’s largest and most technologically advanced companies like Dropbox, General Electric, PayPal, and many more trust Sourcegraph with their private code.

I joined Sourcegraph shortly after the company closed a Series B funding round, and helped the company grow to a $125M Series D funding round to bring universal code search to every dev at every company of every size and scale. We heard again and again that for smaller companies with lean engineering teams that already use SaaS products across their stack, the ease of deploying through SaaS outweighs the benefits of self-hosting. And for us to bring universal code search to every dev at every company of every size and scale, Sourcegraph needs to be effortless for these smaller companies to deploy.

In response, we decided to make a big bet and build out Sourcegraph Cloud on top of Sourcegraph.com. This presented a big challenge: everything about Sourcegraph's product was built on the assumption that Sourcegraph is a self-hosted, single-tenant product.

Breaking assumptions

Building a self-hosted product involves a lot of assumptions that break when turning it into a cloud SaaS:

All code on the instance belongs to the same organization.
There is a person or team who can set up, orchestrate, and administrate Sourcegraph on the organization’s own stack and network.
The instance admin will connect Sourcegraph with code hosts, and choose which repositories to add to Sourcegraph and make sure they sync properly.
Running a global search on Sourcegraph will search across only your company’s repositories.
All users on the Sourcegraph instance belong to the same company.
A URL to a search result copied and pasted to someone else will “just work,” because everyone has access to the same code.
There’s a high-touch, multi-stage sales process tailored for each customer.

The extent of this list of assumptions gave us an idea of just how big this challenge was, especially when we knew we wanted to keep “one Sourcegraph”—we didn’t want to split out into separate “Cloud” and “Self-hosted” products.

Big vision, incremental action

We needed to strike the balance between the long-term of what Sourcegraph Cloud would become and the concrete steps that would get us there. Our approach was to orient around the problems, identify the foundations, and to be really thoughtful about what we needed when.

Orienting around problems

Instead of orienting around action items or tactical steps, we oriented our efforts around problems that, when solved, would represent the “minimum lovable product” at each iteration, ultimately helping us to validate the overriding assumption that companies would trust us with their private code on a Cloud environment.

At the highest level, these problems and their dependencies looked something like:

Sourcegraph Cloud is the easiest way for organizations to get started with Sourcegraph.

Small-to-medium organizations can search across their code together on Sourcegraph Cloud.

Individual users can search across their private code on Sourcegraph Cloud.

Sourcegraph team members can search across their private code on Sourcegraph Cloud.

Individual users can search across their public code on Sourcegraph Cloud.

Each problem was inherently cross-disciplinary and revealed a set of sub-problems we’d have to solve.

And each problem we solved would support each incremental step forward, all in a low-risk, high-confidence way that would help us to build momentum rather than rehashing the same problems again and again.

Identifying foundations

Once we had a roadmap of problems to solve, we took time to identify what we considered the “foundations”—those decisions that would influence all other decisions and ultimately accelerate future efforts, even if we didn’t act on them right away. Without identifying these foundations, we risked working ourselves into a corner by solving each problem in isolation, rather than as an incremental step within a bigger context.

For us, these foundations turned out to be mostly conceptual: what does it mean to “add code?” What about “connecting with code hosts?” How might that be the same, but different, for an individual user versus an organization? How does code visibility work? What’s the difference between what we’re doing here, and multitenancy? What decisions do we make now that will open up a future path to multitenancy?

We captured these decisions in async artifacts, such as an exploration of how multitenancy affects existing assumptions around roles and permissions, and an information architecture summary.

Sourcegraph Multi-tenancy: Initial thoughts on how multi-tenancy affects existing assumptions around roles and permissions
View document

Sourcegraph Cloud Information Architecture Summary
View document

Async artifacts like these are a core part of effective asynchronous, remote collaboration. They make it easy to share and provide thoughtful, considered feedback, and to consistently revisit and build on our shared understanding of these foundations.

You ain’t gonna need it (yet)

Having both a clear roadmap of problems and a good idea of the foundations helped us to keep moving forward without feeling like we needed to solve every problem all at once. A big part of this meant constantly revisiting what we needed to do now, and what could be done later.

We knew we’ll need a strong invitation flow for organizations. The way we connect with code hosts can keep improving. And we know we’ll eventually need things like self-service payments and subscription management. But we weren’t going need it yet.

A peek inside the iterative process of solving problems

Our approach to each major milestone involved breaking it down into sub-problems, and carrying out an overall end-to-end design process, from discovery, definition, design, and implementation. The general process was similar and predictable for each problem: as a team, we’d create clear problem statements and definitions of success, then move into low-fidelity design with heavy cross-disciplinary collaboration. We'd often test these low-fidelity design prototypes through hallway testing, and once aligned on the solution, we'd use high-fidelity design artifacts as the async source of truth to move into implementation.

Of course, this was an evolving and adaptable process. The goal is to use process to help create predictability in how the team collaborates—rather than to use process to define a linear step by step approach. I consistently take the pragmatic approach in efforts like these: do what makes sense.

First milestone: Individual users can search across their public code on Sourcegraph Cloud

This milestone was all about defining the foundations and aligning the team around a common understanding of how we would get from here there there.

Success for this milestone was defined as: individual users can add and search across their public code from GitHub.com and GitLab.com.

This opened a bunch of questions like, “What does it mean to add my code to Sourcegraph.com?” Much of the initial collaboration here was with engineering to understand and define the information architecture, and figuring out how to evolve the current Sourcegraph product to support this in a sustainable way.

At the same time, we identified a likely future project that would need to be carried out by a different engineering team. Existing data showed us that users who search code they care about (typically their code) are more likely to have early success with onboarding and understanding the product value. But when we get to the point of users adding their own code to Sourcegraph Cloud, they'd be searching for their code across nearly 2.2m repos indexed on Cloud. We needed a way to make it effortless for a new user to add their code and have success searching across it right away. This led to proposing and building search contexts (case study coming soon).

Second milestone: Individual users can search across their private code on Sourcegraph Cloud

This milestone was all about building on the foundation to enable users adding their private code to Sourcegraph Cloud.

Success for this milestone was defined as: individual users on Sourcegraph Cloud can add their private and public code from GitLab.com and GitHub.com, and effortlessly search across it in their first search.

This required more extensive backend work: we needed to build out code host connections, the permissions handling system, and reach a pretty solid level of confidence that the permissions system would never leak private code to someone who shouldn’t see it.

We also implemented the rest of the interfaces needed to manage repositories, and a simple onboarding flow to make it easy for users to connect with code hosts and add repos they care about before conducting their first search.

Third milestone: Small-to-medium organizations can search across their code together on Sourcegraph Cloud

After launching private code for individual users on Cloud, we did a collaborative story-mapping workshop to break out what we knew we’d need to do to support organizations into a series of iterations that represented a fully functional, lovable product outcome.

Some things we’d have to solve included:

Organizations can add private and public code and organization members can search across that code.
System state is visible to users so that they have confidence they are searching across the most up-to-date code.
Enabling users to join and invite others to organizations.
A workflow for enabling early access for organizations so that we could test and improve the product before GA.

An interesting small effort that came out of this was the need to introduce helpful friction into the process of connecting organizations with code hosts (read the case study).

Initial results

We launched Sourcegraph Cloud for small-to-medium businesses in early 2022. Within a day, we achieved zero to one with our first contracted customer. Over the course of the quarter, we more than 100% outperformed our customer targets.

The pivot

While my team was working on Sourcegraph Cloud as a multitenant product, the company was also exploring “managed instances”—a way of deploying Sourcegraph where Sourcegraph set up and managed the instance ourselves, effectively delivering a different approach to a “cloud-based SaaS deployment.” This didn’t scale, because it was hands-on and heavily involved engineering to manage the instances, but there was growing customer appetite for managed instances.

Between the initial success of multitenant Cloud after launch and the growing demand for managed instances, we felt that our overall assumption was correct: customers would trust us with their private code in the Cloud.

The cross-disciplinary triad working on Sourcegraph Cloud—including myself, the product manager, the engineering manage, and a staff engineer—frequently revisited our goals and objectives with our work around multitenant Cloud.

In early 2022, we decided to challenge that it was the right decision to continue with Cloud as a multitenant product. While we had paying customers and were growing our commercial customer base, we felt that relative to the business priorities, capabilities, and revenue potential, it didn’t make sense as a strategic direct. The cost and complexity of continuing to build out the multitenant Cloud product to deliver a parallel experience as self-hosted massively outstripped the potential revenue and reduced complexity of instead pivoting to single-tenant cloud based on managed instances.

This was a decision we felt needed to be owned by the VP and Director level, because it had significant impact on roadmap planning and organizational structure. I created the initial RFC together with the product manager on the team, and we made the case to senior leadership.

RFC XXX: Pivoting Sourcegraph’s SaaS product from shared multitenancy to single tenancy
View document

Ultimately, this recommendation was accepted, and the company executed on the pivot. Today, single-tenant Sourcegraph Cloud is now the default deployment option for new customers, and many of our existing self-hosted customers are migrating to Cloud.