System Design Questions to ask in an interview
Fair warning, this is a long one. Brew a pot of coffee (or tea!) and read on.
This page is about questions to ask in a system design interview, not answers. In general, we should ask these questions. Be careful to ask questions, but not dive too deep into a topic if you don’t know the answers. Topics can become very deep when tackling distributed systems.
Gathering requirements
At this stage we want to gather requirements so that we have a better understanding of the system we’re designing.
Functional requirements
- Who is it for, and why do we need to build it?
- What are the features we need to build to solve the users’ problem?
- Identify a few problem areas to solve, and hone in on 1 or 2
Non-functional requirements
- How many active users? Are there power users that upload most of the content or is it spread evenly?
- How do we partition this data and handle noisy neighbours?
- Are users distributed across the world?
- Is there ever burst traffic?
- Are there users who upload single files that are GBs in size?
- Is the application used during work hours? When is there higher general traffic?
Availability and consistency requirements
- Accuracy is different from consistency.
- e.g. eventual consistency is eventually accurate, where inaccuracy implies the processed data doesn’t have to be the same as what the users provided.
- Does it need to be highly available?
- Touch on CAP theorem
- Durability: how much data is acceptable to lose?
Response time and latency
- How long does a response need to be available for?
- Freshness: what actions are performed to keep items fresh (jobs, etc.)?
API design
- What is needed to provide each feature?
- What will the following look like:
- Endpoints
- Request body
- HTTP methods used
- Inputs and outputs
- Dive into the API design
High level diagrams
- What clients will use this API? Logical blocks: API gateway, database, app server.
- Repeat per API
- Keep it simple for the diagram at the start
- Walk through the diagram and how it functions
Draw the diagram early and move quickly. A concrete diagram allows for real discussion around choices and prevents flip-flopping. A few guidelines:
- Keep it presentable
- Don’t try to use vendor-specific images; rely on generic names (e.g. “Database” instead of AWS RDS/Aurora)
Data model and schema
Here we define how we are going to store our data in our data store of choice. We make tradeoffs based on schema vs. schemaless approaches.
Questions to work through:
- Do we have all the entities we need? Is anything missing?
- How are we going to identify the key properties of each entity?
- What are the relationships between these entities?
- If using a relational data store: how normalized should the database be? More normalized means less redundancy but lower performance.
Deep dive design
This is where we scale up the design. You should have numbers already (e.g. 5 million users).
Ask these questions, but also be ready to provide answers. Everything mentioned here is fair game depending on what you bring up.
The following are questions you may be asked, and if not you should ask and answer them yourself.
As we scale, our database(s) will come under extreme load. How do we deal with this?
Are we making good use of indexes?
Are we using database partitions? Vertical vs horizontal partitions?
- We can break the tables down into smaller pieces, then attach them to a main table
- Vertical: large column like a blob that can be stored in an access drive in own tablespace
- Horizontal: range or list, like videos_200k, videos_400k, etc. - xxxk refers to the last sequential key in this table
What partitioning types are there? If we jump to partitioning, remember to talk about the tradeoffs!
- Range: dates, id, etc. - we could do this for video metadata, as older videos would not be watched as much
- List: discrete values like states or zipcodes. e.g. this partition is for everyone in CA
- Hash: hash functions
- Partition Advantages:
- Improves query performance when accessing a single partition
- Sequential scan improvement vs scattered index scan being slow on a huge table
- Easy bulk loading (attaching partition) - mysql only
- Archiving old data that are barely accessed into cheap storage
- Partition Disadvantages:
- Updates can move rows from one partition to another which can be slow
- Inefficient queries can scan all partitions
- Schema changes can be challenging if not planned for
Are we using database sharding?
Row based split of table across databases like 200k in db 1, 200k db 2, …
We need consistent hashing, to ensure we connect to the right instance that has our data
- Make input into number, -> binary -> int -> modulo # nodes. Remainder + a given number gives port # to connect to
- Take in input and get back the instance
- Hashing of input(1) will always go to db 1, input(2) will always go to db 2, …
What is the difference between sharding and partitioning?
- Partitioning splits table into multiple tables in the same database. Table name or schema changes
- Sharding splits table into multiple tables in multiple database servers. Everything stays the same except for the database server
Addendum
- Is there concurrency control on our database engine? Most engines have this.
- Is there data that requires two phase locking?
- Are we using read/write replicas?
- Are we using the proper data store now? Is the database engine the right one for our needs? Why?
- What happens when we have dead locks?
In closing
Hope this gives you a solid set of questions to work through at each stage of a system design interview. Use it as a checklist going in, and over time you’ll find you need to look at it less and less.