Three Tables Every Analyst Needs
Sep 3, 2025

Introduction
If you’re one of the first analysts joining a company, you will probably arrive to find an overwhelming amount of data. You’ll almost certainly have some relational data (probably powering your application’s backend), and likely have event data too (possibly through a third party like Segment).
This can make it difficult to know where to start with each analytical request. You might be tempted to create a new dataset for each analysis, but this will add to the amount of data and quickly becomes difficult to manage.
Fortunately, you can gather all of your data into three tables that can then be used to answer the majority of user related questions.
The SQL snippets in this post apply to Google BigQuery but can easily be adapted for other data warehouses.
Table 1: User Stats
To start, you need a table that tells you all about your users. This table will contain dimensions (e.g. user account creation timestamp) and metrics (e.g. user lifetime transaction count). It might be simple to begin with, but as your company grows and gets more complex, you’ll keep adding to this table.
Key point: user_stats
has one row per user

What kind of questions can this table be used to answer?
How many users signed up this week?
How many users have made at least 5 transactions?
How many users have logged in within the past 24 hours?
Table 2: Daily user stats
The need for this table might not be so obvious, but it turns out that having a table describing each user’s activity every single day since they signed up (even if they were not active at all on that day) is really useful. There are four dimensions you’ll definitely want in this table:
user_id
: a unique identifier for your users, that you can link back touser_stats
date
: what date does this row correspond today_n
: what day (in the lifetime of the user) does this date represent. On the day a user signs upday_n = 1
, the day after sign upday_n = 2
etcwas_active
: was the user active on this day? (Exactly how you choose to define active should depend on the specifics of your business. If you are creating a messaging app, you may define "active" as having sent at least one message, for example)
As with user_stats
, this table will probably end up with many more than these 4 fields as your business becomes more complex.
Get Daniel Lee’s stories in your inboxJoin Medium for free to get updates from this writer.Subscribe
Key point: daily_user_stats
has one row per user per day (for all days since they signed up)

What kind of questions can this table be used to answer?
What percentage of users are active one week after they signed up?
How has the percentage of users active the day after they sign up trended over time?
What does our user retention curve look like?
Table 3: Sessions
The sessions
table is a place to store your event data. Events are user interactions on your app or site (e.g. button clicks, form submissions, message sends etc). You probably track many different events, and finding a consistent approach to analysing them can be tricky. Sessions is a table to organise the chaos!
A session is a collection of events that represent one end to end interaction with your product: user opens app > user opens product abc page > user clicks add to cart > user closes app. How you combine a user’s events into a session depends on your product, but a standard approach is to end a session after a period of 30 minutes of inactivity. Any subsequent events from that user will be added to a new session.
Using some relatively complex SQL, you can split your raw events into sessions. Your sessions table should at least have these dimensions:
user_id
: unique user identifiersession_id
: Starting at 1 (the user's first ever session), incrementing for each further sessionstart_time
: timestamp of first event in sessionend_time
: timestamp of last event in sessionevents
: ordered list/array of all events in the session (this is well supported in BigQuery, other query languages may need to approach this slightly differently, for example using JSON)
Key point: sessions
has one row for each session a user has had

What kind of questions can this table be used to answer?
How has average session length changed over time?
What share of sessions contain a transaction?
What chain of events lead to a user leaving the app and never returning?
Summary
These three well structured tables can help answer most questions:
user_stats
: who is in my user basedaily_user_stats
: how many active users do I have, and how well do they retainsessions
: what journey are users taking through my product
Once you’ve written the queries to combine your other datasets into these three simple tables you should set up a schedule to keep them up to date with any new changes. You’ll soon start relying on these tables for a large share of your analytics.
Why this still matters today
When we work with startups at Outlier, this is exactly the kind of simplification we do: taking messy, scattered data and boiling it down into a few well-structured tables that unlock powerful insights.
In the AI age, this clarity matters more than ever — the simpler your data foundations, the easier it is for tools (and people) to answer questions about your business without error.
If you’re scaling and want a fractional data partner to help get you there, that’s what we do every day. Book an initial chat with us here.