Analyze website traffic, user engagement, and purchase behavior to answer 8 business questions and turn raw analytics data into clear insights.
- π― Business Question: Which traffic sources drive the most revenue - and how do user engagement patterns differ between purchasers and non-purchasers?
- π¬ Domain: E-commerce & Digital Marketing
- π οΈ Tools: SQL (Google BigQuery)
π€ Author: BαΊ‘ch Minh Nam
π― Objective:
- This project uses SQL (Google BigQuery) to analyze Google Analytics 4 (GA4) data from the Google Merchandise Store e-commerce website
- It answers 8 specific business questions covering Traffic Performance, User Engagement, Revenue Analysis, and Conversion Funnel Optimization
- The goal is to turn raw session and event data into clear, actionable insights for marketing and product teams
β Main business question:
This project uses SQL to analyze website traffic, engagement, and revenue data from Google Analytics to:
- Track changes in visits, pageviews, and transactions over time
- Evaluate which traffic sources generate the most revenue and engagement
- Compare user behavior between purchasers and non-purchasers
- Identify cross-selling opportunities and conversion funnel bottlenecks
π€ Who is this project for?
- Data analysts & business analysts who want a reference for writing analytical SQL (CTEs, window functions, cohort analysis, UNNEST operations)
- Digital marketing teams who need insights into traffic source performance and ROI
- E-commerce managers & stakeholders who need quick insights into revenue trends, user engagement, and conversion rates
- Business intelligence teams building dashboards and reporting systems
The analysis is based on Google Analytics 4 (GA4) data exported to Google BigQuery, representing the Google Merchandise Store, a real e-commerce website selling branded merchandise. It contains data on user sessions, page views, product interactions, transactions, and revenue across multiple months in 2017.
To answer the 8 business questions in this project, 6 core data structures from the GA4 export schema were used. The table below lists only the columns that were actually used in the queries.
| Schema | Table / Struct | Columns Used | Used In | Purpose |
|---|---|---|---|---|
| Sessions | ga_sessions_2017* |
date, fullVisitorId |
Q1, Q2, Q4, Q5, Q6, Q8 | Base session table tracking unique users and session timestamps for all temporal analysis. |
| Sessions | totals |
visits, pageviews, transactions, bounces |
Q1, Q2, Q4, Q5, Q6 | Aggregate metrics per session - visits, pageviews, bounce count, transaction count for KPI calculations. |
| Sessions | trafficSource |
source |
Q2, Q3 | Identifies traffic channel origin (organic search, direct, referral, paid ads) to analyze channel performance. |
| Hits | hits |
eCommerceAction |
Q8 | Unnested to capture individual user actions within a session (product view, add to cart, purchase). |
| Hits | eCommerceAction |
action_type |
Q8 | Action type codes ('2'=View, '3'=Add to Cart, '6'=Purchase) to build conversion funnel analysis. |
| Product | product |
v2ProductName, productRevenue, productQuantity |
Q3, Q4, Q6, Q7, Q8 | Unnested product-level data to track revenue, quantities sold, and product-specific insights. |
π Full Documentation: For the complete explanation of all available fields in the GA4 BigQuery export schema, please refer to the Official Google Analytics BigQuery Export schema.
Question: Calculate total visits, pageviews, and transactions for January, February, and March 2017.
Tracking monthly traffic metrics helps the business understand seasonal demand patterns and measure the impact of marketing campaigns across the first quarter.
SELECT
FORMAT_DATE('%Y%m', PARSE_DATE('%Y%m%d', date)) AS month,
COUNT(totals.visits) AS visits,
SUM(totals.pageviews) AS pageviews,
SUM(totals.transactions) AS transactions
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_2017*`
WHERE _table_suffix BETWEEN '0101' AND '0331'
GROUP BY month
ORDER BY monthπ‘ Observations:
Visits grew from 64K (Jan) to 70K (Mar), pageviews from 257K to 294K, and transactions jumped from 700 to nearly 1,000. Growth acceleration in March suggests successful marketing campaign or new product launch. Investigate March drivers and replicate.
Question: Calculate the bounce rate per traffic source in July 2017.
High bounce rates indicate poor landing page relevance or user experience issues. Identifying which traffic sources bounce most helps prioritize optimization efforts and reallocate budget from underperforming channels.
SELECT
trafficSource.source AS source,
SUM(totals.visits) AS total_visits,
SUM(totals.bounces) AS total_no_of_bounces,
ROUND(SUM(totals.bounces) / SUM(totals.visits) * 100, 3) AS bounce_rate
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_201707*`
GROUP BY source
ORDER BY total_visits DESCπ‘ Observations:
Google sends 38K visits but 51% bounce. YouTube bounces worst at 67%. Direct traffic (43% bounce) converts best-loyal customers. Landing page misalignment for Google/YouTube traffic requires urgent optimization.
Question: Calculate revenue by traffic source by week and by month in June 2017.
Breaking revenue down by traffic source and time period reveals which channels are most profitable and when peak revenue occurs. This guides budget allocation and campaign timing decisions.
WITH month_data AS (
SELECT
'Month' AS time_type,
FORMAT_DATE('%Y%m', PARSE_DATE('%Y%m%d', date)) AS month,
trafficSource.source AS source,
SUM(p.productRevenue) / 1000000 AS revenue
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_201706*`,
UNNEST(hits) AS hits,
UNNEST(product) AS p
WHERE p.productRevenue IS NOT NULL
GROUP BY 1, 2, 3
),
week_data AS (
SELECT
'Week' AS time_type,
FORMAT_DATE('%Y%W', PARSE_DATE('%Y%m%d', date)) AS week,
trafficSource.source AS source,
SUM(p.productRevenue) / 1000000 AS revenue
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_201706*`,
UNNEST(hits) AS hits,
UNNEST(product) AS p
WHERE p.productRevenue IS NOT NULL
GROUP BY 1, 2, 3
)
SELECT * FROM month_data
UNION ALL
SELECT * FROM week_data
ORDER BY time_type, revenue DESCπ‘ Observations:
Direct traffic dominates revenue ($97K in June), followed by Google ($18.7K). YouTube/referral traffic bring visitors but zero revenue conversion. Shift budget from non-converting to high-ROI channels (direct, Google).
Question: Calculate average number of pageviews by purchaser type (purchasers vs non-purchasers) in June and July 2017.
Comparing engagement between buyers and non-buyers reveals the page view threshold needed to drive conversion. Higher pageview counts among purchasers signal deeper product exploration before purchase.
WITH
base AS (
SELECT
FORMAT_DATE('%Y%m', PARSE_DATE('%Y%m%d', date)) AS month,
totals.transactions,
product.productRevenue,
totals.pageviews,
fullVisitorId
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_2017*`,
UNNEST(hits) AS hits,
UNNEST(product) AS product
WHERE _table_suffix BETWEEN '0601' AND '0731'
),
purchase AS (
SELECT
month,
ROUND(SUM(pageviews) / COUNT(DISTINCT fullVisitorId), 8) AS avg_pageviews_purchase
FROM base
WHERE transactions >= 1
AND productRevenue IS NOT NULL
GROUP BY month
),
non_purchase AS (
SELECT
month,
ROUND(SUM(pageviews) / COUNT(DISTINCT fullVisitorId), 8) AS avg_pageviews_non_purchase
FROM base
WHERE transactions IS NULL
AND productRevenue IS NULL
GROUP BY month
)
SELECT *
FROM purchase
FULL JOIN non_purchase USING (month)
ORDER BY monthπ‘ Observations:
Non-purchasers view 3x more pages (317 in June) than purchasers (94 pages). High pageviews β conversion. Suggest focusing on engagement quality (time-on-page, cart adds) over quantity metrics.
Question: Calculate the average number of transactions per user that made a purchase in July 2017.
Understanding repeat purchase frequency within a month reveals customer loyalty and multi-purchase behavior. Higher repeat rates indicate strong product satisfaction and cross-sell success.
SELECT
FORMAT_DATE('%Y%m', PARSE_DATE('%Y%m%d', date)) AS month,
ROUND(SUM(totals.transactions) / COUNT(DISTINCT fullVisitorId), 4) AS avg_total_transactions_per_user
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_201707*`,
UNNEST(hits) AS hits,
UNNEST(product) AS product
WHERE totals.transactions >= 1
AND product.productRevenue IS NOT NULL
GROUP BY monthπ‘ Observations:
Customers average 4 transactions per user in July-strong repeat purchase behavior. Loyalty programs and personalized email recommendations should amplify this.
Question: Calculate the average amount of money spent per session (purchasers only) in July 2017.
Revenue per session reveals the monetary value each visit generates. Higher values indicate strong product pricing, effective upselling, or high-value customer segments.
SELECT
FORMAT_DATE('%Y%m', PARSE_DATE('%Y%m%d', date)) AS month,
ROUND((SUM(product.productRevenue) / SUM(totals.visits)) / 1000000, 2) AS avg_revenue_by_user_per_visit
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_201707*`,
UNNEST(hits) AS hits,
UNNEST(product) AS product
WHERE totals.transactions >= 1
AND product.productRevenue IS NOT NULL
GROUP BY monthπ‘ Observations:
Each purchasing session generates ~$44 revenue-strong AOV. Prioritize marketing spend on high-intent channels (Google branded search, email, direct).
Question: Calculate other products purchased by customers who also bought "YouTube Men's Vintage Henley" in July 2017.
Market basket analysis identifies which products are frequently purchased together. This drives product bundling, upsell strategies, and personalized recommendation engine training.
WITH
buyer_list AS (
SELECT DISTINCT fullVisitorId
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_201707*`,
UNNEST(hits) AS hits,
UNNEST(product) AS product
WHERE product.v2ProductName = "YouTube Men's Vintage Henley"
AND totals.transactions >= 1
AND product.productRevenue IS NOT NULL
)
SELECT
product.v2ProductName AS other_purchased_products,
SUM(product.productQuantity) AS quantity
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_201707*`,
UNNEST(hits) AS hits,
UNNEST(product) AS product
JOIN buyer_list USING (fullVisitorId)
WHERE product.v2ProductName != "YouTube Men's Vintage Henley"
AND product.productRevenue IS NOT NULL
AND totals.transactions >= 1
GROUP BY other_purchased_products
ORDER BY quantity DESCπ‘ Observations:
Henley buyers also purchase Sunglasses (20x), Hero Tee (7x), and Lip Balm (6x). Create bundled promotions pairing complementary products to increase AOV.
Question: Generate a cohort map of the checkout funnel (Product View β Add to Cart β Purchase) for JanβMar 2017.
Conversion funnel analysis identifies where users drop off during the purchase journey. High drop-off rates at specific funnel stages highlight optimization priorities (e.g., cart abandonment recovery, checkout simplification).
WITH
data_overview AS (
SELECT
FORMAT_DATE('%Y%m', PARSE_DATE('%Y%m%d', date)) AS month,
eCommerceAction.action_type AS action_type,
totals.transactions,
product.productRevenue
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_2017*`,
UNNEST(hits) AS hits,
UNNEST(product) AS product
WHERE _table_suffix BETWEEN '0101' AND '0331'
),
data_count AS (
SELECT
month,
COUNTIF(action_type = '2') AS num_product_view,
COUNTIF(action_type = '3') AS num_addtocart,
COUNTIF(action_type = '6' AND productRevenue IS NOT NULL) AS num_purchase
FROM data_overview
GROUP BY month
ORDER BY month
)
SELECT
*,
ROUND(num_addtocart / num_product_view * 100.0, 2) AS add_to_cart_rate,
ROUND(num_purchase / num_product_view * 100.0, 2) AS purchase_rate
FROM data_countπ‘ Observations:
Jan: 28% add-to-cart, 8% purchase. Mar: 37% add-to-cart, 13% purchase. Biggest drop-off is product view β add-to-cart (72% abandon). Add reviews, reduce friction, apply urgency tactics (low stock warnings).
1. Traffic Growth with Revenue Spike:
- Q1 traffic grew 10% (64Kβ70K visits) but transactions surged 43% (700β1K), indicating March marketing campaign effectiveness.
- Action: Replicate successful March tactics.
2. Channel Quality Gaps:
- Direct traffic (43% bounce) and Google (51% bounce) dominate volume, but YouTube (67% bounce) bleeds budget without ROI.
- Action: Reallocate spend to direct + Google branded search.
3. Revenue Concentration in Direct:
- Direct traffic generates $97K (June), 5x Google ($18.7K). Non-converting channels (YouTube, referral) waste marketing dollars.
- Action: Shift to high-intent channels.
4. Repeat Purchase Strength:
- Customers average 4 purchases/month ($44 per session), showing strong loyalty. Upsell and bundling opportunities underutilized.
- Action: Implement loyalty programs and personalized recommendations.
5. Engagement β Conversion:
- Non-purchasers view 3x more pages (317 vs 94), pageviews are poor conversion signal.
- Action: Focus on quality metrics (time-on-page, add-to-cart rate).
6. Cross-Sell Opportunity:
- Henley buyers also buy Sunglasses (20x), Tees (7x), Lip Balm (6x).
- Action: Bundle complementary products to increase AOV.
7. Funnel Bottleneck:
- 72% abandon at product view stage (25.7K views β 7.3K cart adds).
- Action: Add reviews, simplify add-to-cart, show social proof to improve step-1 conversion.
8. Urgent: Cart Abandonment:
- 28% add-to-cart conversion with only 8% purchase (Jan) means 3.5K daily carts abandoned.
- Action: Implement email recovery sequences and one-click checkout.








