Web Analytics - Data Model

1 - About

A periodic process will process the raw data event to load them into a SQL database.

When data is loaded:

  • the data is inserted and updated
  • the schema is adjusted

Data are broken down into tables and views where:

  • Tables contain duplicate data,
  • Views do not.

Example:

  • A table = type of event
    • each tracked event has its own table (pageview, click button,…)
    • each property maps to a column
  • Partition = one day of data

3 - Platform

3.1 - Segment

  • Web Analytics - Segment.io - Doc

3.2 - Google Analytics

Google Analytics works with a one table format exported each day where:

  • each row within a table corresponds to a session
  • each hit data is saved in a record format (array)

See the full definition at BigQuery Export schema (Column Definition)

Extract of the Example that can be found in the guide

  • Average number of transactions per user that made a purchase in July 2017
SELECT
  (SUM (total_transactions_per_user) / COUNT(fullVisitorId) ) AS avg_total_transactions_per_user
FROM (
  SELECT
    fullVisitorId,
    SUM (totals.transactions) AS total_transactions_per_user
  FROM
    `bigquery-public-data.google_analytics_sample.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170701'
    AND '20170731'
    AND totals.transactions IS NOT NULL
  GROUP BY
    fullVisitorId )
  • Total transactions per device browser in July 2017
SELECT
  device.browser,
  SUM ( totals.transactions ) AS total_transactions
FROM
  `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE
  _TABLE_SUFFIX BETWEEN '20170701' AND '20170731'
GROUP BY
  device.browser
ORDER BY
  total_transactions DESC

4 - Documentation / Reference


Data Science
Data Analysis
Statistics
Data Science
Linear Algebra Mathematics
Trigonometry

Powered by ComboStrap