Models
dim_legacy_devices
Legacy device slowly changing dimension.
Model tests: dbt_utils.mutually_exclusive_ranges
| Column | Description | Tests |
|---|---|---|
legacy_devices_key |
Unique legacy device key for slowly changing dimension. | unique, not_null |
device_id |
Unique source id for legacy devices. | |
valid_from |
Timestamp that reflects the valid start date of the record. | |
valid_to |
Timestamp that reflects the valid end date of the record. | |
is_current |
Flag to denote this record is the current record for the device. | |
customer_id |
Owning Tincan customer id for the device. | |
device_mac |
Device MAC address. | |
extension |
Internal extension number assigned to the device. | |
did_number |
Direct inward dial (DID) or outbound caller id number. | |
caller_id |
Caller id display name for this device. | |
enabled_911 |
Flag indicating whether 911 emergency calling is enabled. | |
is_discoverable |
Flag indicating whether the device is discoverable in the app. | |
device_name |
Human-readable device name. | |
device_meta |
JSON metadata about the device. | |
device_avatar |
URL or identifier for the device avatar. | |
external_network |
Flag indicating whether the device is on an external network. | |
status |
Raw device status value in Tincan. | |
created_at |
Timestamp when the device record was created. | |
updated_at |
Timestamp when the device record was last updated. | |
subscription_type |
Subscription type associated with the device. | |
dnd_until |
Timestamp until which do-not-disturb is active. | |
dnd_status |
Current do-not-disturb status for the device. | |
onboarding_complete |
Flag indicating whether device onboarding is complete. | |
device_status |
Operational device status (for example, online or offline). | |
device_type |
Device type (for example, desk phone or softphone). | |
country |
Country associated with the device address. | |
city |
City associated with the device address. | |
zip_postal_code |
Zip or postal code for the device address. | |
state_province |
State or province for the device address. | |
street |
Primary street address for the device. | |
street_secondary |
Secondary street address line for the device. | |
twilio_sid |
Twilio SID for the associated phone number. | |
twilio_emergency_address_sid |
Twilio emergency address SID for the device. | |
emergency_address_confirmed_date |
Date when the emergency address was confirmed. | |
emergency_address_validated_date |
Date when the emergency address was validated. | |
service_version |
Service version of device. | |
first_online |
First online / install date. | |
stripe_subscription_id |
Stripe payment system subscription id for device. |
daily_sales
One row per calendar day (America/Los_Angeles) with daily sales metrics from Shopify. Aggregates orders and refunds into items_sold, gross_sales, discounts, refunds, and net_sales_revenue. Used as the base for blended CAC. Orders use order date and refunds use refund date.
Model tests: dbt_utils.unique_combination_of_columns, dbt_utils.recency, dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart
| Column | Description | Tests |
|---|---|---|
date |
Calendar date (Pacific time) for the daily aggregate. Orders use order date and refunds use refund date. | unique, not_null |
items_sold |
Total units sold (before refunds) for the day. | |
items_refunded |
Total units refunded for the day. | |
net_items_sold |
items_sold minus items_refunded. | |
gross_sales |
Total sales before discounts and refunds. | |
discounts |
Total order discounts applied. | |
refunds |
Total refund amount for the day. | |
net_sales_revenue |
gross_sales minus discounts minus refunds. |
daily_sales_blended_cac
One row per calendar day with ad spend (Meta + Google), attributed purchases, and sales. Includes blended CAC, ROAS, and month-to-date metrics (mtd_ad_spend, mtd_items_sold, mtd_net_revenue, mtd_cac, mtd_roas). Built from daily_sales and Meta/Google ad sources.
Model tests: dbt_utils.unique_combination_of_columns, dbt_utils.recency, dbt_expectations.expect_row_values_to_have_data_for_every_n_datepart
| Column | Description | Tests |
|---|---|---|
date |
Calendar date for the daily row. | unique, not_null |
total_ad_spend |
Combined Meta + Google ad spend for the day. | |
direct_ad_attributed_sales |
Purchases attributed to Meta and Google ads. | |
total_units_sold |
Net units sold (from daily_sales) for the day. | |
net_sales_revenue |
Net sales revenue for the day. | |
mtd_ad_spend |
Month-to-date ad spend through this date. | |
mtd_items_sold |
Month-to-date net units sold through this date. | |
mtd_net_revenue |
Month-to-date net revenue through this date. | |
blended_cac |
Total ad spend divided by total units sold for the day (null when no ad spend). | |
roas |
Net sales revenue divided by total ad spend for the day (null when no ad spend). | |
mtd_cac |
Month-to-date ad spend divided by month-to-date units sold. | |
mtd_roas |
Month-to-date net revenue divided by month-to-date ad spend (null when mtd ad spend is zero). |
combined_google_ad_data
One row per (date, Google Ads creative unit). Union of src_google_ads.ad_performance
(Search/Display/Shopping/Video, grained at ad_group_ad) and src_google_ads.pmax_ad_performance
(Performance Max, grained at asset_group). PMax campaigns don't have ad_groups or ads —
the asset_group is the closest analog, so creative_unit_type disambiguates the two:
'ad' for rows from ad_performance and 'asset_group' for rows from pmax_ad_performance.
Campaign fields and metrics are harmonized across both sources; search-only columns
(ad_group_, ad_) are null on PMax rows and pmax-only columns (asset_group_*) are null
on search rows. Cost is converted from micros to dollars.
Model tests: dbt_utils.unique_combination_of_columns, dbt_utils.recency
| Column | Description | Tests |
|---|---|---|
date |
Reporting date for the row (from SEGMENTS.DATE). |
not_null |
customer_name |
Descriptive name of the Google Ads customer/account. | |
campaign_id |
Google Ads campaign id. | not_null |
campaign_name |
Campaign display name. | |
campaign_status |
Campaign status (for example, ENABLED, PAUSED, REMOVED). | |
bidding_strategy_type |
Bidding strategy type applied to the campaign. | |
advertising_channel_type |
Advertising channel type (for example, SEARCH, DISPLAY, SHOPPING, VIDEO, PERFORMANCE_MAX). | |
advertising_channel_sub_type |
Advertising channel sub-type, when applicable. | |
creative_unit_type |
Discriminator for the source feed and creative unit grain: 'ad' for rows from src_google_ads.ad_performance (Search/Display/etc., grained at ad_group_ad), or 'asset_group' for rows from src_google_ads.pmax_ad_performance (PMax, grained at asset_group). |
not_null, accepted_values |
creative_unit_id |
Unified creative unit id — ad id on 'ad' rows, asset_group id on 'asset_group' rows. |
not_null |
creative_unit_name |
Unified creative unit display name — ad name or asset_group name. | |
creative_unit_status |
Unified creative unit status — ad status (from AD_GROUP_AD.STATUS) or asset_group status. |
|
creative_unit_final_urls |
Unified array of final landing-page URLs — ad final_urls or asset_group final_urls. | |
ad_group_id |
Ad group id. Populated on 'ad' rows; null on 'asset_group' rows (PMax has no ad groups). |
|
ad_group_name |
Ad group display name. Null on 'asset_group' rows. |
|
ad_group_type |
Ad group type. Null on 'asset_group' rows. |
|
ad_group_status |
Ad group status. Null on 'asset_group' rows. |
|
ad_id |
Ad id (matches creative_unit_id on 'ad' rows). Null on 'asset_group' rows. |
|
ad_name |
Ad display name. Null on 'asset_group' rows. |
|
ad_type |
Ad type (for example, RESPONSIVE_SEARCH_AD). Null on 'asset_group' rows. |
|
ad_status |
Ad status within its ad group. Null on 'asset_group' rows. |
|
asset_group_id |
Asset group id (matches creative_unit_id on 'asset_group' rows). Null on 'ad' rows. |
|
asset_group_name |
Asset group display name. Null on 'ad' rows. |
|
asset_group_status |
Asset group status. Null on 'ad' rows. |
|
asset_group_ad_strength |
Google's ad-strength rating for the asset group (for example, POOR, AVERAGE, GOOD, EXCELLENT). Null on 'ad' rows. |
|
clicks |
Number of clicks for the row. | |
impressions |
Number of impressions for the row. | |
cost |
Spend in currency units for the row, converted from METRICS.COST_MICROS (divided by 1,000,000). |
|
ctr |
Click-through rate (clicks / impressions) for the row. | |
average_cpc |
Average cost per click for the row (currency units). | |
conversions |
Number of primary conversions attributed to the row. | |
conversions_value |
Monetary value of primary conversions for the row. | |
all_conversions |
Number of all conversions (primary + secondary) for the row. | |
all_conversions_value |
Monetary value of all conversions for the row. | |
view_through_conversions |
View-through conversions for the row (mainly meaningful for display/video/PMax inventory). | |
airbyte_extracted_at |
Extraction timestamp from Airbyte for the underlying source row. |
call_logs
One row per call from Asterar CDR, filtered and deduplicated by linkedid. Joins CDR to Tincan devices (by device_id / DID) to resolve call_from and call_to labels. Excludes AppDial/voicemail-check noise; filters to calls from 2025-05-05 onward. Output includes call_classification (can_to_can, can_to_external, external_to_can, failed_call, etc.), disposition, call_was_answered, talk_time_seconds, and left_voicemail.
Model tests: dbt_utils.unique_combination_of_columns, dbt_utils.recency
| Column | Description | Tests |
|---|---|---|
unique_id |
Asterar unique call id (from CDR). | unique, not_null |
call_date |
Timestamp of the call (group first calldate) in PST. | |
call_from |
Source identifier (device_id for calls from other Tin Cans, raw src for calls from other types of phones) of the device that initiated the call. | |
call_from_name |
Human-readable label for call_from (caller_id or external number of device that initiated the call). | |
is_call_from_tin_can |
True if call_from matches a Tincan device or DID. | |
destination_id |
Raw destination identifier used for routing. | |
destination_number |
Raw destination number (E.164-style when external). | |
call_to |
Destination identifier (device_id for calls to other Tin Cans, raw src for calls to other types of phones) for the device to which the call was made. | |
call_to_name |
Human-readable label for call_to (caller_id or external number of device to which the call was made). | |
is_call_to_tin_can |
True if call_to matches a Tincan device or DID. | |
call_classification |
Type of call (can_to_can, can_to_external, external_to_can, failed_call, cancelled_call, voicemail_check, UNKNOWN). | |
is_voicemail_check |
True if call was a voicemail check (*97 or src = dst). | |
disposition |
Cleaned disposition (NA for voicemail check, else ANSWERED/NO ANSWER/etc). | |
is_call_answered |
True if the call had a human conversation (answered, not voicemail-only). | |
talk_time_seconds |
Billable talk time in seconds; null when not answered and no voicemail. | |
is_voicemail |
True if caller left a voicemail. | |
cdr_key_concat |
CDR composite key used for joins. | |
call_from_service_version |
Service version of the caller-side Tin Can device at call time; defaults to v1 when caller is Tin Can and version is unavailable. | |
call_to_service_version |
Service version of the recipient-side Tin Can device at call time; defaults to v1 when recipient is Tin Can and version is unavailable. |
call_logs_full_history
Union of (1) FreePBX fused call logs (bi_call_logs_fusion_freepbx), (2) Twilio external-to-Canada calls, and (3) call_logs (Asterar from 2025-05-05). Same schema as call_logs; provides full historical call set.
Model tests: dbt_utils.unique_combination_of_columns
| Column | Description | Tests |
|---|---|---|
unique_id |
Unique call id from source (FreePBX, Twilio, or Asterar). | unique, not_null |
call_date |
Timestamp of the call in PST. | |
call_from |
Source identifier (device_id for calls from other Tin Cans, raw src for calls from other types of phones) of the device that initiated the call. | |
call_from_name |
Human-readable label for call_from (caller_id or external number of device that initiated the call). | |
is_call_from_tin_can |
True if call_from matches a Tincan device or DID. | |
destination_id |
Raw destination identifier. | |
destination_number |
Raw destination number. | |
call_to |
Destination identifier (device_id for calls to other Tin Cans, raw src for calls to other types of phones) for the device to which the call was made. | |
call_to_name |
Human-readable label for call_to (caller_id or external number of device to which the call was made). | |
is_call_to_tin_can |
True if call_to matches a Tincan device or DID. | |
call_classification |
Type of call (can_to_can, can_to_external, external_to_can, etc.). | |
is_voicemail_check |
True if call was a voicemail check. | |
disposition |
Cleaned disposition. | |
is_call_answered |
True if the call had a human conversation. | |
talk_time_seconds |
Billable talk time in seconds; null when not answered and no voicemail. | |
is_voicemail |
True if caller left a voicemail. | |
cdr_key_concat |
CDR composite key (uniqueid in FreePBX branch). |
customers_to_stripe
One row per Tincan customer. Maps each customer to a Stripe customer via (1) explicit stripe_customer_id, or (2) email match, or (3) phone match. Best match chosen by priority_order and stripe created date. Used to join Tincan entities to Stripe billing data.
Model tests: dbt_utils.unique_combination_of_columns
| Column | Description | Tests |
|---|---|---|
customer_id |
Tincan customer id. | unique, not_null |
stripe_customer_id |
Matched Stripe customer id (null if no match). | |
match_type |
How the match was made (base, email, or phone). | |
stripe_customer_created_date |
Stripe customer creation timestamp (null when no match). |
devices_joined_full_history
Unified device list: full outer join of Tincan devices and historical dial numbers (all_dial_numbers). One row per device_id. Uses Tincan data when present, else historical; coalesces extension, did_number, caller_id, first_online, etc. Supports participant and call-log models that need full device history.
Model tests: dbt_utils.unique_combination_of_columns
| Column | Description | Tests |
|---|---|---|
device_id |
Device identifier (Tincan or historical dial number). | unique, not_null |
customer_id |
Owning customer id. | |
device_mac |
Device MAC address. | |
extension |
Extension number. | |
did_number |
DID or outbound caller id number. | |
caller_id |
Caller id display name. | |
enabled_911 |
Whether 911 is enabled. | |
is_discoverable |
Whether device is discoverable. | |
device_name |
Device display name. | |
device_meta |
Device metadata (JSON). | |
device_avatar |
Avatar URL. | |
external_network |
External network flag. | |
status |
Device status. | |
created_at |
Record creation timestamp. | |
updated_at |
Record update timestamp. | |
subscription_type |
Subscription type. | |
dnd_until |
DND until timestamp. | |
dnd_status |
DND status. | |
onboarding_complete |
Whether onboarding is complete. | |
device_status |
Device status. | |
device_type |
Device type. | |
country |
Country. | |
city |
City. | |
zip_postal_code |
Zip or postal code. | |
state_province |
State or province. | |
street |
Street address. | |
street_secondary |
Secondary street. | |
twilio_sid |
Twilio SID. | |
twilio_emergency_address_sid |
Twilio emergency address SID. | |
emergency_address_confirmed_date |
Emergency address confirmed date. | |
emergency_address_validated_date |
Emergency address validated date. | |
first_online |
First online / install date. |
participant_daily_activity
One row per (participant device, call_day). Built from call_logs and devices_joined_full_history. Aggregates per-device, per-day metrics: talk_time_seconds, successful_calls_participated_in, calls_made, calls_received, num_external_calls, num_voicemails_left/received, num_voicemail_checks. Excludes device 3635. Pacific time used for call_day.
Model tests: dbt_utils.recency, dbt_utils.unique_combination_of_columns
| Column | Description | Tests |
|---|---|---|
call_day |
Date (Pacific) for the daily aggregate. | not_null |
day_of_week |
Day of week (mon, tues, …, sun). | |
this_participant_name |
Participant caller_id or name. | |
this_participant_device_id |
Participant device id. | not_null |
this_participant_service_version |
Service version for the participant device used for this daily rollup row. | |
talk_time_seconds |
Sum of talk_time_seconds for the day. Measured in seconds. | |
successful_calls_participated_in |
Count of answered, non–voicemail-check calls participated in. | |
calls_made |
Outbound calls placed by this participant. | |
successful_calls_made |
Answered outbound calls. | |
calls_received |
Inbound calls received by this participant. | |
successful_calls_received |
Answered inbound calls. | |
num_interlocutors |
Distinct call counterparties (call_from/call_to) for the day. | |
num_external_calls |
Count of can_to_external or external_to_can calls. | |
num_successful_external_calls |
Answered external calls. | |
num_voicemails_left |
Voicemails left by this participant. | |
num_voicemails_received |
Voicemails received by this participant. | |
diff_from_first_call_day |
call_day minus first call day for this participant. | |
num_voicemail_checks |
Count of voicemail-check calls. |
participant_daily_activity_full_history
Union of participant_daily_activity and historical view_tc_participant_slice_daily_summary. Excludes historical rows where (call_day, this_participant_device_id) already exists in participant_daily_activity. Provides full-history participant daily metrics for reporting.
Model tests: dbt_utils.unique_combination_of_columns
| Column | Description | Tests |
|---|---|---|
call_day |
Date for the daily aggregate. | not_null |
day_of_week |
Day of week (mon, tues, …, sun). | |
this_participant_name |
Participant name. | |
this_participant_device_id |
Participant device or dial number id. | not_null |
this_participant_service_version |
Service version for the participant device in full-history daily activity (v1 for legacy historical rows where assigned). | |
talk_time_seconds |
Sum of talk time for the day. Measured in seconds. | |
successful_calls_participated_in |
Answered, non–voicemail-check calls participated in. | |
calls_made |
Outbound calls by this participant. | |
successful_calls_made |
Answered outbound calls. | |
calls_received |
Inbound calls received. | |
successful_calls_received |
Answered inbound calls. | |
num_interlocutors |
Distinct counterparties for the day. | |
num_external_calls |
External (can_to_external / external_to_can) calls. | |
num_successful_external_calls |
Answered external calls. | |
num_voicemails_left |
Voicemails left. | |
num_voicemails_received |
Voicemails received. | |
diff_from_first_call_day |
Days since first call day for this participant. | |
num_voicemail_checks |
Voicemail-check count (null in historical slice). |
participant_logs
Call-level participant view: one row per participant per call. Built from call_logs joined to Tincan devices (call_from or call_to). Includes this_participant_device_id, this_participant_name, call_day, num_entries_forcall, and all call fields. Excludes device 3635; uses Pacific time for call_day.
Model tests: dbt_utils.unique_combination_of_columns
| Column | Description | Tests |
|---|---|---|
this_participant_device_id |
Participant device id. | not_null |
this_participant_name |
Participant caller_id or name. | |
call_day |
Date (Pacific) of the call. | not_null |
num_entries_forcall |
Number of participant rows for this call (1 for non–voicemail-check, or 1 for voicemail check per participant). | |
unique_id |
Call unique id. | not_null |
call_date |
Call timestamp. | |
call_from |
Source identifier (device_id for calls from other Tin Cans, raw src for calls from other types of phones) of the device that initiated the call. | |
call_from_name |
Human-readable label for call_from (caller_id or external number of device that initiated the call). | |
is_call_from_tin_can |
True if call_from is Tincan. | |
destination_id |
Raw destination id. | |
destination_number |
Raw destination number. | |
call_to |
Destination identifier (device_id for calls to other Tin Cans, raw src for calls to other types of phones) for the device to which the call was made. | |
call_to_name |
Human-readable label for call_to (caller_id or external number of device to which the call was made). | |
is_call_to_tin_can |
True if call_to is Tincan. | |
call_classification |
Type of call. | |
is_voicemail_check |
True if voicemail check. | |
disposition |
Cleaned disposition. | |
is_call_answered |
True if answered. | |
talk_time_seconds |
Billable talk time in seconds. | |
is_voicemail |
True if voicemail left. | |
cdr_key_concat |
CDR key for the call. |
participant_logs_full_history
Full-history participant-side call log. Each row represents one Tin Can participant on one call, not one deduplicated underlying call. Calls between two Tin Can devices can therefore appear once for each participant. Historical participant-slice records and current participant_logs records are unioned into the same schema.
Model tests: dbt_utils.unique_combination_of_columns
| Column | Description | Tests |
|---|---|---|
this_participant_device_id |
Participant device or dial number id. | not_null |
this_participant_name |
Participant name. | |
call_day |
Date of the call. | not_null |
num_entries_forcall |
Participant rows per call. | |
unique_id |
Call unique id. | not_null |
call_date |
Call timestamp. | |
call_from |
Source identifier (device_id for calls from other Tin Cans, raw src for calls from other types of phones) of the device that initiated the call. | |
call_from_name |
Human-readable label for call_from (caller_id or external number of device that initiated the call). | |
is_call_from_tin_can |
True if call_from is Tincan. | |
destination_id |
Raw destination id. | |
destination_number |
Raw destination number. | |
call_to |
Destination identifier (device_id for calls to other Tin Cans, raw src for calls to other types of phones) for the device to which the call was made. | |
call_to_name |
Human-readable label for call_to (caller_id or external number of device to which the call was made). | |
is_call_to_tin_can |
True if call_to is Tincan. | |
call_classification |
Type of call. | |
is_voicemail_check |
True if voicemail check. | |
disposition |
Cleaned disposition. | |
is_call_answered |
True if answered. | |
talk_time_seconds |
Billable talk time in seconds. | |
is_voicemail |
True if voicemail left. | |
cdr_key_concat |
CDR key for the call. |
fct_ga_events
Event-level Google Analytics 4 fact table from src_google_analytics.events_all.
Includes event metadata, page context, device/geo dimensions, last-click campaign attribution,
and ecommerce revenue fields with event timestamps converted to America/Los_Angeles.
Data begins on April 3, 2026 and should be populated continuously going forward from that date. This data source will never contain traffic data for dates before April 3 2026.
Model tests: dbt_utils.unique_combination_of_columns
| Column | Description | Tests |
|---|---|---|
_airbyte_raw_id |
Raw Airbyte record identifier from the source event row. | |
event_timestamp |
Event timestamp converted from UTC to America/Los_Angeles. | |
event_name |
GA4 event name. | |
user_pseudo_id |
Anonymous GA4 user identifier. | |
ga_session_id |
GA4 session identifier extracted from event parameters. | |
platform |
Platform where the event occurred (for example, WEB, IOS, or ANDROID). | |
page_host |
Host/domain parsed from page_location. | |
page_path |
URL path parsed from page_location. | |
page_title |
Page title extracted from event parameters. | |
referrer_page_host |
Host/domain parsed from page_referrer. | |
referrer_page_path |
URL path parsed from page_referrer. | |
device_category |
Device category reported by GA4 (for example, desktop, mobile, or tablet). | |
device_mobile_brand_name |
Mobile device brand name. | |
device_mobile_model_name |
Mobile device model name. | |
device_operating_system |
Device operating system. | |
device_operating_system_version |
Device operating system version. | |
device_browser |
Browser name from GA4 web_info. | |
device_browser_version |
Browser version from GA4 web_info. | |
geo_city |
Event city from GA4 geo metadata. | |
geo_region |
Event region/state from GA4 geo metadata. | |
geo_country |
Event country from GA4 geo metadata. | |
geo_metro |
Event metro area from GA4 geo metadata. | |
geo_continent |
Event continent from GA4 geo metadata. | |
first_click_medium |
First-touch traffic medium from GA4 traffic_source. | |
first_click_source |
First-touch traffic source from GA4 traffic_source. | |
first_click_campaign_name |
First-touch campaign name from GA4 traffic_source. | |
event_medium |
Event-level manual medium from GA4 collected_traffic_source parameters. | |
event_source |
Event-level manual source from GA4 collected_traffic_source parameters. | |
event_campaign_name |
Event-level manual campaign name from GA4 collected_traffic_source parameters. | |
event_campaign_id |
Event-level manual campaign id from GA4 collected_traffic_source parameters. | |
event_term |
Event-level manual term from GA4 collected_traffic_source parameters. | |
event_content |
Event-level manual ad/content tag from GA4 collected_traffic_source parameters. | |
last_click_channel |
Last-click primary channel group from GA4 cross-channel campaign attribution. | |
last_click_platform |
Last-click source platform from GA4 cross-channel campaign attribution. | |
last_click_medium |
Last-click medium from GA4 cross-channel campaign attribution. | |
last_click_source |
Last-click source from GA4 cross-channel campaign attribution. | |
last_click_campaign_name |
Last-click campaign name from GA4 cross-channel campaign attribution. | |
last_click_campaign_id |
Last-click campaign id from GA4 cross-channel campaign attribution. | |
items |
Array of ecommerce item objects associated with the event. | |
transaction_id |
Ecommerce transaction identifier. | |
revenue_amount |
Ecommerce purchase revenue in USD; defaults to 0 when missing. | |
shipping_amount |
Ecommerce shipping amount in USD; defaults to 0 when missing. | |
tax_amount |
Ecommerce tax amount in USD; defaults to 0 when missing. |
fct_marketing_attributed_orders
One row per Shopify order with last-click marketing attribution applied. Built from fct_ga_events purchase events joined to shopify.orders (GA <-> Shopify order_id match rate is 100%; the Shopify join filters out test orders). GA occasionally double-fires the purchase event on confirmation-page refresh, so events are deduped to keep the earliest per transaction.
Attribution: GA4's last-non-direct-click model (default 30-day lookback). Hybrid implementation — channel, ad_platform, source, medium, campaign_id, and campaign_name are read straight off GA's native LAST_CLICK_ columns; ad_set_id and ad_id are not exposed by GA at the purchase grain and so are re-derived by finding the matching session_start (same user, same source/medium/campaign_id as LAST_CLICK_, latest such session before the purchase) and reading event_term and event_content off it.
Revenue convention: revenue_gross is the cart price net of shipping and tax (both pass-throughs, not company revenue) and post-discount (GA's revenue_amount is already discount-netted, matching Shopify's total_price at ~97%). Does NOT yet subtract returns; once returns are wired in we'll add a true revenue_net column. Subscription revenue is excluded for now.
UTM migration state (as of 2026-04-30): Meta fully migrated; Google migrated 2026-04-24 so pre-migration paid Google orders have NULL ad_set_id and ad_id; TikTok utm_term (AID) is missing entirely so ad_set_id is always NULL on TikTok rows, and TikTok's utm_content (CID) is producing creative names rather than IDs.
Model tests: dbt_utils.unique_combination_of_columns
| Column | Description | Tests |
|---|---|---|
order_id |
Shopify order id (parsed from GA's transaction_id). One row per order. | unique, not_null |
user_pseudo_id |
GA4 anonymous user identifier of the buyer at purchase time. | |
order_ts |
Order timestamp (Pacific time), from GA's purchase event. | |
order_date |
Order date (Pacific time), derived from order_ts. | not_null |
cart_price_with_taxes_and_shipping |
Total cart price including taxes and shipping. Deliberately verbose; NOT a usable revenue figure. Kept for reconciliation and audit only. | |
shipping |
Shipping amount on the order (USD pass-through, not company revenue). | |
tax |
Tax amount on the order (USD pass-through, not company revenue). | |
revenue_gross |
Cart price less shipping and tax. Post-discount (GA's revenue_amount already nets discounts). Does NOT yet subtract returns; true revenue_net awaits a returns join. | |
units_ordered |
Total units in the order, summed from quantities in the GA items array. Useful for cost-per-unit-sold metrics on a single-SKU business where one order may include multiple phones. | |
channel |
GA4 last-click channel group (for example, Organic Search, Direct, Paid Social). | |
ad_platform |
Business-level paid-ad platform: 'Meta Ads', 'Google Ads', 'TikTok Ads', or NULL for non-paid traffic. Derived from source/medium/channel rather than GA's native LAST_CLICK_PLATFORM (which only distinguishes auto-tagged from UTM-tagged traffic). | accepted_values |
source |
GA4 last-click source. | |
medium |
GA4 last-click medium. | |
campaign_id |
GA4 last-click campaign id. | |
campaign_name |
GA4 last-click campaign name. | |
ad_set_id |
Unified ad-set-level identifier of the last-click session, from event_term: adset_id (Meta) | ad_group_id (Google standard) |
ad_id |
Ad id of the last-click session, from event_content. NULL for non-paid traffic, for Google PMax (which has no ad concept), and for paid traffic where UTM tagging hasn't yet flowed IDs. Some Meta rows carry an ad name rather than a numeric id due to URL-template misconfig (e.g., ad.name instead of ad.id). |
fct_marketing_spend
One row per (date, ad platform, campaign, ad-set-or-asset-group, ad). Pure-spend model: spend, clicks, and impressions only. Platform-claimed conversions/revenue belong in a separate reconciliation model so our house ROAS (= attributed revenue / this spend) stays cleanly computable without platform self-attribution leaking in.
Sources: - Meta: src_facebook_ads.custom_ad_performance (native ad-grain). - Google: combined_google_ad_data, which unions PMAX_AD_PERFORMANCE and AD_PERFORMANCE, normalizes column names, and converts cost from micros to USD. - TikTok: src_tiktok_ads.ads_reports_daily for daily metrics, joined to src_tiktok_ads.ads to resolve parent campaign_id, ad_group_id, and the corresponding names. Spend/clicks/impressions live in the metrics JSON OBJECT as string-y values and are cast to FLOAT/NUMBER on extraction.
Grain notes by source: Meta rows have campaign_id, ad_set_id, and ad_id all populated. Google PMax rows have campaign_id and ad_set_id (= the asset_group_id) populated, with ad_id NULL because PMax has no ad concept — the asset group is the lowest creative grain. Google non-PMax rows (when running) populate campaign_id, ad_set_id (= ad_group_id), and ad_id. TikTok rows have campaign_id, ad_set_id (= adgroup_id), and ad_id all populated. ad_platform values match the same-named column on fct_marketing_attributed_orders so the two models join cleanly. All IDs are cast to TEXT for a uniform join key against GA UTM values.
Model tests: dbt_utils.unique_combination_of_columns
| Column | Description | Tests |
|---|---|---|
spend_date |
Date of ad spend. | not_null |
ad_platform |
Paid-ad platform: 'Meta Ads', 'Google Ads', or 'TikTok Ads'. Matches the same-named column on fct_marketing_attributed_orders so the two models join cleanly. | not_null, accepted_values |
campaign_id |
Campaign id (TEXT). Meta's campaign_id, Google Ads CAMPAIGN.ID cast to text, or TikTok's campaign_id (resolved from the ADS dimension table). | not_null |
campaign_name |
Campaign display name. | |
ad_set_id |
Unified ad-set-level identifier (TEXT): adset_id (Meta), ad_group_id (Google non-PMax), asset_group_id (Google PMax), or adgroup_id (TikTok). | |
ad_set_name |
Ad set / ad group / asset group display name. | |
ad_id |
Ad id (TEXT). Populated for Meta, Google non-PMax, and TikTok rows. NULL for Google PMax rows because PMax has no ad concept — the asset group is the lowest creative grain. | |
ad_name |
Ad display name. NULL for Google PMax rows. | |
spend |
Spend in USD for the row. | |
clicks |
Number of clicks for the row. | |
impressions |
Number of impressions for the row. |