Eureka AI · Alpha Data
Alpha Data Dictionary Hub
Schema reference and field definitions for Eureka AI's proprietary digital signal dataset. Captures web and mobile touchpoints across consumer applications — mapped to companies, GICS-classified equities, and Eureka's alpha signal taxonomy for investment intelligence.
Markets & availability
🇬🇧
United Kingdom
alpha_uk_YYYYMMDD
Status● Live
LatestMay 2026
Fields19
Rows135,793
🇺🇸
United States
alpha_us_YYYYMMDD
StatusComing soon
Latest—
Fields19
Rows—
🇪🇺
Europe
alpha_eu_YYYYMMDD
StatusComing soon
Latest—
Fields19
Rows—
🌏
APAC
alpha_apac_YYYYMMDD
StatusComing soon
Latest—
Fields19
Rows—
Field reference — 19 fields
| Field name | Data type | Nullable | Description & notes |
|---|---|---|---|
| host_url | String | No | The observed domain or host URL. Primary grain of the Alpha dataset — each row represents one domain entity. Enables linkage to company, sector and signal classifications. |
| reg_domain | String | No | Registered root domain (e.g. google.com). Groups subdomains under a single registrable entity for aggregation and deduplication. |
| tpe | String — Categorical | No | Touchpoint Endpoint type — classifies the functional role of the URL (e.g. Main Website / Portal, Content Delivery (CDN), Ad Service, Authentication). 25+ types. Same taxonomy as Telco dataset. |
| app | String | No | Application or product name associated with the observed domain. Enables app-level aggregation across multiple subdomains belonging to the same product. |
| company | String | No | Corporate entity operating the observed app or domain. Provides entity-level grouping above app for company-wide analysis and equity linkage. |
| ticker | String | No | Stock exchange ticker in [TICKER] [EXCHANGE] format (Bloomberg convention). 'Private' for unlisted companies. Enables direct linkage to equities. Same format as Telco dataset — cross-dataset joins supported. |
| micro_category | String — Categorical | No | Granular two-level categorisation in [Macro] / [Sub-type] format. Most specific thematic classification available. Cross-dataset comparable with Telco micro field. |
| macro_category | String — Categorical | No | High-level sector vertical (e.g. AI, Infrastructure, Shopping, Finance). Primary field for sector-level aggregation and thematic signal construction. 90+ verticals. Same taxonomy as Telco macro. |
| alpha_signal | String — Categorical | Yes | Eureka AI proprietary signal classification. 15 signal types mapped to investment intelligence categories. 12.6% null rate — rows without a signal classification are unclassified domains. |
| funnel | Integer (1–15) | No | Funnel stage position (1–15) corresponding to alpha_signal. Maps each signal to its position in Eureka's proprietary consumer funnel model for sequential analysis. |
| alpha_point | Float | Yes | Eureka AI proprietary signal score. 99% null rate — only populated for a small subset of high-conviction classified domains. When present, provides a numeric strength indicator for the alpha signal. |
| notes | String | Yes | Free-text annotation field for analyst notes, caveats or contextual information about the domain classification. |
| gics_company | String | No | Company name as classified in the GICS (Global Industry Classification Standard) taxonomy. Enables standardised entity resolution across data sources. |
| gics_sectors | String — Categorical | No | GICS Level 1 sector classification (e.g. Information Technology, Consumer Discretionary). Top of the four-tier GICS hierarchy. Enables institutional-grade sector benchmarking. |
| gics_industries | String — Categorical | No | GICS Level 3 industry classification. Third tier of the GICS hierarchy, sitting between Industry Group (L2) and Sub-Industry (L4). |
| gics_sub_industry | String — Categorical | No | GICS Level 4 sub-industry classification. Most granular level of the GICS hierarchy. Enables the finest institutional classification available for equity analysis. |
| reviewed | Boolean | No | Indicates whether this domain record has been manually reviewed and validated by the Eureka AI research team. True = reviewed; False = auto-classified only. |
| start_date | Date (YYYY-MM-DD) | Yes | Date from which the domain record is considered valid. Used for temporal scoping and version tracking of the classification. |
| end_date | Date (YYYY-MM-DD) | Yes | Date until which the domain record is considered valid. Null indicates the record is currently active. Use start_date / end_date together for point-in-time analysis. |
How to use this hub
Data grain
Each row represents one
host_url — a single domain entity. This is a reference dataset (not time-series), so each domain appears once. Use start_date and end_date for temporal scoping.Signal taxonomy
The
alpha_signal field contains 15 proprietary signal types mapped to Eureka's consumer funnel model via the funnel field (1–15). Filter to non-null signals for classified domains only.Cross-dataset joins
The
ticker, macro_category, and tpe fields use identical taxonomy to the Telco dataset — enabling direct joins between Alpha and Telco data on company, sector, and endpoint type.Changelog
2026-05-19
alpha_uk_20260519 published
UK Alpha dataset dictionary published. 135,793 rows, 19 fields covering digital signal taxonomy, GICS classification, and equity linkage.