Banno Usage Data - Vendor
Introduction
The goal of this pipeline is to make the Banno Usage Data dataset (previously called Mixpanel dataset) available in JH-managed Google Big Query where it can be queried using the SQL interface for business purposes and to drive downstream business processes.
Scenarios
The Banno Usage Data dataset consists of two tables, namely, events and people. Both tables hold multi-tenant data separated by a single key which is a GUID that uniquely identifies a specific Financial Institution. In addition, the name of the Financial Institution is also included in each row.
Pipelines
The pipeline consists of two Google Data Transfer jobs. When created, the pipeline creates and manages the initial and subsequent incremental loading of data into the two tables in realtime. JH provides the empty dataset and permissions to it for the DAA service account. DAA exposes a REST API that is used to create and manage the pipeline. The REST API is authenticated with a project key provided to JH leadership team.
Details of the pipeline design and architecture, BigQuery objects, and service account details are provided here.
Development
The REST API to create the Mixpanel BigQuery realtime export pipeline is documented in detail here.