The datasets created here represent typical data from banking systems.  Three
independent datasets are created, each belonging to a different organization.
The three organizations are "SAN", "JPM" and "PNB".

Each holds 27 columns of data -- 26 data columns and one "target" column.  Each
dataset holds 60,000 records.  The "target" is a boolean indicating whether a
customer (represented by one row) performed a specific action afterwards.

These datasets are built from the Santander Customer Transaction Prediction
data, originally part of https://www.kaggle.com/c/santander-customer-transaction-prediction/data.
This is a 200,000 record, 200 column dataset contained in a CSV file.  It is
stripped of the string ID_code, 10% of the data is retained for testing
purposes, and the meaningless "var_X" column names replaced with more
meaningful, although arbitrary, column names.

Ultimately this outputs the datasets:
 * "EXAMPLE - SAN Customer Database"
 * "EXAMPLE - JPM Customer Database"
 * "EXAMPLE - PNB Customer Database"

And an equivalent but much smaller group:
 * "TEST - SAN Customer Database"
 * "TEST - JPM Customer Database"
 * "TEST - PNB Customer Database"

There are also two additional testing files that get created at this time.
These will be stashed in Google Drive for download when needed:
 * test.csv (10% of the total data)
 * test_small.csv (just 50 records)

This dataset is used by the following examples:
Tabular_Data
Transfer_Learning
XGBoost
