Skip to main content
/_layouts/images/titlegraphic.gif

Public

Go Search
Public
  
Public > Documentation > Demo File Loader  

Demo File Loader

.Demo file Loader, —Documentation

Build a Demo File

() Basic Nuts and Bolts

Four things are needed to do a demo build:

a) Source Data
b) Template
c) Ini file
d) The Demofile Loader

() Table Format for Source Data

The tables described below can be in dBase, .csv (with a header row) or Access(Tables and Queries) format.  Table references to dBase tables should include the ".dbf" extension.  Table references to CSV tables should include the ".csv" extension.  Table references for Access tables should be "FileName.mdb,TableName".  There is no limit to the number of fields in a CSV file, so it is a very good choice if you are going to use the Allocate engine to produce data from polygons that you are going to want to add into a .demo file.

() Type of data needed

For an application to use the Allocate engine it must build its application-specific data into a Demo File.  

Data must be in either the general format 

1) "Geography, Fieldname1, FieldName2, ..."
2) "Geography, FieldName, Values".  (a "Vertical Build"..this tends to be a bit more complicated in practice.  Refer to the SRC Data Team for more information if needed).

In both formats, Geography is one of those built into the Allocate dataset with which you are working (e.g. BlockGroup, Tract, zip, country, state, MSA, or DMA), and is a fixed length (e.g. IF building ZIP codes, ZIP code 1015 must be padded with a leading 0:  01015).

 FieldName# is a column of demographic information (aka a Variable or Var) for that geography (example: the population for the zip code).  However in the first example, the data is crosstabbed (each GeoID occurs only once) and in the second, there is a row for every GeoID/Var combination.  You specify which kind of table you have with arguments in the INI file.

You will almost always need tables of the same demographic information for each level of geography.  These are all built into the same demo file.  If you don't have a table for a geography, the Allocate engine will fail if asked to report a demographic variable at that level of geography.

This whole process is driven by the DemoKeys  file.  Any geography that is included in the DemoKeys file can be built into a demo file.  Each demographic data file requires that it has the exact same keys as the DemoKeys and that they be in sorted order.  Note that the DemoKeys is built by SRC, and the order of the geographies (Level0, Level1, etc as seen below) MUST be given to you by SRC.

Here's an example of the US geographies:

The following keys (char fields) are required for the tables of demographic information at each level of geography:

(example)
For the Demo File to Build Data the Geographies must be in the Following Order.

LevelNum Name Key columns (can be separate columns combined during load, but recommended as a single field)
0 BlockGroups County FIPs+Tract+BlockGroup codes (12 digits)
1 Census Tracts County FIPS+Tract (11 digits)
2 Counties County FIPs code (5 digits)
3 States State FIPs or normal 2-ltr state code. (2 digits)

() Template

To build a demo file you must create a table describing the fields, units, and summary methods for each demographic variable you want to build into the demo file.  We call this a Template

ALL FIELDS SHOULD BE SET TO TEXT FIELDS WITHIN ACCESS OR DBF FORMAT.  NUMERIC FIELDS CAN CAUSE PROBLEMS, notably the YEAR field.  Be sure that it's set as a Text field.

The structure of the table is: 

Column Type Description Level
LEVELX C Where X is the level # in the Demokeys file [Header] table.  This specifies the name of the table.  This works for any DemoKeysFile.  Note: If you choose you can have the BuildDemo.mdb contain links to each of the source tables or queries.  If you do this, you must proved a Select statement: "BuildDemo.mdb,select * from McLaganMSA".
FIELDNAME C Name of output field.  Also, column name checked in the input tables.

Notes:

  • FieldName must be all upper case, start with a letter, and cannot end with an underscore.
  • Length must be <=10 if you want to be able to export to .dbf format.  All the "official" .demo files distributed w/ the shrink-wrap product follow this.
  • Length must be <=59 for the field to be used via the Allocate ODBC driver (ODBC buffers are 64 - 1 null terminator - 4 for leading "IDX_" = 59).  This is not enforced during load.
ALT_NAME1 C Alternate column name checked in input tables.  Optional (but column must exist).
ALT_NAME2 C Alternate column name checked in input tables.  Optional (but column must exist).
FORMULA C Formula based on FIELDNAMEs previously described in the database table.  The formula is stored directly in the demo file and calculated on the fly when used in a report.  The purpose is to provide something that acts like the other demographic fields and only take a small amount of storage.  Ex: "TOT_POP_PJ/CSQ_MILES".  If this is filled in, the Table name fields and Alt column name fields must be blank.   The column is required, contents are optional.  Again, note that if the formula is not proprietary, consider using the Virtual Variable File (vvf) for more flexibility.
CATEGORY C Top-Level grouping.  Ex. "Population".

Note:  Length of this field + length of the internal demo file name must be <=62.  Note this is NOT checked during the .demo file build.  The problem w/ a longer name is that the ODBC driver will not be able to access the table DemoFileInternalName_Category (however you can still get to the data via the DemoFileInternalName_All or Theme_All tables). Required

VINTAGE C Next-level grouping.  Specifies the source or style of the data.   Ex. "5 Year Projections", "Census".  Required.
YEAR C Year of data vintage.  Usually displayed in parens after Vintage.   Ex. "2003".  Note:  Must be 4-digit (the builder assumes this and your data will appear incorrectly in Allocate if you use anything else).  Required
SUBCAT C Further categorization.  Ex. "Male By Age".   Optional.  If categorizations in this field are separated by a colons (:), it will create a subfolder for each colon.  Ex: "Age:Female"  Thus, this field can be used to further nest variables within the Allocate tree where needed for organization.
DESC C Description of item displayed at the end of the above categorizations.  This is what the user ultimately selects.   Ex. "Age 25 - 34".  Required
UNITS** C The units of the value being stored.  Determine the precision of the storage and precision and format of output.
Units are defined as follows (Size and Format fields can be disregarded): Required
Units Desc Decimals MaxVal and Format Size (2^n) Format
Dollars Dollars. 0 $4294967296 32 $%.0f
Households Number of households. 0 4294967296 32 %.0f
Persons Fractional number of people (eg 2.43 children).  Two digits of precision. 2 42949672.96 32 %.2f
Population Number of people. 0 4294967296 32 %.0f
Percent Percentage.  Two digits of precision. 2 42949672.96% 32 %.2f%%
Index Indexed value.  Two digits of precision. 2 42949672.96 32 %.2f
Integer Integer. 0 4294967296 32 %.0f
LatLong Latitude or longitude.  Six digits of precision. 6 4294.969999 32 %.6f
Round 10 Integer rounded to the nearest 10.  Values up to 20 billion. -1 42949672960 32 %.0f
Round 100 Integer rounded to the nearest 100.  Values up to 200 billion. -2 429496729600 32 %.0f
Round 1,000 Integer rounded to the nearest 1,000.  Values up to 2 trillion. -3 4294967296000 32 %.0f
Round 10,000 Integer rounded to the nearest 10,000.  Values up to 20 trillion. -4 42949672960000 32 %.0f
Physical2 Square miles.  Two units of precision. 2 42949672.96 32 %.2f
Dollars Round 1,000 Dollars rounded to the closest $1,000. -3 $4294967296000 32 $%.0f
Integer64 Integers up to 2^64. 0 18446744073709600000 64 %.0f
String: MUST HAVE THE WEIGHT=COUNT! String values. 0 Hello world   %s
Dollars64 Dollar values up to 2^64. 0 $18446744073709600000.00 64 $%.0f
Fixed2_64 Numbers up to 2^64 / 100.  Two digits of precision. 2 184467440737096000.00 64 %.2f
Fixed4_64 Numbers up to 2^64 / 10,000.  Four digits of precision. 4 1844674407370960.0000 64 %.4f
Fixed6_64 Numbers up to 2^64 / 1,000,000.  Six digits of precision. 6 18446744073709.600000 64 %.6f
Dollars2_64 Dollar values up to 2^64/100.  Two digits of precision. 2 $184467440737096000.00 64 $%.2f
Fixed1_32 Numbers up to 2^32 / 10.  One digits of precision. 1 429496729.6 32 %.1f
Fixed2_32 Numbers up to 2^32 / 100.  Two digits of precision. 2 42949672.96 32 %.2f
Fixed3_32 Numbers up to 2^32 / 1000.  Three digits of precision. 3 4294967.300 32 %.3f
Fixed4_32 Numbers up to 2^32 / 10000.  Four digits of precision. 4 429496.7300 32 %.4f
Fixed5_32 Numbers up to 2^32 / 100000.  Five digits of precision. 5 42949.67000 32 %.5f
Fixed6_32 Numbers up to 2^32 / 1000000.  Six digits of precision. 6 4294.970000 32 %.6f
Pounds United Kingdom Pounds up to £2^32. 0 4,294,967,296.00 32 £%.0f
Pounds Round 1,000 United Kingdom Pounds rounded to the closest £1,000. -3 4,294,967,296,000.00 32 £%.0f
Pounds64 United Kingdom Pounds up to £2^64. 0 18,446,744,073,709,600,000.00 64 £%.0f
Pounds2_64 United Kingdom Pounds up to £2^64/100.  Two digits of precision. 2 184,467,440,737,096,000.00 64 £%.2f
Euros Euro up to €2^32. 0 4,294,967,296.00 32 €%.0f
Euros Round 1,000 Euro rounded to the closest €1,000. -3 4,294,967,296,000.00 32 €%.0f
Euros64 Euro up to €2^64. 0 18,446,744,073,709,600,000.00 64 €%.0f
Euros2_64 Euro up to €2^64/100.  Two digits of precision. 2 184,467,440,737,096,000.00 64 €%.2f

 

 

 

ACTION C How to summarize the values when a report is grouping multiple geographies.  Required
"Action" Description Example
"Sum" Values for each geography are summed.  Use this when the value is an absolute number.

A test that should fail for values to be summed: "can the value for an individual geography be compared to the US-wide number".

"Total Population" should be summed since it is an absolute value.

In odd cases this could be a percentage: "state's percent of GNP").

"Average" Values for each geography are averaged.  The averaging is weighted based on the type in the Weight column.  Use this when the value is a relative number.

A test that should succeed for values to be averaged: "can the value for an individual geography be compared to the US-wide number".

Demo vars that are percentages, indexed, averages, or (as a fudge) medians (you can't really average median values).
"Min" The minimum value is returned when geographies are summarized. We haven't used this.
"Max" The maximum value is returned when geographies are summarized. We haven't used this.
WEIGHT C Determines how to weight values that are being summarized.  Required. If the Unit is Household or Population, then it's usually (always) the same for the Weight.

This is used for reports that summarize using the Action "Average".

It's also used for variables using the Action "Sum" when the summation is for a custom polygon or radius (note: this type of summation requires that there is BlockGroup data in the demo file).   When the polygon or radius cuts across a BlockGroup Allocate determines the Block centroids that fall within the polygon or radius and weights the BlockGroup''s demographic value based on the percent of the Population/NumHouseholds for the BlockGroup that are in that Block.  (The Population and NumHousehold data for Blocks is part of Allocate from 2000 census figures.)

U.S. NOTE: Note that due to lack of data at the block level, and variables with a weighting assignment of BusEmpCount or BusEstCount will default to "Count".

THESE WEIGHTS MUST RELATE TO VARIABLES IN THE "BASE" demographic file.  See below for further clarification.

 

"Weight" Description Example
"Population" Weight based on population. PerCapitaIncome, "Units" are "Population" and Action is "Average".  We want to weight based on the population in the Block.
"Households" Weight based on Households.  This is used for all consumer expenditure data even when this data may seem based on Population (like "Spending on shoes"). NumHousingUnitsVacant, "Units" are "Households" and Action is "Sum".  We want to weight based on the households in the Block.
"Count" No weighting is used. SquareMiles, Units are "Physical" and Action is "Sum".  There should be no special weighting of this number for the BlockGroups that are included in the polygon or radius, just treat them equally as one of the X Blocks in the BlockGroup.
"BusEstCount" Weight based on Business Establishments Number of Establishments, 100,000 Employess and greater
BusEmpCount Weight based on Employee Count Employees by Occupation: Clerical

Tips

  • Absolute values are generally preferable to percentages.  However, if the percentages have "Units" of "Population" or "Households" they can be accurately averaged in reports.  
  • Don't include values that are simply calculated from other columns.  You can achieve these calculations by adding a Formula to the Virtual Variable File (vvf) or be specifying the formula in a report. Unless there are issues of privacy/copyright, the Virtual Variable File is recommended for any formulas that can be derived from the values in the Demo file. 

Example of a Demo File (Source) Template For DemoBuilds

Level1 Level2 Level3 FIELDNAME FORMULA CATEGORY VINTAGE YEAR SUBCAT DESCRIPTION ACTION WEIGHT UNITS ALT_NAME1 ALT_NAME2
BGCEX.DBF TRCEX.DBF COCEX.DBF STCEX.DBF XCYHF1   Consumer Expenditures (Ave Hhld Exp) Current Year Estimates 2000   Household Textiles Average Households Dollars    
BGCEX.DBF TRCEX.DBF COCEX.DBF STCEX.DBF XPYFB1   Consumer Expenditures (Ave Hhld Exp) Five Year Projections 2005   Food At Home Average Households Dollars    
BGCEN2.DBF TRCEN2.DBF COCEN2.DBF STCEN2.DBF HOU90RENT   Housing Units US Census 1990   Renter-Occupied Sum Households Households    
BGDEMPY1.DBF TRDEMPY1.DBF CODEMPY1.DBF STDEMPY1.DBF DWLPYRENT   Housing Units Five Year Projections 2005   Renter-Occupied Sum Households Households    
BGDEMCY2.DBF TRDEMCY2.DBF CODEMCY2.DBF STDEMCY2.DBF DWLCYRENT   Housing Units Current Year Estimates 2000   Renter-Occupied Sum Households Households    

 

 

() INI file for loader

Before running the loader you must create a  BuildDemoFile.INI file that points to all file locations, sets key field names, and sets a few options.

The ini file keys (e.g., tablePath=.\, etc.) are:

Key Description
tablePath Location of source databases referenced in the definition table.   You can include multiple paths delimited with ";".  Be sure paths end with "\"
fieldInfoFile Location and name of the definition table.  For Access this is the pathname for the .mdb, comma, the name of the table.
outputFileName Location and name of the output demo file (.demo).
demokeysFileName Location and name of the DemoKeys file that contains lists of all the known geographies.  The demo file builder uses these lists to query the source tables for the specific geographies.
CompressVersion2 T/F. Use the newer fancier MUCH better compression.  Builds run 1/2 the speed.  File will only be compatible with version 3.5 and above.  This should generally be on for larger (>100MB) files.  Recommend that all full builds be set to True
  The following three parameters are used for testing/diagnostic purposes only.
isTest T/F.  When True will only load the first 64 items for each level of geography (which is all states).  The demo file that's built is completely useable and can be tested from Allocate, and log files can be reviewed for build errors.  Test runs and log file reviews should be considered mandatory prior to running a full build.
numTestRecords The number of records at each level of geo. to test.  Works with isTest.  Default=64.
skipToLevel Another diagnostic.   Skips loading any data until it gets to the specified level from the demoKeys file.  Not used often.
  FOR TESTING PURPOSES, the most common settings are commenting OUT the CompressVersion2, and comment IN isTest.  numTestRecords and skipToLevel are rarely used.
LevelIDNameX where X is the level # in the DemoKeys [Header] table.   The field name (or names) to look for the ID field for the geographic level.  Note: All field names can be combined with "+" and comma seperated to look for more than one column name.
  Three column source data tables ("Vertical Build") are not the norm.  The following two parameters must be included when building a three column-based demo build.
VerticalFieldIdent If using three column source data tables, this specifies the name of the column that holds the fieldnames referred to in the template. (Note: Because of a bug in the BuildDemoFile program, you must terminate a "three column" input CSV file with an invalid, dummy, record to force processing of the final valid data group.  pct, 2001/03/02)
ValueFieldIdent If using three column source data tables, this specifies the name of the column that holds the values for the corresponding fieldname/geoID combination.
   
username SRC supplied username for authentication.
password SRC supplied password for authentication.
   
vendorDesc Internal name of demo file.  Can be used to dot-specify variables on reports.  This is necessary to disambiguate variable names when more than one demo file contains the same variable name (try to avoid that!).  If there is a space in the name it will be converted to a "_".  Do NOT include any commas here (the file will build but Allocate will not be able to build or use an index base b/c the comma will be incorrectly parsed).  Note:  The length of this name + length of the Category must be less than 59 characters, so keep this concise
vendorCopyright Output in reports.  
dataVersion Internal use only.  Don't believe this is currently used anywhere a user might see.
  The following 4 parameters are intended for use with "Primary" or "Base" Demographic files.  Do not use during typical builds.
PopField The name of the demo field assigned in the definition table for total population.  Required for Primary Demo files.  Optional for supplemental files.
HHField The name of the demo field assigned in the definition table for total households.  Required for Primary Demo files.  Optional for supplemental files.
BUSESTCOUNTFIELD
The name of the demo field assigned in the definition table for Business Establishments.  This is only needed if business variables are referencing (within the demo file) the BusEstCount variable for Weighting
BUSESTCOUNTFIELD The name of the demo field assigned in the definition table for Business Employees.  This is only needed if business variables are referencing (within the demo file) the BusEmpCount variable for Weighting
  Additional, optional parameters (not often used)
dependentDemoFilen= Where n is an integer from 1 to n.  If a formula in the demo file being built is referencing a fieldname in ANOTHER demo file, that referenced demo file must be listed here.  This is convenient when a build is too large for a single demo file, and there are numerous formulas being built in.  By "dot-specifying" the demo file within the formula field in the template, the demo file can be broken into multiple files that will reference each other seamlessly.  Consider solving the problem with a VVF instead.
   
TestOnly1Field Builds the specified variable only.  TestOnly1Field=TOTALPOP. Never Used
testFirst10 Builds the first 10 fieldnames in the demo file. testFirst10=True.  Never used
  Any additional parameters in the log file are not currently supported

(Note) All field names can be combined with "+" and comma separated to look for more than one column name.

Example file: buildDemoFile.ini

Note on BuildDemoFile.ini: 
  • Comments are semicolon;
  • Logical "OR" for fields of LevelID# is comma;
  • All files used for data must be sorted on geography key (the LevelIDName# column);

 

The Following is an example of a demo file ini and its corresponding geographies.

tablePath=\\bobcat\e\AGS\2003_01\ExtractedData\
fieldInfoFile=\\bobcat\e\AGS\2003_01\Base_Demographics\Template.mdb,Template_2000_Base
outputFileName=\\bobcat\e\AGS\2003_01\SRC_US_03A_AGS_Base_Demographics_03B.demo
demokeysFileName=\\Zappa\w\Data_Master_Copy\SRCData\SRC_US_03A\Base\SRC_US_03A_DemoKeys_B.src

;istest=True
CompressVersion2=True

;skiptolevel=7

LevelIdName0=key,id,blockgroup
LevelIdName1=key,id,tract
LevelIdName2=key,id,COUNTY
LevelIdName3=key,id,STATE


Username=mike 
password=#$%^

venderDesc=AGS_Base_Demographics
venderCopyright=(C) 2002 Applied Geographic Solutions, Inc. All Rights Reserved, SRC, LLC.
dataVersion=2002

HHField=HHDCY
PopField=POPCY
BUSESTCOUNTFIELD=BUSCYEST
BUSEMPCOUNTFIELD=BUSCYEMP

 

() Running loader

  • Ensure that you have a recent BuildDemoFile.exe.
  • Prepare your ini file with the correct paths and parameters.
  • Ensure the Template is ready.
  • Run the application.  It can be located anywhere that is accessible on your network, but run it from within the folder where the BuildDemoFile.ini file is located.  Run with the isTest parameter: isTest=T in the ini file for the first run and test the demo file in Allocate).
  • Look at the .log file created (for the Test run and the full run).
    1. The first section shows "valid/invalid" in finding the known Demo keys from the Demo Keys file (StateFIPS, Zip, etc) in your source tables.  The numbers are calculated for the total number of accesses, which is dependant on the builder's algorithms (in brief, the number of geographies within any given level * the number of variables being built.  For example, if you have 51 States, and 200 variables, number of total accesses would be 200 * 51 = 1050).  In test mode, failure of most or all zips is ok (since the first 64 zips are often the weird National Parks, which Allocate ignores).  Failure for a handful of MSA's may be ok if your data has some weird MSAs.  Failure for any States is bad b/c your list of states should match the known list of states 100% (i.e., why are you missing states).  Regardless, any Ratios other than 0 need to be understood and determined to be acceptable.
    2. The next section shows lists of "Good/Bad Fields".  Search the file for: 100% Bad or for a series of equal signs (e.g. "=====").  If any show up, these are fields you defined in the Definition Table that were not found in the actual source tables.   These probably represent misspellings in field names and must be corrected.

() Registering the demo file

Drag the demo file to the icon for Allocate to register the demo file (if you only have one of the applications installed you can double-click the demo file, the extension is registered to start the Allocate application).  You can also register the demo file from within the Allocate interface.

Once registered, the demo file appears as a folder in the list of variables.  The other top level folders include the Categories from other demo files already registered.

To unregister a demo file, either drag it to the application icon (or dbl-click it) or delete it (you'll be prompted to unregister it the next time the application is run), or unregister from within the Allocate interface.

Build up a Demo File for all geographies based on BG-only data - ADVANCED DATA BUILDING TECHNIQUES

You can build up a demo file to all levels of geography based on data at the BG level only.

  1. Within the Geography tab of Allocate, add a custom polygon file.
  2. If you are going to build all levels for your variables, load all the TAB files.  Otherwise, just load the ones you want.
  3. For each geographic level, create a CSV file to be used as input to the "Build demo file" process described above as follows:
    1. Select a single polygon file, making sure to pick the correct Name and Key fields.
    2. Select the variables you need from the "Variables" tab.  (Note: Most standard "demo" files are already created at all levels.)
    3. Select Output tab, Create Output File:  Enter path and filename for the CSV file to be created.
  4. Build the new demo file from the CSV files you created. 

A "better" way to do this, is to minimize block centroid retrieval (and commensurate rounding error), where possible, by building vgfs that are mapped to block groups directly. - ADVANCED DATA BUILDING TECHNIQUES

1) Build a BG-only level DemoFile.  Register in Allocate
2) Using VGF_Create_CMD, build a vgf that is a mapped vgf:  For example, list all the blockgroups for every county.  See VGF documentation for more info.  Register in Allocate.
3) Select this VGF as your output, and choose all the variables that were in the original BG-only build.  The result will be an extraction of these variables directly from the BG level data, but "rolled up" to the county level.  Summable variables are summed exactly, Averages are averaged (of course..).
4) Use this method for any relevant level (i.e. Tract, County, State..some would argue DMAs map to BG directly as well). 
5) If the DemoKeys also contains "non-standard" geographies, such as ZIPS, the data would have to be extracted via a polygon file and block centroid retrieval.  Thus, for many complete demo builds, a hybrid approach is required for completion.

Last modified at 7/23/2008 2:34 PM  by Nathalie Sharpe