The yellow output ranges you see are created from two white input cells by three formula cells. This is the power of arrays.
Purpose
In one of my deeper dives into family history I discovered a man named Henry Putnam, my sixth great-grandfather, who was killed in the Battle of Lexington. He inspired me to join the Sons of the American Revolution (SAR). Since then I've discovered several more patriot ancestors.
Like other lineage societies, SAR membership requires proof of the ancestor's participation in the signature event, and of the direct bloodline to the applicant. After completing a few of these applications, I thought to myself, "I could do this in Excel," recalling my work on the Family History Timeline almost a year ago. I was in fact able to reuse much of it.
As always, the reason I refine and share my personal projects with you, my readers, is to illustrate the process of conceptualizing, designing and constructing elegant Excel applications with practical examples. This project is a great showcase of what can be done with dynamic array formulas to construct any desired output.
Agenda
Today's topic is the new output report itself. As with any query-based application, the prerequisite is a source data table with a "key" for unique record identification. In the Timeline project I identified duplicate names and gave them each a numeric suffix based on birth year order. As it turns out, the resulting names were almost unique, not quite. Lessons learned, the final process to uniquify the records and other internal processes will be the subject of future posts.
Source Data
The data are downloaded from Family Tree Maker (FTM), including Name, Sex, Birth Date, Birth Place, Marriage Date, Marriage Place, Death Date, Death Place, Father name, Mother name, Spouse name. All fields require transformation to uniquify or format appropriately for lookup and output. Some of these transformations are covered in the Timeline Part 1, with more detail to come soon.
Ancestor Table
This table is the subject of the Timeline Part 3, also with more detail coming soon. Since it is integral to the Bloodline Report, I'll briefly recap it now.
The clever trick of this table is to assign a binary id to the father and mother of every generation, then convert it to F's and M's to represent every possible bloodline through fathers and mothers for as many generations as the table contains. Every name in the table is a lookup of the father or mother of each member of the previous generation.
The table now contains 13 generations starting from the self.
Bloodline Report
The Bloodline Report is a self-contained sheet with the following features:
Two input cells for descendant and ancestor, and several user option controls
A short list of the names connecting the descendant and the ancestor
A full report of the vital facts of each generation in the lineage
The sheet background fill is light blue, outputs are conditionally formatted light gold, and only input cells are white. Static and conditional formatting are designed to provide an intuitive and aesthetically pleasant experience.
User Inputs and Controls
There are two input cells, one for descendant and one for ancestor. Both use Data Validation to create a drop-down picklist for selection. The picklists ensure correct spelling and simplify the entry; mouse or Alt+↓ to open and scroll the list. The descendant picklist comes from FTM[UName]. As a reminder, Data Validation doesn't accept table notation, so a range name is defined as a proxy. That is, the name source_ref refers to FTM[UName], and the DV list source is =source_ref.
Once the descendant name is selected, the ancestor picklist is calculated only to include that person's ancestors. This comes from an array formula, which DV is also unfriendly to, which is therefore stored on the Ancestors sheet in a hidden column off to the right of the table, referenced in DV as name anc_list referring to =Ancestors!$N$4#. The array formula incorporates the sort and filter options to customize the picklist to the user selections.
Sort is by generation or by name. If by generation, a 2-digit generation number is prefixed to the names in the picklist as the primary sort; within each generation, the sort is according to the Ancestors table formulas, paternal side of the previous generation followed by the maternal side. Sort by name removes the generation number and is sorted by LAST, First.
Filter options are by sex; if neither box is checked or if both boxes are checked, all individuals are included. If just one box is checked, the list is filtered for that sex.
Ancestor picklist formula:
2 lname the list of known (non-blank) ancestor names
3 gprfx the generation prefix
4 lwgen the same list as 2 with generation prefix
5 lmale list of male ancestors
6 IndSort the sort indicator (linked cell IndSort), 1=gen, 2=name
7 * used as "AND" operator to combine criteria
9 lfeml list of female ancestors, as above
13 optio combines three option cells into an ordinal selection (see below)
14 (output) CHOOSE picks which array to output depending on options selected
Line 13 shows a way to combine the values from user controls. Here, IndSort (radio buttons) has value 1 or 2, FlagMale and FlagFem (checkboxes) have boolean values TRUE or FALSE. Multiplication by 1 converts TRUE/FALSE to 1/0; concatenation of the two values yields {"00","01","10","11"}; BIN2DEC yields {0,1,2,3}; + 1 yields {1,2,3,4}. Adding 0 if IndSort=1 and 4 if IndSort=2 yields 1 to 8. Thus, this single formula incorporates all eight (2 states each of 3 options =2^3) possible option combinations.
Here are some examples of the picklist with various options:
Sort by gen
Sort by name
Filter by males
Short-Form Bloodline
To the right and below the ancestor selection cell are a short form of the bloodline, showing names on the left and the father/mother chain on the right. The chain cell is one formula and the two columns below are another.
These formulas are written in module b for bloodline. This keeps the main user sheet clean and simple, and the formulas safe from fat fingers.
Ancestor chain formula:
The chain formula AncChain in cell anc_chain (single cell to the right of the Ancestor pick cell) is a lookup of the Ancestors table, matching on the ancestor name, which is cell anc_name, stripped of the generation prefix.
The AncLine formula in cell anc_line allows for the option to include generation 1 or not. Generation 1 is the selected descendant, normally the self. This formula builds an array with two columns and a dynamic number of rows.
Ancestor line formula:
72 Gen2Up stores the left x characters of the chain, from 1 to all
73 Gen1Up stacks ref "R" atop the chain, if FlagGen1
74 AncDir looks up direct ancestor names from Ancestors table
77 chooses return field depending on FlagProper
79 output stacks names and chains; if none, returns message with blank chain
Here are some examples:
Without generation 1
With generation 1
No bloodline
Full Bloodline Report
The full bloodline report is designed to imitate the SAR application form. This does not enable mass copy/paste but prepares the data efficiently and helps prevent errors. It has eight columns, with five rows per generation, all defined by one formula written in module b. As seen by the green font below, it pulls data from the Ancestors table and the FTM table.
Bloodline formula:
2 ixdi Ancestor table index of chain from each row of the short line above
4 ixsp spouse's index
5 ixar 5 rows per generation; 2 rows husband index, 3 rows wife index
6 irow ordinal index of the report rows
7 gene Ancestor table generation number
8 rela Ancestor table relationship
9 name Ancestor table name
10 prop Ancestor table proper name (First Middle Last)
11 mods mod of irow (see 6)
12 fact repeating sequence of fact labels
13 surn surname
14 part M or F from chain
15 rsex FTM table sex of reference descendant name
17 soda sons and daughters
18 sdof array of soda
23 numb FTM table position of name
24 bidt FTM table birth date
25 bipl FTM table birth place
26 madt FTM table marriage date
27 mapl FTM table marriage place
28 dedt FTM table death date
29 depl FTM table death place
30 date assign dates to b, d, m fact rows
31 plac assign places to b, d, m fact rows
32 output stack eight columns (see 5, 7, 8, 18, 9 or 10, 12, 30, 31)
Conditional Formatting
Conditional Formatting is designed to highlight relevant information and hide irrelevant information (i.e. duplicates). Highlighting is by fill color, hiding is by number format ";;;". It also ensures that the output regions are properly outlined.
Stay tuned for more details in the coming weeks!
Comments