# Multiple Correspondence Analysis: A Political Example

### MULTIPLE CORRESPONDENCE ANALYSIS

Correspondence analysis and multiple correspondence analysis are techniques from multivariate analysis. You can use the techniques to find clusters in a data set. Correspondence and multiple correspondence analysis are similar to principal component analysis, in that the analysis attempts to reduce the dimensions (number of columns or rows) of a set of intercorrelated variables so that the smaller dimensioned (number of columns or rows) variables explain most of the variation in the original variables. However, correspondence and multiple correspondence analysis are for categorical variables rather than the numerical variables of principal component analysis. Correspondence analysis was developed by Jean Paul Benzecri and measures similarities of patterns in contingency tables.

#### The Mathematics Behind Correspondence Analysis

Correspondence analysis is used in the analysis of just two categorical variables. In correspondence analysis, the reduced variables are found by applying singular value decomposition to a transformation of the contingency table created from the two original variables. The transformation replaces the value in each cell of the contingency table by the original value minus the product of the row total and the column total divided by the overall total, with the difference divided by the square root of the product of the row total and the column total. The resulting cells then contain the signed square roots of the terms used in the calculation of the chi square test for independence for a contingency table, divided by the square root of the overall total.

#### The Mathematics Behind Multiple Correspondence Analysis

For multiple correspondence analysis, more than two categorical variables are reduced. The reduced set of variables is found by applying correspondence analysis to one of two matrices. The first matrix is a matrix made up of observations in the rows and indicator variables in the columns, where the indicator variables take on the value one if the observation has a quality measured by the variable and zero if the individual does not. For example, say there are three variables, ‘gender’, ‘hair color’, and ‘skin tone’. Say that the categories for gender are ‘female’, ‘male’, ‘prefer not to answer’; for hair color, ‘red’, ‘blond’, ‘brown’, ‘black’; and for skin tone, ‘light’, ‘medium’, and ‘dark’, then, there would be three columns associated with gender and, for a given person, only one would contain a one, the others would contain zeros; there would be four columns associated with hair color and, for a given person, only one would contain a one, others would contain zeros; and there would be three columns associated with skin tone and, for a given person, only one would contain a one, the others would contain zeros.

The second type of matrix is a Burt table. A Burt table is multiple sort of contingency table. The contingency tables between the variables make up blocks of the matrix. From the example above, the first block is the contingency table of gender by gender, and is made of up of a diagonal matrix with the counts of males, females, and those who did not want to answer on the diagonal. The second block, going horizontally, is the contingency table of gender by hair color. The third block, going horizontally, is the contingency table of gender by skin tone. The second block, going vertically, is the contingency table of hair color by gender. The rest of the blocks are found similarly.

#### Plotting for Clustering

Once the singular value decomposition is done and the reduced variables are found, the variables are usually plotted to look for clustering of attributes. (For the above example, some of the attributes are brown hair, male, red hair, light skin tone, each of which would be one point on the plot.) Usually just the first two dimensions of the reduced matrix are plotted, though more dimensions can be plotted. The dimensions are ordered with respect to how much of the variation in the input matrix the dimension explains, so the first two dimensions are the dimensions that explain the most variation. With correspondence analysis, the reduced dimensions are with respect to the contingency table. Both the attributes of the rows and the attributes of the columns are plotted on the same plot. For multiple correspondence analysis, the reduced dimensions are with respect to the matrix used in the calculation. If one uses the indicator variable matrix, one can plot just the attributes for the columns or one can also plot labels for the rows on the same plot (or plot just the labels for the rows). If one uses the Burt table, one can only plot attributes for the columns. For multiple correspondence analysis, the plots for the columns are the same by either method, thought the scaling may be different.

#### Interpreting the Plot

The interpretation of the row and column variables in correspondence anaylsis is done separately. Row variables are compared as to level down columns and column variables are compared as to level across rows. However if a row variable is near a column variable in the plot, then both are represented at similar relative levels in the contingency table.

The interpretation of the relationship between the variables in the indicator or Burt table is a bit subtle. Within an attribute, the levels of the attribute are compared to each other on the plot. Between attributes, points that are close together are seen at similar levels.

#### An Example Using the Deficit by Political Party Data Set

Below is a multiple correspondence plot for which only the column reduction is plotted. We look at the size of the deficit (-) / surplus (+) using the political affiliation of the President and the controlling parties of the Senate and the House of Representatives. The size of the onbudget budget deficit (-) / surplus(+) as a percentage of gross domestic product for the years 1947 to 2008 was classified into four classes. The largest deficit over the years was -6.04 percent and the largest surplus was 4.11 percent. The class breaks were -6.2, -4, -2, 0, and 4.2, which gives four classes. (There was only one observation with a surplus greater than 2, so years of surplus were all classed together.) Each of the President, Senate, and House variables were classified into two classes, either Democrat or Republican. Since the budget is created in the year before the budget is in effect, the deficit (-) / surplus (+) percentages are associated with the political parties in power the year before the end of the budget year.

The first principal axis appears to measure distance between the relative counts for the parties of the senate, and house. On the plot, the greatest distances on the first principal axis are between the Republican and Democratic senates and the Republican and Democratic houses. Looking at the tables below, the lowest counts were for the Republican senates and houses, which means the highest counts were for the Democratic senates and houses. For the deficit classes, classes (0,4.2] is the smallest class in the table and, like the Republican senates and houses, is to the left on the plot. Still there is not much difference between the deficit classes on the first principal axis (or the parties of the presidents).

The second principal axis appears to show how the differing levels of deficit (or surplus) are associated with the political parties of the presidents, senates, and houses. Looking at the senates and houses, there is not a lot of difference between the Democrats and the Republicans on the second principal axis, both cluster around the deficit class (-4,-2]. For the presidents, however, there is a great difference. Democratic presidents are near the deficit classes (-2,0] and (0,4.2] on the second principal axis while Republican presidents are between the deficit classes (-4,-2] and (-6.2,-4].

Below are two classifications of the data.

 (-6.2,-4] (-4,-2] (-2,0] (0,4.2] 11 22 21 8
 Party President Senate House Democrat 27 42 48 Republican 35 20 14

In this example, we have treated the deficit, which is a numeric variable, as a categorical variable. In the post before this post, an example of principal component analysis, the same data is analyzed, with all of the variables treated as numeric rather than categorical.

# CHERNOFF FACES IN R

Chernoff faces is a technique from multivariate analysis. The Chernoff faces technique was developed by Herman Chernoff and was presented in a paper in 1973 in the Journal of the American Statistical Association. Chernoff faces provide an intuitive way of looking at variation within a data set where the data is in a matrix of rows and columns (for example, a two way contingency tables).  A different face is created from each row in the data set. The differences between the faces are based on the columns of the data set. Each column is associated with one part of a facial expression (the first column is associated with the height of the face, the second with the width of the face, etc.). Chernoff designed the method for up to 18 facial components.  The faces() function in the TeachingDemos package of R uses 15 facial components, which were used here.

### Components of the Faces

From  help page for the function faces(), the fifteen components in R are the height of the face, width of the face, shape of the face, height of the mouth, width of the mouth, curve of the smile, height of the eyes, width of the eyes, height of the hair, width of the hair, styling of the hair, height of the nose, width of the nose, width of the ears, and height of the ears.  If the number of columns in the matrix is less than 15, the function will cycle back through columns until 15 columns are used.  One way around the problem is to put a constant value in the excess columns.  In what is presented here, cycling was done.

### Description of the Data

The Chernoff faces below were generated using the function faces() in the TeachingDemos package of R and are plots of the differences between facial expressions in the The Astrofaces Data Set. The Astrofaces Data Set was created from the pictures at the website, http://www.astrofaces.com, where an astrological group has gathered photographs from over 4700 persons, along with the placement of the Sun, Moon, and Ascendant for each person. In the spring of 2002, I went though the photographs and classed the faces as to whether the face had an open smile, a closed smile, an open mouthed neutral expression, a closed mouth neutral expression, an open frown, or a closed frown. For each person, I recorded the expression and the elements of the Sun, Moon, and Ascendant (the elements in astrology are air, earth, fire, and water). There were 2015 photographs in the data set at the time.

I have used a variety of techniques over the years to try to find a relationship between the elements of the Sun, Moon, and Ascendant and the expressions in the photographs, with little to show for my effort. So far, correspondence analysis has given the best results. (That said, I encourage anyone who thinks that there is nothing to astrology to visit the Astrofaces website and look at groups of photos.)  Few persons had open frowns.  The expressions on the persons at the Astrofaces website are not to be confused with the facial expressions of the Chernoff faces.

The input file for the faces() function in R was a table of normalized counts with astrological elements in the rows and facial expressions in the columns. In the four rows were counts of the number within the element with each of the six facial expressions divided by the total number of persons in the element. The plots are for the Sun, Moon, and Ascendant data, each done separately. In the post for astrological researchers posted earlier on this site, there is a plot of the combined data.

### The Sun Faces

The first column – open smiles – controls height of face, height of eyes and width of nose; the second – closed smiles – width of face, width of eyes and width of ears; the third – open neutrals – shape of face, height of hair, height of ears; the fourth – closed neutrals – height of mouth and width of hair; the fifth – open frowns – width of mouth and styling of hair; and the sixth – closed frown – curve of smile and height of nose.

In the Sun plot, we can see that the air and fire faces have similar sizes and shapes but different expressions, so air and fire are somewhat similar with regard to the first three columns (open smile, closed smile,  and open neutral), but not with regard to the last three columns (closed neutral, open frown, and closed frown).  Water and earth have the same shape- so are similar with respect to open neutrals, but are different sizes – so are different with respect to open and closed smiles.  Other than the open neutrals, the two elements are different.

The raw table of counts for the Sun data is given below.  The zero under open frowns for the water element causes a degeneracy in the water face.

 Open Smile Closed Smile Open Neutral Closed Neutral Open Frown Closed Frown Air 104 31 32 100 3 26 Earth 97 35 40 104 1 28 Fire 108 32 31 101 2 29 Water 107 41 36 93 0 34

### The Moon and Ascendant Faces

The Chernoff faces for the Moon and Ascendant are plotted below.

In the Moon and Ascendant plots, we can follow the same procedure as with the Sun to evaluate the expressions.

### Conclusions

We can see from the plots for the Sun, Moon, and Ascendant that the faces tell us something intuitive about the differences between the four astrological elements with regard to the six facial expression classes in the Astrofaces dataset. Looking between plots, we can also see similarities across the Sun, Moon, and Ascendant for differing elements.

# FOR ASTROLOGICAL RESEARCHERS:

### Analysis of Count Data – Clustering – Hypotheses Testing – Plotting

#### In this Plot You can see how Plots can Convey Information about the Relationship between Placements:

The data for this plot comes from www.astrofaces.com. Several years ago, I went through the photos at the Astrofaces website and classed the photos by expression and by the element of the Sun, Moon, and Ascendant. The plot is of the classed data. From the plot, Fire has fewer Closed Smiles and Air has fewer Closed Frowns, otherwise, there is not much difference between the expressions.

#### Chernoff Faces show You Differences between Rows of data across Columns within a matrix by using Facial Expressions: (not to be confused with the expressions in the data set)

This plot is based on the data on expressions from www.astrofaces.com. (One reason I included this plot is because the expressions on the faces seem to have an astrological take – pure coincidence.)  You can get Chernoff faces for your data at Vanward Statistics.

#### I can do Multiple Correspondence Plots for you to show Clustering for Categorical Variables:

For the Facial Expression data, the plot shows that different element / point combinations are associated with different facial expressions.

#### I can Model and Test Hypotheses for Astrological Questions for You, Using the Swiss Ephemeris:

In an article in Today’s Astrologer, by Cowell Jeffery, Volume 73, Number 13, Page 23, Mr. Jeffery hypothesized that if an aspect between two planets is present at birth, the same planets are likely to be in aspect at death. Mr. Jeffery used the Elizabethans Henry VIII, Queen Elizabeth I, Robert Devereux, William Cecil, and Mary Stuart, for an example. I modeled the problem for the major aspects, using an orb of 8 degrees for all of the planets and the Sun and an orb of 12 degrees for the Moon, where the planets are the eight usual planets (including Pluto, excluding Chiron).

I estimated the expected number of aspects for each point combination (there are 42 of them – I excluded aspects between Uranus, Neptune, and Pluto) – if the births and deaths occured randomly over the time period from the last decade of the 15th century to the end of the 16th century. I, also, found estimates of the variances and covariances between the point combinations. Different planetary combinations have different expected numbers of aspects and different variances and covariances. I, then, estimated the number of matches to expect, and the variance of the estimate of the number of matches to expect, if the count at birth is independent of the count at death. Using the results, I was able to test if the five Elizabethans had unusually high numbers of matches. The difference was positive but not significantly different from zero. One would need a very strong effect for the p-value of a test to be very small given a sample of five. The number seen for the Elizabethans is larger than the estimated expected count, which is encouraging, but too small to be considered significantly different from zero.

I did not tried to model seasonal, locational, or time of day patterns for the Elizabethans, which would affect the results. This is an area of ongoing research for me. Different parts of the world (or US for the US) have differing distributions of births  (on which planetary placements depend) over the year, so currently I am only looking at data in localized places (in Iowa). Also, I now used the distributions of births in a data set and the seasonal patterns of births from data given to me by the State of Iowa to estimate expected counts and the covariances of the counts. I use an empirical density function generated from the birth observations to do the estimations. Since the data I am using does not have times of birth, I have not tried to account for time of day.

#### Links to a Vulcan Cent Ephemeris:

I have been experimenting with points I call Vulcan and Cent. The longitude of Vulcan is the average over the shortest arc of the longitudes of Ceres, Vesta, Pallas, and Juno. The declination of the point is found using the longitude and the average of the latitudes of the four. Cent is the closer midpoint of the Centaurs Chiron and Pholus – the only two civil Centaurs in Greek mythology. An ephemeris from 1900 to 2050 for the two points can be found in two files, one in rich text format and the other in pdf format. The original files I put up contained an error. The files now should be correct. I did not realize ‘abs’ was an integer function in C in computing the original function. I used the Swiss Ephemeris for the placements in longitude and latitude of Ceres, Vesta, Pallas, Juno, Chiron, and Pholus.

Earlier in this blog, there is a description of doing regression analysis with autocorrelated errors. The point Vulcan jumps 90 or 180 degrees from time to time and I look at divorce numbers (in Iowa) as related to Vulcan jumps.

#### For the Basics of Statistical Theory:

Gator Talk Slides
Here I have provided you with a introduction to some statistical results. Click on the link  above to see the power point slides that I used for a talk to Alphee Lavoie’s AstroInvestigators. The slides cover the Normal and Chi Square distributions, plus the Central Limit Theorem, which are applied to some Astrological Data.

#### Sometimes, You might Find that just Plotting Astronomical Information is Interesting:

Using sidereal longitudinal limits for the constellations of the zodiac, I plotted
the placements of the zero degree points of the tropical signs within the constellations.

# Some Tools for Politicians – Researchers – Pundits

### Word clouds tell you what people are saying about candidates:

You can see that tweets containing Hillary or Clinton and Bernie or Sanders have more words than tweets containing just Cruz or just Trump.  Tweets are centered on Indiana, which was just having their primary.  Each word cloud was based on 100 tweets.

#### You can use Box plots to provide visual information about a Data Set:

For example, You might find it is Useful to know How the different Mixes of Political Parties performed with respect to the Budget Deficit, using the Majority Party of the House and Senate and the Political Party of the President.

#### Bar graphs compare categories:

Comparing the different Tools governments use to Raise Money might be of interest to you for setting or evaluating policy on taxes.

#### Principal component plots show how numeric data clusters:

You can see from the plot that the parties of the Senate and House cluster together – Republican  Houses and Senates are similar and Democratic Houses and Senates are similar – and the  party of the President clusters with the Budget Deficit. You could find this information very useful.  These plots do not include the Obama years or the years of WWII.

#### Multiple correspondence plots show how categorical data clusters:

You can see that the House and Senate cluster together for both Republicans and Democrats while Democratic presidents cluster with small deficits than Republican presidents. This might be of interest for your campaign or analysis.

#### Log linear modeling allows comparisons between classes:

Expected Cell Counts Normalized to Percentages by Row
For the On Budget Deficit (-) (Surplus (+)) per GDP
by the Controlling Political Parties
of the Presidency, Senate, and House
Budget Deficit from 1947 to 2008Political Parties Lagged One Year
Rows Sum to 100 Except for Rounding
President Senate House (-6.2,4] (-4,-2] (-2,0] (0,4.2]
D D D 0% 30% 55% 15%
R
R D
R 0% 29% 42% 29%
R D D 33% 40% 21% 7%
R 26% 42% 17% 15%
R D 33% 40% 21% 7%
R 26% 42% 17% 15%

In the table, each row is the combination of the  Presidential, Senate, and House parties. You find in the rows the log linear modeled percentage of years in which the President, Senate, and House combination falls in each Budget Deficit class. The Party of the Senate does not affect the outcome and the Parties of the President and House do – Republican Houses do a bit better –  Democratic Presidents do much better – which might be of interest to you.

#### Odds ratios can be used to compare parties:

Iowa 2013 Legislature
Odds Legislator is Male
House Senate
Democrats 3 : 2 10 : 3
Republicans 16 : 2 10 : 2

You could be interested in the odds a legislator is a male, by party.
For both the House and Senate in Iowa, Democrats have lower odds of being male than Republicans. Depending on your political philosophy, this could be bad or good.