Scatter plots of scores from various analyses have a range of specific options that can be controlled using the popup menu of the graph. These are explained below.
For explanations of the generic menu items Print, Export to SVG File and Close Panel, follow this link.
The variables to be displayed in the scatter plot can be selected with the menu items Choose ... for the Horizontal Axis and Choose ... for the Vertical Axis. Selecting either one of them will bring up a dialog box like the following (from a CVA in this example):
The drop-down menu contains all the available variables. Select one of them and click OK to change the graph or click Cancel to leave the scatter plot unchanged.
If this option is selected, MorphoJ scales the horizontal and
vertical axes of the plot so that they have equal scaling for the
respective variables (same distance per unit of each variable).
This is very desirable if the different axes are in the same units
(e.g. scores in a PCA or CVA). It is less relevant if the
axes are scaled in different units (e.g. the scores for different
blocks in a PLS analysis, or the depenednt and independent
variables in a regression).
If this option is not selected, MorphoJ scales the axes
of the plot to fit the data into the available space of the
screen. As a consequence, the scaling of the horizontal and
vertical axes is not the same.
By selecting Use Same Scaling for Both Axes,
the user can specify that both axes of the plot are scaled
equally, so that distances in the plot are preserved and the
relative amounts of variation in both directions is directly
visible.
This option ensures that the scaling of the axes of all
possible scatter plots is equal. This is useful if the
investigator wants to combine multiple plots for comparison. As a
consequence, however, some plots may not fill the space available
in the graphics window. Note that it is assumed that the first
axis (in most analyses, this is the one with the greatest range of
values) is used as the horizontal axis, as it is conventional.
Changing the size of the MorphoJ frame (and therefore the graphics
window) will change the scaling of all plots — to obtain multiple
plots that are scaled consistently, the user should therefore not
resize the MorphoJ window.
Selecting Use Same Scaling for ALL Axes automatically also selects Use Same Scaling for Both Axes. Likewise, unselecting Use Same Scaling for Both Axes automatically unselects Use Same Scaling for ALL Axes.
MorphoJ can color the data points of a scatter plot according to
the values of a classifier variable. For instance, such a
classifier might indicate species, geographic origin or sex. To
set up the association of the colors and the groups indicated by a
classifier, select Color the Data Points from
the popup menu. This command also changes the coloring for
confidence ellipses or convex hulls, if the option for coloring
according to groups defined by a classifier variable was chosen.
If the check box labeled 'Use a classifier variable ...' is selected, a drop-down menu is activated, in which the user can select the variable that is to be used as the grouping criterion (this classifier is 'Species' in the screen shot above). A new choice of this variable eliminates any existing choice of colors.
If a classifier variable has been chosen, the list labeled 'Classes' contains all the values that occur for this classifier (in the example, these are 's1/f', 's1/m' etc.). Initially, each of these values is shown in a different color. This color will be used for the data points of the respective group in the scatter plots.
The colors assigned to the specific groups can be changed by selecting one or more of the values in the list and then selecting a new color from the interface to the right, and then clicking the button Use Color.
Finally, click OK to use the changed colors, or Cancel to stop.
If the check box labeled 'Use a classifier variable ...' is not selected, the dialog box can be used to select a color for all data points. Select the new color, click Use Color, and then click OK.
To change the size of the dots that indicate the data points, use the menu item Resize Data Points. A dialog box like the following will appear:
The text field indicates the current diameter of the dots in pixels. This value can be changed by the user. Click OK to update the graph with the new size of dots or click Cancel to leave it unchanged.
If the check box for Label Data Points is selected, the identifier strings for all the observations are shown in the graph. This is useful if there are only few observations, but becomes very cumbersome with bigger datasets. If you have many observations, do not select this option, but check the identity of individual data points by shift-clicking on them (this will invoke a dialog box with the identifier string).
It is possible to add confidence or equal frequency ellipses to a scatter graph. To do so, select Confidence Ellipses from the popup menu. A dialog box like the following will appear:
The checkbox at the top, Draw ellipse(s), determines whether or not ellipses are going to be drawn in the graph.
The two radio buttons Equal frequency ellipse(s) and Condifence ellipse(s) for mean(s) determine what type of ellipses are to be drawn.
For a given probability level, the equal frequency
ellipse is the ellipse that contains randomly drawn
data points from a sample with that probability. In other words,
it is the ellipse that encloses a proportion of data points in the
sample that corresponds to the probability (e.g., the 90% equal
frequency ellipse contains about 90% of the data points, and each
data point has a probability of 0.9 of falling within the
ellipse).
The confidence ellipse for the mean at a given
probability is the ellipse that, if the sampling process were
repeated over and over, would have that probability to overlap the
true sample mean (note that this is not the same
as saying that the true sample mean has that probability of lying
within the confidence ellipse).
The calculations of both the equal frequency and confidence
ellipses assume that the data (or more specifically, the scores in
the graph) follow a multivariate normal distribution. This is
usually a reasonable approximation (because the scores are
computed as linear combinations of Procrustes coordinates, the
central limit theorem can be invoked). Still, it is preferable not
to overinterpret confidence or equal frequency ellipses (interpret
them cautiously).
The check box Use a classifier as a criterion for grouping observations determines if a single ellipse is drawn for the entire dataset or if separate ellipses are to be drawn for several groups. If so, use the drop-down menu below the check box to select the classifier to be used as the criterion for forming groups.
To use the same classifier for choosing colors for distinguishing
groups, select the option Use this classifier to
determine the colors of the ellipses and data points.
Note that this selection may override colors that were already
chosen for the same scatter plot (it will do so if the user
chooses a classifier different from the one previously used as a
coloring criterion). For adjusting the colors per se, use Color
the Data Points.
Because the ellipses may extend outside the area of the graph, there is an option Clip the ellipse(s) at the margins of the graph. If this option is selected, the ellipses are drawn only inside the rectangular boundaries of the scatter plot.
Finally, the option Show the data points can be de-selected if a scatter plot contains very many data points and many groups, so that it would be confusing. In this situation, it may be easier to show the ellipses only (click the check box to select or unselect this option).
Click OK to update the graph with the new selections for ellipses or click Cancel to leave it unchanged.
Note that ellipses are only drawn for groups with three or more observations.
An alternative to equal frequency ellipses, especially when sample sizes are relatively small, are convex hulls. Convex hulls are convex polygons (areas without any 'dents' where the contour is concave toward the outside) enclosing all the data points in a scatter plot, usually drawn for groups of items in the plot (e.g. taxa, populations, etc.). If large sample sizes are available, equal frequency ellipses tend to give a better impression of the distribution of points because they take into account all data points in a sample, whereas convex hulls, by definition, focus only on the extreme points in every direction.
To add convex hulls to a scatter plot, select Convex Hulls in the popup menu.
The checkbox at the top, Draw convex hull(s), determines whether or not convex hulls are going to be drawn in the graph.
The check box Use a classifier as a criterion for grouping observations determines if a single convex hull is drawn for the entire dataset or if separate ellipses are to be drawn for several groups. If so, use the drop-down menu below the check box to select the classifier to be used as the criterion for forming groups.
To use the same classifier for choosing colors for distinguishing groups, select the option Use this classifier to determine the colors of the convex hulls and data points. Note that this selection may override colors that were already chosen for the same scatter plot (it will do so if the user chooses a classifier different from the one previously used as a coloring criterion). For adjusting the colors per se, use Color the Data Points.
Finally, the option Show the data points can be de-selected if a scatter plot contains very many data points and many overlapping groups, so that it would be confusing to have all the data points in the plot. In this situation, it may be easier to show the convex hulls only (click the check box to select or unselect this option).
Click OK to update the graph with the new selections for ellipses or click Cancel to leave it unchanged.
For groups containing only two data points, convex hulls are lines between the corresponding pairs of points. Convex hulls do not exist for single data points.