PROJECT
Dataset
Explanatory Variables (volume of two particular brain regions):
- amygdala
- acc
Repsonse Variable:
- orientation (scale 1 [very conservative] to 5 [very liberal])
EDA - Data distribution
Let's compare the Histogram, KDE and Gaussian plot to estimate the frequency distribution of these brain region volumns.
KDE provides a probability density estimate that helps visualize the likelihood of where data points are likely to occur. It is particularly useful for estimating distributions when the shape is unknown and may not follow a neat, symmetric bell curve like the Gaussian distribution.
The Gaussian (Normal) Distribution also provides a probability density estimate but specifically assumes that the data follows a normal distribution, which is bell-shaped and symmetric. Gaussian distribution may not be ideal for skewed data, multimodal distribution and heavy tails or outliers.
Therefore, KDE or other non-parametric methods are recommended when the data shows more complexity or deviates from normal distribution. Let's see if the normality assumption holds.
[Distribution of amygdala in 1-dimensional]
[Distribution of acc in 1-dimensional]
The histogram relatively forms a bell-shaped curve that is symmetric around the mean with single peak. The KDE plot also shows a bell-shaped curve which indicates the data may be close to normally distributed.
Note that, we can also check the normality assumption with boxplot and qq-plot.
EDA - Joint Distribution
Now, let't plot 2-dimensional plots to show the relationship between two variables amygdala and acc. We'll explore the joint distributions, correlations, and dependence.
[Distribution of acc & amygdala in 2-dimensional]
The 2-dimensional histogram plot is not clear to determine the bimodality or multimodality. However, 2-dimensional KDE plot shows one primary region with a concentration of the data points (unimodal). Therefore, we can assume that the joint distribution of amygdala and acc forms a single cluster.
KDE also shows that there are a few outliers appearing as very light and disconnected regions in the plot, separate from the main concentration of points (red circles in the plot).
The 2-dimensional Gaussian plot shows almost circle contours (not elongated ellipses) which indicates the variables don't have strong correlation. I double checked with the Pearson correlation coefficient to check whether the two variables are linearly related. Pearson correlation coefficient [-0.128] shows a weak negative correlation and p-value [0.227] shows that the observed correlation is not statistically significant.
EDA - Conditional Distribution
Now, let's analyze the conditional distributions of amygdala and acc volumns on political orientation. By conditioning on political orientation, we can identify whether political orientation influences brain volume in specific regions.
I'm going to use KDE becuase it provies a smooth estimate of the distribution, making it easier to visualize and identify trends & patterns. As I mentioned earlier, KDE is non-parametric which is ideal for cases where the data may not conform to standard distribution form, allowing for a more accurate representation.
FYI, I initially tried a bandwidth of 0.5 for the KDE, but it produced a bit too much noise. After switching to 0.7, the results were much improved, as shown below.
- Amygdala
- Orientation 2 & 5
- The plot shows bimodal indicating the presence of two subgroups within the political orientation 2 (not all individuals share the same traits). In other words, some people within the orientation 2 have significantly larger or smaller brain regions than others.
- Orientation 3 & 4
- I interpret this plot as unimodal with some noise since it does not exhibit two distinct high peaks in the density, and there is no clear separation between them.
- Most data points cluster around a specific value, indicating that the amygdala brain volume is consistent within this group, with most individuals sharing similar values.
- ACC
- orientation 2 & 3 & 4& 5
- The plot shows unimodal which indicates most data points cluster around a specific value, indicating that the acc brain volume is consistent within thiese groups.
- For orientation 3 & 4, a wide distribution indicates greater variability in acc brain volumes within political orientation 3.
Inferences from the resuts
- Bimodal distribution for the amygdala in orientation 2 suggest that there are differences in how this group's brain structure relates to their political belief.
- Unimodal distribution for acc across all orientations suggest that acc volume is less strongly associated with political orientation compared to amygdala volume.
Further consideration
- For the amygdala volumns for orientation 2 & 5, are there demographic or psychological factors that could explain this variation?
- Why is ACC more consistent across political orientations, while amygdala shows variability?
EDA - Conditional Joint Distribution
- Orientations 2 and 4 show a bimodal distribution, suggesting the presence of subgroups within the political orientation (red circle).
- The contours appear compact and clustered, indicating that individuals within these orientation groups tend to have similar brain volumes.
- The peaks of the distributions are located near zero, while Orientation 5 shows a slight shift, which may indicate differences in brain structure.
- I'm uncertain whether Orientation 4 represents an outlier.
Conclusion
In this analysis of the relationship between brain structure (amygdala and anterior cingulate cortex volumes) and political orientation, including one-dimensional histograms, kernel density estimation (KDE), Gaussian distribution, and two-dimensional KDE, were employed to explore the distributions and potential correlations.
Comments
Post a Comment