1. T-test:
- A statistical test used to determine if there is a significant difference between the means of two groups. The t-test produces a t-value, and the results are often interpreted using a significance level (usually 0.05).
2. P-value:
- The probability of obtaining results as extreme or more extreme than the observed results, assuming the null hypothesis is true. A smaller p-value (typically less than 0.05) suggests stronger evidence against the null hypothesis.
3. Confidence Interval:
- A range of values used to estimate the true value of a population parameter. It provides a range within which we are reasonably confident that the true parameter lies, based on sample data.
4. Simpson's Paradox:
- A phenomenon where a trend appears in different groups of data but disappears or reverses when these groups are combined. It highlights the importance of considering confounding variables in statistical analysis.
5. Hypothesis Testing:
- A statistical method used to make inferences about population parameters based on sample data. It involves formulating a null hypothesis and an alternative hypothesis and testing the null hypothesis using statistical tests.
6. Central Limit Theorem:
- A fundamental concept in statistics stating that, regardless of the shape of the original population distribution, the sampling distribution of the sample mean will be approximately normally distributed for sufficiently large sample sizes.
7. Ordinary Least Squares (OLS):
- A method used in linear regression to find the line that minimizes the sum of the squared differences between the observed and predicted values. OLS is commonly used to estimate the coefficients of a linear regression model.
8. Bayes' Theorem:
- A mathematical formula that describes the probability of an event based on prior knowledge of conditions that might be related to the event. It is fundamental to Bayesian statistics.
9. Sampling:
- The process of selecting a subset of elements from a larger population. Sampling is crucial in statistics because it allows researchers to make inferences about a population based on data collected from a subset.
10. Power Analysis:
- A statistical method used to determine the likelihood of detecting a true effect (or difference) in a study. It involves considering factors such as sample size, effect size, and significance level.
11. Sample Size Determination:
- The process of determining the number of observations or participants needed for a study to achieve a desired level of statistical power and precision.
12. Regression Modeling and Assumptions:
- Regression modeling is a statistical technique used to examine the relationship between one dependent variable and one or more independent variables. Assumptions include linearity, independence, homoscedasticity, and normality of residuals.
13. Type 1 and Type 2 Error:
- Type 1 Error: Rejecting a true null hypothesis (false positive).
- Type 2 Error: Failing to reject a false null hypothesis (false negative). The balance between Type 1 and Type 2 errors is controlled by the significance level (alpha) and the power of the test.
Understanding these statistical terms is essential for designing experiments, analyzing data, and drawing meaningful conclusions in various fields of study.