Top 100 Mathematics & Statistics Interview Questions for Business Analysts (12 LPA Ready) Part 1: Descriptive Statistics & Probability for Business Analysts
- IOTA ACADEMY

- Oct 17
- 9 min read
Introduction
Mathematics and Statistics are the foundation of every Business Analyst’s problem-solving toolkit. Whether you’re analyzing sales data, customer behavior, or market trends, these concepts help you translate raw numbers into meaningful insights.
In Part 1 of our Top 100 Mathematics & Statistics Interview Questions for Business Analysts (12 LPA Ready) series, we’ll focus on Descriptive Statistics and Probability — the building blocks of analytical thinking.
This section covers everything from averages, dispersion, and normal distribution to probability rules, conditional probability, and sampling concepts. Each question includes a simple, easy-to-understand explanation and practical business examples to help you remember and apply them in interviews.
By mastering these fundamentals, you’ll be able to interpret data distributions, detect anomalies, and make confident, evidence-based business recommendations — a must-have skill for any analyst aiming for high-paying roles in analytics and consulting.

SECTION 1: DESCRIPTIVE STATISTICS (Q1–20)
1. What is the difference between quantitative and qualitative data?
Quantitative data represents numbers that can be measured — such as revenue, sales, or customer age. Qualitative data describes characteristics or categories like gender, city, or product type.For example, “Monthly Sales = ₹10,000” is quantitative, while “Region = North” is qualitative. Business Analysts often use quantitative data for metrics and KPIs, while qualitative data is used for segmentation and customer profiling. Understanding both helps analysts tell complete stories from data — not just numbers but also their context.
2. What are mean, median, and mode? When do you prefer median over mean?
The mean is the arithmetic average, median is the middle value, and mode is the most frequent value in a dataset.For example, if incomes are ₹20k, ₹25k, ₹25k, ₹30k, and ₹5L — the mean is distorted by ₹5L (outlier). Here, the median (₹25k) represents the data better.Analysts prefer the median when data is skewed, such as salary distributions or property prices, because it’s not affected by extreme values.
3. What is range, variance, and standard deviation?
The range is the difference between the highest and lowest value. Variance measures how far each value is from the mean (in squared units), while standard deviation is its square root — making it easier to interpret.For example, if two stores have the same average sales but one has higher standard deviation, it means its performance is more inconsistent. Analysts use SD to compare stability — e.g., steady revenue growth vs fluctuating sales.
4. What is the 68–95–99.7 rule?
In a normal distribution, about 68% of values fall within one standard deviation from the mean, 95% within two, and 99.7% within three.For example, if monthly sales average ₹1,00,000 with SD ₹10,000, around 95% of months will have sales between ₹80,000 and ₹1,20,000.This rule helps analysts detect anomalies — if a month’s sales fall far outside this range, it may indicate an unusual event or data error.
5. What is Interquartile Range (IQR) and how is it used to detect outliers?
The IQR is the difference between the 75th (Q3) and 25th (Q1) percentiles. It captures the middle 50% of data.Outliers are values that lie below Q1 – 1.5×IQR or above Q3 + 1.5×IQR.For instance, if most customer ages fall between 25–35, but one record shows 95, it’s likely an outlier. Business Analysts use IQR to clean data before analysis, ensuring decisions aren’t skewed by incorrect or extreme values.
6. What is skewness and kurtosis?
Skewness measures asymmetry of a distribution. If the right tail is longer, it’s positively skewed (e.g., income data). If the left tail is longer, it’s negatively skewed.Kurtosis measures how “heavy” the tails are compared to a normal distribution. High kurtosis means more extreme outliers.In business, understanding skewness and kurtosis helps analysts choose correct measures — e.g., median for skewed data or trimming outliers for better reporting.
7. What is data dispersion and why is it important?
Dispersion measures how spread out data is. Even if two teams have the same average performance, one may be consistent while the other fluctuates a lot.For instance, two sales teams may both average ₹1 lakh per month, but if one team’s sales vary between ₹60k–₹1.4L and another stays around ₹95k–₹1.05L, the second team is more stable.Business Analysts use dispersion to understand reliability and predictability in KPIs such as revenue, churn, or expenses.
8. What is the difference between a histogram and a bar chart?
A histogram is used for continuous data (like height or sales amounts) and shows how data is distributed over ranges — bars touch each other.A bar chart is for categorical data (like departments or product types), and bars are separated.For example, a histogram can show the distribution of monthly income among customers, while a bar chart can compare income across regions. Analysts use histograms to detect skewness, normality, or gaps in data.
9. What is a percentile or quartile?
A percentile indicates the relative position of a value in a dataset. For example, if your sales are in the 90th percentile, you’re performing better than 90% of peers.Quartiles divide the data into four equal parts — Q1 (25%), Q2 (median), Q3 (75%).Business Analysts use percentiles to benchmark KPIs like response times, delivery speeds, or customer satisfaction — identifying top and bottom performers.
10. What is a weighted average? Give an example.
A weighted average assigns more importance (weight) to certain data points.For instance, when calculating overall satisfaction, a Business Analyst might give 70% weight to premium customers and 30% to regular ones.Weighted averages are more realistic when all data points don’t carry equal importance — e.g., computing average sales considering store size or calculating average ratings based on review counts.
11. What is geometric mean and where is it used?
The geometric mean is the nth root of the product of n values. It’s useful when data grows multiplicatively — like interest rates or revenue growth.Example: if a company’s annual growth rates are 10%, 20%, and 30%, the geometric mean gives the average consistent rate (≈19.2%), not just arithmetic average (20%).Business Analysts use it to calculate CAGR (Compound Annual Growth Rate) — giving a realistic measure of multi-year performance.
12. What is the difference between population and sample?
A population includes every member of a group (e.g., all customers). A sample is a smaller subset used for analysis when studying the entire population is impractical.For instance, analyzing feedback from 1,000 customers instead of 1,00,000 helps save time and cost.Analysts ensure the sample is representative — covering different regions, genders, and preferences — so conclusions generalize well to the full population.
13. What is the Central Limit Theorem (CLT)? Why is it important?
The CLT states that the average of samples from any population approaches a normal distribution as sample size increases — even if the population isn’t normal.This means analysts can apply statistical inference (like confidence intervals and hypothesis tests) even on non-normal business data.For example, if you take daily sales samples from different stores, their mean sales will approximately follow a normal curve.
14. What is law of large numbers?
The law of large numbers says that as sample size grows, the sample mean gets closer to the population mean.For instance, flipping a coin 10 times may give 7 heads, but over 1,000 flips, it’ll approach 50%.Business Analysts use this concept in forecasting — more transactions or data points mean more reliable predictions and reduced randomness.
15. What is a Z-score? How is it used?
A Z-score shows how far a data point is from the mean, measured in standard deviations.Formula: Z = (Value – Mean) / SD.For example, if a store’s sales Z-score is +2, it’s performing 2 SDs above average — a top performer.Analysts use Z-scores to detect outliers, standardize metrics from different scales, or compare KPIs across departments.
16. What is a scatter plot? How is it useful?
A scatter plot shows relationships between two numeric variables. Each point represents one observation.For example, plotting “Advertising Spend vs Sales” helps visualize if higher spending leads to more sales.A strong upward trend indicates positive correlation. Business Analysts use scatter plots to detect relationships before building regression models or making marketing decisions.
17. What is covariance vs correlation?
Covariance measures how two variables move together — positive means both increase, negative means one decreases as other increases.Correlation is a standardized version (between –1 and +1), making it easier to interpret.For example, correlation between “Ad Budget” and “Sales” = +0.85 indicates a strong positive link.While covariance shows direction, correlation shows both direction and strength — so analysts prefer correlation for comparison.
18. What is the difference between sample mean and population mean?
A sample mean is calculated from a subset of data, while the population mean is the true average of the entire group.In business, analysts often use sample means to estimate population behavior — e.g., average satisfaction from a survey sample to represent all customers.Understanding this difference is key to interpreting whether results are representative or may contain sampling error.
19. What is data normalization or standardization?
Normalization scales values into a fixed range (like 0–1), while standardization transforms data to have mean 0 and SD 1.For example, when combining “Revenue (₹)” and “Customer Count,” normalizing ensures both contribute equally to analysis.Analysts use it in dashboards, clustering, and regression to avoid bias from differing scales.
20. What is the importance of descriptive statistics in business analysis?
Descriptive statistics summarize large datasets into understandable numbers — like averages, spreads, and distributions.They help analysts quickly identify trends, outliers, or inconsistencies.For example, before analyzing sales growth, a BA may check mean sales, median order value, and variance to understand baseline performance.These insights guide next steps — whether to deep-dive, clean data, or test hypotheses.
SECTION 2: PROBABILITY & DECISION LOGIC
21. What is probability and why is it important for Business Analysts?
Probability measures how likely an event is to occur, ranging from 0 (impossible) to 1 (certain).For Business Analysts, probability underpins risk estimation and forecasting.Example: The probability of a customer defaulting helps decide loan approval limits.Probability enables decisions under uncertainty — key in pricing, demand prediction, and customer segmentation.
22. Explain independent and dependent events with examples.
Independent events: One does not affect the other (e.g., coin tosses).
Dependent events: One influences the other (e.g., drawing cards without replacement).
In business, “customer revisiting” depends on satisfaction — a dependent event, while “website uptime” and “exchange rate” might be independent.
23. What is conditional probability?
It’s the probability of event A occurring given that event B has already occurred.Formula:

Example: The probability a customer upgrades given they used a free trial.Conditional probabilities help model targeted offers and retention analysis.
24. Explain Bayes’ Theorem and its business application.
Bayes’ Theorem updates prior probability after new evidence:
![Bayes' theorem formula: P(A|B) = [P(B|A) * P(A)] / P(B) in black text on a white background.](https://static.wixstatic.com/media/048bbe_9c47537035f54b9eb889907e89569757~mv2.png/v1/fill/w_247,h_76,al_c,q_85,enc_avif,quality_auto/048bbe_9c47537035f54b9eb889907e89569757~mv2.png)
Example: In fraud detection, if a transaction pattern (B) is observed, Bayes updates the probability that it’s fraudulent (A).Used in spam filtering, risk scoring, and recommendation systems.
25. What is the difference between mutually exclusive and exhaustive events?
Mutually exclusive: Events cannot occur together (e.g., win or lose a bid).
Exhaustive: All possible outcomes covered (win, lose, or tie).
Understanding both ensures analysts model complete decision scenarios.
26. What is expected value (EV) and how is it used in business?
EV is the long-term average outcome weighted by probabilities.

Example: A ₹10,000 campaign with 40% success chance → EV = 0.4×₹10,000 = ₹4,000.Used for evaluating marketing ROI, pricing risk, and investment decisions.
27. What is covariance matrix and how is it used in analytics?A covariance matrix is a square table showing covariances between multiple variables. Each cell represents how two variables move together — positive values indicate they increase together, negative means one increases while the other decreases.For example, a covariance matrix between “Ad Spend,” “Sales,” and “Leads” helps identify interdependence between marketing metrics.Business Analysts use it in multivariate analysis, PCA (Principal Component Analysis), and risk modeling to understand variable relationships in datasets with many features.
28. What is the difference between permutation and combination?
Permutation: Order matters (arranging passwords).
Combination: Order doesn’t matter (selecting products for offer bundles).
Example: A campaign choosing 3 offers from 10 possible uses combinations, not permutations.
29. What is a probability distribution?
A function describing likelihood of different outcomes.Example: Sales data following a normal distribution helps predict average performance, while Poisson distribution models counts (like daily transactions).Understanding distributions helps in forecasting and risk modeling.
30. Explain normal distribution and its business importance.
A bell-shaped symmetric curve where most values cluster around the mean.68–95–99.7% rule applies for 1σ, 2σ, 3σ deviations.Example: Customer delivery times often follow normal distribution; analysts use it for SLA reliability estimation.
31. What is sampling and why is it used?
Sampling selects a subset of data to represent the entire population.Example: Surveying 500 out of 10,000 customers to estimate satisfaction.It saves time, cost, and effort — while still enabling statistically valid conclusions if sampling is random.
32. What are the main types of sampling methods?
Random sampling: Equal chance for all.
Stratified: Divide population into groups, sample from each (e.g., by region).
Systematic: Every nth record.
Cluster: Randomly pick entire groups.
Stratified sampling is most common in business surveys to ensure representation.
33. What is Monte Carlo simulation and how is it used in business decisions?
Monte Carlo simulation uses random sampling and probability distributions to estimate possible outcomes of uncertain events.For example, a Business Analyst can model future profit by simulating thousands of possible sales, cost, and price combinations.This technique helps quantify risk, forecast ranges instead of single-point estimates, and make more informed decisions in finance, supply chain, and project management.
34. What is the difference between theoretical and empirical probability?
Theoretical: Based on reasoning (e.g., dice roll = 1/6).
Empirical: Based on observed data (e.g., 16% cart abandonment).
Business analysts rely more on empirical probability derived from customer behavior and market data.
35. What is decision tree analysis in business?
A graphical method to evaluate decisions under uncertainty.Each branch represents choices, probabilities, and payoffs.Example: Deciding whether to launch a new product — branches show success/failure with associated profits and likelihoods.Helps compare expected values and choose the best option.
📊 Part 2: Hypothesis Testing, Regression & Forecasting for Business Analysts and Data Analysts (Click Here)





Comments