top of page

Top 100 SQL, MySQL, SQL Server, PostgreSQL, and MongoDB Interview Questions & Answers — A Complete Guide for Aspiring Business Analysts (12 LPA Ready)

Updated: Oct 28

In the world of business analytics, data is the foundation of every decision — and SQL remains the most essential skill for anyone aspiring to become a Business Analyst. Whether you’re working with structured databases like MySQL, SQL Server, or PostgreSQL, or unstructured systems like MongoDB, the ability to extract, clean, and analyze data efficiently is what differentiates a good analyst from a great one.


This comprehensive guide — “Top 100 SQL, MySQL, SQL Server, PostgreSQL, and MongoDB Interview Questions & Answers” — is carefully curated to help you prepare for technical interviews at top organizations offering starting packages around 12 LPA. The questions are designed from an interviewer’s perspective, focusing on practical knowledge, clarity of concepts, and real-world use cases relevant to Business Analysts.


Each question is accompanied by an easy-to-understand answer and concise explanation, ensuring you not only memorize the syntax but also grasp when and why to use it. The collection covers everything from database fundamentals and joins to transactions, indexing, performance optimization, and NoSQL concepts.


Whether you’re a fresher stepping into analytics or an experienced learner aiming to strengthen your foundations, this guide will equip you with the confidence and technical depth to ace your BI and analytics interviews.



Holographic display in a server room shows "Top 100 Interview Questions" with SQL and MySQL logos, glowing blue text and circuitry.

 

 

SQL / Relational Database Basics


  1. What is a database? What is an RDBMS?

    Answer:

    A database is a structured collection of data. An RDBMS (Relational Database Management System) stores data in tables (relations) with rows and columns, supports SQL, relationships (keys), constraints, transactions.

    Explanation: In RDBMS you define schemas, enforce integrity, query via SQL.


  2. What is SQL?

    Answer:

    SQL (Structured Query Language) is the standard language used to communicate with relational databases for querying, inserting, updating, deleting, and managing schema.

    Explanation: SQL has sublanguages like DDL, DML, DCL, TCL.


  3. What are tables, rows, columns in SQL?

    Answer:

    • Table: structure (like “Customer”)

    • Row / record: one entry in the table (customer A)

    • Column / field: attribute (Name, Age, etc.)


  4. What is a primary key? Why is it important?

    Answer:

    A primary key uniquely identifies each row in a table (e.g. CustomerID). It must be non-null, unique.

    Explanation: Enforces entity integrity, used in relationships.


  5. What is a foreign key?

    Answer:

    A foreign key is a column (or set of columns) in one table that refers to the primary key in another table, enforcing referential integrity (e.g. Order.CustomerID refers to Customer.CustomerID).


  6. What is normalization? What are normal forms (1NF, 2NF, 3NF)?

    Answer:

    Normalization is organizing data to reduce redundancy and dependency.

    • 1NF: atomic values (no repeating groups)

    • 2NF: in 1NF, and no partial dependency (non-key attributes depend on full primary key)

    • 3NF: in 2NF, and no transitive dependency (non-key depends on non-key)

      Explanation: Normal forms help maintain data integrity and reduce anomalies.


  7. When might you denormalize?

    Answer:

    You may denormalize to improve read performance, simplify queries, or avoid many joins in reporting scenarios (common in BI). But it trades off update anomalies.


  8. What is a unique constraint / unique key?

    Answer:

    A unique constraint ensures that values in a column (or set) are distinct (no duplicates), though they may allow one null (depending on DB).


  9. What is an index? Why use it?

    Answer:

    An index is a data structure that speeds up query searches on a column (or columns). It helps SELECT queries, filtering, joins.

    Explanation / caveat: But too many indexes slow down INSERT/UPDATE/DELETE operations. Also, index on low-cardinality columns may not help.


  10. What is the difference between clustered and non-clustered index (SQL Server)?

    Answer:

    • Clustered index: determines physical order of data in the table (only one per table).

    • Non-clustered: separate structure, pointer to table rows (multiple allowed).


  11. What is SQL SELECT syntax (basic)?

    Answer:

    SQL query code snippet in black and purple: "SELECT column1, column2 FROM TableName WHERE condition ORDER BY column1;".

  1. What is the difference between WHERE and HAVING?

    Answer:

    • WHERE filters rows before aggregation (applies to underlying table).

    • HAVING filters after aggregation (applies to grouped results).


  2. What is GROUP BY?

    Answer:

    GROUP BY groups rows with the same values in specified columns so you can compute aggregates (SUM, COUNT) per group.


  3. What is the use of aggregate functions? Name some.

    Answer:

    Aggregate functions compute summary metrics: SUM(), COUNT(), AVG(), MIN(), MAX().


  4. What is a join? Types of joins?

    Answer:

    A join links two tables based on a related column. Types:

    • INNER JOIN (only matching rows)

    • LEFT (OUTER) JOIN (all from left + matching from right)

    • RIGHT (OUTER) JOIN

    • FULL (OUTER) JOIN

    • CROSS JOIN (Cartesian product)


  5. What is a self-join?

    Answer:

    Self-join is joining a table with itself (using aliases) to compare rows in same table (e.g. employee-manager relationship).


  6. What is a subquery / nested query?

    Answer:

    A query inside another query (in SELECT, FROM, or WHERE). Used to filter, derive, or compute intermediate results.


  7. What is a correlated subquery?

    Answer:

    A subquery that references columns from the outer query, so it’s evaluated per row of the outer query.


  8. What is a union vs union all?

    Answer:

    • UNION combines results and removes duplicates

    • UNION ALL combines all, keeps duplicates


  9. What is EXCEPT / INTERSECT?

    Answer:

    • INTERSECT: returns common rows between two queries

    • EXCEPT: returns rows in first query not in second (some DBs use MINUS)


  10. What is a view?

    Answer:

    A virtual / stored query whose result acts like a table (you can query from it). It does not store data (unless materialized).


  11. What is a materialized view?

    Answer:

    A view whose results are stored (materialized) and refreshed periodically, so query performance is faster.


  12. What is a stored procedure?

    Answer:

    A precompiled SQL program / routine stored in the database that can take parameters, perform logic (queries, updates) and return results.


  13. What is a trigger?

    Answer:

    A trigger is a procedure automatically invoked (fired) in response to certain events (INSERT, UPDATE, DELETE) on a table.


  14. What is transaction? What are ACID properties?

    Answer:

    A transaction is a unit of work (one or more SQL operations) executed as a single logical operation.

    ACID:

    • Atomicity: all or nothing

    • Consistency: data transitions from one valid state to another

    • Isolation: parallel transactions don’t interfere

    • Durability: committed changes persist


  15. What is isolation level? Name some levels.

    Answer:

    Isolation levels define how/when the changes by one transaction become visible to others. Examples:

    • READ UNCOMMITTED

    • READ COMMITTED

    • REPEATABLE READ

    • SERIALIZABLE

      Some DBs also have Snapshot Isolation (PostgreSQL, SQL Server).


  16. What are deadlocks? How are they resolved?

    Answer:

    Deadlock is a situation where two (or more) transactions wait for each other, causing a standstill. DB’s deadlock detection mechanism kills one transaction (rollback) to resolve.


  17. What is normalization vs denormalization trade-off?

    Answer:

    Normalization reduces redundancy and anomalies but can lead to many joins (slower reads). Denormalization reduces joins and can speed reads, but introduces redundancy and risk of inconsistencies.


  18. What is partitioning?

    Answer:

    Splitting a large table into smaller, manageable pieces (partitions) usually by a key (date, range). This improves performance and manageability.


  19. What is indexing strategy for reporting / analytics?

    Answer:

    Use indexes on key filter/join columns, avoid indexing columns with low cardinality, use composite indexes (multiple columns) for multi-column filters, consider covering indexes to include non-key columns.


  20. What is optimizer / query plan / execution plan?

    Answer:

    The optimizer is the DB engine component that selects an efficient execution strategy (join order, index usage). The execution plan is its output (how query will run). You can view it (EXPLAIN in many DBs).


  21. How to read / use EXPLAIN / EXPLAIN ANALYZE?

    Answer:

    EXPLAIN shows the plan (estimated cost). EXPLAIN ANALYZE executes and shows actual runtime. Use these to find bottlenecks (full scans, expensive joins).


  22. What is a common table expression (CTE)?

    Answer:

    A temporary named result (using WITH) you can reference within the main query. Helps modularize complex queries and recursion.


  23. What is window / analytic function? Give examples.

    Answer:

    Functions that operate over a “window” of rows (relative to current row) without collapsing rows. Examples:

    • ROW_NUMBER() OVER (PARTITION BY X ORDER BY Y)

    • RANK(), DENSE_RANK(), LAG(), LEAD(), SUM(...) OVER (...)


  24. What is CASE expression?

    Answer:

    Conditional logic inside SQL, similar to “if … then”:

    SQL code snippet displaying a CASE statement with conditions, results, and a default. Text is highlighted in purple and black.

MySQL / SQL Server / PostgreSQL Specific / Advanced


  1. What are differences between MySQL, SQL Server, PostgreSQL?

    Answer (compare key features):

    • MySQL is commonly used as open-source, often used in web stacks; older versions had limitations in transactions (InnoDB vs MyISAM)

    • SQL Server (by Microsoft) has deep integration in Windows, strong tooling, T-SQL extensions, CLR, etc.

    • PostgreSQL is highly standards-compliant, supports advanced features (JSON, window functions, full text search), extensible, open source.

      Note / update: PostgreSQL continues to add capabilities (e.g. improved partitioning, parallelism) in newer versions (v14, v15, etc.).


  2. What storage engines exist in MySQL?

    Answer:

    Common engines:

    • InnoDB (transactional, supports foreign keys)

    • MyISAM (non-transactional, faster reads but no foreign keys)

    • Others: MEMORY, CSV, etc.


  3. What is T-SQL in SQL Server? What are its extensions over standard SQL?

    Answer:

    T-SQL is Microsoft’s procedural extension to SQL used in SQL Server. It has procedural constructs (IF, WHILE), system functions, built-in stored procedures, error handling, etc.


  4. What is PL/pgSQL in PostgreSQL?

    Answer:

    It is PostgreSQL’s procedural language allowing stored procedures / functions, loops, variables, control flow inside the database.


  5. How to limit results / pagination (different DBs)?

    Answer:

    • MySQL / PostgreSQL: LIMIT offset, count or LIMIT count OFFSET offset

    • SQL Server (2012+): OFFSET ... FETCH NEXT ...

    • Older SQL Server: use ROW_NUMBER() / subquery or TOP + subquery


  6. What is SERIAL / BIGSERIAL in PostgreSQL?

    Answer:

    SERIAL is a shorthand for auto-increment integer (integer + sequence). BIGSERIAL is for larger range. Use these for auto-generated primary keys.


  7. What is IDENTITY in SQL Server?

    Answer:

    IDENTITY(seed, increment) defines an auto-increment column. SQL Server’s analog for SERIAL.


  8. What is AUTO_INCREMENT in MySQL?

    Answer:

    A column attribute to automatically generate incremental values (used often for primary key).


  9. What is schema / database / catalog?

    Answer:

    • Database: container of schemas / data

    • Schema: namespace inside a database (group of tables etc.)

    • Catalog: high-level container (varies by DB).

      In PostgreSQL, a database contains schemas; in SQL Server, you can reference database.schema.table.


  10. How to handle case sensitivity / quoting?

    Answer:

    SQL standard: identifiers are case-insensitive by default. Some DBs (PostgreSQL) fold to lowercase unless quoted. Use double quotes / backticks when needed.


  11. What is JSON / JSONB support (PostgreSQL)?

    Answer:

    PostgreSQL supports json and jsonb data types. jsonb is binary, indexed, more efficient for queries. You can query JSON fields (e.g. ->>, ->) and index on them (GIN indexes).


  12. What is full-text search?

    Answer:

    A feature in many DBs to index and efficiently query text content (words, phrases) using inverted index, ranking, stemming.

    Example: SQL Server (Full-Text Index), PostgreSQL (tsvector, tsquery), MySQL (FULLTEXT index in MyISAM / InnoDB).


  13. What is EXPLAIN / EXPLAIN ANALYZE in PostgreSQL / MySQL / SQL Server?

    Answer:

    Similar to earlier: show execution plan. In PostgreSQL, EXPLAIN ANALYZE runs the query and shows actual timing. In MySQL, EXPLAIN shows type, key used, etc. SQL Server has “Show Execution Plan”.


  14. What are window functions in PostgreSQL & SQL Server?

    Answer:

    Same as earlier: ROW_NUMBER() OVER (…), RANK(), LEAD(), LAG(), cumulative sums with OVER (PARTITION / ORDER BY).


  15. What is CTE recursive query?

    Answer:

    Using WITH RECURSIVE you can write recursive queries (e.g. hierarchical data, such as parent-child).


    Example (PostgreSQL / SQL Server):

    ree

  1. What is a window frame (ROWS vs RANGE)?

    Answer:

    When using window functions, you can define the frame:

    • ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

    • RANGE uses value-based frame (values rather than row counts). Behavior differs in ORDER BY and value duplicates.


  2. What is UPSERT / merge functionality?

    Answer:

    UPSERT means “insert or update if exists”.

    • PostgreSQL: INSERT ... ON CONFLICT (key) DO UPDATE ...

    • MySQL: INSERT ... ON DUPLICATE KEY UPDATE ...

    • SQL Server: MERGE statement


  3. What are transactions in Postgres / SQL Server / MySQL?

    Answer:

    All support transactions. In MySQL, use InnoDB engine. Default isolation levels differ (MySQL default is REPEATABLE READ). In SQL Server default is READ COMMITTED.


  4. What is VACUUM / ANALYZE in PostgreSQL?

    Answer:

    VACUUM reclaims storage from deleted / outdated rows (PostgreSQL MVCC).

    ANALYZE updates statistics for query planner. In newer versions, autovacuum runs automatically.


  5. What are extensions in PostgreSQL?

    Answer:

    Extensions add new functionality (e.g. postgis, uuid-ossp). You can install via CREATE EXTENSION.


  6. What are user-defined functions (UDF)?

    Answer:

    Functions you create (in T-SQL, PL/pgSQL, or MySQL) to encapsulate reusable logic returning scalar or table.


  7. What is indexing on JSON / JSONB (PostgreSQL)?

    Answer:

    You can use GIN or GiST indexes on JSONB data to speed query on keys, existence, text search.


  8. What is a sequence / generator?

    Answer:

    Objects that generate unique incremental numbers (used in PostgreSQL). SERIAL is shorthand creating a sequence. You can use sequences manually (NEXTVAL / CURRVAL).


  9. Difference between DELETE and TRUNCATE?

    Answer:

    • DELETE: deletes rows, logs each deletion, can have WHERE, triggers fire.

    • TRUNCATE: faster, deallocates whole table, no row-level triggers, resets identity (depending on DB).


  10. Difference between DROP and DELETE?

    Answer:

    • DROP TABLE: removes the table structure and data entirely.

    • DELETE: just deletes data (or some via WHERE).


  11. What is ON DELETE CASCADE / ON UPDATE CASCADE in FK constraints?

    Answer:

    These are referential actions. ON DELETE CASCADE means when parent row is deleted, child rows are also deleted automatically. Similar for update.


  12. What are views vs materialized views (in PostgreSQL / SQL Server)?

    Answer:

    As earlier: views are virtual; materialized views are stored. In PostgreSQL, you refresh materialized views manually (or via REFRESH MATERIALIZED VIEW). SQL Server calls them indexed views sometimes.


  13. What is WITH (INDEX ...) hint in SQL Server?

    Answer:

    Query hints to force using a specific index. Use carefully; optimizer typically knows best.


  14. What is table locking / row locking?

    Answer:

    When operations run, DB locks data for consistency. Some operations lock entire tables (e.g. DDL). Row-level locks are more granular (less blocking). SQL Server, PostgreSQL support row-level locks.


  15. What is pg_hba.conf in PostgreSQL?

    Answer:

    It is the host-based authentication file controlling which hosts / users / databases can connect, and which authentication method (md5, trust, peer, etc.) is allowed.


MongoDB / NoSQL


  1. What is MongoDB?

    Answer:

    MongoDB is a NoSQL, document-oriented database. It stores data as JSON-like documents (BSON). It is schema-flexible.


  2. Why use MongoDB / NoSQL vs relational DB?

    Answer:

    Use when you need flexible schema, horizontal scaling (sharding), handling unstructured / semi-structured data, fast writes, nested data. Relational is stronger for strict consistency, relationships, complex transactions.


  3. What is a document and a collection in MongoDB?

    Answer:

    • Document: a JSON / BSON object (analogous to a row)

    • Collection: group of documents (analogous to table)


  4. What is _id field in MongoDB?

    Answer:

    A unique primary key field automatically created for each document (type ObjectId by default).


  5. What is schema design in MongoDB (embedding vs referencing)?

    Answer:

    • Embedding: nest related data inside document (good for one-to-few)

    • Referencing: store relationships via foreign keys / references (manual) for many-to-many or large arrays.


  6. What is indexing in MongoDB?

    Answer:

    You can create indexes on fields in documents (single field, compound, multi-key, text). Indexes speed query/filter operations.


  7. What is a compound index?

    Answer:

    An index on multiple fields (e.g., { a:1, b:-1 }). Helps queries filtering/sorting on both fields.


  8. What is a multikey index?

    Answer:

    An index over an array field: if a document field is an array, multikey index indexes each element in the array.


  9. What is text index / full-text search in MongoDB?

    Answer:

    You can create a text index on string fields to enable text search, with weights, language stemmers, phrase matching.


  10. What is aggregation pipeline?

    Answer:

    A sequence of stages ($match, $group, $project, $sort, $limit, $lookup) that processes documents in steps to compute results.


  11. What is $match, $project, $group in pipeline?

    Answer:

    • $match: filter documents

    • $project: shape/select fields, compute new fields

    • $group: aggregate by key(s) (like SQL GROUP BY)

      Other stages help sort, join, limit etc.


  12. What is $lookup in the pipeline?

    Answer:

    $lookup is like a left join: join documents from another collection into the pipeline based on matching fields.


  13. What is $unwind?

    Answer:

    $unwind “flattens” an array field so that each element becomes a separate document in the pipeline.


  14. What is update operators ($set, $inc, $push, $pull)?

    Answer:

    • $set: set/replace value of a field

    • $inc: increment numeric field

    • $push: add element to an array

    • $pull: remove matching element from array


  15. What is atomic operation in MongoDB?

    Answer:

    Single-document operations are atomic (updates to one document). Multi-document operations are not atomic by default (unless using transactions in WiredTiger).


  16. What is a replica set?

    Answer:

    A set of MongoDB servers (primary + secondaries) replicating data for high availability and failover.


  17. What is sharding?

    Answer:

    Horizontal partitioning by distributing data across multiple machines using a shard key. Helps scale large datasets.


  18. What are transactions in MongoDB?

    Answer:

    Since version 4.0+, MongoDB supports multi-document ACID transactions (in replica sets). In version 4.2+, support across sharded clusters as well.


  19. What is TTL index?

    Answer:

    Time to live index: automatically removes documents after a specified time (based on a date field). Useful for event logs, sessions.


  20. What is explain() in MongoDB?

    Answer:

    It returns execution stats / query plan for a query or aggregation pipeline, to help analyze performance and index usage.


  21. What is collation?

    Answer:

    Collation defines string comparison rules (case sensitivity, accent sensitivity) in queries / indexes.


  22. What is a capped collection?

    Answer:

    A fixed-size collection that behaves like a circular queue: new inserts overwrite oldest when size limit reached.


  23. What is schema validation in MongoDB?

    Answer:

    You can define JSON Schema or rules on collections to enforce field types, required fields, value constraints, etc.


  24. What are change streams?

    Answer:

    Change streams allow applications to subscribe to real-time changes (insert, update, delete) in a collection or database.


  25. What is GridFS?

    Answer:

    A MongoDB specification and driver support for storing large files (bigger than BSON document size) by splitting them into chunks.


  26. What is difference between findOne() and find()?

    Answer:

    findOne() returns a single document (first match), find() returns a cursor over multiple documents.


  27. How do you do pagination in MongoDB?

    Answer:

    Use .skip() + .limit() (but this is inefficient for large offsets), or use range-based pagination with index (e.g. query > last_id LIMIT N).


  28. How to update many documents?

    Answer:

    Use updateMany() or update() with { multi: true } depending on driver, specifying filter and update operators.


  29. What is write concern and read concern?

    Answer:

    • Write concern: acknowledgment level for writes (e.g. w:1, w:majority)

    • Read concern: level of isolation for reads (e.g. local, majority, linearizable)


  30. What is majority read / causal consistency?

    Answer:

    • Majority read: reads from nodes that have reached write majority, ensuring you see the latest durable data

    • Causal consistency: ensures operations see results in a causal order


  31. What is profiler / monitoring in MongoDB?

    Answer:

    MongoDB has a database profiler to log slow operations. You can also use monitoring tools like mongostat, mongotop, Atlas metrics.


  32. What is TTL delete behavior?

    Answer:

    TTL background thread periodically deletes expired documents (not instant at expiry time).


  33. What is the maximum BSON document size?

    Answer:

    16 MB by default (as of current stable versions). Large documents may require GridFS or chunking.


  34. What is the difference between embedding and referencing in schema design — give example?

    Answer:

    • Embedding: In an Order document, embed items array with product info.

    • Referencing: In Order, store product_ids and fetch from Products collection via $lookup.

      Use embedding for fast reads when data is small and local; referencing when data is large, changing, or many-to-many.


  35. What are trade-offs in MongoDB schema design?

    Answer:

    - Flexibility vs data integrity

    - Read performance (embedding) vs update complexity

    - Indexing complexity

    - Storage overhead

    - Transaction costs if multi-document updates


Tips & Additional Notes


  • While memorizing, focus more on concepts and when to use which feature than verbatim syntax.

  • For SQL relational ones, practice on sample datasets using MySQL / PostgreSQL / SQL Server to get hands-on.

  • For MongoDB, set up a local instance (or use cloud) and try CRUD, indexing, aggregation pipelines.

  • Stay updated: e.g. PostgreSQL gets new features (partitioning, performance) often; MongoDB adds features like multi-shard transactions in newer versions.

  • Be ready for follow-up or “why” questions — interviewers expect you to reason.

  • Use real-world examples (sales data, customers, orders) while explaining to show your thinking.

Comments


bottom of page