Sampling: Choosing Who to Study and How Many

Cluster Post 6  |  Module 3: Research Methodologies

From Concept to Submission Series  |  2026

← Back to Module 3 Overview

Sampling: Choosing Who to Study and How Many

The module overview introduced probability and purposive sampling with brief descriptions. This post goes deeper: the logic behind each sampling approach and how it determines what conclusions you can draw, how to write a sampling justification that satisfies reviewers, what saturation actually means and how to demonstrate it, and the most common sampling mistakes that generate examiner queries.

The Fundamental Logic: What Sampling Is For

Sampling is not primarily a practical problem — it is an epistemological one. The sampling decisions you make determine what population your findings apply to, what conclusions you can legitimately draw, and what claims of generalisability you can make. Getting the logic wrong produces findings that are technically sound but intellectually indefensible.

The core question every sampling decision must answer is: what do I need my sample to be in order for my findings to be valid? For quantitative research, validity usually requires representativeness — the sample should be structured so that its characteristics reflect those of the population you want to generalise to. For qualitative research, validity requires informativeness — the sample should be selected so that participants can provide rich, relevant insight into the phenomenon you are studying.

These are different logics, and they produce different sampling strategies. Applying representativeness logic to qualitative sampling — selecting participants to be demographically representative of a population — is a category error that produces a sample too large for in-depth analysis and too varied for thematic depth. Applying informativeness logic to quantitative sampling — selecting participants because they have interesting things to say — produces a biased sample that cannot support statistical inference.

Probability Sampling: When and How

Probability sampling gives every member of the target population a known, non-zero chance of selection. This property is what enables statistical inference from a sample to a population — the mathematical logic of inferential statistics requires that observations are drawn by a random process.

Simple random sampling

Every member of the population has an equal probability of selection. Requires a complete sampling frame — a list of every member of the population from which to sample randomly. In practice, complete sampling frames are rarely available for the populations social scientists study. A complete list of all first-year students at Indian government colleges does not exist centrally; a list of students at a specific college does.

Stratified random sampling

The population is divided into subgroups (strata) defined by a characteristic relevant to the research question, and random samples are drawn from each stratum. Stratification ensures representation of each subgroup and increases precision when the characteristic used to stratify is correlated with the outcome of interest.

Study examining peer mentoring and retention across disciplines: Stratify by discipline (Arts, Science, Commerce) and randomly sample proportionally from each stratum (40% Arts, 30% Science, 30% Commerce, reflecting the population distribution). This ensures that discipline-level variation is represented and allows discipline-specific analyses.

Cluster sampling — and its hidden limitation

When a complete individual-level sampling frame does not exist but a frame of groups (clusters) does, randomly sample clusters and then study all or a random sample of members within selected clusters. Useful for geographically dispersed populations: randomly sample colleges from a district, then survey all first-year students at selected colleges.

The limitation: cluster sampling introduces design effects — members of the same cluster are more similar to each other than to members of other clusters, which reduces effective sample size. Standard errors must be adjusted for clustering in the analysis, using software that accounts for this (Stata, R’s survey package). Many researchers using cluster samples apply standard error formulas appropriate for simple random samples, which underestimates uncertainty and produces misleadingly narrow confidence intervals.

Convenience sampling: when and how to defend it

Most quantitative research in Indian universities uses convenience samples — available participants rather than randomly selected ones. This is a pragmatic reality, not automatically a fatal flaw, but it requires explicit acknowledgement and careful limitation of the conclusions drawn.

A convenience sample cannot support claims about a defined population beyond the sample itself. A study of 450 students at three Rajasthan government colleges cannot claim to represent Indian government college students generally. It can claim to examine the phenomenon in a specific, described context, and it can generate findings that are plausible candidates for generalisation pending replication in other contexts.

The methodology chapter must: describe the sampling frame precisely (which colleges, which year groups, which departments), explain why this sample was accessible and what it is a sample of, and constrain the generalisability claims in the discussion to match the sample actually studied.

Purposive Sampling: The Logic of Qualitative Selection

Purposive sampling selects participants or cases deliberately, based on characteristics relevant to the research question, rather than randomly. This is not a compromise forced by the impossibility of random sampling — it is the logically correct approach when the goal is insight rather than statistical representativeness.

Purposive strategyWhen to use it
Maximum variationSelect participants who vary widely on characteristics of interest. Patterns that hold across maximum variation are more robust than patterns found only in a narrow population. Good for exploratory research where you want to map the full range of a phenomenon.
HomogeneousSelect participants who share a key characteristic. Useful when you want to understand a specific subgroup in depth without variation on the characteristic of interest.
Typical caseSelect participants or cases that are representative of the common experience. Useful when you want to describe what is typical rather than map the full range.
Critical caseSelect a case where the phenomenon should be especially visible or where your theory should especially apply. If it does not hold here, it is unlikely to hold elsewhere.
Extreme or deviant caseSelect unusual cases to understand boundary conditions. What happens at the extremes illuminates what happens in the middle.
SnowballAsk initial participants to refer others. Useful for reaching populations that are hard to identify or access — stigmatised groups, professional networks, covert communities.

The key to writing a purposive sampling justification is explaining not just what strategy you used but why that strategy fits your research question. Maximum variation sampling is appropriate when you want to identify patterns that hold across diverse cases; it is not appropriate when you want to understand a specific, bounded experience. The justification connects the sampling logic to the research question.

Weak: “Purposive sampling was used to select thirty participants.”  Strong: “Purposive sampling using a maximum variation strategy was used to select thirty participants who varied across three dimensions identified as theoretically relevant: discipline (Arts, Science, Commerce), geographic origin (urban, semi-urban, rural), and first-generation status (first-generation vs. continuing-generation). Maximum variation was chosen because the research question concerns peer mentoring experience broadly across the government college population; selecting a homogeneous sample would risk generating findings applicable only to a narrow subgroup.”

Sample Size: The Right Answer for Each Approach

Quantitative sample size: power analysis

The module overview gives rough guidelines — 30 to 50 per group for basic comparisons, several hundred for survey research. These are useful starting points but not substitutes for a power analysis. As explained in Cluster Post 2, a power analysis takes your expected effect size, your alpha level, and your desired power and calculates the minimum sample size needed to detect the effect reliably.

Report your power analysis in the methodology chapter. State the expected effect size and its source (prior research, published meta-analysis, or Cohen’s benchmarks with justification), your alpha level, your desired power (conventionally .80), and the calculated minimum N. If your achieved sample falls below the minimum, report the actual power your achieved sample provides. Post-hoc power calculation using G*Power takes five minutes and adds considerable credibility to a methodology chapter.

Qualitative sample size: the saturation standard

Saturation — the point at which new participants are not generating new codes, themes, or theoretical insights — is the qualitative equivalent of statistical power as a sample size criterion. It is the right standard, but it is widely misunderstood and inconsistently applied.

Saturation is not a single moment that arrives cleanly. It is a judgment the researcher makes, incrementally, about whether the pattern of what is being heard is stable. You do not know you have reached saturation until you have gone past it — collected several additional interviews and found that they are confirming, not extending, your existing themes. Saturation is therefore not a stopping rule you apply in advance; it is a retrospective judgment you make after analysis.

How to demonstrate saturation in your methodology chapter: report the number of participants, the number who were interviewed before themes stabilised, and the number of additional interviews conducted after that point to confirm stability. This demonstrates that saturation was actively tested rather than assumed.

“Thematic stability was reached after approximately eighteen interviews, when new participants were elaborating existing themes rather than introducing new ones. Six additional interviews were conducted beyond this point to confirm stability; none generated themes not already identified. The final sample of twenty-four participants therefore exceeds the saturation point, providing confidence that the theme structure is stable within this population and context.”

The guidelines debate: what counts as enough for qualitative research

The module mentions that PhD committees often expect 15 to 30 interviews. This is accurate but misleading if taken as a target. Fifteen interviews with highly knowledgeable participants in a well-bounded context may reach saturation comfortably; thirty interviews with participants who have limited direct experience of the phenomenon may never reach it.

The appropriate number of participants depends on the tradition (IPA typically works with smaller samples of 4–10; grounded theory may require 20–30 or more), the heterogeneity of the population (diverse populations require more participants to reach saturation), and the complexity of the phenomenon (phenomena with multiple dimensions require more data to map fully than simpler ones). Cite Braun and Clarke, Creswell, or the tradition-specific methodology texts to justify your sample size, not just a committee guideline.

The Most Common Sampling Mistakes

  • Calling a convenience sample random: The most consequential sampling error. Convenience samples cannot support statistical inference to a population in the same way random samples can. Mislabelling a convenience sample as random invalidates the inferential claims built on it.
  • Purposive sampling without a rationale: Stating that purposive sampling was used without explaining what characteristics participants were selected for and why those characteristics are relevant to the research question.
  • Over-claiming generalisability from a convenience or purposive sample: Drawing population-level conclusions from a sample that cannot support them. Conclusions must be scoped to the sample that was actually studied.
  • Declaring saturation without demonstrating it: Stating “theoretical saturation was reached” without describing the process by which it was determined. Examiners increasingly expect to see evidence of how saturation was assessed, not just a claim that it occurred.
  • Ignoring attrition: For longitudinal or multi-phase designs, failing to report how many participants dropped out between phases, and whether those who dropped out differed systematically from those who remained. Non-random attrition is a serious threat to validity that must be addressed, not mentioned as a minor limitation.

For Law Students

Sampling in doctrinal legal research is rarely discussed in methodology terms, but it is present in every doctrinal study — in the form of case selection. What cases do you include in your analysis? Why those and not others? What does your selection mean for the conclusions you can draw?

Case selection as purposive sampling

Selecting cases for doctrinal analysis is structurally identical to purposive sampling in qualitative research. You are choosing which data to include and exclude based on criteria that should be theoretically justified and explicitly stated. The most common doctrinal case selection strategies map directly onto purposive sampling strategies:

  • Maximum variation: Selecting cases from different benches, time periods, or factual contexts to identify principles that hold across variation.
  • Critical case: Selecting landmark cases where the doctrine was most directly and authoritatively stated.
  • Typical case: Selecting cases representative of how the courts commonly handle the issue, rather than exceptional cases.
  • Extreme case: Selecting cases at the doctrinal boundaries — where courts extended or refused to extend a principle — to understand the limits of the doctrine.

Regardless of which strategy you use, the methodology chapter must specify: the search terms and databases used to identify candidate cases, the inclusion and exclusion criteria applied, the total number of cases identified and the number retained for analysis, and the rationale for the criteria.

The SCC Online search protocol

Many Indian law theses identify their cases through SCC Online, Manupatra, or similar databases without describing the search process. This is inadequate for methodological transparency. A replicable search protocol specifies: the database searched, the search terms used (exact strings), the filters applied (court level, date range, subject matter), and the number of results returned before and after applying inclusion criteria.

Example search protocol: “Primary cases were identified through SCC Online using the search string ‘Article 21’ AND (‘surveillance’ OR ‘privacy’) AND (‘technology’ OR ‘digital’ OR ‘AI’), filtered to Supreme Court of India decisions from 1 January 2017 to 31 December 2025. This search returned 94 results. After reviewing titles and headnotes, 41 cases were excluded as cases where privacy was mentioned incidentally (fewer than two substantive paragraphs engaging with the right) and 12 were excluded as repetitive of principles fully established in other retained cases. The final dataset comprises 41 cases analysed in depth.”

This level of specificity allows a reader to assess whether your case selection is systematic and replicable — the legal equivalent of the quantitative sampling transparency that reviewers expect.

References

Next: Cluster Post 7 — Research Ethics in Practice: What Ethics Forms Don’t Tell You

← Back to Module 3 Overview