How to Evaluate in GCSE Psychology
Evaluation is the skill that separates good Psychology students from great ones. AQA examiners want you to critique the METHOD, not just describe the findings. This guide explains exactly how to write evaluation that reaches the top mark bands — with real examples from the studies you need to know.
What This Question Asks
Evaluation questions in AQA GCSE Psychology are typically worth 4 to 9 marks and use command words such as "Evaluate...", "Assess the strengths and limitations of...", "Discuss the methodology of...", or "To what extent...". These questions require you to weigh the strengths and limitations of psychological studies, theories, or research methods — assessing their validity, reliability, generalisability, and ethical standing. The most important thing to understand is the distinction between AO1 and AO3 in these questions. AO1 is knowledge: describing what a study found. AO3 is evaluation: critically assessing what that study means, how it was conducted, and whether its conclusions are justified. Writing "Milgram found that 65% of participants continued to the maximum 450-volt shock" is AO1 — a description of findings. Writing "The laboratory setting used by Milgram lacks ecological validity because participants knew they were in an experiment and the situation of obeying an authority figure in a lab does not reflect obedience in real-life settings such as the workplace" is AO3 — an evaluation of the method. Examiners are specifically trained to look for this distinction. No matter how detailed your description of findings, it cannot score AO3 marks.
Mark Scheme Breakdown
- Basic evaluation points stated with little or no explanation or development.
- Evaluation focuses on findings rather than methodology — describes what the study found rather than critiquing how it was conducted.
- Terms like "valid" or "reliable" may be used but without explanation of their meaning or application.
- Example: "The study was not very ethical. Also, the sample was small so it might not apply to everyone."
- Evaluation points are clear and relevant with some explanation of their methodological significance.
- A strength and a limitation are identified and partially explained in terms of the study's methodology.
- Psychological terminology is used with some accuracy — validity, reliability, sample bias, or ethics are explained, not just named.
- Evaluation may be one-sided or lack balance — only strengths or only limitations addressed.
- Example: "The study lacks ecological validity because it was conducted in a lab, which does not reflect real life. However, it was a controlled experiment, which means extraneous variables were controlled and it is easier to establish cause and effect."
- Balanced, well-developed evaluation addressing both strengths and limitations with clear methodological reasoning.
- Psychological concepts (validity, reliability, generalisability, ethics, sample bias) are accurately applied and explained in relation to the specific study.
- Evaluation is explicitly linked to the implications for the study's conclusions — not just identifying a limitation but explaining what it means for what we can conclude.
- A judgement or overall assessment is reached, weighing the methodological strengths against limitations.
- Precise references to the study's design (laboratory vs. field experiment, sample characteristics, operationalisation of variables) support the evaluation.
- Example: "While Milgram's study has high internal validity due to standardised procedures — all participants received the same scripted prods from the experimenter — the artificial laboratory setting reduces ecological validity. Obedience to an authority figure in a Yale University basement may not generalise to real-world authority situations where social pressures, relationships, and consequences differ significantly. Furthermore, the all-male, American sample introduces sampling bias, limiting cross-cultural and gender generalisability."
How to Structure Your Answer
Identify whether you are being asked to evaluate the STUDY, the METHOD, or the THEORY
AQA evaluation questions may ask you to evaluate a specific study (e.g. Milgram's obedience study), a research method (e.g. laboratory experiments), or a theory or explanation (e.g. the social learning theory account of aggression). The core evaluative concepts are the same, but your examples and focus will differ. Always read the question carefully and underline what you are being asked to evaluate.
"Evaluate the use of laboratory experiments in psychological research." → Evaluate the method itself, using studies as examples. "Evaluate Asch's research into conformity." → Evaluate the specific study and its methodology. "To what extent can the cognitive approach explain memory?" → Evaluate the theory and its supporting evidence.
Plan your evaluation points before writing
For most evaluation questions, plan two to three evaluation points — a mix of strengths and limitations. For each point, note: the concept (e.g. ecological validity), your explanation of what it means in this context, and the implication for conclusions. A structured plan prevents the common mistake of listing findings instead of evaluating methods.
Evaluating Milgram (1963): Strength: Standardised procedure → high internal validity → can establish cause and effect. Limitation 1: Lab setting → low ecological validity → cannot generalise to real-life obedience. Limitation 2: All-male American sample → sampling bias → limited gender and cultural generalisability. Limitation 3: Deception and psychological harm → ethical issues → Milgram was later criticised by the BPS.
Write each evaluation point using PEEL: Point, Evidence, Explanation, Link
Each evaluation paragraph should follow this structure. State your evaluation point (e.g. "One limitation is the low ecological validity of the study"). Provide evidence by referencing the specific aspect of the study's design (e.g. "The study was conducted in a laboratory at Yale University using a fake shock generator"). Explain the methodological significance (e.g. "Participants were aware they were in a study, and the artificial setting does not reflect real-world obedience situations"). Link back to the implications for conclusions (e.g. "This means we cannot be certain that the obedience rates Milgram observed would be replicated in natural settings such as workplaces or military contexts").
"One limitation of Milgram's study is its low ecological validity. The experiment was conducted in a controlled laboratory setting at Yale University, where participants gave electric shocks to a confederate using a fake shock machine. This artificial situation does not reflect the complexity of real-life obedience, where factors such as personal relationships, long-term consequences, and social norms would influence behaviour differently. As a result, we cannot be certain that the 65% obedience rate observed in the lab would be replicated in real-world authority situations."
Balance your evaluation — include both strengths AND limitations
For Level 3, you must evaluate from both sides. Even if a study has significant methodological limitations, most studies also have genuine strengths — standardised procedures, controlled variables, quantitative data that allows comparison, or ethical safeguards. Evaluate both and avoid the mistake of writing only a list of criticisms. The examiner wants to see that you can weigh evidence, not just attack it.
"A strength of Milgram's study is its high internal validity. Because the experiment used standardised procedures — the same scripted prods, the same shock generator, the same room — extraneous variables were controlled. This means that changes in obedience levels could more confidently be attributed to the independent variable (proximity of the authority figure or learner) rather than to other factors. The controlled design allowed Milgram to draw causal conclusions about obedience that would not be possible from a correlational study or naturalistic observation."
Use precise psychological terminology — and define it in context
Key evaluation terms you should use accurately: ecological validity, internal validity, external validity, reliability, replicability, generalisability, sampling bias, operationalisation, demand characteristics, investigator effects, ethical issues (deception, right to withdraw, psychological harm, informed consent), and standardisation. Do not just insert these words — define them briefly in the context of the specific study. "The study lacks ecological validity" is weak. "The study lacks ecological validity because the laboratory setting means participants' behaviour may be influenced by demand characteristics — knowing they are being observed may cause them to behave differently than they would in a natural context" is strong.
Reach an overall evaluative judgement
For higher-tariff evaluation questions (6 marks and above), conclude with an overall judgement that weighs the strengths against the limitations. This is the hallmark of Level 3 evaluation — committing to a position rather than simply listing points on both sides. Use evaluative language: "Overall, despite its methodological limitations, Milgram's study remains significant because...", "On balance, the limitations outweigh the strengths because..."
"Overall, Milgram's research provides valuable insights into obedience to authority, and its controlled methodology means the findings are internally valid. However, the significant ethical concerns — particularly the psychological distress experienced by participants, many of whom showed signs of extreme stress — and the limited generalisability of an all-male American sample mean that the study's conclusions should be applied cautiously. The study opened important questions about human obedience but cannot be taken as a definitive account of how obedience operates across all social contexts."
Common Mistakes to Avoid
Top Tips
Practise This Question Type
"Evaluate the methodology of Milgram's (1963) research into obedience to authority." [9 marks]
Frequently Asked Questions
Related Resources
Ready to Practise?
Write your answer and get instant, AQA-aligned feedback.