A compendium of methods and stats resources for (social) psychologists

A compendium of methods and stats resources for (social) psychologists

This page helps me recover papers or websites that I use regularly when planning or analyzing research studies. I hope you will find it useful as well. I am open to suggestions for things to add or remove (okleinATulb.ac.be or @olivier_klein on Twitter).

Olivier Klein


List of the topics covered in this page:

Research Design

Causality

Software for designing experiments

Online Surveys

Measurement

Sampling

Power estimation and effect sizes

Qualitative Methods/Text Analysis

Preregistration

Data Preparation

Displaying data

Testing

Inference

Linear Regression and General Linear Model (test, ANOVA, etc)

Mediation and Moderation

SEM

Bayesian

Mixed Models

Logistic Regression

Meta-Analysis

Statististical Software

SPSS

R

Learning R

General Reference

Markdown

Freely Available Datasets

Other

Social networks (sharing resources & knowledge)

Ethics

Other stuff

 

Research design

Causality/Endogeneity/Theory

  • Bill McGuire’s important annual review chapter about creative hypothesis generation in psychology.
  • Cronbach & Meehl’s classic paper on construct validity (the foundation of  psychological theorizing thereafter).
  • Paul Meehl’s 1990 paper on appraising theories in psychology, where he qualifies his earlier paper comparing theorizing in physics and psych. See also this more accessible 1990 paper on the uninterpretability of research summaries in psychology.
  • Psychbrief made summary of Paul Meehl’s lectures (that are available online) as well as of some of his most important papers. Impressive! All of this is available here.
  • Barry Markovsky on evolution and nebulousness in social psychological theories.
  • Klaus Fiedler’s thoughts on the cycle of theory formation in (social) psychology (paywall).
  • Gerd Gigerenzer’s personal reflections on theory and psychology.
  • Yarkoni and Westfall’s radical take on the matter: we should choose prediction (using machine learning) rather than explanation in psychology.
  • General and simple introduction to the problem posed by “endogeneity” (ie., when a causal variable is correlated with the error term of the DV as is  usually the case in non-experimental designs) in testing causal relationship, and how to deal with them, by Antonakis et al. This is applied to leadership research but the points made apply to social psych as well.
  • If you only have a few minutes to get acquainted with this crucial issue, check out this very short paper by Hernan called “The C-Word: Scientific Euphemisms Do Not Improve Causal Inference From Observational Data”.
  • How to move from statistical association to causation? Paper by Genetian et al. introducing researchers in developmental psych. to the “instrumental variables” approach to deal with this issue. Again, very accessible and useful for social psychologists as well. The more extensive paper on instrumental variables in the social sciences by Kenneth Bollen is available here  (paywalled).
  • Causal Models and Learning from Data: paper by Maya Petersen & Mark Van der Laan on applying causal modeling to epidemiological data
  • Julia Rohrer’s excellent paper “Thinking Clearly About Correlations and Causation: Graphical Causal Models for Observational Data”. Proposes new (to social psychologists) tools to infer causation from observational data.
  • In a similar vein, Judea Pearl, who wrote the remarkable “Book of Why” and is the “father” of such graphical models, explains in this short paper how to properly assess causality using linear models using path diagrams. It remains challenging for psychologists not used to equations but is worth the effort.
  • Are randomized controlled trials the panacea for establishing causality? Deaton and Cartwright argue against it in this paper (preprint) extracted from a special issue of Social Science and Medicine that addresses this topic at length (including a response to this paper; but the rest of the issue is behind a paywall).
  • Thinking about causality in complex interventions. Paper by Rickles (2009) with paywall. See this on a similar topic.

      Many thanks to Holger Steinmetz & Djouaria Ghilani for the suggestions in this section! 
Experimental Design

  • Paper by Pirlott & McKinnon on how to design studies with mediation in mind
  • The classic paper by Spencer, Zanna & Fong on why designing experiments is preferable to mediational analysis in examining causal processes.
  • Document destined to Ph.D. students at the University of Guelph explaining how to avoid questionable research practices in planning and reporting experiments.
  • Paper by Biglan et al. on time series experiments: a solution when randomized controlled trials are too costly or difficult.
  • When you can’t access large samples, randomization tests may be a solution. Check out this and this.

Software for designing experiments

  • Psytoolkit: free to use software to conduct experiments online.
  • Psychopy: open source software for conducting psychology experiments (offline). There are two interfaces: a user friendly one and a powerful one using python programming language.
  • Psyscope: software to design experiments using Mac OS.
  • Gorilla: free user friendly software for conducting experiments online. However, this is not a free solution: the researcher pays per subject.

Sampling

  • A lucid paper by Neil Stewart et al. introducing readers to crowdsourcing platforms in cognitive science with their benefits and drawbacks (another version without paywall here).
  • Paper by Goodman et al. on strength and weaknesses of Amazon Mechanical Turk Samples (applied to consumer research).
  • This is the “classic” 2011 and much cited paper by Buhrmeister et al. suggesting that AMT provided high quality data.
  • A list of crowdsourcing platforms provided by Gorilla Science.
  • Classic paper by Henrich et al. on the use of samples from Western, Educated, Industrialized, Rich and Democratic (WEIRD) societies in psychology.

Online surveys

  • Paper by Zhou & Fishbach showing validity problem with attrition in online surveys and how to deal with them.
  • Identifying careless responses in survey data. Paper by Meade et al.
  • Seriousness checks to improve the reliability of online surveys. Paper by Aust et al.
  • Detecting and deterring insufficient efforts in responding to surveys. Paper by Huang et al.
  • New paper by Meyer et al comparing several methods for detecting bots responding to online surveys. Matt Motyl developed a script to detect bots. There us an effort by Andy Wood along the same lines. Dennis et al argue that rather than bots, low quality answers are provided by humans using virtual private networks (VPNs) and offer solution for detecting them.
  • Randomly assigning people to different conditions in limesurvey.
  • A series of videos from the Beonline event covering issues in conducting online research: ensuring data quality, technical issues, recruiting participants…
  • To get into the mind of MTurkers, you can devise questionnaires. You can also check out this reddit page where they share their qualms.

Measurement

Power estimation & Effect sizes

  • pratical primer by Perugini et al.  covering power analysis in simple experimental designs (including mediation, interactions involving continuous predictors, etc.). Will cover most of your needs and is supplemented by online material (excel sheets, etc.).
  • Clear introduction to estimating and reporting effect size by Daniël Lakens (Frontiers). Check Katherine Wood’ online version (shiny): it is even more user-friendly. 
  • In the same vein, here is MOTE, a comprehensive online app by Erin Buchanan et al. for computing effect sizes based on a variety of designs and measurement levels. Play around with it. It’s fantastic!
  • G*Power, the free software that computes power for a variety of designs. The manual is here.
  • Declaredesign: Interactive R program that computes power based on your design (using simulations).
  • Pangea: A web applet that computes power for General Anova designs. By Jake Westfall. Thanks, Almudena Claassen, for pointing me to this!
  • Computing power for interactions involving one continuous and one dichotomous variable. Online application designed by myself.
  • When there is more than one within subject factors, G*Power can’t compute power. The best solution is to run simulations but that requires programming skills. D’Amico et al. proposed a method using SPSS’s MANOVA procedure. Here is also another paper using this method for regression, correlation and simple anova designs.
  • Here is how to calculate power for a 3-way ANOVA in G*Power
  • Converting effect size (e.g., from d to eta square, etc). Excel page here.
  • Online calculator for power estimation in mixed models.
  • Powerlmm is a R package designed by Kristofer Magnusson for computing power in multilevel models. It can also help you assess whether your really need such a model.
  • On the same topic, see this paper by Lane & Hennes on estimating sample size in multilevel models (in relationship research)
  • Sequential data analysis. Great method for maximizing power and minimizing sample size at the same time. EJSP paper by Daniël Lakens.
  • How many participants do I need to test a moderation of the effect I found in my first study? Many. Post on Data Colada explaining this. 
  • A tutorial by myself on Determining sample size in social psychology (with tables). French.
  • Sample size planning adjusting for publication bias and uncertainty by Anderson et al.
  • Sample size planning for cognition and perception. Where Jeff Rouder explains that relatively small sample (around 30) can be OK if you use repeated measures.

Qualitative methods / Text analysis

Preregistration

Displaying data

  • Using graphs instead of tables in political science: Great paper by Kastellec & Leoni showing how to replace tables by graphs. Applies to social psychology as well
  • Plotting the confidence interval for regression estimates in R.
  • Tools to enhance plots made by GGplot with results of statistical tests. Developed by Indrajeet Patil.
  • Paper by McCabe and al with resources for plotting interactions optimally.
  • A list of free tools for creating more transparent figures for small datasets (well, the kind we mostly rely on in social psych).
  • University of Minnesota page showing examples of well formatted tables and graphs according to APA standards.
  • Seeing theory. Great website using splendid vizualizations to introduce to basic concepts in probability and stats.
  • Making it pretty: Plotting 2-way interactions with GGplot2. Nice tutorial & code (R).
  • Implementing Edward Tufte’s recommendations for cool looking graphs using R (Tufte is the absolute master of vizualization).
  • Guide de démarrage pour GGPlot. French.
  • The summarytools package (R). Provides summaries of variables in a data frame like this. Intro here.
  • Useful Tips from the @realscientists twitter account (a science illustrator for National Geographic) on improving figures (twitter thread)
  • Paper by Allen et al on raincloudplots (much better than bar graphs!) and resources (code, tutorials…) for drawing them.

Data preparation

  • Our paper “Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median”
  • Our paper on the detection of multivariate outliers.
  • Bakker & Witchers’s paper arguing against outlier removal.
  • Codebook,  a R package by Ruben Arslan that creates a codebook based on an SPSS file (haven’t tried it yet but looks great!).
  • Kai Horstmann provided very useful resources for constructing codebooks.

Testing

Statistical Inference (Fisher/Neymann-Pearson)

  • Jacob Cohen’s paper, Things I’ve learned so far. Essential reading that covers many of the core issues psychologists should be attuned to when conducting (inferential) statistical analyses.
  • Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Paper by Greenland et al. See also American Statistical Association’s statement on p values.
  • One simple effect is significant, the other not but no interaction. Paper by Gelman on this.
  • Is it a problem to use parametric stats on likert scales when the sample size is low or the distribution far from normal? Usually not according to this paper by Geoff Norman.
  • Using covariates when testing for interactions. Paper by Yzerbyt et al.
  • p curve
  • Testing that the null is true without Bayes. Blog post by Daniël Lakens on Equivalence testing and the paper now in SPPS. See also a tutorial by Lakens et al. In the same vein, a concise open access paper by Etienne Quertemont (2011), “How to Statistically Show the Absence of an Effect”, covers equivalence testing, power analysis & use of confidence interval in a single paper. Useful for students especially.
  • A post by Heino Matti on false expectations about the relation between p values and sample size. Includes great vizualisations.
  • Aligning scientific reasoning and statistical inference: Short “Science” paper by Steven Goodman on misunderstandings in statistical inference and their impact on scientific progress.
  • Great paper by Miller and Chapman on misunderstandings surrounding the interpretation of ANCOVA.
  • Ever wondered about the meaning of degrees of freedom? Check out this excellent tutorial by Ron Dotsch (thanks, Rui Costa Lopes for the suggestion!) .
  • Beyond Statistics: Testing the null in mature sciences. Very informative paper by Morey et al.
  • Tukey’s 1991 paper on the philosophy of multiple comparisons. Where he rejects the dichotomous logic of significance testings and recommends the use of confidence intervals.

Regression / General Linear Model (t-test, ANOVA, etc.)

  • Partial Least Square Regression: I am not very knowledgeable on this one but colleagues recommended that I include this method for testing causal models on this page. So here are two links: one recommended by Davide Del Cason and the other recommended by Carole Fantini.
  • Don’t use the Student t test anymore (Welsh test is better). If you are not convinced, check this paper by Delacre et al.
  • The perfect t-test. R program that reports the results of a t-test completely formatted, with graphs, tests of assumptions, etc. By Daniël Lakens.
  • Computing confidence intervals for multiple regression estimates.
  • Appropriate categorical variables coding schemes for linear regression in R. Be careful to read it because the R defaults don’t match usual practices in social psych.
  • Bob Abelson’s paper on contrast coding for testing interactions.
  • Ordinal regression tutorial by Paul Bruekner using his BRMS package.

Mixed Models

  • Mixed Models: Introduction to treating stimuli as random factors and code for common statistical software by Westfall et al.
  • Follow-up on mixed models: Annual review chapter by the same authors addressing various research designs
  • Significance testing in lme4
  • Should you fit the “maximal model”? Parsimony in model construction. Paper by Bates et al.
  •  Centering predictors in mixed models. Paper by Enders & Tofighi.
  • Wonderful tutorial by Sommet and Morselli on multilevel logistic regression with scripts in R, Stata, MPlus and SPSS. Applied to Justin Bieber and a very fun read!
  • Paper by Fisher et al. showing that correlations between variables across individuals do not match correlations between these same variables (measured on many occasions) within individuals.
  • Explained variance measures for multilevel models. Paper by LaHuis et al.
  • Bodo Winter’s tutorials on mixed models using R.

Mediation & Moderation

  • Yzerbyt et al’s new recommendations for testing indirect effects in mediation models. In press in JPSP (06/2018): One should test the paths separately.
  • David Kenny’s simple and excellent mediation page.
  • Broader overview of mediation and moderation. By Judd et al (2014).
  • Interactions do not tell us when but also tell us how. Nice paper by Jaccoby & Sassenberg (2011).
  • A paper by Rik Pieters explaining when mediation analysis is warranted.
  • Why testing reverse mediation to check for directionality is a terrible idea:  Here  (Gollwitzer et al) and Here (Thoemmes)
  • Answers to the question “what’s the go-to-paper for why mediation analyses shouldn’t be reported as process evidence?” on twitter.

Bayesian approaches

  • Zoltan Dienes’ very useful webpage on Bayesian stats for beginners (including online calculators).
  • How to get the most of nonsignificant results? Paper by Zoltan Dienes based on a a Bayesian approach.
  • Is there a free lunch in inference? Forceful advocacy of the Bayesian approach by Rouder et al. Very clear for nonspecialists.
  • Short intro to Bayesian stats with R examples by Fabian Dablander.
  • Tutorial for performing bayesian t tests and ANOVAs, by Richard Morey.
  • Richard McElreath wrote a fantastic book “Statistical Rethinking” which, as it names suggests, invites readers to update their view of stats (using a Bayesian perspective). Even if you don’t have the book, he provides plenty of resources, including recorded lectures and slides, for pursuing this path.
  • BRMS is a R package designed by Paul Buerkner that tests linear mixed model using a bayesian approach. It uses the same syntax as lme4.

Structural Equation modeling

  • Paper by Goodboy & Kline: Statistical and practical concerns with research featuring structural equation modeling. Good primer on some of the errors you want to avoid!

Logistic Regression

Meta-Analysis

  • Metafun: excel spreadheet that allows you to easily implement meta-analysis in R using the “Metafor” package. Resources here.
  • Meta-analysis on SPSS. See Andy Field’s paper.
  • Computing the meta-analytic effect size manually. Paper by Goh et al. (2016)
  • Nice post by Chris Holdgraf on designing and interpreting funnel plots.
  • How to conduct a mixed effect meta-analysis in R? If this question haunts you, check out this video by Sara Locatelli.

Social Networks

Statistical Software

SPSS

  • Annotated SPSS Output for Logistic Regression
  • Laerd stats: Online resources for learning Stats on SPSS. It’s not free but apparently, it’s worth the cost (I must say I prefer R but some people can’t live without SPSS).

R

Learning

General Reference

  • R for data science. A comprehensive and authoritative book on the subject. Online.
  • This post on the facebook page “R Users psychology” has a wonderful list of resources to learn R.

Markdown

  • Video tutorial by Michael C. Frank on using RMarkdown for integrating paper writing and data analysis. Handouts here.
  • RMarkdown the definitive guide: Book with free online version.
  • Papaja: an add-on to RMarkdown that formats papers in line with APA style requirements. Fantastic work by Frederik Aust. And if you’d rather watch a video tutorial, here is one by Nick Fox.

Other

  • JASP: a free statistical software that also performs bayesian tests.
  • Jamovi: Even better. Open source.

Datasets

  • Open Stats Lab: Freely available datasets to play around. It comes with the corresponding article. Ideal for teaching purposes.
  • This response to a tweet by Vicky Boykis has plenty of suggestions for freely available datasets to use in R or elsewhere.

Ethics

Other stuff

  • Undergraduate statistics with JASP. Resources compiled by Erin Buchanan.
  • Learn Statistics: materials from Erin Buchanan’s class. Covers most areas of stats of interest to social psychologists (including SEM and an emphasis on reproducibility).
  • An openaccess stat introductory textbook with great animated vizualizations (coded in R & ggplot) by Matt Crump et al.
  • Our practical guide to transparency in psychological science: research planning /preregisration / data and material sharing/ publication. With extensive supplementary material. Includes an example “model” project on the OSF.
  • Shiny Web Apps for designing experiments and analysing data
  • Rpsychologist: All kinds of vizualisations of common statistical procedures. Splendid for pedagogical purposes especially.
  • Paper  by Butts et al. on the source of common errors in the interpretation of cutoff criteria for widely used stats (including .70 for Cronbach’s alpha)
  • Blog post by Dick Morey on the dangers of averaging data.
  • On a similar topic, paper by Colin Leach on how to integrate the person with macro-level processes (and look differently at regression plots).
  • Paper by LeBel & al summarizing criticisms of open science and advocacy of high powered research movement in social psych and addressing them. Useful to find all the important recent references.
  • Statcheck: Online tool to check whether there any errors in your paper. Just upload a PDF.
  • SPRITE: method developed by James Heathers et al to reconstruct datasets based on available means, SDs and Ns. A great tool for detecting errors. The shiny app  is here.
  • Learning to use GitHub. Module 5 of an excellent Open Science MOOC.
  • Githbub for R users: Excellent tutorial by Jenny Brian.
  • ESSurvey: RPackage for downloading and analyzing data from the European Social Survey.
  • Paul Minda’s lab manual (OSF)

Leave a Reply

Your email address will not be published. Required fields are marked *