Part I: Foundations

Part I of this book lays the groundwork for everything that follows. Before we can build and evaluate sophisticated statistical models, we need to be confident with the basic concepts, tools, and habits that underpin all sound data analysis. These first five chapters are designed to provide that foundation: introducing the role of statistics in the social sciences, familiarising you with the R programming environment, showing how to explore data effectively, explaining the logic of statistical inference, and developing the practical skills for transforming raw data into a form ready for analysis.

Chapter 1  Introducing statistics begins with the “why” of statistics. We show how social science questions — from the effects of hybrid work to the impact of school closures — cannot be answered convincingly without statistical reasoning. Here, you will meet the core ideas of populations and samples, modelling, uncertainty, and inference, and see why they are indispensable in research that deals with inherently variable and imperfect real-world data.

Chapter 2  Introducing R & RStudio introduces you to the tools you will use throughout the book: the R language and the RStudio interface. You will learn how to install and navigate RStudio, organise your work, run your first pieces of R code, and use packages to extend R’s capabilities. This chapter is not about turning you into a programmer overnight; it is about giving you a working familiarity with the software so that it becomes a natural part of your analytical toolkit.

Chapter 3  Exploratory data analysis focuses on exploratory data analysis (EDA). Before any modelling begins, you need to understand the structure, types, and patterns in your data. You will learn how to look at data from multiple angles—numerical summaries, visualisations of distributions, and plots of relationships, so that you can spot trends, anomalies, and possible data quality issues early.

Chapter 4  Introducing inference introduces the logic of statistical inference. Here, concepts like sampling distributions, p-values, and confidence intervals are presented as practical tools for reasoning from a sample to a population. This is where the link between probability and evidence becomes clear, and where you begin to see how uncertainty is quantified and communicated in research.

Chapter 5  Data wrangling rounds out the foundations with data wrangling. Raw data are often messy, incomplete, or in the wrong format. Using the dplyr and tidyr packages, you will learn how to select, filter, transform, join, and reshape datasets so they are ready for analysis. These skills are as critical as any statistical test because most real-world analyses begin with making sense of, and making order out of, complex data sources.

Together, these chapters give you the conceptual understanding, technical skills, and computational tools to approach statistical analysis with confidence. They are the essential base on which the rest of the book is built, and by the end of Part I, you will be ready to move from understanding the principles of statistics to applying them in real-world modelling.