[1] 25
POL269 Political Data Research
Javier Sajuria
22.01.2024

Dr Javier Sajuria
Reader in Comparative Politics
📧 : j.sajuria@qmul.ac.uk
💻 : www.sajuria.com
📍 : ArtsOne 2.29
A&F hours: Mondays 3.30pm - 4.30pm. Book via Calendly and by appointment

Dr Elizabeth Simon
Postdoctoral Researcher in British Politics
📧 : e.simon@qmul.ac.uk
💻 : QM profile
📍 : TBC
A&F hours: TBC
First Part
What are the course goals?
How will your grade be determined?
What resources will be available to you?
Syllabus review
Questions?
Introductions
Second Part
Become familiar with RStudio
Become familiar with R
Do calculations: +, -, *, /
Create objects: <-, ""
Use functions: (), sqrt(), #
Seminars are mandatory. They will take 50 minutes.
They will take place on Mondays and Tuesdays, check your timetable to find out the right time and place
40% of the course mark is based on TikTok video
60% of the course mark is based on a final take-home research project
The two assignments will require you to:
Details will follow during the term
Course website is: pol269.sajuria.com
Where you will find lecture slides, seminar activities, solutions.
QMPlus access is essential for this course
Students should be automatically enrolled, but please let me know if you have problems accessing it
Elena Llaudet and Kosuke Imai. Data Analysis for Social Science: A Friendly and Practical Introduction, Princeton University Press, 2022
When doing the readings:
Pay close attention to the TIPs; they are written especially for students with no prior experience
Skip the FORMULAS IN DETAIL, unless I tell you otherwise; they contain more advanced-level material
Pay especial attention to the key concepts and \(R\) operators and functions
When working on seminars and other exercises, you are required to collaborate with your assigned partner
In the next week, I will send an email to each of you introducing you to your assigned partner
Collaboration can be in person, via Zoom, by phone, email, text, pigeons…
The important thing is so that you can ask and answer each other’s questions on a regular basis
Take a few minutes to look over the important module information on QMPlus. Notice that:
Material is cumulative: lectures later in the course assume that you know what was covered earlier in the course
Make sure to keep up with the material and take some time to review each week!
If you miss class, make sure to watch the recording
How are you supposed to fill those 10 hours of work a week?
Attend all seminars and lectures
review old material: make sure you understand everything we have already covered by reviewing all previous lecture notes in sequence
learn new material: do the new readings following along the exercises with your own computer and attend lectures and seminars
As part of Homework #0, you should have installed two programs in your computer:
R is the statistical program that will perform calculations and create graphics for us (it’s the engine)
RStudio is the user-friendly interface that we will use to communicate with R
We will never open R directly; we will always start by opening RStudio (RStudio will open R by itself)
Go ahead and open RStudio 
Then, open a new R script:
What is an R script?
R Script (upper left window): where we write and run code
R Console (lower left window): where R provides the executed code and its outputs, including errors
Environment (upper right window): storage room of current R session; lists objects that we have created
Help and Plots tabs (lower right window)
To use R, we need to learn its language
Learning a programming language is like learning a foreign language
Do calculations
Create objects
Use functions
We can use R as a calculator
+, -, *, /Let’s ask R to calculate 20 plus 5
First, we type on the R script (upper left window): 20+5
Then, to run this code: we highlight it and either manually hit the run icon
or use the shortcut command+enter in Mac or ctrl+enter in Windows
Go ahead and do it
In the Console, you should see the following:
R stores information in the form of objects
In order to analyse data, we will need to create objects
An object is like a box that can contain anything

To create one, we need to:
give it a name
specify its contents
use the assignment operator
In R, we use the assignment operator <- to create an object:
To its left, we specify the name of the object
name cannot begin with a number or contain spaces or special symbols like $ or % that are reserved for other purposes
name can contain _ underscores, which are good substitutes for spaces
To its right, we specify the content of the object
For example, type and run:
After running this code, the object twentyfive will show up in the Environment (the upper right window of RStudio)
To find out the contents of an object, you can run the name of the object in R:
This is equivalent of asking to R: what is inside of twentyfive?
Objects can contain text as well as numbers. Run for example:
Now in the environment there should be two objects
What are they? Note that in this last piece of code we used " around the contents, but we did not use " in the previous piece of code
When do we need to use " when writing code in R?
the names of objects, names of functions, and names of arguments as well as special values such as TRUE, FALSE, NA, and NULL should NOT be in quotes
all other text should be in quotes
numbers should never be in quotes unless you want R to treat them as text
What would happen if you run instead: class <- pol269?
without the ", R thinks that pol269 is the name of an object and R is right; there is no object called pol269 in the environment
Running into errors is part of the coding process
do not be discouraged
if you have problems figuring out what a particular error means, google it; there are lots of Q&A sites
if that doesn’t help, post the code and error in our discussion board
R will overwrite objects if you assign new content to an existing object name
After running the code above, class will contain the text “data analysis” instead of “po269”
R is case sensitive:
function: + takes input(s)
sqrt(), setwd(), read.csv(), View(), head(), dim(), mean(), ifelse(), table(), prop.table(), na.omit(), hist(), median(), sd(),var(), plot(), abline(), cor(), lm(), log(), c(), sample(), rnorm(), pnorm(), print(), nrow(), predict(), abs(), summary() among othersfunction_name()function_name()function_name(argument1, argument2)or
We typically write code in one of these two formats:
or
bake() that, by default, bakes the specified ingredient for 60 minutes at 180\(^{\circ}\)C
degrees and minutes to change the default temperature and duration of the bake, respectively
degrees=200 changes temperature to 200\(^{\circ}\)Cminutes=30 changes duration of bake to 30 minutesThe following code would ask R to bake a cake mix for 30 minutes at 350\(^{\circ}\)F, so that we can have cake as the output:
Example: sqrt() computes the square root of the argument specified inside the parentheses. To compute \(\sqrt{\textrm{25}}\), run:
sqrt is the name of the function, which, as all function names, is followed by parentheses ()
25 is the required argument
5 is the output
Introductions and Housekeeping
R and Rstudio, scripts, console and the environment
R calculations, objects, functions
What are data/datasets?
What is an observation
What is a variable
Types of variables on content
How to load and make sense of data
Computing and interpreting means

POL269