Introductory Statistics
Basic Introduction to Statistics
What is statistics, you ask? Well, statistics (or stats) is defined as the math of collecting, analysis, managing, and presenting data. That’s a very broad definition, which means that stats is a very broad and complex field.
Here we’ll just cover the very minimal basics of stats. So don’t worry about things going over your head, it’ll be fine, I promise.
First of all, what’s data?
In relation to stats, data is any sort of information you can think of in mathematical terms. A collection of shoe-sizes of all the people in a company is data. The colours of the different M&M candies in a small bag is data. The running speeds of top athletes in the Olympics 100-m dash is data.

There are 2 types of data. First type we’ll call discrete data, and you’ll hear it called by other names too like categorical data. This type of data is relatively simple, since it can fall into only a few categories. Shoe sizes, number of coffee cups, anything that can’t be divided infinitely is discrete data. The other type is continuous data. This includes things like speed, light, time, and everything that can be divided into infinitely small intervals.
So what do we do with data? We first collect it.
How do we collect data?
Whenever you collect data, you first need to define what you’re looking for (and preferably why). Define your population- the whole range of things you’re checking (people, certain species of animals, a type of chemical, etc.). Of course, it’s hard to check the entire population, especially when you’re dealing with a very large quantity, so you can use sampling.

- Simple random sampling, where you randomly select members of the population and gather data from them.
- Systematic sampling, where you sample something like every 10th person from the phone book, or anything that’s not completely random.
- Cluster sampling, where you choose a smaller area of the population and sample parts of that area randomly.
- Convenience sampling, where you randomly grab samples that are convenient to you in terms of location or accessibility.
There are many ways to collect data. You can physically conduct experiments and measurements to find things out. You have to design these experiments so that they’re objective and valid of course, otherwise nobody would trust your data.
A good way to find out data about people which is statistical surveying. These can be conducted with phone interviews, sending them out by mail (like a population census), online or even in person. The good thing about surveys is that they can be used to get a large amount of data pretty easily, and there are methods of verifying their statistical accuracy and relevance. The bad side is that they can encourage some forms of bias.
Bias is what happens when a sampling method isn’t completely random, leading types of data to be more probable than others for no statistical reason. For example, if you use cluster sampling on an area that doesn’t represent the whole population accurately, your data will become biased. A good example of bias is saying that America is full of violence and crime because that’s what is seen on international news- this is very biased, since boring people who don’t commit crimes won’t be shown on the news, so the population isn’t sampled fairly and accurately.

The final important piece of information used for data gathering is sample size- how many data did you actually collect? In general, the more data you have, the better. Consider this: I’m doing a survey to see how many people believe in ghosts. I asked 3 people, and 2 of them happen to believe in ghosts, so I publish a paper saying 2 out of every 3 people in Canada believe in ghosts. Clearly, I haven’t sampled enough people. If I found that out of 3 million people 2 million believe in ghosts, my paper would make more sense.
But let’s say we used good sampling, a large sample size, eliminated bias, and now we have a collection of lots of data. Now what?
It’s time for getting organized
There are several ways to organize data which makes it easy to analyze:- Ordered list. Simply create a list of your data, in order.
- Table. You can use whatever criteria you want to sort the data. Here’s an example- I collect shoe size samples from my office, and find this: 7, 11, 9, 10, 9, 6, 10.5, 7.5 9. I can organize in a table like so:
| Range of shoe sizes | Number of employees |
| [6 – 7) | 1 |
| [7-8) | 2 |
| [8-9) | 0 |
| [9-10) | 3 |
| [10-11) | 2 |
| 11+ | 1 |
- Graph. Graphs are nice, visually effective ways of organizing data.For discrete data like what I have, we use a histogram, since it’s simple and convenient:

- For continuous data, we can use regular graphs. For example, say I measure the time it takes to climb the stairs after you drink x cups of coffee. Here’s a table of the data:
Number of coffee cups Time to climb stairs 0 13.6 1 12.1 2 11.7 3 9.5 4 12.2 5 13.8 6 15.7
And here’s the lovely resulting graph:

As you can see, we have organized the data in a simple but effective manner. And now comes the important part.
Analyzing the data using stats methods
There are many things we can use to analyze data.
For one variable data, like the shoe size experiment, there are a lot of things we can find out. For example, we can see that the mean data point (adding everything up and dividing by number of entries) is 8.78. We can also find the standard deviation, which measures the dispersion of the data. Together, they tell us how the data is distributed. If a new worker joined the office, we can expect their shoe size to fall close to the mean. We can also find the median, the middle value- 9, and the mode, the most frequent value- also 9. They tell us that half the office has shoe size 9 and above and half have 9 or below, and that 9 is the most popular shoe size.

For 2-variable data, like the coffee cups and stairway climbing time, statistical analysis can tell us how these variables relate. Using a method called least squares, we can draw a curve or line of best fit- on the graph above, I’ve used a parabola since it seems appropriate for the data. The parabola shows that if you drink 3 cups of coffee or less, you will be able to climb the stairs faster, but once you go over 3 cups you’ll start climbing slower and slower. From this data we can look at the ingredients and effects of coffee on people and make conclusions.

One important note- correlation does not equal causation. This means that even though coffee cups seem to be correlated with climbing speed, it doesn’t mean that drinking coffee causes you to go faster or slower. It’s possible that it’s just a coincidence, or else that other factors are involved. Further research and analysis is required to make a substantial, statistically valid conclusion.
That’s it for basic stats!
Of course, statistics include a lot more. Regression analysis, hypothesis testing, and various other tools and methods are involved. If this lesson seems interesting to you, you should consider learning more about stats!
10 out of every 9 people love stats!
Photo Credits:
- Beijing Olympics- Usain Bolt, by rich115
- Free Samples, by avlxyz
- Graphitti: Bias, by Franco Folini
- Shoes, by davitydave
- Stairs, by AlexPears
Page Author
From Here You Can…
Information
- 235 Views
- 0 Comments
Most Recent Related Content
- Video
- Avatar

- Title
- Definition of the Derivative
- Description
- Definition of the Derivative
- Author
- Lesson
- Avatar

- Title
- Basics of Algebra: Part I
- Body
- Algebra, part I- Numbers, Variables, Operations, Expressions, Relations T...
- Author
- Lesson
- Avatar

- Title
- Percentages Basic Concepts-1
- Body
- Friends, in this lesson we will learn how to answer the questions (Problem...
- Author
- Video
- Avatar

- Title
- What Time is it?
- Description
- Credit: mrtee073
- Author
- Lesson
- Avatar

- Title
- Standardized Tests: GRE Study Tips
- Body
- What is GRE? Graduate Record Examination (GRE) is a commercially-run s...
- Author
- Lesson
- Avatar

- Title
- Writing a Great Statement of Purpose
- Body
- Writing a Great Statement of Purpose The Statement of Purpose is the singl...
- Author
- Lesson
- Avatar

- Title
- algebra tips
- Body
- for information about this topic please refer to the pdf link so that it may ...
- Author
- Lesson
- Avatar

- Title
- Vocab Flash Cards Section # 6
- Body
- FLASH CARDS # 11 – 16 Flash card # 11 Aerodynamics : A branch...
- Author
