## Datascience

# Statistics

# Statistics

Statistics - Numbers that summarise raw facts and figures in a meaningful way

Analyse data to draw conclusions

## Why?

- Make objective decisions
- Convey meaning
- Make accurate predictions

Statistics are based on facts, but they can be misleading

## Visualising Data

### Pie Chart

- Divide your data into groups, whose frequency when combined comes to 100% of the total.
- They show
**proportions** **Useful**if you want to compare basic proportions**Less Useful**if all the slices have a similar size

### Bar Chart

- Let you compare relative sizes with the advantage of
**allowing a greater degree of precision** - Ideal for when categories are roughly the same size
- Horizontal or vertical bar graphs can be used
- horizontal better for long field names

The golden rule for showing charts with percentages is to try and indicate frequencies, wither on the chart or just next to it

Otherwise one percentage taken from a few respondents can be compared with a nother percentage from many respondents

What you sometimes need is to show percentages and frequenciesâ€¦

### Split-category Chart

For each genre you can split into people satisfied and dissatisfied

- Useful for comparing frequencies but difficult to see proportions and percentages

### Segmented Bar Chart

Similar to split categgory but both satisfied and dissatisfied are shown in a single bar

## Categories vs Numbers

Categorical (Qualitive) data is split into categories that describe qualities or characteristics.

Eg. Genre, breed or type

Numerical (Quantitive) data deals with numbers, measurements and counts. It describes quantities.

Eg. weight, length, time

### Histogram

The data being numeric allows us to displays ranges of scores as a continuous scale on an axis.

Histograms are like bar charts but the area of each bar is proportional to frequency and there are **no gaps** between bars.