# Correlations

Statistical techniques used engineering assignment Correlation gives the linear relationship between two variables. Initial measure to check for the relationship between the movements of the two variables is covariance. Covariance has some drawbacks like the value can range from minus infinity to plus infinity. The magnitude of the covariance doesn’t tell anything about the strength of the relationship. Hence to overcome the drawbacks correlation is used. Correlation is covariance is divided by the standard deviations of the two variables. Thus correlation is the standardized version of the covariance. Correlation between two variables is always between -1 to +1.

Implication of correlation values

Correlation gives the linear relationship between the variables. A zero correlation doesn’t mean there is no relationship between the variables but instead it means that the linear relationship between the variables is not there but they can be non-linearly related.

A value of +1 means that they are perfectly positively correlated, meaning that if one variable value increases then the other variable will also increase in value. A value of 1 means that they are perfectly negatively correlated, meaning that if one variable value increases then the other variable will also increase in value. Correlation doesn’t give the causality but only the directional view, meaning that falling of one variable doesn’t mean is affect by the other variable (Keller, 2011).

## Regression

This is a statistical method of finding a relationship between variables. There is a single dependent variable and can be many independent variables. If there are more than one independent variable it is called as the multiple regression. If there is one independent variable then it is called as simple regression

The user attempts to understand how the dependent variable will change as the independent variable changes.

Regression can be used to explain the variation which occurs in the dependent variable with the help of the independent variables. Here the variation term refers to the variability of the variable from its mean value. There are different types of regression depending on the data and each type of data have some assumptions which needs to considered while doing the regression. Simple regression, Time series regression, dummy regression is some of the examples of regression. Regression uses the least squares method to solve the equation it forms and it comes with the equation in which squares of the distance between the line and variables is minimum.

### Pareto Chart

The above diagram gives an idea about how the Pareto Chart looks when plotted using the data. It consists of two basic diagrams, a bar chart and a line diagram. The bar chart vertical axis is on the left hand side and the line diagram vertical axis is on the right hand side.

The bar diagram are arranged in a descending order. They show the individual values of the data. Line diagram shows the cumulative values of the individual data.

Pareto chart main purpose is to find the major contributors or main factors in the data which will affect the user. For e.g. in the above it is visible what causes the late arrivals. So the most important factor is traffic and least important factor is emergency. Similarly for other data looking at the Pareto diagram the user can easily identify the major factors.

### Flow Chart

It is form a algorithm process form of diagram. There are different types of diagram/boxes in the flow chart diagram and each shape represents a certain flow. These boxes are connected with the help of the arrow. The oval box represents a start or end of the flow chart. The diamond box represents the decision, rectangles represent the general process , parallelogram represent the input output and subroutines are represented with the help of the reactangle which has doubt struck edges.

Flow chart can be used to represent a normal flow of the problem or a procedure used to execute a certain decision. By looking at the flowchart user can get an idea on how a certain procedure is executed ((Keller, 2011).

### Scatter Diagram

Scatter diagram can be said to be a collection of points in a graph. In this graph each is a representative of two variables (i.e. X/Y pair)

Correlation gives the linear relationship between two variables. Initial measure to check for the relationship between the movements of the two variables is covariance. Covariance has some drawbacks like the value can range from minus infinity to plus infinity. The magnitude of the covariance doesn’t tell anything about the strength of the relationship. Hence to overcome the drawbacks correlation is used. Correlation is covariance is divided by the standard deviations of the two variables. Thus correlation is the standardized version of the covariance. Correlation between two variables is always between -1 to +1 (Keller, Warrack & Bartel, 1988).

Implication of correlation values

Correlation gives the linear relationship between the variables. A zero correlation doesn’t mean there is no relationship between the variables but instead it means that the linear relationship between the variables is not there but they can be non-linearly related.

A value of +1 means that they are perfectly positively correlated, meaning that if one variable value increases then the other variable will also increase in value. A value of 1 means that they are perfectly negatively correlated, meaning that if one variable value increases then the other variable will also increase in value. Correlation doesn’t give the causality but only the directional view, meaning that falling of one variable doesn’t mean is affect by the other variable. The scatter diagram with data is shown in the diagram below:

This graph can be used to interpret correlation between the two variables. If the scatter is sloping upward then it indicates that there is positive correlation between the two variables. If the diagram is downward sloping then it indicates there is negative correlation between the two variables. If the dots are scattered all over the graph with no visible pattern then it indicates there is zero correlation between the two variables. Thus looking at the scatter diagram the user can easily identify the relation between the two variables.

### Sampling

Sampling is a method to select a subset of data from the universe of the date which is representative of the population of data for which the statistical analysis is required to tbe done. It is impossible to select the whole population and get the data and hence the user needs to collect samples data such that it represents all the characteristics of the population and all the analysis done can be applied to the whole population. There are different types of sampling methods which are applied to get the required data. Random sampling, stratified sampling and systemic sampling are some of the examples of methods of sampling ((Keller, 2011).

### Fishbone Diagram

This diagram is also called as the Ishikawa diagram. It is also known as the cause and effect diagram. This diagram is used to analyze what causes an event. Hence know as the cause and effect diagram. Thus the reasons leading an event are plotted on the graph and they can be used to analyze which are reasons are most important. The causes are generally classified into categories to get a better idea about the cause (Cressie & Cassie, 1993).

### Brainstorming

This is a form a technique to get new ideas for different purpose like solving an issue, finding a new method for execution. In this group of individuals sit together and collate ideas which arrive in mind on a spontaneous basis. A group of individuals get together and come up with new ideas and in the end these ideas are collated to come to a new conclusion. In this method all the ideas are accepted by the group.

This is a very useful method of generating ideas for solving the problem. It helps the stakeholders to think and apply their and build on the ideas of other people.

### Histograms

This diagram helps the user to get an idea on how the data is distributed. Thus histogram forms a graphical representation on how the data is distributed. The user can easily get an idea about the probability distribution of the form of data is continuous. If the data is discrete then the data is divided into different intervals and vertical bars are representative of that interval. Higher the vertical bar it means more data is concentrated in that range. Thus the user can easily judge the concentration of the data (Cressie & Cassie, 1993).

### Affinity Diagram

This diagram is useful for organization of ideas. This majorly used in the project management. In this the ideas are organized according to their relationship or affinity into groups and it gives the user better idea and they can be easily used to for making decisions. This can be the next step from brainstorming. Hence after the brainstorming session is done it can be used to organize the ideas and come to conclusion. It also useful in organizing the notes and data collection from different research. Thus it is a useful project management tool to organize the data and then decisions on them.

### References

Keller, G. (2011). Statistics for management and economics. Cengage Learning.

Cressie, N. A., & Cassie, N. A. (1993). Statistics for spatial data (Vol. 900). New York: Wiley

Keller, G., Warrack, B., & Bartel, H. (1988). Statistics for management and economics: A systematic approach. Wadsworth Publishing Company.