Importing data into SPSS

Importing some data

If you have a small amount of data you want to get into SPSS, the most easy way is to simply Copy Paste it into SPSS. Be aware to check if everything ends up in the right cell, and if  you really have all the data you need. The most safe way to get this data into SPSS, would be to import it. Especially if you have a lot of data.

Importing a lot of data

There are a number of file formats SPSS can import data from. In general it is handy – but optional – to have a header row in the top of your file with descriptions of the columns. This means that you add one line before all the cases, where you put the name of the column (or variable).

Importing data from an Excel file
Importing data from an ASCII file
Importing data from a tab delimited text file

5 comments May 3rd, 2006 andris

How to handle multiple response questions

In a lot of research, multiple answers can be given to a single question. For example:

“What kind of food do you like?”
o soup
o rice
o salad

How do you analyse this type of question? Let’s assume you want to make a table with the answers. How do you combine them? Even though it may seem like the most easy thing to do, this is pretty difficult stuff. SPSS is good at analyzing unique combinations of variables (answers to questions) combined with unique cases (people in a survey). The combination of more than one answer per person, does not fit into that logic. So we have to be a little creative.

There are – at least – four different ways to analyze these results. They all have advantages and disadvantages, so it really comes down to your preferences:

1) Create separate variables for each answer 
varsoup (0 = not ticked, 1 = ticked)
varrice (0 = not ticked, 1 = ticked)
varsalad (0 = not ticked, 1 = ticked)

If you’re not entering the data yourself, chances are that this is what your data-set will look like if there was a multiple question in the survey. With three separate variables, you can create three tables using the frequencies-command. The advantage of this approach is its simplicity, the disadvantage is that you have three separate tables. You cannot tell if the three answers were in any way linked to one another.

2) Make a ‘grouping variable
a. Follow the steps in option 1, creating separate variables for each answer
b. Create a new variable: the ‘grouping variable’. Select analyze > tables > multiple response sets… Select all variables from the list you want to group together and click them into the right window ‘Variables in set’. Choose the ‘counted value’: this is the value that you want to count as ‘yes’. In our example, this is the value 1. Give the new variable a name, for instance ‘varfood’. Click ‘add’, and ‘OK’. Your new grouping variable will appear with a $ in front in de list to the right of the screen: ‘$varfood’. The $-sign tells you this is a variable containing several variables.

To make a table, select analyze > tables > multiple response tables… In the window to the bottom-left of the screen, you see grouping variables. Click the one you want and click it into the ‘Rows’ box. Click OK and you’re done.

3) Several variables with a hierarchy
In order to be able to use this method, it would be nice if your survey also asked to rank the three items available. If that’s not the case, you will have to decide for yourself which answer is most important and label that one first answer.

This will lead to:

varfood1 (1 = soup, 2 = rice, 3 = salad) label “First answer”
varfood2 (1 = soup, 2 = rice, 3 = salad) label “Second answer”
varfood3 (1 = soup, 2 = rice, 3 = salad) label “Third answer”

4) Create a variable with a single value for each possible combination
This will lead to:

varfood

1 = soup,
2 = rice,
3 = salad,
4 = soup and rice,
5 = soup & salad,
6 = rice and salad,
7 = soup and rice and salad

The disadvantage is that this can be quite some work when you’ve got more than three answers. And you also run the risk of not being able to interpret your resulting table at a glance: the number of cases per option can be quite small. This is really only a serious option if you want to know exactly what each person answered. Pay attention to the order of the answers: if you start off with those combinations that were pretty popular, they will be at the top of your table, making it easier to interpret the results.

(Thanks to Sander for answering this question)

27 comments May 2nd, 2006

Entering data in SPSS

We write this post assuming that you have already defined your variables, and that you have not already entered data in another program. If you have, please refer to the post about importing data into SPSS.

Open your SPSS file, and make sure that the tab selected in the bottom is the “Data View” tab. In this view, the rows represent the observations (or respondents, cases) and the columns represent your variables.

SPSS Empty data sheet

In this example we will use the data which you can find in our SPSSlog.com example data sheet (PDF file). One handy thing when filling in data, is to set SPSS to show the labels. You do this by choosing View -> Value Labels. If you have filled in everything from the Excel sheet, and chosen this option, your SPSS file should look like this:

SPSS Filled in data sheet

When you have to enter a lot of data, be sure to Save your file now and than. Typing in data is not the coolest job, and you certainly don’t want to spent hours typing in the same data again. After you have filled in your data, don’t forget to check the data for errors!

Files used for this example:

SPSSlog.com example data sheet
SPSSlog.com SPSS file
SPSSLog.com example questionnaire

1 comment May 1st, 2006 andris

How to define your variables

Start SPSS, and create a new data file (Choose Type in data in the first dialog window). You now see a file that looks like a Microsoft Excel file. In the bottom of your screen, you see two tabs. We see that the tab “Data View” is selected. This is the place to type in your data. Before we take that step, we have to define our variables. Next to the tab “Data View” we see the tab “Variable View“, click on that tab. When you have clicked this tab, you get the following screen:

SPSS Variable View
(click on thumbnail to get the big version)

For each variable there is a row in this view. In the columns you can find several properties of a variable, such as, name, type, width, etc. Below you will find an explanation of the most important columns.
In the name column you have to define the name of the variable. This is bound to a lot of limitations, the following are the most important:

- The name must start with a letter.
- The name should be 8 characters max (depending on your version of SPSS).
- The name should not be one of the keywords that SPSS uses to make statistical calculations (like AND, NOT, EQ, BY, WITH and ALL).

You can find the other limitations in the SPSS help file. Personally I always try to name my variables as follows: for question 1, I name the variable q1, for question 3a, variable q3a etc.
After you have typed in a name, and pressed “Enter“, the other columns (except Label) are filled in automatically:

SPSS Question 1

(click on thumbnail to get the big version)

The next thing you have to do is adjust these columns (where neccessary):

- Adjusting the Type
When you have hit Enter, the focus goes to the Type column, and a grey box appears in the right part of this cell. If you click on this box, you can select the variable type from a dialog window. The types you should focus on in the beginning are Numeric and String. Use the String type for questions with open answers, use the Numeric type for questions where for example the respondent has a limited number of choices which you have preselected.

- Adjusting Width
If you have selected the String type, than adjust the width to maximum (for some version of SPSS 255 characters, for other versions more). If you have chosen Numeric, you can go with the standard with of 8 characters.

- Adjusting decimals
In most cases this is not neccessary to adjust.

- Adjusting Label
This is the place where you can fill in a describing text for your variables. This can be the text of your question or in some cases of the answer category. This label will appear in output.

- Adding Value Labels
For Numeric type questions, you can predefine the answers. To add variables, click the grey box in the right of the cell. In the dialog window you can add value labels:

Insert value labels

(click on thumbnail to get the big version)

- Adding Missing labels
Here you can add value labels that are irrelevant.

- Adjusting column align and widht
The columns do not have to be changed for starter use of SPSS, so we leave them untouched.

- Adjusting Measurment
This is the last place where you can choose the right measurment for the variable. You can choose between scale, ordinal and nominal. Which measurment you’ve choose for what question, you can read here.

Repeat these steps for all of your questions, and you have defined all your variables!

Files used for this example:

SPSSlog.com example data sheet
SPSSlog.com SPSS file
SPSSLog.com example questionnaire 

4 comments April 29th, 2006 andris

Get SPSS in your own language

Today Giuseppe from Italy sent us the following question:

“How can i translate spss into the italian language?”

SPSS is available in several different languages, namely English, Japanese, French, German, Italian, Spanish, Chinese, Polish, Korean, and Russian. The website of SPSS says you should “contact your local office to find out version information and more”. Visit the SPSS website to find a list of local offices. And Guiseppe, especially for you, the SPSS website in Italian. :)
If you have any more questions about SPSS, please submit your question!

April 29th, 2006 andris

One sample t-test

Many visitors of our blog are searching for information about the one sample t-test.

You perform a one-sample t-test when you want to determine if the mean value of a target variable is different from a hypothesized value.

To perform a one-sample t-test in SPSS. Choose Analyze>Compare Means>One-sample t-test.

ttest1.JPG

Move the variable of interest to the Test variable(s) box. Change the test value to the hypothesized value. Click the OK button.

The output from this analysis will contain the following sections.

One-Sample Statistics. Provides the sample size, mean, standard deviation, and standard error of the mean for the target variable.

One-Sample Test. Provides the results of a t-test comparing the mean of the targetvariable to the hypothesized value.

A significant test statistic indicates that the sample mean differs from the hypothesized value. This section also contains the upper and lower bounds for a 95% confidence interval around the sample mean.

Do you have an question about the one-sample t-test, submit your question here.

3 comments April 26th, 2006

Performing a MANOVA in SPSS

Ivy sent us an e-mail about investigating the interaction effect of independent variables. MANOVA (multivariate analysis of variance) is a statistical procedure that allows you to determine if a set of categorical predictor variables can explain the variability in a set of continuous response variables.

In SPSS you can perform a MANOVA as follows:

- Choose Analyze -> General Linear Model -> Multivariate.
- Move the DVs (dependent variables) you want to examine to the Dependent Variables box.
- Move any categorical IVs (independent variables) to the Fixed Factor(s) box.
- Move any continuous IVs to the Covariate(s) box.
- Click OK and there you have your output.

If you have any more questions about MANOVA or ANOVA, submit your questions!

18 comments April 25th, 2006 andris

100% stacked bar graph problem

This Friday we got a question from Els, who has a problem with making her stacked bar graphs look good:

“As a trainee I am now analysing the results of a customer satisfactory investigation. Many people advised me to use SPSS, so I did.

Most of the questionnaire questions are built the same way (very dissatisfied, dissatisfied, satisfied, very satisfied).

The report will be devised in subjects (price, quality, reaction speed,etc). Each subject contains around 5 questions.

For each subject I made a horizontal graph in which all 5 questions regarding that subject are being displayed. This way I will analyze around 5 questions in each graph, as in this example.

bar1.gif

The stacked graphs I made myself are horizontal, 100% stacked, so the bar fills the entire graph horizontally (like in the example). So far, so good, it looks great!

Then the problem: Inside the bars I would like to show the exact percentage over that question.(like in the example).

The problem is, that these figures are incorrect. Now I see the percentage concerning the entire graph, all the questions together, in stead of per question.

Can you tell me what I have to do to change the figures inside the bars from a percentage over the entire graph to a percentage over the bar?”

To solve the problem we will describe step by step how to make a 100% stacked bar chart, and how to get the exact percentages into it. First select Tables > Tables of frequencies.

Now drag the five questions you want to make graphs for to Frequencies for. Click on statistics and choose percent. Click ok, and you are back at the ‘tables of frequencies’ screen. By clicking ok again you will get a table.
Select the table by double-clicking, then click the right-mouse button and select create graph > bar. Now you will get the following screen (click on the picture to show the screen shot in full size):

stackedbar2.jpg
Click at the red marked button in the picture above. In this screen you can change the option cluster after ‘color’ into stack. Go back to the graph and click on the button Horizontale orientation.

After this you have to double click on one of the bars. A new screen will open. In this screen you can click on values in bar labels. Now you can choose the location of the percent value. Now click on inside base and then click ok. Then you will have 100% stacked bars, with the exact percentage on that question inside.

Do you also need an answer to your SPSS question, submit your question here.

17 comments April 22nd, 2006

Looking for patterns in data

This weekend we got a question from Kat, who is desperately ;) looking for our help. She is working on a project with a lot of data in Excel:

“I am in the middle of a project for which I have constructed a large table in Microsoft Excel.

The table consists of variables going across the top, and cases down the side. The cells contain numbers (ie. the frequency of each variable within each case). Many of the cells have no number or the variable has zero frequency in that case.

I want to look for patterns with in the data, that is , similarities between cases. It has been suggested that I do this visually, however the table is so big it would take forever. I am wondering if you know of an application within SPSS which looks for patterns in this way. It would be an enormous help to my work if there is such a thing.”

Well Kat, to begin, of course you should always use SPSS, and never Excel for analysing your data. :)
There are two ways to solve your problem:

1. Do it the hard way by comparing all answers with a correlation test. This means you would have to compare all the questions (and their answers) to see if there is a correlation between them. Doing this, you will in the end have found all relationships and be able to find the patterns.

2. There is an easy way, but for this you need an extra piece of software from SPSS, called “SPSS Categories“. In the software you can put in ratings and categories, and the software will display graphically (and in figures) the possible relationships in your data.

Good luck analysing your data!

April 19th, 2006 andris

New categories and difficulty rating

We have a great website visitor programme, called Google Analytics. With this we get a lot of information about you, our visitor. Not too personal however, so you don’t have to worry. We get to see what you search for, how long you stay, if you come back and how many times. Also we see on which links you click. We found out that most of the people click on the category “Questions and answers“. We decided that we should you help you a little bit, and give you a more easy search digging into this category. To do so, we have added subcategories in this category, based on the steps people normally take when using SPSS. Every question we answer or tip / trick we write, we will place in one of these subcategories. We hope this will make life with SPSS for you a little bit easier. :)

We have also added a rating to the Questions and answers posts. You can find the rating just below the title of a post. This rating shows what level of SPSS knowledge we think you should have if you can use to use the information in the post. So if you are a beginner, and see a posting with a 4 or 5 star rating, for example “Finding correlations using the Pearson correlation analysis.“, than you know you should stay far away from that one.

Happy SPSS-ing! And oh yeah, Happy Easter for everyone.

April 13th, 2006 andris

Next Posts Previous Posts