Monday, October 5, 2015

How do you import data in an Excel spreadsheet into R?

MS Excel is an excellent data cruncher which also has statistics related tools to process data in the sheets. R language has package can do staitstical processing of data. Once the data is processed it can be exported so as to create reports. This import and export can be frustrating in some cases taking more time than the statistical processing. R language is language of choice for statistical processing but for not for large scale data.

The easiest type of data that can be imported is the data on a text file. Text file based data is for small and medium amount of data.
I will describe three methods of importing data from a Excel spreadsheet.

First method:
Let us take an example of data on a Excel spreadsheet as shown here:


ExcelOri

Save it as text file as shown in a previous post.

Launch R and in the prompt type as shown:
Enter the location of your .CSV file as shown and clilc Enter


You get an error:
Error: '\U' used without hex digits in character string starting ""C:\U"

We need to change the slash character as shown. Click Enter
Now you get the second error which shows all the needed attributes for calling reading a text file:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  line 2 did not have 3 elements


Just like Read.table(), Scan() is another function. In fact Read.table() calls Scan() to do the job. The sep in the list refers to what kind of a separator was used. In the .CSV file it is a comma.

Modify the statement to indicate that the separator is a comma and click enter.
Now the result is displayed. Row numbers and Column heading are added.

Use sep = " "  spaces or newlines
Use sep ="\t" for tab

Second method:
 
Open the Excel file shown at the top and copy the column heading and the data as shown:


R_clip
The contents are now in the "Clipboard".

Now in R enter the code as shown. After the error modify the separator (tab instead of comma)
-----------
> mydat <-read .table="" br="" file="clipboard" sep=",">> mydat
                             V1
1 First Name\tLast Name\tAGE\tRent
2         Chris \tLanger\t40\t2500
3          Jean\tSimmons\t80\t1200
4          Tom \tHiggins\t35\t4000
> mydat <-read .table="" br="" file="clipboard" sep="\t">> mydat
          V1        V2  V3   V4
1 First Name Last Name AGE Rent
2     Chris     Langer  40 2500
3       Jean   Simmons  80 1200
4       Tom    Higgins  35 4000
>

-------------------------------
Third method:

You can also use a statement like the following to display the contents of a .CSV file:
> read.csv("C:/Users/mysorian/Desktop/R_Related/names.csv")
  First.Name Last.Name AGE Rent
1     Chris     Langer  40 2500
2       Jean   Simmons  80 1200
3       Tom    Higgins  35 4000
>
=============================

No comments: