Showing posts with label Comma Separated Value. Show all posts
Showing posts with label Comma Separated Value. Show all posts

Friday, December 8, 2017

How do you plot data with ggplot?

After knowing your data, visualizing data is the next most important thing.

Report writing, analyzing data and mining data all require data visualization. It is a must for all including the data scientists.

There are various data visualization software such as Power BI, SQL Server Reporting Services, Tableau etc, but ggplot outshines them in many ways besides being free, like air. GGPLOT also happens to be the most used tool especially in serious science and  statistics.

Let us get some data to plot using GGPLOT.

In my previous post I have created a csv file that we can use. Read about this csv file here:

http://hodentekhelp.blogspot.com/2017/02/how-do-you-import-text-file-into.html

How do you plot data with ggplot?

Launch R GUI from your Microsoft R folder here:


Get this data into a data frame using this code in R

> df <- csv="" esktop="" header="TRUE)</font" log2017="" read.csv="" sers="" wner="">
The data can be displayed in R as follows:


ggplot_data2

Now I assume you have the ggplot package. If you do not have get it as shown here:

http://hodentekhelp.blogspot.com/2015/11/what-is-needed-to-visualize-data-in-r.html

Load the library of ggplot as shown here:

> library(ggplot2)


Now run the following code in R
> z=ggplot(data=df, aes(x=productName, y=Quantity))
This just loads the data to ggplot but will not plot. You need to tell what kind of geometrical object we use to plot and that we specify by geom_point(). It is some what of unintuitive way but that is how it works.
The code to plot would be as shown:
-----------
> y=ggplot(data=df, aes(x=ProductName, y=Quantity))+geom_point()
> y
--------------
This brings up the graphic R window as shown (if you do not see this, click on the
windows menu item in R)


ggplot23_plot.png


Agreed that this is not a great set of data, but it is enough to illustrate the most basic step of visualizing data with ggplot().

The function aes() is called the aesthetics. You will learn that the name of this function is quite appropriate. That will be for another day.

Monday, October 5, 2015

How do you import data in an Excel spreadsheet into R?

MS Excel is an excellent data cruncher which also has statistics related tools to process data in the sheets. R language has package can do staitstical processing of data. Once the data is processed it can be exported so as to create reports. This import and export can be frustrating in some cases taking more time than the statistical processing. R language is language of choice for statistical processing but for not for large scale data.

The easiest type of data that can be imported is the data on a text file. Text file based data is for small and medium amount of data.
I will describe three methods of importing data from a Excel spreadsheet.

First method:
Let us take an example of data on a Excel spreadsheet as shown here:


ExcelOri

Save it as text file as shown in a previous post.

Launch R and in the prompt type as shown:
Enter the location of your .CSV file as shown and clilc Enter


You get an error:
Error: '\U' used without hex digits in character string starting ""C:\U"

We need to change the slash character as shown. Click Enter
Now you get the second error which shows all the needed attributes for calling reading a text file:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  line 2 did not have 3 elements


Just like Read.table(), Scan() is another function. In fact Read.table() calls Scan() to do the job. The sep in the list refers to what kind of a separator was used. In the .CSV file it is a comma.

Modify the statement to indicate that the separator is a comma and click enter.
Now the result is displayed. Row numbers and Column heading are added.

Use sep = " "  spaces or newlines
Use sep ="\t" for tab

Second method:
 
Open the Excel file shown at the top and copy the column heading and the data as shown:


R_clip
The contents are now in the "Clipboard".

Now in R enter the code as shown. After the error modify the separator (tab instead of comma)
-----------
> mydat <-read .table="" br="" file="clipboard" sep=",">> mydat
                             V1
1 First Name\tLast Name\tAGE\tRent
2         Chris \tLanger\t40\t2500
3          Jean\tSimmons\t80\t1200
4          Tom \tHiggins\t35\t4000
> mydat <-read .table="" br="" file="clipboard" sep="\t">> mydat
          V1        V2  V3   V4
1 First Name Last Name AGE Rent
2     Chris     Langer  40 2500
3       Jean   Simmons  80 1200
4       Tom    Higgins  35 4000
>

-------------------------------
Third method:

You can also use a statement like the following to display the contents of a .CSV file:
> read.csv("C:/Users/mysorian/Desktop/R_Related/names.csv")
  First.Name Last.Name AGE Rent
1     Chris     Langer  40 2500
2       Jean   Simmons  80 1200
3       Tom    Higgins  35 4000
>
=============================

Wednesday, September 23, 2015

How to create a CSV file?

The present post describes creating a Comma Separated Value (CSV) file using Microsoft Excel.

CSV files are very popular and frequently used data transformation formats since legacy data are usually of this type. In recent times XML and JSON formatted data has replaced them.

However, there is a whole lot of legacy data that needs to be loaded on to more recent databases. Hence, every database vendor provides a program to accomplish this conversion. Also programs exist which takes a CSV file and convert it into an XML file. Perhaps this is another route one can take in data conversions for legacy data.

You can create a CSV file using Microsoft Excel in all versions.
Here is an example of a CSV file.
------------
First Name,Last Name,AGE,Rent
Chris ,Langer,40,2500
Jean,Simmons,80,1200
Tom ,Higgins,35,4000

--------------
The first row in the above are headers (providing column names) and the rest is data.

Step 1:
Create an Excel file as shown by typing in the cell entries after launching the Microsoft Excel (herein Excel 2010).

namesExcel.png

Step 2:
Click File to display drop-down. You will be saving the file as names in the CSV format.


ExcelSaveAs.png

 
Step 3: Click Save As to open the Save As dialog as shown. You have a variety of options to choose from. Pick MS-DOS (CSV )as the Excel file type as shown.


ExcelSaveOptions

 
Step 4:
Provide a name for the file and accept the default folder. You get the following warning:

Excel Permissions

Accept the provided location (My Documents) by cliking Yes. The document gets saved to the location.

 A word of caution. If you have multiple sheets (usually when launched there will be three sheets). Delete the two extra sheets and just keep one sheet. You will get an error message if you have more than one sheet while saving it as a CSV file.