Friday, 29 March 2013

ITBAL Session 10: Plotting in R


Assignment 1: Create three vectors x,y,z of equal length, bind them together and create three dimensional plots for the same.

--> First create a data random data of 60 item, mean = 20 and standard deviation = 5.


> data<-rnorm(60,mean=20,sd=5)
> data

--> To create and display three vectors of length = 15 items.

> x<-sample(data,15)
> x
 > y<-sample(data,15)
> y
> z<-sample(data,15)
> z

--> To bind the vectors together.

> p<-cbind(x,y,z)
> p

--> Plotting of graphs.

>plot3d(p[,1:3])


>plot3d(p[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(500))

>plot3d(p[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='s')

>  plot3d(p[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='p')

> plot3d(p[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='l')

Assignment 2: 
Create 2 random variables
Create 3 plots:
1.X-Y
2.X-Y|Z(introducing a variable z and cbind it to z and y with 5 diff. categories)
3.Colour code and draw the graph

--> Creating a data set for two random variables and introducing a third variable z.

x <- rnorm(1000, mean= 30 , sd=10)
>  y <- rnorm(1000, mean= 30, sd=10)
> z1 <- sample(letters, 5)
> z2 <- sample(z1, 1000, replace=TRUE)
> z <- as.factor(z2)
> z

--> Creating quick plots.

> qplot(x,y)

>qplot(x,z)

--> For semi transparent plot.
>qplot(x,z, alpha=I(2/10))

--> For coloured plot

> qplot(x,y, color=z)


--> For logarithmic coloured plot

>qplot(log(x),log(y), color=z)

--> Best fit and smooth curve using geom.

> qplot(x,y,geom=c("path","smooth"))

>qplot(x,y,geom=c("point","smooth"))

>qplot(x,y,geom=c("boxplot","jitter"))





    

Friday, 22 March 2013



Session – 9
Wolfram alpha tool
Wolfram Alpha’s Facebook report  delves into your profile and breaks down all of your activity into easy to digest graphs. It’s surprisingly comprehensive so data like times of interaction, word maps, relationship status and network structure is all visualized for your convenience. This is mostly for fun since it’s only your personal account that’s visualized, but if Face book is your main source of interaction (with subscribers, friends, etc.), you will have a lot of information to help you improve.




 

 Wolfram alpha is billed as a "computational knowledge engine", the Google rival Wolfram Alpha is really good at intelligently displaying charts in response to data queries without the need for any configuration. If you’re using publicly available data, this offers a simple widget builder to make it really simple to get visualisations on your site.

 Wolfram alpha is to let anyone do personal analytics with Facebook data. Wolfram Alpha knows about all kinds of knowledge domains; now it can know about you, and apply its powers of analysis to give you all sorts of personal analytics and this is just the beginning; over the months to come, particularly as we see about how people use this, we’ll be adding more and more capabilities.
It’s pretty straightforward to get your personal analytics report: all you have to do is type “facebook report” into the standard Wolfram|Alpha website


Friday, 15 March 2013


IT BAL session 8
Panel Data Analysis: To do the panel data analysis of "Produc" using the models:Pooled, Fixed & Random. 
Also to choose the best model by using the tests:
pFtest  : between fixed and pooled
plmtest: between pooled and random
phtest  : between random and fixed


To load the data
Commands:

> data("Produc" , package ="plm")
> head(Produc)


Pooled Affects Model

Commands:

> pool <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("pooling"), index = c("state","year"))
> summary(pool)


Fixed Affects Model
Commands:

> fixed <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("within"), index = c("state","year"))
> summary(fixed)


Random Affects Model
Commands:

> random <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("random"), index = c("state","year"))
> summary(random)




Tests:
Pooled vs Fixed:
H0: Pooled Affects Model
H1: Fixed Affects Model



As the p-value is very small, we reject the null hypothesis and accept the alternate hypothesis.
=> Fixed affects model is accepted.

Pooled vs Random:
H0: Pooled Affects Model
H1: Random Affects Model




As the p-value is very small, we reject the null hypothesis and accept the alternate hypothesis.
=> Random affects model is accepted.

Random vs Fixed:
H0: Random Affects Model
H1: Fixed Affects Model



As the p-value is very small, we reject the null hypothesis and accept the alternate hypothesis.
=> Fixed affects model is accepted.

Result:
By conducting all the above tests, we come to a conclusion that Fixed Affects Model is the best to do the panel data analysis for "Produc".

Wednesday, 13 February 2013

ITBAL Session 6

Assignment 1:Create a log of returns on data from 01.01.2012 to 31.01.2013 and calculate historical volatility.
2.create ACF plot for log returns perform adf test and interpret.


closingprice<-ass1[,5]
closingprice.ts<-ts(closingprice,frequency=252)
temptable<-closingprice.ts-lag(closingprice.ts,k=-1)
lagtable<-cbind(closingprice.ts,lag(closingprice.ts,k=-1),temptable)
lagtable
head(lagtable)
returns<-(closingprice.ts-lag(closingprice.ts,k=-1))/lag(closingprice.ts,k=-1)

l<-scale(returns)+10
logreturns<-log(l)
acf(logreturns)




















from the above graph we can see the measurements lie with in the 95% confidence interval therefore the time series is stationary.


2.commands
T=(252)^0.5
historicalvolatility<-sd(logreturns)*T
historicalvolatility
adf.test(logreturns)

















From the picture it is clear that p value=0.01 is less than 0.05
Therefore we reject the null hypothesis and accept the alternative hypothesis which states the time series is stationary. 

Thursday, 7 February 2013

ITBAL session 5

ITBAL -Session 5


Assignment1 :To find and plot graph returns on daily basis for NSE data more than 6 months


Assignment 2:Do logit analysis for 700 data points and then predict 150 data points

z<-read.csv(file.choose(),header=T)

head(z)

z.data<-z[1:700,1:9]

sapply(z.data,mean)

z.data$ed<-factor(z.data$ed)

logit.est<-glm(default~age+employ+address+income+debtinc+creddebt+othdebt,data=z.data,family="binomial")

summary(logit.est)

confint.default(logit.est)
logit.eg2<-with(z[701:850,1:8],data.frame(age=age,employ=employ,address=address,income=income,debtinc=debtinc,creddebt=creddebt,othdebt=othdebt,ed=factor(1:3)))
logit.eg2$prob<-predict(logit.est,newdata=logit.eg2,type="response")

head(logit.eg2)



Tuesday, 22 January 2013

Business application lab Day 3

Assignment 1 (a) :Using a given data of mileage-groove , Fit lm and comment on the applicability

the plot is not scattered so it is not a linear model

Assignment 1 (b) :Using a given data of alpha X, Fit lm and comment on the applicability
As the plot is random linearity is applicable.

Assignment 2:
Justify null hypothesis of Anova







Tuesday, 15 January 2013

BUSINESS APPLICATION LAB

DAY 2

Assignment 1:
Create 2 matrices of 3*3 and select 1 column in matrix 1 and matrix 2 merge them into another matrix using cbind command.

command:
z1<-c(1:9)
dim(z1)<-c(3,3)

z2<-c(10:18)
dim(z2)<-c(3,3)

x<-z1[,3]
y<-z2[,1]

z3<-cbind(x,y)
z3

Assignment 2:
multiply matrix 1 and matrix 2

z1%*%z2

Assignment 3:
Read historical data of indices from NSE site from Dec 1 2012 to Dec 31 2012. Find Regression and Residuals.
 To read NSE file 
nse<-read.csv(file.choose(),header=T)
reg<-lm(high~open,data=nse)

To find residuals
residuals(reg)



Assignment 4:
Generate a Normal distribution data and plot it.
x=seq(70,130,length=200)
y=dnorm(x,mean=100,sd=10)
plot(x,y)