beyond imagination

Friday, 29 March 2013

ITBAL Session 10: Plotting in R

Assignment 1: Create three vectors x,y,z of equal length, bind them together and create three dimensional plots for the same.

--> First create a data random data of 60 item, mean = 20 and standard deviation = 5.

> data<-rnorm(60,mean=20,sd=5)
> data

--> To create and display three vectors of length = 15 items.

> x<-sample(data,15)

> x

> y<-sample(data,15)

> y

> z<-sample(data,15)

> z

--> To bind the vectors together.

> p<-cbind(x,y,z)

> p

--> Plotting of graphs.

>plot3d(p[,1:3])

>plot3d(p[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(500))

>plot3d(p[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='s')

> plot3d(p[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='p')

> plot3d(p[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='l')

Assignment 2:

Create 2 random variables

Create 3 plots:

1.X-Y

2.X-Y|Z(introducing a variable z and cbind it to z and y with 5 diff. categories)

3.Colour code and draw the graph

--> Creating a data set for two random variables and introducing a third variable z.

x <- rnorm(1000, mean= 30 , sd=10)

> y <- rnorm(1000, mean= 30, sd=10)

> z1 <- sample(letters, 5)

> z2 <- sample(z1, 1000, replace=TRUE)

> z <- as.factor(z2)

> z

--> Creating quick plots.

> qplot(x,y)

>qplot(x,z)

--> For semi transparent plot.

>qplot(x,z, alpha=I(2/10))

--> For coloured plot

> qplot(x,y, color=z)

--> For logarithmic coloured plot

>qplot(log(x),log(y), color=z)

--> Best fit and smooth curve using geom.

> qplot(x,y,geom=c("path","smooth"))

>qplot(x,y,geom=c("point","smooth"))

>qplot(x,y,geom=c("boxplot","jitter"))

Friday, 22 March 2013

Session – 9

Wolfram alpha tool

Wolfram Alpha’s Facebook report delves into your profile and breaks down all of your activity into easy to digest graphs. It’s surprisingly comprehensive so data like times of interaction, word maps, relationship status and network structure is all visualized for your convenience. This is mostly for fun since it’s only your personal account that’s visualized, but if Face book is your main source of interaction (with subscribers, friends, etc.), you will have a lot of information to help you improve.

Wolfram alpha is billed as a "computational knowledge engine", the Google rival Wolfram Alpha is really good at intelligently displaying charts in response to data queries without the need for any configuration. If you’re using publicly available data, this offers a simple widget builder to make it really simple to get visualisations on your site.

Wolfram alpha is to let anyone do personal analytics with Facebook data. Wolfram Alpha knows about all kinds of knowledge domains; now it can know about you, and apply its powers of analysis to give you all sorts of personal analytics and this is just the beginning; over the months to come, particularly as we see about how people use this, we’ll be adding more and more capabilities.

It’s pretty straightforward to get your personal analytics report: all you have to do is type “facebook report” into the standard Wolfram|Alpha website

Friday, 15 March 2013

IT BAL session 8

Panel Data Analysis: To do the panel data analysis of "Produc" using the models:Pooled, Fixed & Random.
Also to choose the best model by using the tests:
pFtest : between fixed and pooled
plmtest: between pooled and random
phtest : between random and fixed

To load the data
Commands:

> data("Produc" , package ="plm")
> head(Produc)

Pooled Affects Model

Commands:

> pool <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("pooling"), index = c("state","year"))
> summary(pool)

Fixed Affects Model
Commands:

> fixed <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("within"), index = c("state","year"))
> summary(fixed)

Random Affects Model
Commands:

> random <- plm(log(pcap)~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp) , data =Produc, model=("random"), index = c("state","year"))
> summary(random)

Tests:
Pooled vs Fixed:
H0: Pooled Affects Model
H1: Fixed Affects Model

As the p-value is very small, we reject the null hypothesis and accept the alternate hypothesis.
=> Fixed affects model is accepted.

Pooled vs Random:
H0: Pooled Affects Model
H1: Random Affects Model

As the p-value is very small, we reject the null hypothesis and accept the alternate hypothesis.
=> Random affects model is accepted.

Random vs Fixed:
H0: Random Affects Model
H1: Fixed Affects Model

As the p-value is very small, we reject the null hypothesis and accept the alternate hypothesis.
=> Fixed affects model is accepted.

Result:
By conducting all the above tests, we come to a conclusion that Fixed Affects Model is the best to do the panel data analysis for "Produc".

Wednesday, 13 February 2013

ITBAL Session 6

Assignment 1:Create a log of returns on data from 01.01.2012 to 31.01.2013 and calculate historical volatility.
2.create ACF plot for log returns perform adf test and interpret.

closingprice<-ass1[,5]

closingprice.ts<-ts(closingprice,frequency=252)

temptable<-closingprice.ts-lag(closingprice.ts,k=-1)

lagtable<-cbind(closingprice.ts,lag(closingprice.ts,k=-1),temptable)

lagtable

head(lagtable)

returns<-(closingprice.ts-lag(closingprice.ts,k=-1))/lag(closingprice.ts,k=-1)

l<-scale(returns)+10
logreturns<-log(l)

acf(logreturns)

from the above graph we can see the measurements lie with in the 95% confidence interval therefore the time series is stationary.

2.commands
T=(252)^0.5
historicalvolatility<-sd(logreturns)*T
historicalvolatility
adf.test(logreturns)

From the picture it is clear that p value=0.01 is less than 0.05

Therefore we reject the null hypothesis and accept the alternative hypothesis which states the time series is stationary.

Thursday, 7 February 2013

ITBAL session 5

ITBAL -Session 5

Assignment1 :To find and plot graph returns on daily basis for NSE data more than 6 months

Assignment 2:Do logit analysis for 700 data points and then predict 150 data points

z<-read.csv(file.choose(),header=T)

head(z)

z.data<-z[1:700,1:9]

sapply(z.data,mean)

z.data$ed<-factor(z.data$ed)

logit.est<-glm(default~age+employ+address+income+debtinc+creddebt+othdebt,data=z.data,family="binomial")

summary(logit.est)

confint.default(logit.est)

logit.eg2<-with(z[701:850,1:8],data.frame(age=age,employ=employ,address=address,income=income,debtinc=debtinc,creddebt=creddebt,othdebt=othdebt,ed=factor(1:3)))

logit.eg2$prob<-predict(logit.est,newdata=logit.eg2,type="response")

head(logit.eg2)