This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
library(ggplot2)
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
cars_local <- cars
cars_means <- apply(cars,2,mean)
mtcars$cyl = factor(mtcars$cyl)
ggplot(data=mtcars, aes(x=disp, y=mpg, color=cyl)) + geom_point() +
xlab("Displacement") + ylab("Miles per Gallon")
We check to see if we can call Python.
print("Hello World. If you can read this, Python is working.")
## Hello World. If you can read this, Python is working.
And we do a little computation. Beware of indentation and be sure to note that r is reserved for the R/Python interface object. If you overwrite r, things will break.
G = 6.67 * (10 ** -11)
M = 2.0 * (10 ** 30) # Mass of the Sun
m = 6.0 * (10 ** 24) # Mass of the Earth
d = 3.0 * (10 ** 11)
F = G*M*m/((d/2) ** 2)
print("Force of gravity = ", F)
## Force of gravity = 3.5573333333333336e+22
And we mess with conditionals.
price = 257
if (price >= 300):
price *= 0.7
elif (price >= 200):
price *= 0.8
elif (price >= 100):
price *= 0.9
elif (price >= 50):
price *= 0.95
else:
price
print(price)
## 205.60000000000002
Mess with string.
x = 3
y = 1
def rep_cat(x, y):
return str(x) * 8 + str(y) * 5
z = rep_cat(x, y)
print(z)
## 3333333311111
Fibonacci is always fun.
def fib(n):
first = 0
second = 1
if (n < 1):
return -1
if (n == 1):
return first
if (n == 2):
return second
i = 3
while (i <= n):
fib_n = first + second
first = second
second = fib_n
i += 1
return fib_n
n = 10
print(fib(n))
## 34
def Fibonacci(n):
# Check if input is 0 then it will
# print incorrect input
if (n < 0):
print("Incorrect input")
elif (n == 0):
return 0
elif (n == 1 or n == 2):
return 1
else:
return Fibonacci(n-1) + Fibonacci(n-2)
n = 10
print(Fibonacci(n))
## 55
i = 0
while (n > 0):
print(Fibonacci(i))
n -= 1
i += 1
## 0
## 1
## 1
## 2
## 3
## 5
## 8
## 13
## 21
## 34
We can import Python data into R. First read a CSV file to create a Python data frame.
import pandas as pd
htwt = pd.read_csv("Data/HtWt.csv")
htwt.describe()
## Height Weight Group
## count 20.000000 20.000000 20.000000
## mean 62.100000 139.600000 1.550000
## std 8.441127 43.122103 0.510418
## min 51.000000 82.000000 1.000000
## 25% 56.000000 108.250000 1.000000
## 50% 59.500000 123.500000 2.000000
## 75% 68.000000 166.750000 2.000000
## max 79.000000 228.000000 2.000000
Now we use the Python data frame in an R ggplot call.
library(ggplot2)
summary(py$htwt)
## Height Weight Group
## Min. :51.0 Min. : 82.0 Min. :1.00
## 1st Qu.:56.0 1st Qu.:108.2 1st Qu.:1.00
## Median :59.5 Median :123.5 Median :2.00
## Mean :62.1 Mean :139.6 Mean :1.55
## 3rd Qu.:68.0 3rd Qu.:166.8 3rd Qu.:2.00
## Max. :79.0 Max. :228.0 Max. :2.00
ggplot(py$htwt, aes(x=Height, y=Weight)) + geom_point() +
geom_smooth(se=FALSE) + geom_smooth(method="lm", color="orange", se=FALSE) +
xlab("Height (in)") + ylab("Weight (lbs)")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## `geom_smooth()` using formula 'y ~ x'
The cars data frame is built into R. We can create a local copy from the global.
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
cars_local = cars
summary(cars_local)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
Now we import the R data frames cars_local and mtcars into Python and play with them.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
cars_loc = r.cars_local
cars_loc.describe()
## speed dist
## count 50.000000 50.000000
## mean 15.400000 42.980000
## std 5.287644 25.769377
## min 4.000000 2.000000
## 25% 12.000000 26.000000
## 50% 15.000000 36.000000
## 75% 19.000000 56.000000
## max 25.000000 120.000000
mtcars = r["mtcars"]
mtcars.describe()
## mpg disp hp ... am gear carb
## count 32.000000 32.000000 32.000000 ... 32.000000 32.000000 32.0000
## mean 20.090625 230.721875 146.687500 ... 0.406250 3.687500 2.8125
## std 6.026948 123.938694 68.562868 ... 0.498991 0.737804 1.6152
## min 10.400000 71.100000 52.000000 ... 0.000000 3.000000 1.0000
## 25% 15.425000 120.825000 96.500000 ... 0.000000 3.000000 2.0000
## 50% 19.200000 196.300000 123.000000 ... 0.000000 4.000000 2.0000
## 75% 22.800000 326.000000 180.000000 ... 1.000000 4.000000 4.0000
## max 33.900000 472.000000 335.000000 ... 1.000000 5.000000 8.0000
##
## [8 rows x 10 columns]
fig = plt.figure()
sns.scatterplot(data=mtcars, x='disp', y='mpg', hue='cyl')
plt.xlabel("Displacement")
plt.ylabel("Miles per Gallon")
#plt.gca().legend_.remove()
plt.show()