Hey friends.
Today, I wanted to talk about two coding languages I learned in my Computational Biology course in Spring 2021.
The first language I learned was R, which I had past familiarity with. R is useful for data analysis and working with large data sets. I actually use R frequently in my research lab where we study gut and vaginal microbiomes. You can check out two projects I did using R, under “Projects” in the navigation bar.
The second language I learned was python. I learned how to do basic functions with python such as indexing and slicing, as seen below.
list1=[2,4,6,8,10]
list2=['a','b','c','d', 'e']
list1[-2] #second to last element
## 8
plist=list(range(20))
plist
## [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
plist[::-5] #Grab every fifth element in reverse
## [19, 14, 9, 4]
I also learned how to create loops using python.
#loop with an index
mylist = ['dog', 'cat', 'bunny', 'frog', 'turtle', 'bird', 'mouse']
for i in range(len(mylist)):
print(mylist[i])
## dog
## cat
## bunny
## frog
## turtle
## bird
## mouse
#loop over a string
for letter in "hi my name is tien":
print(letter)
## h
## i
##
## m
## y
##
## n
## a
## m
## e
##
## i
## s
##
## t
## i
## e
## n
I also learned how to read in datasets, view them and analyze them using python. Below, I used the figure skating dataset from my Project 1.
library(reticulate)
import pandas as pd
import numpy as np
skate=pd.read_csv("https://raw.githubusercontent.com/BuzzFeedNews/2018-02-olympic-figure-skating-analysis/master/data/judged-aspects.csv",index_col=0)
skate.head()
## performance_id section aspect_num ... goe ref scores_of_panel
## aspect_id ...
## 004e382688 648ff2cbff elements 2.0 ... 0.80 NaN 6.60
## 005bdf4588 5458eddc1d elements 3.0 ... -2.71 NaN 5.49
## 0070f9cc40 c39eade62e components NaN ... NaN NaN 6.29
## 0071f2e3ae cb67dacba3 components NaN ... NaN NaN 7.25
## 007ae3fc4b 9e771ce55d elements 3.0 ... 1.90 NaN 8.50
##
## [5 rows x 11 columns]
skate.shape
## (3405, 11)
skate["goe"] > 2
## aspect_id
## 004e382688 False
## 005bdf4588 False
## 0070f9cc40 False
## 0071f2e3ae False
## 007ae3fc4b False
## ...
## ff96152a6c False
## ffa5ee8f76 False
## ffac9b2f6a False
## ffc8eba079 False
## fff0940a94 False
## Name: goe, Length: 3405, dtype: bool
Finally, I used the reticulate package in R to demonstrate how python and R can talk to each other and share information across code chunks.
By using alternating R and python code chunks, I created a joke exchange between the two. This is a more simple use of reticulate and much more complex things can be done using a combination of R and python. Regardless, please enjoy!
library(reticulate)
hi <- "Good"
hi="afternoon!"
print(r.hi, hi)
## Good afternoon!
cat(c(hi,py$hi))
## Good afternoon!
knock <- "knock"
knock = "knock"
print(r.knock, knock)
## knock knock
who = "who's there?"
cat(c(py$who))
## who's there?
python <- "python"
python = "python"
print(r.python)
## python
snake = "ah! a snake!"
cat(c(py$snake))
## ah! a snake!