Reticulate is a handy way to combine Python and R code. The reticulate help page suggests that the tool allows for: “Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python modules, and using Python interactively within an R session. Translation between R and Python objects (for example, between R and Pandas data frames, or between R matrices and NumPy arrays). Flexible binding to different versions of Python including virtual environments and Conda environments.”
This all is great. To unpack the previous statement, we will go through and use reticulate in all the ways, right here. IMPORTANT NOTE: mixing Python and R in a non-markdown R program can be done to some extent, through calling python programs and using python functions in R (see below), but this doesn’t really come across from the language used in the reticulate() documentation, at least to me.
First, lets use make a Python code chunk inside our R markdown. To do this, simply specify Python or python (case doesn’t matter) in between the braces that start the code chunk (if using R then there would be an r in between these braces, see below).
DNASeq = 'ATGAAC'
SeqLength = len(DNASeq)
print ('Sequence Length:', SeqLength)
## ('Sequence Length:', 6)
We can mix r and python code chunks into the same markdown document.
print("R is still cool, I guess")
## [1] "R is still cool, I guess"
The other way to use reticulate is to source Python scripts. Objects made by the scripts are put into memory and can be called from either a R or python chunk. First, let me make a demo Python script that makes an object. Then I will source that script and do stuff with the objects it makes.
fileConn<-file("deleteme.py")
writeLines("#!/usr/bin/env python\n
print 'Hello World'\n
test = 2 + 3", fileConn)
close(fileConn)
Now I will source this script. NOTE, I can source it from WITHIN an R chunk. This is likely to be the most useful way to use reticulate, because one can run various Python programs from within an R script that then operates on the outputs of those programs. This strategy will work in a script interpreted as R, it doesn’t have to be a markdown file. E.g., one could do some data wrangling in R, save the objects, run a Python script that does some machine learning nonsense, then take the output of that back into R for plotting, all in one potentially too long script, like this sentence.
source_python("deleteme.py")
str(test) #looky thar
## int 5
plot(test,main = "the most interesting plot ever made at this moment in time")
We can also import python modules from within R…This could be very useful, since we could then use these functions on R objects. For instance, here I load numpy and make an array and use numpy to operate on the array. IMPORTANT: r and python prints arrays a bit differently, so be careful. See this page for more: https://rstudio.github.io/reticulate/articles/arrays.html
np = import("numpy")
#Make an array in R that we will operate on in Python
aRray <- array(1:100, dim = c(2,50))
#Do some numpy stuff to the array
np$min(aRray)
## [1] 1
np$sum(aRray)
## [1] 5050
Python time…
#Same thing as last chunk, but reptilian
r.aRray.min()
## 1
r.aRray.sum()
## 5050
Also, we can start up a CLI for Python by calling repl_python(). REPL stands for “read–evaluate–print loop” and is a way to interact with the Python interpreter. It is like when one uses R in the console, typing each command one line at time. This can be useful for doing debugging. Objets made using the REPL are available in the R session. To close the REPL type exit.
From the reticulate documentation, “When calling into Python, R data types are automatically converted to their equivalent Python types. When values are returned from Python to R they are converted back to R types.”
We have already converted a few things in this script with no hassle. Note to access an r object from python use the prefix “r.” and when going the other way use “py$”. To demonstrate, I will make some stuff in R and then in Python and then call the R stuff from within Python and vice versa.
df <- data.frame(matrix(1:100, nrow = 25, ncol = 4))
the_almighty_list <- list(df, 1, "string")
scalar <- 1
namedList <- list(thing1 = seq(1,5,1))
scalar = 1
mylist = ['a','b', 2]
#make a dictionary also called a hash
metal = {}
metal['grind'] = ['nails']
metal['death'] = ['tomb mold']
Lets call the r objects from within Python and see what they look like.
type(r.df)
## <type 'dict'>
r.df['X1']
## [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]
type(r.the_almighty_list)
## <type 'list'>
type(r.namedList)
## <type 'dict'>
type(scalar)
## <type 'int'>
Note that the data frame and the named list got turned into dictionaries. A data frame is a named list, so it is not surprising that they got treated in the same way. However, a non-named list remained a list in Python, as did a scalar. The class of the scalar was preserved as well.
Now lets access some Python stuff in R.
class(py$scalar)
## [1] "integer"
class(py$mylist)
## [1] "list"
class(py$metal)
## [1] "list"
py$metal
## $death
## [1] "tomb mold"
##
## $grind
## [1] "nails"
Notably, the dictionary came back as a named list, but in the regular list style format of R, not the data.frame format for named lists.
Have fun!