I’m used to object oriented programming in languages such as C++ and Python and I should say that the R syntax is not the most intuitive I’ve seen. Besides, it is quite difficult to find a good documentation about it. I’m not going to write another exhaustive attempt describing OOP in R but since I had to figure out in many different places how to write a simple class, let me summarize what I found.
First of all, the most relevant and well written document I found is a PDF document called “A (not So) short Introduction to S4, object oriented programming in R, by Christophe Genolini”.
In R, there are 2 OOP syntax called S3 and S4. Stick to S4, which is the newest one.
Then, keep in mind that when you will declare a method, you must know if it already exists in the standard library otherwise you will have first to create it in two steps; we’ll come to that in a moment.
So, let us be pragmatic and jump directly to the class design.
Definition
First, you create a new class by using the setClass funtion:
setClass("MyClass",
#can contains attributes of different types:
representation(
myMatrix="matrix",
myList="list"),
## validity method (optional I think)
validity=function(object) {
msg <- NULL
nrow <- nrow(myMatrix(object))
if (nrow <= 1)
{
msg <- "'nrow must be stricly larger than 1"
}
if (is.null(msg)) TRUE else msg
}) |
setClass("MyClass",
#can contains attributes of different types:
representation(
myMatrix="matrix",
myList="list"),
## validity method (optional I think)
validity=function(object) {
msg <- NULL
nrow <- nrow(myMatrix(object))
if (nrow <= 1)
{
msg <- "'nrow must be stricly larger than 1"
}
if (is.null(msg)) TRUE else msg
})
The class definition allows to add attributes and perform sanity checks.
Constructor
Then, you need a constructor. It can be done directly with the new function:
new("MyClass", myMatrix=matrix(c(1,2,3,4), nrow=2), myList=list(data=c(1,2))) |
new("MyClass", myMatrix=matrix(c(1,2,3,4), nrow=2), myList=list(data=c(1,2)))
or via a user-defined constructor:
MyClass <-function(data_matrix, data_list){
# you can do all sort of sanity checks here or add a dispatcher,...
new("MyClass", myMatrix=data_matrix, myList-data_list )
} |
MyClass <-function(data_matrix, data_list){
# you can do all sort of sanity checks here or add a dispatcher,...
new("MyClass", myMatrix=data_matrix, myList-data_list )
}
Accessors
Now, for the accessors, you can do the following:
## accessors
myMatrix <- function(obj, ...) obj@myMatrix
myList <- function(obj, ...) obj@myList |
## accessors
myMatrix <- function(obj, ...) obj@myMatrix
myList <- function(obj, ...) obj@myList
Methods
First, the way to call a method associated to an object is:
which is quite different from many other languages where you would type:
obj.method()
What took me a while to understand is that there are 2 ways of defining a method depending it is has already been created or not in another class. For instance, the plot methods exists in the R language, so if you want to create your own plot method, you use setMethod function:
setMethod("plot", "MyClass", function(x, y, ... ){
# do some plotting using the x argument that is your object
}) |
setMethod("plot", "MyClass", function(x, y, ... ){
# do some plotting using the x argument that is your object
})
The 2 first arguments are pretty obvious: the name of the method and the name of the class. Although here, the second argument is misleading. It is not the name of the class name but the list of arguments (the signature). In our case, it is an object “MyClass”. Then, the third argument must be function(x,y,…). Where does it come from ? It looks like it must be the same as the already declared method. Remember that plot already exists in the R language. If you type
you will get
This is what you must use as your third argument. Since you provided as a 2nd argument a MyClass object, x will be replaced by the object and y can just be ignored. What if you want to have several inputs ?
You can design a new method as follows by changing the signature of the second argument:
setMethod("plot", signature(x="CNOlist", y="CNOlist"), function(x, y, ... ){
# do some plotting with x and y objects
}) |
setMethod("plot", signature(x="CNOlist", y="CNOlist"), function(x, y, ... ){
# do some plotting with x and y objects
})
Another simple example to overload the length method:
setMethod("length", "CNOlist", function(x) length(x@timepoints)) |
setMethod("length", "CNOlist", function(x) length(x@timepoints))
Now, if you want to create a method that does not exist (e.g., “newMethod”) you must create what is called a Generic.
setGeneric(
name="newMethod",
def=function(object, optional_arg1=1){standardGeneric("newMethod")}
) |
setGeneric(
name="newMethod",
def=function(object, optional_arg1=1){standardGeneric("newMethod")}
)
Now you can create the method itself
setMethod("newMethod", "MyClass",
definition=function(object, optional_arg1=1){
#do something with your object and the optional arg
}
) |
setMethod("newMethod", "MyClass",
definition=function(object, optional_arg1=1){
#do something with your object and the optional arg
}
)
One issue is that by using the setGeneric you may replace an existing generic method with the same name. So it is recommended to check its existence using
if (isGeneric("MyMethod")==FALSE){
setGeneric(
name="MyClass",
def=function(object){standardGeneric("MyClass")}
)
} |
if (isGeneric("MyMethod")==FALSE){
setGeneric(
name="MyClass",
def=function(object){standardGeneric("MyClass")}
)
}
or to lock it :
lockBinding("MyMethod", .GlobalEnv) |
lockBinding("MyMethod", .GlobalEnv)
There are of course many more features and tricks when designing R objects. The documentation of the class/method is not easy either and of course if you create a package, you need to set your NAMESPACE file properly. Something like:
# MyClass class
exportClasses(MyClass)
exportMethods("plot", "myMethod", "length") |
# MyClass class
exportClasses(MyClass)
exportMethods("plot", "myMethod", "length")