Fortran and R – Speed Things Up

Posted: April 11, 2014 in Fortran, Performance, R programming apply lapply tapply

If you are a newcomer to R then you are probably quite busy learning the semantics of the language as you experiment with the apply family of commands or come up to speed on the grouping and conditioning capabilities offered by lattice graphics. And, along the way, you might have heard that R has the ability to “link in” code written in other languages such as C, C++, Fortran, and Java. This is true but until you are presented with a compelling use case you will most likely ignore such capability since you’ve already got plenty to do. But that’s where this blog can help. I’ll talk about how to integrate Fortran code with R and, in a later post, discuss doing the same with C and C++.

DISCLAIMER: To accomplish this work requires the presence of a working Fortran compiler such as the GNU suite of compilers (g77, gfortran, gcc, etc). Relative to operating systems I use Linux and OSX. I rarely use Windows though do know that the GNU suite is available for that OS so you should be able to link Fortran code into R there. However, I’ve never tried it. Additional Disclaimer: The timings I present in this post are based on an Apple MacBook @ OS level 10.9.2 with a 2.5 Ghz i5 processor and 4GB of RAM. Your timings may vary.

Why Bother ?

Good question. Most people wind up wanting to access Fortran from R for a few reasons such as they have some really fast and efficient Fortran code that they want to exploit within R. Or maybe they have written some code in R that winds up being incredibly slow so they write a much faster version in Fortran and then want to call it from R. Perhaps they need to access subroutines from external Fortran libraries. Lastly, it might simply be because your boss or faculty advisor is making you do it ! Whatever your reason(s) we’ll break the process of linking in Fortran code down into three general steps: 1) prepare the subroutine for compilation and generate a “shared object”, 2) load the shared object into an active R session, and 3) provide R variables to the shared object via the “.Fortran” call, which is part of the “Foreign Function Interface” within R.

Foreign                  package:base                  R Documentation

Foreign Function Interface

Description:

     Functions to make calls to compiled code that has been loaded into
     R.

Usage:

            .C(.NAME, ..., NAOK = FALSE, DUP = TRUE, PACKAGE, ENCODING)
      .Fortran(.NAME, ..., NAOK = FALSE, DUP = TRUE, PACKAGE, ENCODING)

Prepare the Fortran Subroutine

Next, I present a very simple Fortran 77 subroutine that computes the factorial of a number “n” and stashes the result into a variable called “answer”. And speaking of subroutines it is important to know that to use the .Fortran interface one must make reference to Fortran subroutines only – not Fortran functions or full on programs. So if you have some code that you want to bring in then you will need to embed that code within a subroutine definition. We also need to pay attention to the variable types as declared within the subroutine so we can match those types accordingly when calling the subroutine from R. This winds up being one of the more critical steps in the process.

        subroutine facto(n,answer)
c
c simple subroutine to compute factorial
c
        integer n, answer, i

        answer = 1
        do 100 i = 2,n
           answer = answer * i
  100   continue 
        
        end

You should make sure, of course, that this code does compile correctly. Our goal is to generate a shared object file ( a “.so” file) that can be linked in to R. Note also that our routine doesn’t print/write things to output. It simply uses its input variables and ultimately sets an output variable. Make your code lean.

$ ls
facto.f

$ gfortran -c facto.f

$ ls
facto.f	facto.o

$ gfortran -shared -o facto.so facto.o
$ ls
facto.f		facto.o		facto.so

So it looks like we are good to go here. However, instead of doing this compilation ourselves we could have allowed R to help us. In fact it is the preferred way to do this since this will insure the compilation is done “under the supervision” of the R tools. Let’s remove the .o and .so files and start over.

$ rm *.*o

$ ls
facto.f

$ R CMD SHLIB facto.f
<you will see various compilation output messages>

$ ls
facto.f		facto.o		facto.so

Load it

So now what ? Let’s fire up R. We’ll use the dyn.load command which is part of the foreign interface capability. It’s purpose is to load/unload shared objects which are also known as DLLs, (dynamically loadable libraries).

system("ls")
facto.f		facto.o		facto.so
dyn.load("facto.so")

Okay not much happened there. What’s going on ? Well all we did was simply load the shared object. We have yet to use it. To do that we rely upon the “.Fortran” function. Keep in mind that the subroutine “facto” has two arguments both of which are integers. We’ll supply a value of 5 for “n” and we’ll pass a single integer as a value for the “answer” variable though that will be overwritten once the subroutine computes the “answer”.

system("ls")
facto.f		facto.o		facto.so

dyn.load("facto.so")

.Fortran("facto",n=as.integer(5),answer=as.integer(1))

$n
[1] 5

$answer
[1] 120

# Or more directly

.Fortran("facto",n=as.integer(5),answer=as.integer(1))$answer
[1] 120

If you are wondering what types are supported or shared between R and Fortran here is a list from the .Fortran man page. It is worth some time to peruse the man page as it provides some caveats and deeper explanations on how to interact with Fortran.

       R          Fortran          
       integer    integer          
       numeric    double precision 
       - or -     real             
       complex    double complex   
       logical    integer          

Wrap it Up!

Okay that was cool but this is sort of an awkward way to call the Fortran subroutine. We should probably write our own wrapper function in R to do the loading of the shared object and the referencing to the .Fortran function. Take a look at this approach which winds up being very “R like”.

myfacto <- function(num) {
  dyn.load("facto.so")
  retvals <- .Fortran("facto",n = as.integer(num), answer = as.integer(1))
  return(retvals$answer)
}
         
myfacto(5)
[1] 120

sapply(1:10,myfacto)
 [1]       1       2       6      24     120     720    5040   40320  362880 3628800

So with the wrapper approach we can use the Fortran subroutine just as we would any other R function since the call to .Fortran is “buried” in the wrapper function. We could make this a bit more robust by putting in some logic to see if the shared object is already loaded.

myfacto <- function(num) {
  if (!is.loaded('facto')) {
     dyn.load("facto.so")
  }
  retvals <- .Fortran("facto",n = as.integer(num), answer = as.integer(1))
  return(retvals$answer)
}
         
myfacto(5)
[1] 120

It’s all Convoluted

Well that was okay but let’s look at a more involved example. Let’s consider the idea of doing discrete convolution between two vectors. (Note: This is discussed in the “Writing R Extensions” manual). Why did I pick such an example ? Well first it’s commonly referenced in R literature and , second, it is a good motivating case for using an external language to speed up the processing. The algorithm itself isn’t hard to code up either in R or Fortran. However, the performance in R isn’t so good once the vectors get larger. Check it out:

conr <- function(x, y) {
    lx <- length(x)
    ly <- length(y)
    cxy <- numeric(lx + ly - 1)
    for(i in 1:lx) {
        xi <- x[i]
        for(j in 1:ly) {
            ij <- i+j-1
            cxy[ij] <- cxy[ij] + xi * y[j]
        }
    }
    return(cxy)
}

# Let's check the timings for vectors of different sizes

v1 = rnorm(100); v2 = rnorm(100)

system.time(conr(v1,v2))
   user  system elapsed 
  0.034   0.000   0.035 

v1 = rnorm(2000); v2 = rnorm(2000)

system.time(conr(v1,v2))
   user  system elapsed 
 13.195   0.020  13.215 

v1 = rnorm(4000); v2 = rnorm(4000)

system.time(conr(v1,v2))
   user  system elapsed 
 57.757   0.130  58.008 

The timings grow significantly longer as the sizes of the vectors grow. So passing vectors of size 10,000 could take a very long time. While this blog isn’t specifically on performance let’s do a little bit more coding to get an idea about how poorly performing the convolution written in R is. We’ll use this for later comparison with the performance numbers resulting from the Fortran subroutine. Let’s write a wrapper function to the conr function. This will call conr with a variable x that represents the size of the vectors we wish to convolute. If you don’t understand exactly what is going on here don’t worry – just think of at as more exposure to the apply family of commands.

timewrapconr <- function(x) {
    times <- system.time(conr(rnorm(x),rnorm(x)))[3]
    return(c(size=x,times))
}

# time the convolution for vectors of size 100,1000,2000, and 4000

(convtimes <- sapply(c(100,1000,2000,4000),timewrapconr))
           [,1]     [,2]     [,3]    [,4]
size    100.000 1000.000 2000.000 4000.00
elapsed   0.033    3.552   13.811   62.15

# Let's plot this
library(lattice)

xyplot(convtimes[2,]~convtimes[1,],type=c("p","l","g"),
       xlab = "vector size", ylab = "elapsed time in seconds",
       main = "Execution times for Convolution in R", pch = 19)
Figure 1. Plot of execution times

Figure 1. Plot of execution times

How do we address this problem ? Well there are opportunities for improvement within the R code by using vectorization techniques. A good start would be to somehow avoid the second for loop and there are ways to do that. In fact there is a way to avoid both loops altogether and maybe we’ll explore such an approach in another post. But for now let’s see if writing the code in Fortran and then linking it in could help improve things. So here is the rewritten convolution algorithm, which we will save into a file called convolvef77.f

      subroutine convolvef77 (x, lx, y, ly, xy)
c
c A basic implementation of convolution algorithm for two vectors
c I use zero-based arrays here.
c
      integer lx, ly, i, j
      double precision x(0:lx-1), y(0:ly-1), xy(0:lx+ly-2)
      do 20 i = 0, (lx-1) 
         do 15 j = 0, (ly-1) 
            xy(i+j) = xy(i+j) + x(i) * y(j) 
  15     continue  
  20  continue 
      end

# Save the above to a file called convolvef77.f
# Now compile it to a shared library

$ R CMD SHLIB convolvef77.f 

Next we’ll write a function in R to call the convolvef77 function. Start up R.

convolvef77 <- function(x,y) {
  dyn.load('convolvef77.so')
  lx = length(x)
  ly = length(y)
  retdata <- .Fortran("convolvef77",
                       x = as.double(x),
                       lx = as.integer(lx), 
                       y = as.double(y), 
                       ly = as.integer(ly), 
                       xy = double(lx+ly-1))$xy
  return(retdata)
}

# Now let's throw some large vectors at it. Look at how much better the times are

v1 = rnorm(4000); v2 = rnorm(4000)

system.time(convolvef77(v1,v2))
   user  system elapsed 
  0.012   0.000   0.012 

v1 = rnorm(8000); v2 = rnorm(8000)

system.time(convolvef77(v1,v2))
   user  system elapsed 
  0.047   0.001   0.083 

So the speed looks really good. So now let’s repeat the timing exercise we applied to the convolutions done in R.

timewrapconf77 <- function(x) {
    times <- system.time(convolvef77(rnorm(x),rnorm(x)))[3]
    return(c(size=x,times))
}

(convtimes <- sapply(c(100,1000,2000,4000),timewrapconf77))
           [,1]  [,2]  [,3]    [,4]
size    100.000 1e+03 2e+03 4.0e+03
elapsed   0.045 2e-03 4e-03 1.3e-02

# Wow. This is FAST !!!!! Let's throw some bigger vectors at it.

(convtimes <- sapply(c(100,1000,2000,4000,10000,20000,50000),timewrap))
         [,1]  [,2]  [,3]    [,4]    [,5]     [,6]      [,7]
size    1e+02 1e+03 2e+03 4.0e+03 1.0e+04 2.00e+04 50000.000
elapsed 1e-03 2e-03 4e-03 1.2e-02 7.2e-02 3.22e-01     2.074

# Plot the times

xyplot(convtimes[2,]~convtimes[1,],type=c("p","l","g"),
       xlab = "vector size", ylab = "elapsed time in seconds",
       main = "Execution times for Convolution in Fortran", pch = 19)
Execution times using Fortran

Execution times using Fortran

So using the Fortran subroutine took 2.0 seconds to convolute vectors of size 50,000 whereas using native R code to convolute a vector of size 1,000 took 3.5 seconds (these timings might vary depending on your architecture and OS). To get a better visual comparison let’s repeat the timings for both approaches, R and Fortran, and plot the results on the same graph so you can get some sense of proportion between the execution times. This isn’t hard to do. We’ll just rerun our timing functions:

(convtimesr <- sapply(c(100,1000,2000,4000,10000,20000),timewrapconr))

           [,1]     [,2]     [,3]     [,4]      [,5]      [,6]
size    100.000 1000.000 2000.000 4000.000 10000.000 20000.000
elapsed   0.034    3.374   14.118   64.894   355.409  1504.517
  
(convtimesf77 <- sapply(c(100,1000,2000,4000,10000,20000),timewrapconf77))
           [,1]    [,2]  [,3]    [,4]    [,5]     [,6]
size    100.000 1.0e+03 2e+03 4.0e+03 1.0e+04 2.00e+04
elapsed   0.071 2.3e-02 4e-03 1.2e-02 6.9e-02 2.99e-01

# Now plot them on the same graph

plot(convtimesr[1,],convtimesr[2,],xlab="Vector size",
     ylab="Elapsed time in seconds", 
     main="Convolution in R vs Fortran",
     type="l",col="red",lwd=2.5)

points(convtimesf77[1,],convtimesf77[2,],type="l",col="blue",lwd=2.5)

legend("topleft",c("R","Fortran"),col=c("red","blue"),
        lty=c(1,1),lwd=c(2.5,2.5))
grid()
Execution times for R and Fortran

Execution times for R and Fortran

Okay, I think you get the point here. Using the Fortran code definitely helped speed things up. However, speed might not be the only reason you choose to link in Fortran code. For example I know of people who have written the bulk of their thesis analysis work using Fortran and now seek to leverage that effort within R. Sure, they could recode their stuff into R but that would probably result in lower performance results. Any time you have a significant body of work in one language you would like to avoid having to recode it in another. Lastly, there are other ways to bring in Fortran that I haven’t discussed here. The “inline” package allows one to compile fortran code inline within a given R program, which might be more appealing to some. Hope this has been helpful.

Advertisements
Comments
  1. […] article was first published on Rolling Your Rs, and kindly contributed to […]

  2. nico says:

    When I run system(“gfortran -shared -o facto.so facto.o”) on a Mac I get the following error:

    i686-apple-darwin8-gfortran-4.2: unrecognized option ‘-shared’
    ld: warning: ignoring file facto.o, file was built for unsupported file format ( 0xcf 0xfa 0xed 0xfe 0x 7 0x 0 0x 0 0x 1 0x 3 0x 0 0x 0 0x 0 0x 1 0x 0 0x 0 0x 0 ) which is not the architecture being linked (i386): facto.o
    Undefined symbols for architecture i386:
    “_MAIN__”, referenced from:
    _main in libgfortranbegin.a(fmain.o)
    ld: symbol(s) not found for architecture i386
    collect2: ld returned 1 exit status

    I will appreciate how to run it on a mac. thanks, nico

    • Steve says:

      Hi Nico, This sounds like a conflict between the compiler and the XCode dev tools. I can’t be sure about this – I just know that when I upgraded to Mavericks, (something I don’t recommend), I had to upgrade Xcode to restore functionality. Even now I get some spurious error messages that make reference to non-existent paths during linking. I obtained a recent version of gofrtran from http://gcc.gnu.org/wiki/GFortranBinariesMacOS To install you need to have the Apple Developer Tools installed, as well as “command line developer tools”. These can be downloaded from http://developer.apple.com/ (free registration required). Click on Downloads > Developer Tools in the sidebar. Xcode 5.0.1 is the latest version. I hope this helps

  3. frakor says:

    Thank you for this great programming example!
    However, replacing convolvef77 with the R convolve function for vectors of size 50,000 takes 20 milliseconds running time (R 3.1 on Ubuntu 13.10)

    • Steve says:

      Hi Thanks for reading. Each OS and associated underlying hardware will yield different response times though I’m confident that as the vector sizes grow that the R implementation of the convolution algorithm will become much slower than the comparable Fortran version. If you are on a speedy system then keep throwing larger vectors to see a greater difference in processing times. Just use the timing code presented above and start with vector sizes of say 50,000 and go from there.

  4. Alan Parker says:

    Thanks for the beautifully clear explanation. You motivated me to shoehorn a monster, bullet-proof piece of Fortran code into R. Someone, Katherine Mullen, maybe, said that R had saved Fortran from extinction. R has certainly made using this legacy code (e.g. ODEPACK) into something anyone can do.

  5. Michael says:

    This is great! I am using ifort compiler rather than gfortran; however, am struggling to generate a shared object file ( a “.so” file). Any advice would be highly appreciated!

    Thanks
    Michael

  6. Eduardo says:

    Thank you ever so much. I have managed to run one of my fortran codes without much trouble. ifort and gfortran work just fine.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s