Quantcast
Channel: "R" you ready? » data frame
Viewing all articles
Browse latest Browse all 2

R: Zip fastener for two data frames / combining rows or columns of two dataframes in an alternating manner

$
0
0

zipperzippersSometimes I find it useful to merge two data frames like the following ones

  X1 X2 X3 X4      Y1 Y2 Y3 Y4   
1  o  o  o  o       X  X  X  X
2  o  o  o  o       X  X  X  X
3  o  o  o  o       X  X  X  X

by using zip feeding either along the columns

   X1 Y1 X2 Y2 X3 Y3 X4 Y4
1  o  X  o  X  o  X  o  X
2  o  X  o  X  o  X  o  X
3  o  X  o  X  o  X  o  X

or along the rows of the data frames.

  V1 V2 V3 V4
1  o  o  o  o
4  X  X  X  X
2  o  o  o  o
5  X  X  X  X
3  o  o  o  o
6  X  X  X  X

The following function acts like a “zip fastener” for combining two dataframes. It takes the first column (or row) of the first data frame and places it next to the first column (or row) of the second data frame and so on. Only one dimension of the data frame has to be equal to do this. E.g. to combine the columns by zip feeding the number of rows must be equal and vice versa.

So here comes the code for the zipFastener() function. Actually its only the last few lines (from #zip fastener operations on) that do the job, but as I did not want to restrict the function to equal dimensions there is a little prelude.

###############################################################

# zipFastener for TWO dataframes of unequal length
zipFastener <- function(df1, df2, along=2)
{
    # parameter checking
    if(!is.element(along, c(1,2))){
        stop("along must be 1 or 2 for rows and columns
                                              respectively")
    }
    # if merged by using zip feeding along the columns, the
    # same no. of rows is required and vice versa
    if(along==1 & (ncol(df1)!= ncol(df2))) {
        stop ("the no. of columns has to be equal to merge
               them by zip feeding")
    }
    if(along==2 & (nrow(df1)!= nrow(df2))) {
        stop ("the no. of rows has to be equal to merge them by
               zip feeding")
    }

    # zip fastener preperations
    d1 <- dim(df1)[along]
    d2 <- dim(df2)[along]
    i1 <- 1:d1           # index vector 1
    i2 <- 1:d2 + d1      # index vector 2

    # set biggest dimension dMax
    if(d1==d2) {
        dMax <- d1
    } else if (d1 > d2) {
        length(i2) <- length(i1)    # make vectors same length, 
        dMax <- d1                  # fill blanks with NAs   
    } else  if(d1 < d2){
        length(i1) <- length(i2)    # make vectors same length,
        dMax <- d2                  # fill blanks with NAs   
    }
    
    # zip fastener operations
    index <- as.vector(matrix(c(i1, i2), ncol=dMax, byrow=T))
    index <- index[!is.na(index)]         # remove NAs
    
    if(along==1){
        colnames(df2) <- colnames(df1)   # keep 1st colnames                  
        res <- rbind(df1,df2)[ index, ]  # reorder data frame
    }
    if(along==2) res <- cbind(df1,df2)[ , index]           

    return(res)
}

###############################################################

Here come some examples.

###############################################################
### examples ###
require(plyr)

# data frames equal dimensions
df1 <- rdply(3, rep("o",4))[ ,-1]       # from plyr package
df2 <- rdply(3, rep("X",4))[ ,-1]       

zipFastener(df1, df2)
zipFastener(df1, df2, 2)
zipFastener(df1, df2, 1)

# data frames unequal in no. of rows
df1 <- rdply(10, rep("o",4))[ ,-1]
zipFastener(df1, df2, 1)
zipFastener(df2, df1, 1)

# data frames unequal in no. of columns
df2 <- rdply(10, rep("X",3))[ ,-1]
zipFastener(df1, df2)
zipFastener(df2, df1, 2)

###############################################################

I hope you find that useful.

Ciao, Mark



Viewing all articles
Browse latest Browse all 2

Latest Images

Trending Articles





Latest Images