The example continues


Yesterday we have seen how can we scatter a matrix which resides on a core among the processes. Now I want to make the code clearer and encapsulate it into a function. I call the function dscatter and the following are the parameters:

  • const int& context: input, the blacs context
  • const double* const& GlobalMatrix: input, only relevant for the root process; for the other processes it is safe to use a random pointer or NULL
  • double*& LocalMatrix: output, the given pointer is useless; a new one will be stored there; an allocation will be performed, so after the execution the user has to free the memory using delete[].
  • int& GlobalRows: input for root, output for the other processes; after the execution contains the global number of rows
  • int& GlobalCols: input for root, output for the other processes; after the execution contains the global number of columns
  • int& BlockRows: input for root, output for the other processes; after the execution contains the number of rows in each block
  • int& BlockCols: input for root, output for the other processes; after the execution contains the number of columns in each block
  • int& LocalRows: output, after the execution contains the number of rows of the local matrix
  • int& LocalCols: output, after the execution contains the number of columns of the local matrix
  • const int& root: input, the BLACS id of the matrix that owns the global matrix.
I decided to switch to the Fortran interface because of some problems that I have had with the broadcast operation when using the C interface. I also decided to remove every MPI-dependend code, so this function is usable with every message passing interface that is supported by BLACS. The non-root processes must only know which process owns the matrix and have a context handler; the rest is given by the function. It is crucial that the user remembers to free the memory allocated by this function. I will also provide a function that returns the local number of rows and columns, so that the user can allocate the memory, and a function similar to this one that writes the local memory to the array allocated by the user.
Here is the code:
void dscatter(
    const int& context,                // [IN]
    const double* const& GlobalMatrix, // [IN] Only relevant for root
    double*& LocalMatrix,              // [OUT] The pointer changes
    int& GlobalRows,                   // [IN (root) / OUT (other)]
    int& GlobalCols,                   // [IN (root) / OUT (other)]
    int& BlockRows,                    // [IN (root) / OUT (other)]
    int& BlockCols,                    // [IN (root) / OUT (other)]
    int& LocalRows,                    // [OUT]
    int& LocalCols,                    // [OUT]
    const int& root = 0                // [IN]
) {
    /* Helper variables */
    int iZERO = 0, iONE = 1, iSIX = 6;

    int myid, myrow, mycol, procrows, proccols, procnum, rootrow, rootcol;
    blacs_pinfo_(&myid, &procnum);
    blacs_gridinfo_(&context, &procrows, &proccols, &myrow, &mycol);
    bool iamroot = (myid == root);

    /* Broadcast matrix info */
    int minfo[6];
    if (iamroot) {
        minfo[0] = GlobalRows;
        minfo[1] = GlobalCols;
        minfo[2] = BlockRows;
        minfo[3] = BlockCols;
        minfo[4] = myrow;
        minfo[5] = mycol;
        igebs2d_(&context, "All", " ", &iSIX, &iONE, minfo, &iSIX);
    } else {
        igebr2d_(&context, "All", " ", &iSIX, &iONE, minfo, &iSIX,
                 &iZERO, &iZERO);
    }

    GlobalRows = minfo[0];
    GlobalCols = minfo[1];
    BlockRows  = minfo[2];
    BlockCols  = minfo[3];
    rootrow    = minfo[4];
    rootcol    = minfo[5];

    /* Reserve space */
    LocalRows = numroc_(&GlobalRows, &BlockRows, &myrow, &iZERO, &procrows);
    LocalCols = numroc_(&GlobalCols, &BlockCols, &mycol, &iZERO, &proccols);
    LocalMatrix = new double[LocalRows*LocalCols];

    /* Scatter matrix */
    int destr = 0, destc = 0;
    int SendRows, SendCols;
    int RecvRow = 0, RecvCol = 0;
    for (int r = 0; r < GlobalRows; r += BlockRows, destr=(destr+1)%procrows) {
        destc = 0;

        // Is this the last row bloc?
        SendRows = BlockRows;
        if (GlobalRows-r < BlockRows)
            SendRows = GlobalRows-r;

        for (int c=0; c<GlobalCols; c+=BlockCols, destc=(destc+1)%proccols) {
            // Is this the last column block?
            SendCols = BlockCols;
            if (GlobalCols-c < BlockCols)
                SendCols = GlobalCols-c;

            // Send data
            if (iamroot) {
                dgesd2d_(&context, &SendRows, &SendCols,
                         GlobalMatrix + GlobalRows*c + r,
                         &GlobalRows, &destr, &destc
                );
            }

            // Rerceive data
            if (myrow == destr && mycol == destc) {
                dgerv2d_(&context, &SendRows, &SendCols,
                         LocalMatrix+LocalRows*RecvCol+RecvRow,
                         &LocalRows, &rootrow, &rootcol
                );

                // Adjust the next starting column
                RecvCol = (RecvCol + SendCols) % LocalCols;
            }
        }

        // Adjust the next starting row
        if (myrow == destr)
            RecvRow = (RecvRow + SendRows) % LocalRows;

    }

}
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: