Saturday 28 May 2011

Don't Decay your Arrays!

OK, that's nothing new, the technique is old, will be obsolete soon with the new std::tr1::array, and why should we use naked arrays anyhow? But recently I couldn't write it down as easily as I would like to, and as it's useful in the pre C++2011 code, why not discuss it shortly here?

Imagine you have to implement a Netbios/SMB remote file access protocol. The problem we are faced with while doing this is trivial but important - we need to log different packet types, operation codes and command options for debugging. They are all defined as integer type values and C++ doesn't (nor does any language I know) offer much support in that respect. So we define our own mini framework using macros. Don't panic, macros aren't that bad as they are told, and in our use case they are a perfect fit: translating between code (integer value) and strings (its descriptions). It's like eval() but in the other direction :). And it is working like this:
  // trace helper tables
  #define DEF_DESCRIPTION_ROW(a) {a, #a}
  struct CmdNameEntry { int cmdId; string cmdName; };

  const CmdNameEntry g_cmdNameTableNetbios[] = {
      DEF_DESCRIPTION_ROW(SESS_MESSAGE),
      DEF_DESCRIPTION_ROW(SESS_REQUEST),                             
      ...
  };
  const CmdNameEntry g_cmdNameTableSmb[] = { 
      ...
  };
  const CmdNameEntry g_cmdNameTableTrans[] = { 
      ...
  };
  const CmdNameEntry g_cmdNameTableTrans2[] = {                 
      ...
  };    
Got it? We use C's stringize operator (#) to convert a constant's name to a string in compile time. You see, we need quite a couple of tables, and each of the tables has many entries (you have to believe me, Netbios & Co are rather a chatty bunch). So why are we using naked arrays? Because the std::vector class didn't have a convenient initializer before C++2011*, a shame! Alternatively you could copy the array's contents to a fresh vector just to pass it to futher processing, but how ugly is that?

Now you'd need to write something like this to translate between integer codes and their descriptions:
  getCmd(g_cmdNameTableSmb, sizeof(g_cmdNameTableSmb)/sizeof(CmdNameEntry)),
because in C/C++ you cannot define a true function on arrays like this:
  getCmd(CmdNameEntry table[N]).
Well, it seems you can after all, but it's of no use anyway because (as you know) inside of the function the array will decay to a pointer and the size information will be lost. I don't know how other people may feel about it, but after some time I got pretty annoyed with that. Couldn't we wrap this in a macro, or even better, use some template trickery? Let's try and write a small utility:
  template<typename T, size_t N>
    string SmbProtocol::getCmdName(int code, T(&table)[N], const char* errorStr)
  {
      for(size_t i = 0; i < N); i++) // <- oops!!! compile error!
      {
          if(table[i].cmdId == code)
              return table[i].cmdName;
      }    
      return errorStr;
  }
Uh-oh, that doesn't compile, but frankly, how should it? N isn't a runtime construct, it is type information, and it lives only at the compile time! What we need is something to translate between type and value, i.e. a bridge between compile and run time. It appears that's not so complicated as it sounds, just write**:
  namespace util
  {
  template<typename T, size_t N>
    size_t array_size(T(&)[N]) { return N; }      
  }
As we are in a template, we can access the type information, and as we are a function, we can return a value. Bridge ready! Now we only need to replace the incorrect line with:
  for(size_t i = 0; i < util::array_size(table); i++)
and now we are ready to go:
  string SmbProtocol::getNetbiosCmdName(int code)
  {
      return getCmdName(code, g_cmdNameTableNetbios, "UNKNOWN_SESS_MSG");
  }
Now we can define new description tables, forward them to functions using the pair of T(&table)[N]and util::array_size() and let our pre C++2011 code look a little more elegant.
--
* You can have a look at this previous installment of this blog for the new C++0x initialization syntax .
** if you wonder about the array parameter syntax look here for an explanation: template-parameter-deduction-from-array-dimensions

No comments: