2011-07-12

SQL identifier case, Unicode case, and boost::to_upper, all in a fight that nobody will win

The SQL standard says that table and row names are case insensitive. Drizzle table names are in Unicode with UTF8 encoding. boost::to_upper et al mangles UTF8 case. Down this path there is going to be a lot of pain. My own inclination is to tell the SQL Standard to realize that its 2011, no longer 1961, and break all the apps that are lazy about identifier case, but other people will probably disagree.

6 comments:

  1. My own inclination is that table names should be plain ascii (case insensitive)

    ReplyDelete
  2. since the server doesn't know the language of the client connecting machine I suspect the only fix is to either restrict to basic alphanumeric ascii or only make the ascii in a table name case insensitive.

    ReplyDelete
  3. From a server-side performance and simplicity point of view, I'd say they should be case sensitive.

    ReplyDelete
  4. I agree with Mark's comment "SQL Standard to realize that its 2011, no longer 1961". this is problem in SQL standard. only few languages in this world are having "Case". if standard have respect to global citizens, it should be case sensitive.. no magics.
    Otherwise drizzle need to stop me in naming table in my native language. saying "Only English ASCII is allowed".

    ReplyDelete
  5. 1) I agree, 2) also in MySQL we have a history of table names being case sensitive (ie table names being filenames) so I doubt it will break that many applications really.

    ReplyDelete
  6. Peter da Silva, over in this discussion in G+, is making a convincing case that all SQL identifiers should be ASCII. Of course, actual data content and strings should still all be Unicode.

    ReplyDelete