MySQL: Duplicate Entry Error when handling Varchar Primary Keys

| No TrackBacks
mysql.pngIn the last week or so, I worked on a very interesting problem that dealt with using varchar's as a primary key in MySQL.  I wrote some Java code that was migrating sensitive user data from a SQLite database to a MySQL database.  I was inserting the migrated data into a MySQL table that declared a varchar field as a primary key.  On several of the inserts, I received an "ERROR 1062 (23000): Duplicate entry 'bleh' for key 1".  I was puzzled, because the primary key I was inserting into the table was the same string of characters, just with different case.  For example, one key was "Ozzie" and another was "ozzie".  Technically speaking, these SHOULD be different primary keys (and they are) but MySQL didn't immediately think so.

Continue reading for an explanation.
As it turns out, you should use the BINARY operator in MySQL when creating your tables that use a varchar as, or in, a primary key.  According to http://lists.mysql.com/mysql/170390 you can use the BINARY keyword "... to cause sorting and comparison to be case sensitive using the underlying character code values rather then a lexical ordering."

For example, this will fail even though "Ozzie" and "ozzie" are technically separate primary keys:

mysql> create table cat ( name varchar(20), primary key ( name ) );
Query OK, 0 rows affected (0.04 sec)

mysql> insert into cat ( name ) values ( 'ozzie' );
Query OK, 1 row affected (0.00 sec)

mysql> insert into cat ( name ) values ( 'Ozzie' );
ERROR 1062 (23000): Duplicate entry 'Ozzie' for key 1


Declaring the varchar name field BINARY when creating the table solves this problem:

mysql> create table cat ( name varchar(20) BINARY, primary key ( name ) );
Query OK, 0 rows affected (0.04 sec)

mysql> insert into cat ( name ) values ( 'ozzie' );
Query OK, 1 row affected (0.00 sec)

mysql> insert into cat ( name ) values ( 'Ozzie' );
Query OK, 1 row affected (0.00 sec)

I'm not sure why MySQL, by default, uses a lexical ordering instead of a more appropriate approach like comparing the underlying character code values for primary keys.

Anyways, enjoy.

Did You Find this Helpful?

Did you find this post helpful, or at least, interesting?

  

Send Mark a Direct Message

If you'd like to send me a direct message, please do so below. However, I do not publicly post comments or messages submitted directly to me. So, if you're going to try to SPAM me, or my blog, you're pretty much wasting your time.

400 characters remaining

Error

About Mark

A Silicon Valley native, Mark Kolich is a full-time Software Engineer, a casual entrepreneur, and a consultant for hire. A web technologies expert, his current focus is on building powerful and robust cloud-driven web-applications using Java, PHP, Perl, AJAX, DHTML, CSS, and JavaScript. His favorite programming languages are PHP, Java and JavaScript. He uses Linux, enjoys biking to work, loves building great software, and always writes elegant, readable, and maintainable code.

No TrackBacks

No trackbacks attached to this entry.

Twitter (@markkolich)

Translate

About this Entry

This page contains a single entry by Mark Kolich published on October 29, 2008 5:25 PM.

Warning on Registering .pro Domain Names: Process is Quite .lame was the previous entry in this blog.

HOWTO: Generating Many Good Random Passwords 'Automagically' using Bash is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.