Solved

MySQL: Inserting Unique Column Value

Posted on 2008-06-15
11
953 Views
Last Modified: 2008-06-18
Hi,
Scenario: I have a database table that stores blog posts. It looks like this:

CREATE TABLE `posts` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`title` VARCHAR( 50 ) NOT NULL ,
`safe_title` VARCHAR( 60 ) NOT NULL ,
`content` TEXT NOT NULL ,
`user_id` INT UNSIGNED NOT NULL ,
UNIQUE (`safe_title`)
) ENGINE = MYISAM ;

The `safe_title` column contains the post title in a URL safe format. That is, the post title with all special characters removed. So a post title of "Hello World" becomes "hello_world". It's important that every post has a unique safe title, even if some posts have the same title. Post #1 with the title "Hello World" would have the safe title of "hello_world". Post #2 with the title "Hello World" would have the safe title "hello_world_2", and so on.

The problem: How to insert a unique safe title in only one or two db transactions? I don't want to create code like this:

$found = true;
$counter = 0;
do {
      ++$counter;
      $safe_title = getSafeTitle('Hello World') . '_' . $counter;
      $results = $db->execute("SELECT FROM `posts` WHERE `safe_title` = '$safe_title' LIMIT 1");
      if (count($results) == 0) {
            $found = false;
      }
} while ($found == true);
$db->insert("INSERT INTO ....") // Insert post with safe title

That little bit of code could loop through a dozen times before a unique title is found. Even worse, by the time $db->insert() is called, the safe title that I've found may no longer be unique (If the system is inserting thousands of new blog posts a minute).

So how do I avoid this problem?
0
Comment
Question by:headzoo
  • 4
  • 3
  • 2
  • +1
11 Comments
 
LVL 2

Expert Comment

by:pain_is_relative
ID: 21791106
you are painting yourself into a corner.

lets see...you could attach an md5 hash of the current date after the name.... ugly.

you could make the safe-title a unique primary key, but that could cause you problems... old posts getting over written, errors writing to table. etc.

Alternatively you could reference the page by its primary key ... (id), and let the posts share names. So your posts could be referebced by id=678, id=56789, id=3. that way you would not be limiting your posts to having unique names.
0
 

Author Comment

by:headzoo
ID: 21791270
Hi pain_is_relative,
I wish it could be that simple. But clients today demand SEO friendly URLs, and not URLs that simply work (and work well).

I figure I could ditch the loop, and change my code to something like this:

$safe_title = 'hello_world';
$results = $db->execute(
      "SELECT * FROM `posts` WHERE `safe_title` REGEXP '^$safe_title(_[0-9]+)?$' LIMIT 1"
);

/**
 * Check to see if we found another post with the same
 * safe title
 */
if (count($results) > 0) {
      preg_match('/^(.*?)(_([0-9]))?$/', $results->safe_title, $matches);
      
      /**
       * If the safe title already has a _[0-9] appended to
       * it, we need to increment the [0-9] by one. Otherwise
       * add _1 to the safe title.
       */
      if (!empty($matches[2])) {
            $safe_title = $matches[0] . '_' . ($matches[2] + 1);
      } else {
            $safe_title = $matches[0] . '_1';
      }
}

$db->insert("INSERT INTO ....") // Insert post with safe title

That still doesn't solve the problem of another blog post with the same safe title being inserted microseconds before I call $db->insert(). Also I've never used regular expressions in a SELECT statement before, and I don't know what the performance would be like on large data sets.
0
 
LVL 2

Expert Comment

by:pain_is_relative
ID: 21791322
You could always add the session id to the date and then md5 it then add it to the safe name... but yess.. its ugly...

you could invesitigate using modrewrite, but i am not a fan of that.

i still think that a url of : blah/blah/blogpost.php?id=678 still looks reasonable.
you could even do this:
blah/blah/blogpost.php?id=678&name=my_safe_name_url_is_ace
that way you preserve the page name, and display it in the url, without ever using it. (just faking it it)
0
 
LVL 3

Accepted Solution

by:
pzurowski earned 500 total points
ID: 21791428
Count total pages which begins with safe hello_world. If there will be none, so you have good title, if not you can attach total number of existing similar titles to the title. (If there is one, it will be without number, so the number will be 1; if there is two, there willbe wihout number and with _1, so the total is 2 and you can attach _2; ...so on..)

You can change
   LIKE '$safe_title%'
below to
   RLIKE '^${safe_title}_?[0-9]*\$'
to get more accurate results.
$safe_title = getSafeTitle('Hello World');

$results = $db->execute("SELECT count(*) AS cnt FROM `posts` WHERE `safe_title` LIKE '$safe_title%'");
 

$count = $result->getOne(); // or other function, which get "cnt" from query above
 

if( $count > 0 ){

   $safe_title .= "_" . $count;

}
 

$db->insert("INSERT INTO ....") // Insert post with safe title

Open in new window

0
 

Author Comment

by:headzoo
ID: 21791524
@pzurowski -
You gave me an idea. I could try SQL like this:

INSERT INTO `posts` (`id`, `title`, `content`, `user_id`, `safe_title`)
SELECT
null,
'Hello World',
'This is a test',
1,
IF(COUNT(*) > 1, CONCAT('hello_world_', COUNT(*)), 'hello_world')
FROM `posts` WHERE `safe_title` REGEXP '^hello_world(_[0-9]+)?$'
LIMIT 1

That would insert the blog post, and generate the safe title all in one transaction. This would be very quick, and it becomes unlikely that another post with the same safe title would be inserted before the transaction is complete.

Any thoughts on this kind of SQL?
0
Complete Microsoft Windows PC® & Mac Backup

Backup and recovery solutions to protect all your PCs & Mac– on-premises or in remote locations. Acronis backs up entire PC or Mac with patented reliable disk imaging technology and you will be able to restore workstations to a new, dissimilar hardware in minutes.

 
LVL 3

Assisted Solution

by:pzurowski
pzurowski earned 500 total points
ID: 21791659
think about seperating count and safe url, i mean:

CREATE TABLE `posts` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`title` VARCHAR( 50 ) NOT NULL ,
`safe_title` VARCHAR( 60 ) NOT NULL ,
`which` INT NOT NULL, --- added here
`content` TEXT NOT NULL ,
`user_id` INT UNSIGNED NOT NULL ,
UNIQUE (`safe_title`, `which`) --- added here
) ENGINE = MYISAM ;

and next will be simplier and faster
INSERT INTO `posts` (`id`, `title`, `content`, `user_id`, `safe_title`, `which`)

SELECT

null,

'Hello World',

'This is a test',

1,

COUNT(*) WHERE `safe_title` = 'hello_world'
 

SELECT `id`, IF( `which` > 0, CONCAT( `safe_title`, "_", `which` ), `safe_title` ) FROM `posts`

Open in new window

0
 

Author Comment

by:headzoo
ID: 21791678
@pzurowski -
I assume this is better performance wise, because I'm not requesting a count of all matching rows twice?
0
 
LVL 3

Expert Comment

by:pzurowski
ID: 21791696
First of all, because there is no regexp, which are slower. Moreover you can quicker read it, and it will be faster to expand or bug fix in future :)
0
 
LVL 2

Expert Comment

by:raheel_lips
ID: 21792213
Hi headzoo,

$safe_title = getSafeTitle('Hello World'); // this will give you white spaces removed string.
 

$results = $db->execute("SELECT count(*)+1 AS total FROM `posts` WHERE `safe_title` LIKE '$safe_title%'");
 

if( $count > 2 ){

   $safe_title .= "_" . $count;

} 

 

$db->insert("INSERT INTO ....") // Insert post with safe title

Open in new window

0
 
LVL 2

Expert Comment

by:raheel_lips
ID: 21792225
I missed some comments in my last post so consider these too.

// Above mentioned code is utilizing only 2 hits in post table.
1) check existing
2) Insert valid information

** There is no way to sence the duplicate data during insertion of post in table.

/* You would use one table transaction for check duplication before inserting data in table.
0
 

Author Comment

by:headzoo
ID: 21792292
@raheel_lips

The problem with using LIKE 'hello_world%', is it would also match against safe titles like "hello_world_how_are_you", etc.

@pzurowski

I played around with your SQL, and I can't tell what it's supposed to do. It appears to be completely invalid and/or doesn't work no matter how I rearranged it.

@everyone

It dawned on me that the request URLs would look like this:

http://site.com/users/headzoo/posts/hello_world

that is:

http://site.com/usres/[name of user]/posts/[post safe title]

Which means the user and the safe title could be combined into a unique index. So I changed the table to this:

CREATE TABLE `posts` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`title` VARCHAR( 50 ) NOT NULL ,
`safe_title` VARCHAR( 60 ) NOT NULL ,
`content` TEXT NOT NULL ,
`user_id` INT UNSIGNED NOT NULL ,
UNIQUE (`safe_title`, `user_id`) -- changed here
) ENGINE = MYISAM ;

Because it's impossible for the same user to submit more than one post with the same title, at the same time, I can safely write my code like this:

$safe_title = 'hello_world';
$results = $db->execute("SELECT COUNT(*) FROM `posts` WHERE `safe_title` = '$safe_title'");
if (count($results) > 0) {
      $safe_title .= '_' . (count($results) + 1);
}
$db->insert("INSERT INTO ....") // Insert post with safe title

I don't have to worry about another post with the same safe title being inserted prior to calling $db->insert(), because I can have that. Just so long as it's a different user.

A couple people here gave me the ideas for my final code/sql. So I need to split the points up some way.
0

Featured Post

6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

Join & Write a Comment

Loading csv or delimited data files to MySQL database is a very common task frequently questioned about and almost every time LOAD DATA INFILE comes to the rescue. Here we will try to understand some of the very common scenarios for loading data …
Creating and Managing Databases with phpMyAdmin in cPanel.
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…
This video explains how to create simple products associated to Magento configurable product and offers fast way of their generation with Store Manager for Magento tool.

759 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now