Find a product from an incorrectly entered product code or model using MySQL REGEXP (Regular Expressions)
Ever been searching for a particular product online by a model or product code? It could be a brand spanking new super thin LCD or the latest Adidas Predator football boots. You know exactly what you want and you are armed with the model or code of your product. You are now going to shop around looking for the best price and check out the reviews. You don’t need to browse for the product by name or go to an online shop and find after browsing that they don’t sell the exact thing you are looking for as you know what you want.
Have you ever found that you don’t actually have the correct code or model. You are sure you have it, but you just can’t find it. Do you give up? The problem is that you may have seen a model or code that has a “0″ (the number zero) in it and you have it down as an “O” (the letter o). That along with a number of other similarly looking combinations can mean you don’t find your product online. Take for example the product code “C602E/SS”. This could also be “C6O2E/SS” or even “C602ESS” without the forward slash.
Think about how that would affect you if you were an online business. You may have got the model or code from an advert or something, but simply wrote it down incorrectly. The way around this for an online shop is to search for all the different possible combinations of a searched model or code and return the results. Basically if you type in the wrong code you should still get the results.
We will look at swapping the following combinations:
- 1 to l and l to 1
- 1 to i and i to 1
- l to i and i to l
- 0 to o and o to 0
- 5 to s and s to 5
- 2 to z and z to 2
- 8 to b and b to 8
- removing /
- removing .
- removing spaces
To do this we could use PHPs string_replace and build all the possible string combinations within a MySQL OR statement. The problem with this is that if you use the 10 replacements above for each letter the longer the model or code the bigger the number of combinations there are. In this example we have an eight letter code. This results in an 8 to the power of 10 result: 1,073,741,824 possible combinations. This is why regular expressions are so powerful. Let’s see how.
// Get our search from the post
$product_code = $_GET['product_code'];
// Setup our string that will hold the regular expression.
$regexp_search = '';
// Firstly we need to remove certain characters visitors may
// add that are not in the product code.
$replacements = array('/', '-', '.');
// The regular expressions will handle the uppercase and
// lowercase combinations so lets make it all lowercase.
$search = strtolower($product_code);
// Do an extra clean to make sure we have no special characters.
// Most product codes aren't going to have special characters.
$search = preg_replace("/[^a-z0-9]?/", "", $search);
// Split the product code into an array of letters so we can go
// through them individually.
$search_parts = str_split($search);
// Go through each letter and check to see what the alternatives
// could be and build the regexp search.
// For each letter also add the possibility that a ".", "-", or "/" may
// proceed the letter. This is useful if a visitor enters a code, but
// forgets a ".", "-", or "/"
foreach ($search_parts as $search_part)
{
switch ($piece_part)
{
// Find and upper and lower possible combinations for 1
case "1":
$regexp_search .= "(1|l|i|L|I|1/|l/|i/|L/|I/|1-|l-|i-|L-|I-|1.|l.|i.|L.|I.)";
break;
// Find and upper and lower possible combinations for l
case "l":
$regexp_search .= "(1|l|i|L|I|1/|l/|i/|L/|I/|1-|l-|i-|L-|I-|1.|l.|i.|L.|I.)";
break;
// Find and upper and lower possible combinations for i
case "i":
$regexp_search .= "(1|l|i|L|I|1/|l/|i/|L/|I/|1-|l-|i-|L-|I-|1.|l.|i.|L.|I.)";
break;
// Find and upper and lower possible combinations for 0
case "0":
$regexp_search .= "(0|o|O|0/|o/|O/|0-|o-|O-|0.|o.|O.)";
break;
// Find and upper and lower possible combinations for o
case "o":
$regexp_search .= "(0|o|O|0/|o/|O/|0-|o-|O-|0.|o.|O.)";
break;
// Find and upper and lower possible combinations for 5
case "5":
$regexp_search .= "(5|s|S|5/|s/|S/|5-|s-|S-|5.|s.|S.)";
break;
// Find and upper and lower possible combinations for s
case "s":
$regexp_search .= "(5|s|S|5/|s/|S/|5-|s-|S-|5.|s.|S.)";
break;
// Find and upper and lower possible combinations for 2
case "2":
$regexp_search .= "(2|z|Z|2/|z/|Z/|2-|z-|Z-|2.|z.|Z.)";
break;
// Find and upper and lower possible combinations for z
case "z":
$regexp_search .= "(2|z|Z|2/|z/|Z/|2-|z-|Z-|2.|z.|Z.)";
break;
// Find and upper and lower possible combinations for 8
case "8":
$regexp_search .= "(8|b|B|8/|b/|B/|8-|b-|B-|8.|b.|B.)";
break;
// Find and upper and lower possible combinations for b
case "b":
$regexp_search .= "(8|b|B|8/|b/|B/|8-|b-|B-|8.|b.|B.)";
break;
// Build the combination with upper and lowercase options
default:
$regexp_search .= "(".$search_part."|".strtoupper($search_part)."|"
.$search_part."/|".strtoupper($search_part)."/)";
break;
}
}
// Build the mysql query
$mysql = "SELECT * FROM products WHERE (product_code REGEXP '".$regexp_piece."');";




