How to find unique values within a string array that contains semi-colon delimited values?

evibesmusic
evibesmusic used Ask the Experts™
on
I'm using PHP and MySQL to power a small application. I'm importing data from an .ods file. The import produces ugly string values delimited by semi-colons. There are less than 1000 rows of data within the resulting table.

Example:

"SELECT DISTINCT submissions.tags FROM submissions";

#quality;#RIE;#PDSA;#6S;#MyEvent;
#quality;#RIE;#PDSA;#teamengagement;#6S;

I'm looking to extract the unique values beginning with the # character. An example of the resulting unique array would look like:

#quality;#RIE;#PDSA;#6S;#MyEvent;#teamengagement;

I'm wondering if this can be accomplished using strictly SQL or, if this would best be accomplished using PHP?

In either instance, I'm not knowledgeable of how to go about accomplishing this task?

Any guidance would be appreciated. Cheers!
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Scott FellDeveloper & EE Moderator
Fellow 2018
Most Valuable Expert 2013

Commented:
If you don't need to use the individual items in your query, it is better to do this on the server.

$str = "#quality;#RIE;#PDSA;#6S;#MyEvent;#teamengagement;";


print_r (explode("#",$str));

Open in new window


If  you want to remove the the semicolon at the end of each then
$str = "#quality;#RIE;#PDSA;#6S;#MyEvent;#teamengagement;";

$str = str_replace(";","",$str);

print_r (explode("#",$str));

Open in new window


A usage example
$str = "#quality;#RIE;#PDSA;#6S;#MyEvent;#teamengagement;";

$str = str_replace(";","",$str);

$array =  explode("#",$str);

echo "MyEvent= ".$array[5];

Open in new window

Author

Commented:
Hi Scott,

Thank you. You mention attempting to accomplish this server-side, so I assume you mean using SQL?

If not, using PHP and your examples I would:

- Create an empty array which will hold all values
- Clean up each semi-colon delineated array found in the DB and, add these values to the empty array
- Use the array_unique() function to find the unique values within the newly constructed array

This doesn't seem like the most expeditious process but, I believe it would work.

???
EE Topic Advisor
Most Valuable Expert 2014
Awarded 2013
Commented:
Splitting stings into rows within MySQL isn't trivial as there is no inbuilt "split string to rows" function (even in version 8 I think).

To do this requires in MySQL you require a "numbers table", see A MySQL Tidbit: Quick Numbers Table Generation as a guide.

If you cannot create such a table you can use a subquery to dynamically generate some.

Once you have the numbers table, then this will break each of those tags into seperated items:
select
       id
     , SUBSTRING_INDEX(SUBSTRING_INDEX(submissions.tags, ';', numbers.n), ';', -1) tag
from numbers
inner join submissions
  on CHAR_LENGTH(submissions.tags)
    -CHAR_LENGTH(REPLACE(submissions.tags, ';', '')) >= numbers.n-1
order by
  id, n

Open in new window

The result looks like this:
|    | id |       tag       |
|----|----|-----------------|
|  1 |  1 | #quality        |
|  2 |  1 | #RIE            |
|  3 |  1 | #PDSA           |
|  4 |  1 | #6S             |
|  5 |  1 | #MyEvent        |
|  6 |  1 |                 |
|  7 |  2 | #quality        |
|  8 |  2 | #RIE            |
|  9 |  2 | #PDSA           |
| 10 |  2 | #teamengagement |
| 11 |  2 | #6S             |
| 12 |  2 |                 |

Open in new window

note there may be unwanted rows which you can filter out using a where clause.

Here is an online demo of all the steps above to look at: http://rextester.com/YUL4734

(in the online demo only 1000 numbers are generated, the article by Kevin Cross generates 1,000,000 numbers)
Expert Spotlight: Joe Anderson (DatabaseMX)

We’ve posted a new Expert Spotlight!  Joe Anderson (DatabaseMX) has been on Experts Exchange since 2006. Learn more about this database architect, guitar aficionado, and Microsoft MVP.

Scott FellDeveloper & EE Moderator
Fellow 2018
Most Valuable Expert 2013

Commented:
I was using php
Commented:
Accomplish it using SQL is troublesome and hard to update the code if the format ever changes.

PHP was designed for easing operations with strings (was initially build as a templates engine for C language):

<?php

$str = '#quality;#RIE;#PDSA;#6S;#MyEvent;#quality;#RIE;#PDSA;#teamengagement;#6S;';

// find submissions
preg_match_all('/#(.*?);/', $str, $rawSubmissions, PREG_SET_ORDER);
$submissions = [];

// render unique submissions
array_map(
    function ($submission) use (&$submissions){
        if(!in_array($submission['1'], $submissions)) {
            $submissions[] = $submission['1'];
        }
    }, $rawSubmissions
);

// [optional] do operations on found submissions
sort($submissions);

print_r($submissions);

Open in new window

theGhost_k8Database Consultant

Commented:
There is this procedure to convert delimited string to rowswhich is quiet similar functionally as stated in few posts back but worth trying. You may call the procedure and execute next part on those generated tables.
PortletPaulEE Topic Advisor
Most Valuable Expert 2014
Awarded 2013

Commented:
>>"Accomplish it using SQL is troublesome and hard to update the code if the format ever changes."

I suggest it is no harder to update the SQL code that it is to update the PHP code, e.g. for a different table or column change the bold seen below.

select
       id
     , SUBSTRING_INDEX(SUBSTRING_INDEX(submissions.tags, ';', numbers.n), ';', -1) tag
from numbers
inner join submissions
  on CHAR_LENGTH(submissions.tags)
    -CHAR_LENGTH(REPLACE(submissions.tags, ';', '')) >= numbers.n-1
order by
  id, n

i.e. the structure of the SQL does not alter at all, just the relevant facts need updates

nb: If the delimiter changes there are 3 references to it ';' needs to become ',' for example

There is a one-time setup of a numbers table required - but this has a bundle of other uses as well.

Author

Commented:
Thank you, Experts.

I learned several new techniques including the creation of a numbers table which I had not considered - and - new PHP syntax that I can use to create a custom function.

I appreciate the different approaches and will use both during testing to determine which methodology provides the best performance for my application.

Cheers.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial