How to find unique values within a string array that contains semi-colon delimited values?

I'm using PHP and MySQL to power a small application. I'm importing data from an .ods file. The import produces ugly string values delimited by semi-colons. There are less than 1000 rows of data within the resulting table.

Example:

"SELECT DISTINCT submissions.tags FROM submissions";

#quality;#RIE;#PDSA;#6S;#MyEvent;
#quality;#RIE;#PDSA;#teamengagement;#6S;

I'm looking to extract the unique values beginning with the # character. An example of the resulting unique array would look like:

#quality;#RIE;#PDSA;#6S;#MyEvent;#teamengagement;

I'm wondering if this can be accomplished using strictly SQL or, if this would best be accomplished using PHP?

In either instance, I'm not knowledgeable of how to go about accomplishing this task?

Any guidance would be appreciated. Cheers!
evibesmusicAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Scott Fell, EE MVEDeveloper & EE ModeratorCommented:
If you don't need to use the individual items in your query, it is better to do this on the server.

$str = "#quality;#RIE;#PDSA;#6S;#MyEvent;#teamengagement;";


print_r (explode("#",$str));

Open in new window


If  you want to remove the the semicolon at the end of each then
$str = "#quality;#RIE;#PDSA;#6S;#MyEvent;#teamengagement;";

$str = str_replace(";","",$str);

print_r (explode("#",$str));

Open in new window


A usage example
$str = "#quality;#RIE;#PDSA;#6S;#MyEvent;#teamengagement;";

$str = str_replace(";","",$str);

$array =  explode("#",$str);

echo "MyEvent= ".$array[5];

Open in new window

0
evibesmusicAuthor Commented:
Hi Scott,

Thank you. You mention attempting to accomplish this server-side, so I assume you mean using SQL?

If not, using PHP and your examples I would:

- Create an empty array which will hold all values
- Clean up each semi-colon delineated array found in the DB and, add these values to the empty array
- Use the array_unique() function to find the unique values within the newly constructed array

This doesn't seem like the most expeditious process but, I believe it would work.

???
0
PortletPaulEE Topic AdvisorCommented:
Splitting stings into rows within MySQL isn't trivial as there is no inbuilt "split string to rows" function (even in version 8 I think).

To do this requires in MySQL you require a "numbers table", see A MySQL Tidbit: Quick Numbers Table Generation as a guide.

If you cannot create such a table you can use a subquery to dynamically generate some.

Once you have the numbers table, then this will break each of those tags into seperated items:
select
       id
     , SUBSTRING_INDEX(SUBSTRING_INDEX(submissions.tags, ';', numbers.n), ';', -1) tag
from numbers
inner join submissions
  on CHAR_LENGTH(submissions.tags)
    -CHAR_LENGTH(REPLACE(submissions.tags, ';', '')) >= numbers.n-1
order by
  id, n

Open in new window

The result looks like this:
|    | id |       tag       |
|----|----|-----------------|
|  1 |  1 | #quality        |
|  2 |  1 | #RIE            |
|  3 |  1 | #PDSA           |
|  4 |  1 | #6S             |
|  5 |  1 | #MyEvent        |
|  6 |  1 |                 |
|  7 |  2 | #quality        |
|  8 |  2 | #RIE            |
|  9 |  2 | #PDSA           |
| 10 |  2 | #teamengagement |
| 11 |  2 | #6S             |
| 12 |  2 |                 |

Open in new window

note there may be unwanted rows which you can filter out using a where clause.

Here is an online demo of all the steps above to look at: http://rextester.com/YUL4734

(in the online demo only 1000 numbers are generated, the article by Kevin Cross generates 1,000,000 numbers)
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
The Five Tenets of the Most Secure Backup

Data loss can hit a business in any number of ways. In reality, companies should expect to lose data at some point. The challenge is having a plan to recover from such an event.

Scott Fell, EE MVEDeveloper & EE ModeratorCommented:
I was using php
0
Ionut A. TudorPHP ProgrammerCommented:
Accomplish it using SQL is troublesome and hard to update the code if the format ever changes.

PHP was designed for easing operations with strings (was initially build as a templates engine for C language):

<?php

$str = '#quality;#RIE;#PDSA;#6S;#MyEvent;#quality;#RIE;#PDSA;#teamengagement;#6S;';

// find submissions
preg_match_all('/#(.*?);/', $str, $rawSubmissions, PREG_SET_ORDER);
$submissions = [];

// render unique submissions
array_map(
    function ($submission) use (&$submissions){
        if(!in_array($submission['1'], $submissions)) {
            $submissions[] = $submission['1'];
        }
    }, $rawSubmissions
);

// [optional] do operations on found submissions
sort($submissions);

print_r($submissions);

Open in new window

0
theGhost_k8Database ConsultantCommented:
There is this procedure to convert delimited string to rowswhich is quiet similar functionally as stated in few posts back but worth trying. You may call the procedure and execute next part on those generated tables.
0
PortletPaulEE Topic AdvisorCommented:
>>"Accomplish it using SQL is troublesome and hard to update the code if the format ever changes."

I suggest it is no harder to update the SQL code that it is to update the PHP code, e.g. for a different table or column change the bold seen below.

select
       id
     , SUBSTRING_INDEX(SUBSTRING_INDEX(submissions.tags, ';', numbers.n), ';', -1) tag
from numbers
inner join submissions
  on CHAR_LENGTH(submissions.tags)
    -CHAR_LENGTH(REPLACE(submissions.tags, ';', '')) >= numbers.n-1
order by
  id, n

i.e. the structure of the SQL does not alter at all, just the relevant facts need updates

nb: If the delimiter changes there are 3 references to it ';' needs to become ',' for example

There is a one-time setup of a numbers table required - but this has a bundle of other uses as well.
0
evibesmusicAuthor Commented:
Thank you, Experts.

I learned several new techniques including the creation of a numbers table which I had not considered - and - new PHP syntax that I can use to create a custom function.

I appreciate the different approaches and will use both during testing to determine which methodology provides the best performance for my application.

Cheers.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Query Syntax

From novice to tech pro — start learning today.