How to find word longer than 42 characters

deriklogov
deriklogov used Ask the Experts™
on
Hello,


I need your help ,
I am exporting items from mysql database to xml file , I am using php script,
2 fields contains some texts --  Title and Description.
I need to skip items where any words in those fields are more than 42 characters in size.
So lets say I got database
id | title | description
1 | Hello | Basic Descri
2 | Veryloooooooooooooooooo4234oooo423432oo423423oooooongngngngngg| Description
3 |hey | very easy

How we can skip items from exporting them to xml file (in our example item where so long words are present)
?




Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Reza RadConsultant, Trainer

Commented:
you can specify them in your query, like this:
select * from yourtable where length(description)>=42

Author

Commented:
the thing is , its not the total size of description , 42 is maximum size of any word in description or title
Richard QuadlingSenior Software Developer

Commented:
A long word in a column, so a lot of small words would be OK, even if the total length is greater than 42.

Hmmm.

I have to ask why? I know it isn't part of your question, but considering you are putting the data to XML, what does it matter?

Would simply using the first 38 characters, followed by " ..." be enough to fulfil the criteria?

Are you using mysql to generate the XML or are you passing normal rowsets to PHP and using PHP to generate the XML?

Assuming the later (i.e using PHP to generate the XML), and assuming any non-letter/digit is used as separator, then something like this ...


Not particularly elegant. Could be done with the array_filter/array_map functions probably a lot nicer.

Outputs ...

Array
(
    [0] => Array
        (
            [0] => 1
            [1] => Hello
            [2] => Basic Description
        )

    [2] => Array
        (
            [0] => 3
            [1] => hey
            [2] => very easy
        )

)
<?php
$Rows = array
	(
	array(1, 'Hello', 'Basic Description'),
	array(2, 'Veryloooooooooooooooooo4234oooo423432oo423423oooooongngngngngg', 'Basic Description'),
	array(3, 'hey', 'very easy'),
	);

foreach($Rows as $Row => $Data)
	{
	foreach($Data as $Column => $Value)
		{
		if (is_string($Value))
			{
			$Words = str_word_count($Value, 1, '0123456789');
			foreach($Words as $Word)
				{
				if (strlen($Word) > 42)
					{
					unset($Rows[$Row]);
					break 2;
					}
				}
			}
		}
	}

print_r($Rows);

Open in new window

Learn Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

Reza RadConsultant, Trainer

Commented:
>>the thing is , its not the total size of description , 42 is maximum size of any word in description or title
so , it can be :
select * from yourtable where length(description)>=42 or length(title)>=42

Author

Commented:
to RQuadling:

I think you are very close to solution, I am using MYSQL and PHP to export data from mysql to xml.
in you example you putting those words in array, but could you exlpain how its going to look if we doing:

$SQL    = "SELECT * FROM `table` WHERE 1 ";
$result = mysql_query($SQL);
while ($row = mysql_fetch_array($result)) {


//////// So here we need to put IF statement which will check the size of every word in following columns
$row['title']
$row['description']
If any of those columns have words with more than 42 symbols words we skipping column





}
Senior Software Developer
Commented:
Try this - untested as I'm not actually getting any data.Are you wanting the COLUMN to be empty for a single word >42 or the entire ROW?
<?php
$SQL    = "SELECT * FROM `table` WHERE 1 ";
$result = mysql_query($SQL);

$Rows = array();

while (False !== ($Data = mysql_fetch_array($result)))
	{
	$Ignore = False;
	foreach(array('title', 'description') as $Column)
		{
		$Words = str_word_count($Data[$Column], 1, '0123456789');
		foreach($Words as $Word)
			{
			if (strlen($Word) > 42)
				{
				$Ignore = True;
				}
			}
		}
	if (!$Ignore)
		{
		$Rows[] = $Data;
		}
	}

print_r($Rows);

Open in new window

Reza RadConsultant, Trainer

Commented:
why you want to fetch all data and then loop through them and export only the rows that is valid with your criteria, this will not led to good performance!
the only thing you need is to replace your current select query with the query below:
$SQL    = "SELECT * FROM `table` WHERE 1 ";
$result = mysql_query($SQL);
while ($row = mysql_fetch_array($result)) {

$SQL    = "select * from table where length(description)>=42 or length(title)>=42";
$result = mysql_query($SQL);
while ($row = mysql_fetch_array($result)) {

Open in new window

Author

Commented:
you can not compare full length of description to 42, description could be  even 200, only words prohibited longer than 42
Reza RadConsultant, Trainer

Commented:
so, use trim to get only words length

$SQL    = "select * from table where length(trim(description))>=42 or length(trim(title))>=42";
$result = mysql_query($SQL);
while ($row = mysql_fetch_array($result)) {

Open in new window

Richard QuadlingSenior Software Developer

Commented:
You could do the whole thing in T-SQL. I use MS SQL, but I assume you have a similar syntax for mysql.

As I understand things, you'd have to use a cursor to read 1 row at a time.

And, as I understand things and have been recommended, cursors are notoriously awkward in a multi-user environment and are &quot;not the way to do things&quot;. But I don't know.

But.

I have a function which takes a value and splits it into a table. I use it to supply a comma separated list to a insert/update function.

The app's front-end allows a user to select a list of reports to run.

Rather than many INSERT statements, I use a SP to handle the list.

I've included the function below ( which is based upon an answer I got from AngelIII a long time ago).

Using it on a single value ...

DECLARE @Description VARCHAR(MAX)

SELECT @Description = 'This is a description with OneReallyStupidlyLongDescriptionInItIHopeThisIsGreaterThan42CharactersOtherwiseIWillHaveToAddSomeGarbageToIt'

select max(len(value)) from master.dbo.ParametersToList (@Description,' ')

results in

108.

So, something like ...

Select id, title, description
from table
where (select max(len(value)) from master.dbo.ParametersToList(title, ' ') < 42
and (select max(len(description)) from master.dbo.ParametersToList(title, ' ') < 42


___COULD___ be the answer for a pure SQL answer.

Downside? The code is only geared to a single separator. No regex or multiple choice separators.

If you have ....


&quot;PartOfAShortDesc.MoreLongDescButDifferentWord&quot;

The middle . WON'T act as a word break.



USE [master]
GO
/****** Object:  UserDefinedFunction [dbo].[ParametersToList]    Script Date: 02/01/2010 09:48:57 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER FUNCTION [dbo].[ParametersToList] (@s_Parameters VARCHAR(MAX) = NULL, @s_Separator VARCHAR(MAX) = ',')
RETURNS @tbl_List TABLE
	(
	Value VARCHAR(MAX)
	)
AS
BEGIN
	DECLARE
		@tbl_TempList TABLE
			(
			Value VARCHAR(MAX)
			)
	DECLARE
		@s_Value VARCHAR(MAX),
		@i_Pos INT

	SET @s_Parameters = LTRIM(RTRIM(@s_Parameters)) + @s_Separator
	SET @i_Pos = CHARINDEX(@s_Separator, @s_Parameters, 1)

	IF REPLACE(@s_Parameters, @s_Separator, '') <> ''
		BEGIN

		WHILE @i_Pos > 0
			BEGIN

			SET @s_Value = LTRIM(RTRIM(LEFT(@s_Parameters, @i_Pos - 1)))

			IF ISNULL(@s_Value, '') <> ''
				BEGIN
				INSERT INTO
					@tbl_TempList(Value)
					VALUES (@s_Value)
				END

			SET @s_Parameters = RIGHT(@s_Parameters, LEN(@s_Parameters) - @i_Pos - LEN(@s_Separator) + 1)
			SET @i_Pos = CHARINDEX(@s_Separator, @s_Parameters, 1)

			END
		END

	INSERT
		@tbl_List
	SELECT
		Value
	FROM
		@tbl_TempList

	RETURN
END

Open in new window

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial