Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 181
  • Last Modified:

Extracting set amount of words.

I've been looking into creating a news script, on the main page it will display the first 200 words of the article and give a link to the whole article.  What would be the best way of doing this?

I've been told that I could extract the first 200 characters but not words.  If I extract the first 200 characters, is there someway to prevent it from stopping in mid-word?  If I need to post the entire script I will but at this point in time I'm only looking for an example.

Basically when the top story article is posted through a form it will be processed and saved.  During the saving process it should clip the first 200 words (if not that, characters), it will save that along with other information to a file.

Author|Date|Time|UserID|Title|Link|((Extracted200WordsHere))

Thanks in advance.
0
KenHeckert
Asked:
KenHeckert
  • 3
  • 2
  • 2
1 Solution
 
jmcgOwnerCommented:
Depending on your definition of "words" you might be satisfied with something like the following:

$words = join ' ', split(' ', $content, 200);

This splits out the first 200 space-delimited tokens in the string $content. It does not remove HTML coding or anything like that, so you may need to prepare the content of the article to ensure that the first 200 words are something meaningful.

The third argument to 'split' limits the split operation so it stops once it has generated the given number of chunks.
0
 
ahoffmannCommented:
$content=~s/([^\s]*\s+){200}/$1/;
0
 
ahoffmannCommented:
$content=~s/^(([^\s]*\s+){200}).*/$1/;
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
KenHeckertAuthor Commented:
Thanks ahoffman.  Works perfectly fine.  So simple too.
0
 
ahoffmannCommented:
just keep in mind that it does not match if there are less than 200 words
0
 
jmcgOwnerCommented:
But that could be fixed by doing:

$content =~ s/^((\S+\s+){0,200}/$1/;

(\S is the built-in equivalent to [^\s])
0
 
KenHeckertAuthor Commented:
Alright, thanks jmcq.
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 3
  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now