Solved

grep awk text in variable positions text file

Posted on 2013-06-21
3
563 Views
Last Modified: 2013-06-21
I have text files I need to extract website links from, but they are in variable positions.

These lines will all be in one text file, and I need to find and pull out only the links.

Examples:

I received the link and it is http://www.google.com/
Bob sent me the best website and I sent it to him:http://www.yahoo.com
Please update to http://www.dropbox.com and I'll get it back to you asap.

dos or linux or python suggestions?
0
Comment
Question by:fkn
3 Comments
 
LVL 75

Accepted Solution

by:
käµfm³d   👽 earned 500 total points
ID: 39266773
In Python:

import re

# file read code modified from http://stackoverflow.com/questions/8369219/how-do-i-read-a-text-file-into-a-string-variable-in-python#answer-8369272
with open('C:\input.txt', 'r') as inFile:
	text = "".join(line.rstrip() for line in inFile)

for match in re.findall('http://[^ ]+', text):
	print(match)

Open in new window

0
 

Author Closing Comment

by:fkn
ID: 39266831
On the money.
0
 
LVL 48

Expert Comment

by:Tintin
ID: 39267352
Much easier to do

grep -Po "http://[\w-.]+" file

Open in new window

0

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

I. Introduction There's an interesting discussion going on now in an Experts Exchange Group — Attachments with no extension (http://www.experts-exchange.com/discussions/210281/Attachments-with-no-extension.html). This reminded me of questions tha…
Active Directory replication delay is the cause to many problems.  Here is a super easy script to force Active Directory replication to all sites with by using an elevated PowerShell command prompt, and a tool to verify your changes.
Learn the basics of modules and packages in Python. Every Python file is a module, ending in the suffix: .py: Modules are a collection of functions and variables.: Packages are a collection of modules.: Module functions and variables are accessed us…
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…

910 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now