jl66
asked on
match pattern in a big file
Have a big text file (>2G). Want to get the lines, in which there is a pattern match, saying pattern 'ABC XYZ'. The line length can be up to 4000 characters. Questions:
what computer language is suitable for this situation? Could any gurus please share a piece of code or a link about it?
what computer language is suitable for this situation? Could any gurus please share a piece of code or a link about it?
sed -n "/ABC XYZ/p"
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
I had forgotten that sed could have that problem. Thank you for the precaution.
I might have used grep for this task, but it would be hard to call that a programming language.
(Though I guess it could qualify as a "computer language")
In perl, I might prefer
perl -ne 'print if index($_, "ABC XYZ") >= 0' BigTextFile.txt
or
perl -ne 'print if /ABC XYZ/' BigTextFile.txt
I might have used grep for this task, but it would be hard to call that a programming language.
(Though I guess it could qualify as a "computer language")
In perl, I might prefer
perl -ne 'print if index($_, "ABC XYZ") >= 0' BigTextFile.txt
or
perl -ne 'print if /ABC XYZ/' BigTextFile.txt
Yes, greater than or equal to zero. What was I thinking?
(The original grep had a similar line length restriction.)
(The original grep had a similar line length restriction.)
ASKER
Thanks a lot for the info.
I ran it in windows 7 and got the error. The text I tried to catch in the big file is "text" such as
...,"text":"RT @...
...,"text":"Test...
F:\>perl -ne 'print if /text/' bigtext.txt
Can't find string terminator "'" anywhere before EOF at -e line 1.
F:\>perl -ne 'print if index($_, "text") >= 0' twitter_short.txt
Can't find string terminator "'" anywhere before EOF at -e line 1.
The perl version is
F:\>perl -version
This is perl 5, version 14, subversion 4 (v5.14.4) built for MSWin32-x64-multi-thread
Copyright 1987-2013, Larry Wall
I don't know why?
I ran it in windows 7 and got the error. The text I tried to catch in the big file is "text" such as
...,"text":"RT @...
...,"text":"Test...
F:\>perl -ne 'print if /text/' bigtext.txt
Can't find string terminator "'" anywhere before EOF at -e line 1.
F:\>perl -ne 'print if index($_, "text") >= 0' twitter_short.txt
Can't find string terminator "'" anywhere before EOF at -e line 1.
The perl version is
F:\>perl -version
This is perl 5, version 14, subversion 4 (v5.14.4) built for MSWin32-x64-multi-thread
Copyright 1987-2013, Larry Wall
I don't know why?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks a lot. Big help.