troubleshooting Question

Ruby - Extract Year from document text

Avatar of 707Tech
707TechFlag for United States of America asked on
RubyProgrammingMiscellaneousProgramming Languages-Other
6 Comments1 Solution504 ViewsLast Modified:
Hello, I am trying to update a Ruby script that looks in html files downloaded from the IRS site and finds specific text in the file to extract the Year from the document.   The script works fine however there are not 2 types of documents that can be in the folder and the line in question is different in each document.  How can I add an "or" or "if" clause so the script looks for the text formatted either way and pulls out the year based on how the line of text reads.   The line currently in the file is displayed as "TAX PERIOD:    DEC. 31, 2014"  The line in the 2nd document that I need to add is displayed as "Tax Period or Periods:  December, 2014"  I would need the year "2014" extracted from each document.   I have attached screenshots of the script in color and the 2 document types.

 
#Open and read the downloaded file
	transcript_html_name = files_in_dir[i]
	File.open(Dir.pwd + "/Transcripts/#{transcript_html_name}", "r") do |f|
		f.each_line do |line|
			 #Search for the Tax Period date to get the year
			if line.include? "TAX PERIOD:" 
				line.gsub!(/(<[^>]*>)|\n|\t/s) {""}
				line.slice!("TAX PERIOD:")
				line.gsub!(/\w\w\w[.]\s\d\d[,]\s/) {""}
				year = line
			end

Open in new window

script.jpg
document-sample-1.jpg
document-sample-2.jpg
ASKER CERTIFIED SOLUTION
Log in to continue reading
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform for $9.99/mo
View membership options
Unlock 1 Answer and 6 Comments.
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
The Value of Experts Exchange in My Daily IT Life

Experts Exchange (EE) has become my company's go-to resource to get answers. I've used EE to make decisions, solve problems and even save customers. OutagesIO has been a challenging project and... Keep reading >>

Mike

Owner of Outages.IO
Phoenix, Arizona, United States
Member Since 2016
Join a full scale community that combines the best parts of other tools into one platform.
Unlock 1 Answer and 6 Comments.
View membership options
“All of life is about relationships, and EE has made a virtual community a real community. It lifts everyone's boat.”
William Peck

Member since 2004