We help IT Professionals succeed at work.
Get Started

Ruby - Extract Year from document text

503 Views
Last Modified: 2016-03-02
Hello, I am trying to update a Ruby script that looks in html files downloaded from the IRS site and finds specific text in the file to extract the Year from the document.   The script works fine however there are not 2 types of documents that can be in the folder and the line in question is different in each document.  How can I add an "or" or "if" clause so the script looks for the text formatted either way and pulls out the year based on how the line of text reads.   The line currently in the file is displayed as "TAX PERIOD:    DEC. 31, 2014"  The line in the 2nd document that I need to add is displayed as "Tax Period or Periods:  December, 2014"  I would need the year "2014" extracted from each document.   I have attached screenshots of the script in color and the 2 document types.

 
#Open and read the downloaded file
	transcript_html_name = files_in_dir[i]
	File.open(Dir.pwd + "/Transcripts/#{transcript_html_name}", "r") do |f|
		f.each_line do |line|
			 #Search for the Tax Period date to get the year
			if line.include? "TAX PERIOD:" 
				line.gsub!(/(<[^>]*>)|\n|\t/s) {""}
				line.slice!("TAX PERIOD:")
				line.gsub!(/\w\w\w[.]\s\d\d[,]\s/) {""}
				year = line
			end

Open in new window

script.jpg
document-sample-1.jpg
document-sample-2.jpg
Comment
Watch Question
CERTIFIED EXPERT
Top Expert 2016
Commented:
This problem has been solved!
Unlock 1 Answer and 6 Comments.
See Answer
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE