Solved

Novice needs help - simple string replace

Posted on 2014-01-23
5
210 Views
Last Modified: 2014-01-25
I've recently inherited a website that has a ruby application. I have never done anything in ruby and so am a complete novice.
The app parses an xml file and updates database.
The problem I'm having is that some records are getting rejected as they have an unknown 'type'. (Validator code checks the type is contained in list of allowed types)
This is the code that I believe is causing the problem
# v0.1 28/7/11 - Abafim parser
# partly based on xmlfeeds_relm code from v1

# Field count
# field list has 
# base class extract data adds 
# post_process net 
# net result 

require 'baseparser'

class AbafimParser < BaseParser
  include Relmutils

  def initialize(rq, pq, config)
    super

    @fld_list = {:aref => 'Reference', :status => 'Rubrique',
      :address => 'TitreEN', :city => 'Ville',
      :zip => 'CodePostal', :type => 'TypeBien',
      :price => 'Prix', :beds => 'NbChambres',
      :sqfeet => 'SurfaceTerrain', 
      :fulldesc => 'TexteEN',
      :energy => "DPE1", :green =>"DPE2"
    }

    @client_list = {}

    show_version "AbafimParser Version 0.1 1/8/11"
  end

  def do_post_extract
    
    @fields[:agent] = 'AFM'
   	@fields[:baths] = "0"
    @fields[:city] = conv(@fields[:city])
    @fields[:city] = capitalize_str(@fields[:city])
    @fields[:type] = convert_type(@fields[:type])
    @fields[:status] = convert_status(@fields[:status])
    @fields[:neighborhood] = create_locality(@fields[:address])
    @fields[:home_features] = ''
  end

  def extract_images
    @images = {}
    imgs = @node.find('URL_Photo')
    
    data = []
    if imgs.size == 0
      @log.warn "Image data missing"
      @images[:count] = 0
    elsif imgs.size <= 6
      imgs.each {|i|
        data << i.content
      }
      @images[:count] = imgs.size
    elsif
      (0..5).each {|i|
          data << imgs[i].content
      }
      @images[:count] = 6
    end
    @images[:data] = data
    @log.debug "Image count is #{@images[:count]}"
  end

  def checkDict(inval, inhash)
    if inhash.has_key?(inval)
      outval = inhash[inval]
    else
      outval = 'Unknown'
    end

    return outval
  end

  def convert_type(inval)
    @log.debug "convert_type called, with value: #{inval}" if @log

    typeDict = {
      'Terrains' => 'Land only',
      'Fermes' => 'Farmhouse',
      'Neufs - Optimisation fiscale' => 'Commercial',
      'Commerces' => 'Commercial',
      "Gîtes / Chambres d'hôtes" => 'Gites Complex',
      'Appartements' => 'Apartment',
      'Villas' => 'Villa',
      'Châteaux / Belles Demeures' => 'Chateau',
      'Immeubles / Hôtels' => 'Commercial'
    }

    value = checkDict(inval, typeDict)
    @log.debug "convert_type returning value #{value}" if @log
    return value
  end

  def convert_status(inval)
    @log.debug "convert_status called, with value: #{inval}" if @log
    if inval == "VENTE"
      value = "For Sale"
    else
      value = "Unknown"
    end

    @log.debug "convert_status returning value #{value}" if @log
    return value
  end

  def create_locality(inval)
  	loc = "Not Specified"

    if inval =~ /[fF]arm|[Bb]arn|[Ff]ermette/
      loc = "Rural"
    end

    return loc
  end
end

Open in new window


The problem is this line (line 85)
"Gîtes / Chambres d'hôtes" => 'Gites Complex',

This was working until recently when the value in the xml file changed to include a \ before the apostrophe

I've tried changing the line to
"Gîtes / Chambres d\'hôtes" => 'Gites Complex',

but it's still not working.

I thought perhaps I would remove the \ before doing the convert_type but have no idea how to do it?
0
Comment
Question by:fionafenton
  • 3
  • 2
5 Comments
 
LVL 8

Expert Comment

by:Surrano
ID: 39805976
Lines 78 and 93 there are two useful log entries:

    @log.debug "convert_type called, with value: #{inval}" if @log
...
    @log.debug "convert_type returning value #{value}" if @log

Open in new window


Can you provide the log output for the problematic call?
0
 
LVL 1

Author Comment

by:fionafenton
ID: 39806256
There are 4 log files created:
feeds-info-2014-01-23.log
feeds-2014-01-23.log
feeds-db-2014-01-23.log
feeds-debug-2014-01-23.log

db and debug are empty
looks like debug is off but I can't find where to switch it on.

In the other 2 logs example of what's been outputted is:

[WARN  validator] 2014-01-23 00:30:18 :: Validator dropped record 110 AF10647 type invalid value: Unknown

Unfortunately it's returning value Unknown which isn't very helpful

In one of the other included files I've found
if @config.options[:debug] == true
      @xtr.r.debug = true
      @xtr.p.debug = true
      @xtr.v.debug = true
      @xtr.w.debug = true
    end

So from this I'm guessing I have to set debug=true in config.options?

I've done a search of the files looking for @config.settings[@debug] and can't find anywhere where the value is set? other than a class called configuration which includes
@options[:debug] = false
      opts.on( '--debug', "Log level debug" ) do
        @options[:debug] = true
      end

I tried setting @options[:debug] =true but debug file was still empty
0
 
LVL 8

Accepted Solution

by:
Surrano earned 500 total points
ID: 39806549
Seems if you invoke the script with "--debug" option it will turn on debug
But what you need now is @log which may not be set. Try to find how @log is defined, e.g. is it a command line argument or a DB parameter or computed from something?

Another possibility is to change line 71 in checkDict as follows:

...
  def checkDict(inval, inhash)
    if inhash.has_key?(inval)
      outval = inhash[inval]
    else
      outval = "Unknown: ===#{inval}==="
    end

    return outval
  end
...

Open in new window


hopefully the only log that works will now print something in addition to Unknown, e.g.:

[WARN  validator] 2014-01-23 00:30:18 :: Validator dropped record 110 AF10647 type invalid value: Unknown: ===G?tes / Chambres d'h?tes===

(note: replaced unicode stuff with ? because I expect to see something different there)
By the way can't you directly check this record 110 AF10647, whatever it means, in the input?
0
 
LVL 1

Author Comment

by:fionafenton
ID: 39808613
I changed line 71 as it was the quickest option (been trying to work out how to do that!)

All it's done is confirmed that the value in typeDict is correct

This is what I'm getting:
[WARN  validator] 2014-01-25 12:15:12 :: Validator dropped record 4 AF06626 type invalid value: Unknown: ===Gîtes / Chambres d\'hôtes===

I can't help thinking it's the backslash that's causing the problem as this is the only thing that's changed. Does it have some sort of signifcance in Ruby (as it does in Php),  which means that it's effectively ignored in the string?

In line 38 where the values are read in:
@fields[:type] = convert_type(@fields[:type])
Is it not possible to apply another function to first remove the backslash?
0
 
LVL 1

Author Closing Comment

by:fionafenton
ID: 39808753
Got it!

All I had to do was escape the backslash.

Changing

"Gîtes / Chambres d\'hôtes" => 'Gites Complex',

to

"Gîtes / Chambres d\\'hôtes" => 'Gites Complex',

has solved the problem.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Article by: narshlob
If you've ever programmed in Ruby and have come across either a proc or a lambda, you might have been wondering what the difference is between the two and when you would use one over the other. This article will try to explain the difference between…
In Ruby, Call or invoke a API DLL library is easily via Win32API class, win32-api gem or other gems. For general DLL API call, there are quite a few references, some good tips list below: http://www.rubytips.org/2008/05/13/accessing-windows-api-fro…
This Micro Tutorial hows how you can integrate  Mac OSX to a Windows Active Directory Domain. Apple has made it easy to allow users to bind their macs to a windows domain with relative ease. The following video show how to bind OSX Mavericks to …
Many functions in Excel can make decisions. The most simple of these is the IF function: it returns a value depending on whether a condition you describe is true or false. Once you get the hang of using the IF function, you will find it easier to us…

910 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now