• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 248
  • Last Modified:

Novice needs help - simple string replace

I've recently inherited a website that has a ruby application. I have never done anything in ruby and so am a complete novice.
The app parses an xml file and updates database.
The problem I'm having is that some records are getting rejected as they have an unknown 'type'. (Validator code checks the type is contained in list of allowed types)
This is the code that I believe is causing the problem
# v0.1 28/7/11 - Abafim parser
# partly based on xmlfeeds_relm code from v1

# Field count
# field list has 
# base class extract data adds 
# post_process net 
# net result 

require 'baseparser'

class AbafimParser < BaseParser
  include Relmutils

  def initialize(rq, pq, config)
    super

    @fld_list = {:aref => 'Reference', :status => 'Rubrique',
      :address => 'TitreEN', :city => 'Ville',
      :zip => 'CodePostal', :type => 'TypeBien',
      :price => 'Prix', :beds => 'NbChambres',
      :sqfeet => 'SurfaceTerrain', 
      :fulldesc => 'TexteEN',
      :energy => "DPE1", :green =>"DPE2"
    }

    @client_list = {}

    show_version "AbafimParser Version 0.1 1/8/11"
  end

  def do_post_extract
    
    @fields[:agent] = 'AFM'
   	@fields[:baths] = "0"
    @fields[:city] = conv(@fields[:city])
    @fields[:city] = capitalize_str(@fields[:city])
    @fields[:type] = convert_type(@fields[:type])
    @fields[:status] = convert_status(@fields[:status])
    @fields[:neighborhood] = create_locality(@fields[:address])
    @fields[:home_features] = ''
  end

  def extract_images
    @images = {}
    imgs = @node.find('URL_Photo')
    
    data = []
    if imgs.size == 0
      @log.warn "Image data missing"
      @images[:count] = 0
    elsif imgs.size <= 6
      imgs.each {|i|
        data << i.content
      }
      @images[:count] = imgs.size
    elsif
      (0..5).each {|i|
          data << imgs[i].content
      }
      @images[:count] = 6
    end
    @images[:data] = data
    @log.debug "Image count is #{@images[:count]}"
  end

  def checkDict(inval, inhash)
    if inhash.has_key?(inval)
      outval = inhash[inval]
    else
      outval = 'Unknown'
    end

    return outval
  end

  def convert_type(inval)
    @log.debug "convert_type called, with value: #{inval}" if @log

    typeDict = {
      'Terrains' => 'Land only',
      'Fermes' => 'Farmhouse',
      'Neufs - Optimisation fiscale' => 'Commercial',
      'Commerces' => 'Commercial',
      "Gîtes / Chambres d'hôtes" => 'Gites Complex',
      'Appartements' => 'Apartment',
      'Villas' => 'Villa',
      'Châteaux / Belles Demeures' => 'Chateau',
      'Immeubles / Hôtels' => 'Commercial'
    }

    value = checkDict(inval, typeDict)
    @log.debug "convert_type returning value #{value}" if @log
    return value
  end

  def convert_status(inval)
    @log.debug "convert_status called, with value: #{inval}" if @log
    if inval == "VENTE"
      value = "For Sale"
    else
      value = "Unknown"
    end

    @log.debug "convert_status returning value #{value}" if @log
    return value
  end

  def create_locality(inval)
  	loc = "Not Specified"

    if inval =~ /[fF]arm|[Bb]arn|[Ff]ermette/
      loc = "Rural"
    end

    return loc
  end
end

Open in new window


The problem is this line (line 85)
"Gîtes / Chambres d'hôtes" => 'Gites Complex',

This was working until recently when the value in the xml file changed to include a \ before the apostrophe

I've tried changing the line to
"Gîtes / Chambres d\'hôtes" => 'Gites Complex',

but it's still not working.

I thought perhaps I would remove the \ before doing the convert_type but have no idea how to do it?
0
fionafenton
Asked:
fionafenton
  • 3
  • 2
1 Solution
 
SurranoSystem EngineerCommented:
Lines 78 and 93 there are two useful log entries:

    @log.debug "convert_type called, with value: #{inval}" if @log
...
    @log.debug "convert_type returning value #{value}" if @log

Open in new window


Can you provide the log output for the problematic call?
0
 
fionafentonAuthor Commented:
There are 4 log files created:
feeds-info-2014-01-23.log
feeds-2014-01-23.log
feeds-db-2014-01-23.log
feeds-debug-2014-01-23.log

db and debug are empty
looks like debug is off but I can't find where to switch it on.

In the other 2 logs example of what's been outputted is:

[WARN  validator] 2014-01-23 00:30:18 :: Validator dropped record 110 AF10647 type invalid value: Unknown

Unfortunately it's returning value Unknown which isn't very helpful

In one of the other included files I've found
if @config.options[:debug] == true
      @xtr.r.debug = true
      @xtr.p.debug = true
      @xtr.v.debug = true
      @xtr.w.debug = true
    end

So from this I'm guessing I have to set debug=true in config.options?

I've done a search of the files looking for @config.settings[@debug] and can't find anywhere where the value is set? other than a class called configuration which includes
@options[:debug] = false
      opts.on( '--debug', "Log level debug" ) do
        @options[:debug] = true
      end

I tried setting @options[:debug] =true but debug file was still empty
0
 
SurranoSystem EngineerCommented:
Seems if you invoke the script with "--debug" option it will turn on debug
But what you need now is @log which may not be set. Try to find how @log is defined, e.g. is it a command line argument or a DB parameter or computed from something?

Another possibility is to change line 71 in checkDict as follows:

...
  def checkDict(inval, inhash)
    if inhash.has_key?(inval)
      outval = inhash[inval]
    else
      outval = "Unknown: ===#{inval}==="
    end

    return outval
  end
...

Open in new window


hopefully the only log that works will now print something in addition to Unknown, e.g.:

[WARN  validator] 2014-01-23 00:30:18 :: Validator dropped record 110 AF10647 type invalid value: Unknown: ===G?tes / Chambres d'h?tes===

(note: replaced unicode stuff with ? because I expect to see something different there)
By the way can't you directly check this record 110 AF10647, whatever it means, in the input?
0
 
fionafentonAuthor Commented:
I changed line 71 as it was the quickest option (been trying to work out how to do that!)

All it's done is confirmed that the value in typeDict is correct

This is what I'm getting:
[WARN  validator] 2014-01-25 12:15:12 :: Validator dropped record 4 AF06626 type invalid value: Unknown: ===Gîtes / Chambres d\'hôtes===

I can't help thinking it's the backslash that's causing the problem as this is the only thing that's changed. Does it have some sort of signifcance in Ruby (as it does in Php),  which means that it's effectively ignored in the string?

In line 38 where the values are read in:
@fields[:type] = convert_type(@fields[:type])
Is it not possible to apply another function to first remove the backslash?
0
 
fionafentonAuthor Commented:
Got it!

All I had to do was escape the backslash.

Changing

"Gîtes / Chambres d\'hôtes" => 'Gites Complex',

to

"Gîtes / Chambres d\\'hôtes" => 'Gites Complex',

has solved the problem.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: CompTIA Cloud+

The CompTIA Cloud+ Basic training course will teach you about cloud concepts and models, data storage, networking, and network infrastructure.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now