Solved

using filename url method

Posted on 2011-09-09
7
477 Views
Last Modified: 2013-11-16
Hi,
 
I sometime use sas to query web sites multiple times by running a macro that changes the values of the input parameters and then save the results to a sas dataset where I can parse out the information I need. The problem is that I have trouble with the return webpage getting truncated either by width and/or length.
When I run the code below, I get a 282 records, lots of which are empty and not the complete webpage. How can set record length so that I get all the data and preferably on one line?

I would be willing to get a perl or python answer, but the respondant is going to have to spoon feed me.

Thanks,

Bruce


filename foo url
"http://maps.google.com/maps?q=45.3906+-75.6881&hl=en&sll=37.0625,-95.677068&sspn=34.313287,66.533203&vpsrc=0&t=m&z=16&output=html"
lrecl=5000;


data a ;
infile foo length=len;
   input record $varying5000. len;
 run;

0
Comment
Question by:Diaphanosoma
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
7 Comments
 
LVL 83

Expert Comment

by:Dave Baldwin
ID: 36512872
That particular page contains a lot of javascript, not data and not HTML.  The actual data for the map is retrieved by AJAX from the server after the page is loaded in a browser.  What are you expecting to get?
0
 
LVL 1

Author Comment

by:Diaphanosoma
ID: 36512910
The 'output=html' parameter is supposed to output html. When I open the web page with view source, I see the street name and address name that I am hoping to parse out. However, that "text" doesn't appear in the dataset downloaded by sas.
0
 
LVL 83

Expert Comment

by:Dave Baldwin
ID: 36513110
Then you need to look at how SAS is creating the data sets.  There are very few line breaks or returns in the text for that page.  If '5000' is the length of an individual record, it might be too small.  Since you can compare the View Source to the data you have, maybe you can find out where it is being left off.
0
Will your db performance match your db growth?

In Percona’s white paper “Performance at Scale: Keeping Your Database on Its Toes,” we take a high-level approach to what you need to think about when planning for database scalability.

 
LVL 83

Accepted Solution

by:
Dave Baldwin earned 250 total points
ID: 36513122
Unless SAS is emulating a browser and accepts cookies, you might not be seeing exactly the same thing as your View Source in a browser.
0
 
LVL 7

Assisted Solution

by:d507201
d507201 earned 250 total points
ID: 36513238
I've never worked with the URL engine, but I'd set the length to 32000.  The end of line at 4096 might not be there when you use SAS to read the file.
0
 
LVL 1

Author Comment

by:Diaphanosoma
ID: 36514432
I've been playing around with the varying length. The max is 32767 which manages to catch the text I'm interested in. Not sure what one would do if the text was past that number.

I'll keep the question open till late Monday, in case someone has something else to add.

Bruce
0
 
LVL 1

Author Closing Comment

by:Diaphanosoma
ID: 36529882
Thanks for the help. I'll be able to get it going now.
0

Featured Post

Webinar: MongoDB® Index Types

Join Percona’s Senior Technical Services Engineer, Adamo Tonete as he presents “MongoDB Index Types, How, When and Where Should They be Used?” on Wednesday, July 12, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article shows the steps required to install WordPress on Azure. Web Apps, Mobile Apps, API Apps, or Functions, in Azure all these run in an App Service plan. WordPress is no exception and requires an App Service Plan and Database to install
In this blog post, we’ll look at how ClickHouse performs in a general analytical workload using the star schema benchmark test.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
This is a high-level webinar that covers the history of enterprise open source database use. It addresses both the advantages companies see in using open source database technologies, as well as the fears and reservations they might have. In this…

717 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question