Complex Scenario - SAS

Posted on 2011-10-05
Last Modified: 2013-11-16
Hi all,

I have a text file on Server A and I need to SCP it to server B and create a SAS dataset. I have the code built for this functionality. But the file is around 80 million records and for every run I need only a subset of the text file.

The current process is not efficient as it is pull all the file for every run and agin doing a subset of the sas dataset.

Is there any way where I can only pull a subset of data directly from the text file to server B?

I thought of creating a Shell script where it will create a subset there in server A and then pull the subset every time. wanted to know if there is any more efficient way of doing this.

Please help.

Thanks in advance.
Question by:aruku
    LVL 7

    Expert Comment

    At the 2011 SGF I presented a paper on using SAS to move data between servers and the paper has been posted on this site.  Under the Articles tab, search for SAS.

    If both servers have SAS and you can set up a client-server relationship between the two servers then a combination of remote compute services and data transfer services or remote library services will easily do the job.  

    If SAS isn't on one of the servers, but you do have FTP or SFTP on both servers then SAS has access methods that will workl with those.  (9.1 supports only FTP but 9.2 and up support both FTP and SFTP.)  It's not in the paper but my powerpoint deck shows a code example to read a text file and subset it while moving it via FTP.  It works the same way under SFTP.  Let me know if you need to see the powerpoint.

    Read the article and post here if you need more help.

    Author Comment

    Thanks for the comments d507201. Can I see the power point as that will help me with resolving this issue.
    LVL 7

    Accepted Solution

    Slide 14 is the one that talks about FTPing a text file and subsetting at the same time.  

    Slides 18 and 19 are about SCP.  19 has examples of using the X statement to run SCP from within a SAS program.    
    LVL 14

    Expert Comment

    by:Aloysius Low
    i would say subsetting the data before transferring / accessing directly from server B is the most efficient approach... if you try to access the data on server A directly from server B to subset the data, inevidently you are pulling all records to server B before the subset takes place...

    otherwise, you have to consider a change in approach to generate the file on server A - why is the data all inside 1 file? for e.g. can the new records be created in a new file? could you do a grep to get only the date/time of the records you one and write the output into another file to be read/transferred?

    Featured Post

    How your wiki can always stay up-to-date

    Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
    - Increase transparency
    - Onboard new hires faster
    - Access from mobile/offline

    Join & Write a Comment

    Entity Framework is a powerful tool to help you interact with the DataBase but still doesn't help much when we have a Stored Procedure that returns more than one resultset. The solution takes some of out-of-the-box thinking; read on!
    Shadow IT is coming out of the shadows as more businesses are choosing cloud-based applications. It is now a multi-cloud world for most organizations. Simultaneously, most businesses have yet to consolidate with one cloud provider or define an offic…
    Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
    Video by: Steve
    Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…

    734 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    24 Experts available now in Live!

    Get 1:1 Help Now