Solved

Format file for BCP

Posted on 2010-09-09
15
361 Views
Last Modified: 2012-05-10
I have the attached file that I need to post-process with the following steps:

1. Take Subset of file

The post processed file should start with the section that has the header
"name                                                    number      symbol      initial wgt      optimized wgt      target wgt      min wgt      max wgt      volatility      partial 1 (and 2)      partial 3      partial 4      partial 5      r-squared      tilt value stock alerts"

2. Remove Column

Then the column "Stock Alerts" needs to be removed

The columns after removing the first section are tab delimited. I need to then BCP them into SQL Server. Can someone write either a python or a perl script to do this for me?
0
Comment
Question by:copiacap
  • 8
  • 7
15 Comments
 
LVL 10

Expert Comment

by:jeromee
Comment Utility
Can you provide a file sample to make sure I understand what you are asking?

0
 

Author Comment

by:copiacap
Comment Utility
Sorry - I thought I attached the file with the original post
Optimizer.txt
0
 
LVL 10

Expert Comment

by:jeromee
Comment Utility
This will work to create the BCP file (TAB-separated):
perl -F'\t' -ane'BEGIN{<>} print join("\t",@F[0..$#F-1])."\n"' Optimizer.txt

After that I don't know the SQL Server command format to BCP a file in.
0
 

Author Comment

by:copiacap
Comment Utility
Hey Jeromee

I tried running the script but got the following error:

C:\Temp>perl -F'\t' -ane'BEGIN{<>} print join("\t",@F[0..$#F-1])."\n"' Optimizer.txt
> was unexpected at this time.


0
 
LVL 10

Expert Comment

by:jeromee
Comment Utility
the Window/DOS is having a problem with the <>.
Here's workaround.
Create file called make_bcp.pl which will contain this:
<>;
while( <> ) {
   my @F = split/\t/;
   print join("\t",@F[0..$#F-1])."\n";
}

Then run:
    perl make_bcp.pl Optimizer.txt

Good luck!
0
 

Author Comment

by:copiacap
Comment Utility
Thanks Jeromee - I think we're close. Attached is the result of the script you sent me (Optimizer_Formatted.txt) and what the file should look like (Optimizer_Formatted_Correct.txt). The script needs to find the row with the header for the data (name number symbol ...) and then have  that be the start of the file.
Optimizer-Formatted-Correct.txt
Optimizer-Formatted.txt
0
 
LVL 10

Expert Comment

by:jeromee
Comment Utility
Hi copiacap,
I'll take a look tomorrow am when I'm close to a computer...
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 10

Expert Comment

by:jeromee
Comment Utility
Hi copiacap,
The file Optimizer-Formatted-Correct.txt that you posted seems to have bogus value.
Now, I think I understand better what you want.... you want to skip the very long list of field descriptions, all the way to the header line, and then keep that header line as the first line of the output file.

If so, try this.

1 while( ($_=<>) !~ /^name\s+number\s+symbol\s+initial/ );
do {
   my @F = split/\t/;
   print join("\t",@F[0..$#F-1])."\n";
} while( <> )
0
 

Author Comment

by:copiacap
Comment Utility
Ok one more tweak - the file has 16 columns. Rather than removing the last column which is what we're currently doing, it looks like sometimes the file doesn't put a blank for that last column. So some rows have 15 and some have 16. So what we need to do is append a \t if there are only 15 columns to make sure every row has 16 columns, that way the BCP loads it correctly.
0
 
LVL 10

Expert Comment

by:jeromee
Comment Utility
Hi copiacap,
I think this is what you need.

1 while( ($_=<>) !~ /^name\s+number\s+symbol\s+initial/ );
do {
   my @F = split/\t/;
   print join("\t",@F[0..13])."\n";
} while( <> )

I replaced the $#F-1 by the hardcoded 13.

Good luck!
0
 

Author Comment

by:copiacap
Comment Utility
Ok great, I think we need one more fix. We're dropping one column. I changed the script to be @F[0..14]. Only problem is sometimes that last column includes a newline. See the results when I run the script with @F[0..14]. So can we make sure it doesn't include any new line characters in the actual text except for the one we append to the end?
Optimizer-Formatted.txt
0
 

Author Comment

by:copiacap
Comment Utility
One other request - can you add a statement at the beginning to make sure the input file exists and otherwise just exit. Otherwise the perl script hangs waiting for the file I guess
0
 
LVL 10

Accepted Solution

by:
jeromee earned 500 total points
Comment Utility
Try this...

my $input = $ARGV[0] || die "Please specify input file\n";
open(INPUT, $input) || die "Can't open input file $input: $!\n";
1 while( ($_=<INPUT>) !~ /^name\s+number\s+symbol\s+initial/ );
do {
   chomp;
   my @F = split/\t/;
   print join("\t",@F[0..13])."\n";
} while( <INPUT> );
close(INPUT);

0
 

Author Closing Comment

by:copiacap
Comment Utility
Thankes Jeromee - worked like a charm
0
 
LVL 10

Expert Comment

by:jeromee
Comment Utility
Glad to hear that!

0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

This article will show the steps for installing Python on Ubuntu Operating System. I have created a virtual machine with Ubuntu Operating system 8.10 and this installing process also works with upgraded version of Ubuntu OS. For installing Py…
Article by: Swadhin
Introduction of Lists in Python: There are six built-in types of sequences. Lists and tuples are the most common one. In this article we will see how to use Lists in python and how we can utilize it while doing our own program. In general we can al…
Learn the basics of strings in Python: declaration, operations, indices, and slicing. Strings are declared with quotations; for example: s = "string": Strings are immutable.: Strings may be concatenated or multiplied using the addition and multiplic…
Learn the basics of lists in Python. Lists, as their name suggests, are a means for ordering and storing values. : Lists are declared using brackets; for example: t = [1, 2, 3]: Lists may contain a mix of data types; for example: t = ['string', 1, T…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now