• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 601
  • Last Modified:

File copy using bcp with newlines in data

I am dumping data that I would like to import into Excel. Unfortunately, there are many newlines in the text fiels, so I am restoreing to the -r switch to change the record separator, so that the embedded newlines can be distinguished.

This makes is hard to import into Excel.

So far I have tried using regex replacement, but cannot so far get the right combination.

To solve this, I either need a way to escape the newlines as they are exported, or a set to regex replacements that I can do to get the data into excel. It would be ok to have the newlines replaced by some character while in Excel. Unfortunately every combination I have tried so far, evenb including muilti-character sequences, leads to some conflict somewhere along the lines.

0
jasimon9
Asked:
jasimon9
  • 5
  • 3
1 Solution
 
ispalenyCommented:
You cannot use Coll001 = REPLACE(REPLACE(Coll001,CHAR(13),' '),CHAR(10),' ') directly in SELECT statement?
0
 
jasimon9Author Commented:
While it would be possible on a case by case basis to use a queryout with bcp and the suggested transformation, it would be too tediou in general, as there are lots of tables with lots of columns that would need to be handled this way.
0
 
jasimon9Author Commented:
Actually, all it takes sometimes is writing out the problem for the solution to jump into your mind. A simipler solution has just occurred to me that should work:

1. Copy via bcp and substitute a mutil-character string guaranteed not to occur in the data for the record separator.

2. Replace all newlines in the text output file with a symbol that does not occur in the file, such as a pipe. Unfortunately, such a symbol differs by file and there is probably not a generic symbol that works for each file). As stated in my question, it is ok to have such a character left in the Excel file.

3. Replace the multi-character string with newlines.

4. Import into Excel.

I am going to try this approach and report back if it works.
0
Granular recovery for Microsoft Exchange

With Veeam Explorer for Microsoft Exchange you can choose the Exchange Servers and restore points you’re interested in, and Veeam Explorer will present the contents of those mailbox stores for browsing, searching and exporting.

 
jasimon9Author Commented:
While my approach works in principal, it is bogging down in practice. I usually use a regex-enabled text editor, but it is choking on files of production size (over several MB).

So I tried a version of sed to try to replace the newlines, but cannot get that to work. So I still need either more help with this approach, or a new approach.
0
 
ispalenyCommented:
10 MB/min

If you want to make it faster, you must rewrite script logic into VB6 or C++ code.

start /wait CSV2excel.vbs testin.txt testout.txt

CSV2excel.vbs
----------------------------------
dim txtin
dim fso
dim str
dim stra
dim cnt
dim cnta
dim i
dim TEXTDELIMITER
dim LENTEXTDELIMITER

Const ForReading = 1
Const ForWriting = 2

TEXTDELIMITER = """"
LENTEXTDELIMITER = len(TEXTDELIMITER)

set fso=CreateObject("Scripting.FileSystemObject")
set txtin=fso.OpenTextFile(WScript.Application.Arguments(0),ForReading,False)
set txtout=fso.OpenTextFile(WScript.Application.Arguments(1),ForWriting,True)
cnta=0
i=0
do while not txtin.AtEndOfStream

  i=i+1
  str=txtin.ReadLine()
  cnt= ( len(str) - len(replace(str,TEXTDELIMITER,"") )) / LENTEXTDELIMITER
  cnta=cnta+cnt
  if i=1 then
    stra=str
  else
    stra=stra&vbLf&str
  end if
  if cnta MOD 2 = 0 then
    txtout.WriteLine stra
    cnta=0
    stra=""
    i=0
  end if
loop
set fso=nothing
set txt=nothing
0
 
nloeberCommented:
Have you considered using DTS with the Excel provider to export the data directly to an Excel file?
0
 
ispalenyCommented:
With DTS, you are able to fill Excel sheet into correct rows, but contents of output rows is exactly the same as input rows. Excel needs LF row-in-row delimiter (ALT+Enter), for CRLF delimiter it displays 2 "unknown" characters. DTS can handle this in ActiveX transformation, it is similar to REPLACE directly in SELECT,but it uses DTS server memory and CPU.
0
 
jasimon9Author Commented:
The new "bcp" procedure is intended to replace the current direct export using DTS into Excel files. I did not give this background before, but a Q & D export into excel has been in production for several years (and is still).

The reasons to avoid the export into Excel:

1. It is a slow, ugly process. The are some tables which exceed the 64K row limit, which require special processing.

2. It is slow. The Excel export takes many miinutes -- at least 10, whereas the bcp export is less than a minute.

3. The ability to use a batch process which also includes additional post-processing means that ultimately it can be more highly automated.

4. Probably security issues of the current process that I am not going to get into here, but which are completely solved by the new method.

5. Better data handling. Some of the large text fields with embedded linefeeds are not handled properly in Excel. The troubles seem to be erratic.

In any case, the Excel method is really Q & D (quick and dirty) and the newer method is much more robust and elegant.

I have been considering the queryout suggestion from ispaleny. It actually may work because as it turns out, the embedded linefeeds are probably only in 3 fields. So with some custom work on the bcp script for those specific areas, it may provide the solution. When I have a chance I will explore this avenue further.
0
 
jasimon9Author Commented:
I was finally able to finish this project, and last weekend actually replaced the Q & D export into Excel with a bcp-based export. It is working nicely and is now fully automated.

After all was said and done, the queryout method suggested by isapaleny was the only that worked, with special processing for the fields where is was needed.

Therefore, points to isapaleny.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 5
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now