Solved

Powershell - filter stringa nd output to new file for database import

Posted on 2014-10-04
4
189 Views
Last Modified: 2014-11-06
Guys,
I have a flat file which having around 1-3gb data , and before do import in the database, i need some clean up so that when I do import only a new clean data get imported. Below is how we doing it in the database via  T-sql but since the database cannot accept any downtime  and the select query below always takes very very long to complete and very often block insert to the database, we decided to move this task to text file string manipulation.

do anyone know how to begin with Powershell command to output new file without having any values below within a rows ?

Powershell command select-string will do this job but I'm clueless how to write the code for the clean up.




AGT_domainname = '' OR
      AGT_username LIKE 'SELkry' OR
      AGT_username LIKE 'testuser%'     OR
      AGT_username LIKE 'siteminder'    OR
      AGT_username LIKE 'CN=MYR%'       OR
      AGT_username LIKE 'CN=MYRTEST%'   OR
      AGT_username LIKE 'selypz' OR
      AGT_username LIKE 'selyyz' OR
      AGT_domainname LIKE 'Monitoring External' OR
      AGT_domainname LIKE 'onitoring Internal' OR
      AGT_domainname LIKE 'Test Domain' OR
      AGT_domainname LIKE 'TXTAPP SE'   OR
      AGT_domainname LIKE 'TXTAPP COM'  OR
      AGT_domainname LIKE 'TXTAPP NET'  OR
      AGT_domainname LIKE 'TXTAPP FED'  OR
      AGT_eventid = '7' OR
      AGT_eventid = '8' OR
      AGT_eventid = '9'
0
Comment
Question by:motioneye
  • 2
4 Comments
 
LVL 15

Expert Comment

by:Haris Djulic
ID: 40361411
Can you provide sample of the input file?
0
 
LVL 69

Accepted Solution

by:
Qlemo earned 500 total points
ID: 40361594
No, Select-String will not do the job (well). It does not honour fields, though you can emulate that e.g. based on counting delimiters. The search string would resemble to a complex regular expression.
Your T-SQL contains mostly LIKE without a wildcard - is that just missing here, or do you search for exact matches?

Probably you import the file into PowerShell as CSV, and that is the best way, as you need each field for performing the import into DB anyway. Assuming the CSV contains a header line naming the columns like the DB fields without the AGT_ prefix, the filter than looks like this:
Import-CSV FlatFile.csv | ? {
       $_.domainname -eq ''
-or  $_.username -like 'testuser*'
-or  <# and so on #>
} | % { <# Do the DB import of each result line here #> }

Open in new window

0
 

Author Comment

by:motioneye
ID: 40367476
Hi Qlemo
I will start to test this, hopefully it work well and faster than a database query.
0
 

Author Closing Comment

by:motioneye
ID: 40427516
Thanks..
It works with my  quick try, I will do more try and posting a new topic if have any issues.
0

Featured Post

The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article will help you understand what HashTables are and how to use them in PowerShell.
A recent project that involved parsing Tableau Desktop and Server log files to extract reusable user queries for use in other systems. I chose to use PowerShell to gather the data, and SharePoint to present it...
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the adminiā€¦

685 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question