Link to home
Create AccountLog in
Python

Python

--

Questions

--

Followers

Top Experts

Avatar of Mike Johnson
Mike Johnson🇺🇸

Parsing text files for fields that are throwing truncation sql errors

Suggestions on how best to parse those csv, text files for fields that are too large and causing truncation sql errors.

Zero AI Policy

We believe in human intelligence. Our moderation policy strictly prohibits the use of LLM content in our Q&A threads.


Avatar of Daniel Pineault (Microsoft MVP)Daniel Pineault (Microsoft MVP)🇨🇦

you haven't explained very much, but typically you know the limits of your columns and truncate them when you parse the csv data to meet your column requirements.

 

import csv

max_lengths = {
    'field1': 255,
    'field2': 1024,
    # add fields and their max sizes
}

def truncate_row(row):
    for field, max_len in max_lengths.items():
        if field in row and len(row[field]) > max_len:
            row[field] = row[field][:max_len]
    return row

with open('data.csv', newline='', encoding='utf-8') as csvfile:
    reader = csv.DictReader(csvfile)
    rows = []
    for row in reader:
        truncated_row = truncate_row(row)
        rows.append(truncated_row)
        # Insert truncated_row into SQL DB safely here

Open in new window

 


Stage each file into a wide staging table using VARCHAR(MAX) or NVARCHAR(MAX) via BULK INSERT or OPENROWSET, then profile lengths with LEN or DATALENGTH to find columns exceeding target sizes. Redirect offenders to an exceptions table using ERRORFILE or a Conditional Split, clean or approve truncation with LEFT, then load the final typed table in a single transaction.


This question needs an answer!
Looks like this question is still being worked on. Think you can help?
Create your account and start contributing!
Create Account

Reward 1Reward 2Reward 3Reward 4Reward 5Reward 6

EARN REWARDS FOR ASKING, ANSWERING, AND MORE.

Earn free swag for participating on the platform.

Python

Python

--

Questions

--

Followers

Top Experts

Python is a widely used general-purpose, high-level programming language. Its design philosophy emphasizes code readability, and its syntax allows programmers to express concepts in fewer lines of code than would be possible in other languages. Python supports multiple programming paradigms, including object-oriented, imperative and functional programming or procedural styles. It features a dynamic type system and automatic memory management and has a large and comprehensive set of standard libraries, including NumPy, SciPy, Django, PyQuery, and PyLibrary.