Rename or Copy to a New File a Text File based in its Contents

Posted on 2012-09-11
Medium Priority
Last Modified: 2012-09-26
I have hundreds of text files with no-sense names like “UINI1234.TXT”, “UINI12345.TXT”, “UINI106__L01_001”, etc,
Inside these small text files, with sizes no more than 30 kb, typically 7 kb, are important data I need to be part of the text file name.
In these text files, the data I want to be part of the file name, always appear in the same position, as shown in the Power Point slide included.

Each Text File has three main blocks:

            SE FACTURO

The problem to be solved is:
1)      To read these data from each text file
2)      To rename the file (or copy to a new text file) with a  name structured as:
3)      To list in a new file the text files renamed or copied.

I include two files already renamed as an example.


It is possible that two different files have identical contents; in that case, we have to delete the later file. So, it is important to verify the file being processed would have a name already processed.

The solution has to work in windows 7, or in Unix-Linux, or in Excel VBA.

Thank you very much.
Question by:doublemoon
  • 5
  • 5

Author Comment

ID: 38388888
I am waiting for a response, I would appreciate your help.
LVL 51

Expert Comment

ID: 38390793
as you did not specify where to find DATA1, DATA2, etc. here's just an example

mv -i your-file `awk '/DIVISION CENTRO OCC/{a=$1;}/Numero De Cuenta/{b=$NF}END{print a"-"b".txt"}' your-file|sed -e 's/[:\/]/-/g'`

Author Comment

ID: 38391135
Thank you ahoffmann.
That's the idea. I have some clarifications:
   I run the shell script in a Linux server and obtained a file named:


The "?" character caused a little problem working with the renamed file, but I renamed it with:

mv 2012* otherfile

The DATA1,DATA2,... are described in the attached Power Point slide, and they are data contained in the original text files.

The name of all the files I need to rename begins with "UINI"

As example, the attached file UINI001.txt should be renamed as:


That is the idea, I appreciate your valuable help.
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

LVL 51

Expert Comment

ID: 38391274
mv -i UINI001.txt `awk '/DIVISION CENTRO OCC/{a=$1;}/Numero De Cuenta/{b=$NF}END{print a"-"b".txt"}' UINI001.txt|sed -e 's/[?:\/]/-/g'`

If you have more such characters like ? it might be better to improve the final sed, i.e.

    sed -e 's/[^a-zA-Z0-9,._]/-/g'

Author Comment

ID: 38391633
Ok, ahoffmann.

I obtained:

There are no more "?" characters.

 ¿How do I cut so that it obtains only "DF25E" (3rd to 6th characters) for the name of the renamed file, and not the entire string "31DF25E034970100"?

Thank you.
LVL 51

Expert Comment

ID: 38391836
mv -i UINI001.txt `awk '/DIVISION CENTRO OCC/{a=$1;}/Numero De Cuenta/{b=substr($NF,3,5}END{print a"-"b".txt"}' UINI001.txt|sed -e 's/[?:\/]/-/g'`

> only "DF25E" (3rd to 6th characters)
is an ambigius definition, I used 5 characters ;-)

Author Comment

ID: 38391994
Thanks again, ahoffman.

¿How do I differentiate between the first and the second appearance in the file of the string "Numero De Cuenta"?

I need to specify the first or second time it appears because the associated data may be different.

Thank you for your help.
LVL 51

Accepted Solution

ahoffmann earned 2000 total points
ID: 38393992
mv -i UINI001.txt `awk '/DIVISION CENTRO OCC/{a=$1;}/Numero De Cuenta/{c++if(c==1){b=substr($NF,3,5}}END{print a"-"b".txt"}' UINI001.txt|sed -e 's/[?:\/]/-/g'`

Author Closing Comment

ID: 38438115
Thank you for your valuable help!
LVL 51

Expert Comment

ID: 38438932
thanks for grading
enjoy the magic of pipes and backticks ;-)

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Fine Tune your automatic Updates for Ubuntu / Debian
The Windows functions GetTickCount and timeGetTime retrieve the number of milliseconds since the system was started. However, the value is stored in a DWORD, which means that it wraps around to zero every 49.7 days. This article shows how to solve t…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
With the advent of Windows 10, Microsoft is pushing a Get Windows 10 icon into the notification area (system tray) of qualifying computers. There are many reasons for wanting to remove this icon. This two-part Experts Exchange video Micro Tutorial s…
Suggested Courses
Course of the Month14 days, 15 hours left to enroll

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question