totoroha
asked on
determine the files extension
In linux, we can use the command file "example.txt" to determine the true extension of the file. Even if I rename the original example.zip file to example.txt, the file command still can recognize the true extension by using magic header.
So do we have any way to determine the list of files inside a folder by using bash, powershell or python. I prefer these languages because it's easy for me to read and understand.
Can anyone help me with this?
Thank you so much.
So do we have any way to determine the list of files inside a folder by using bash, powershell or python. I prefer these languages because it's easy for me to read and understand.
Can anyone help me with this?
Thank you so much.
Okay, sounds like you are looking for a Windows alternative to the unix "file" command.
I don't think there is any built in way to do that, even with some of the scripting languages you mentioned.
You could try one of the ports of "file" to Windows, like:
http://gnuwin32.sourceforge.net/packages/file.htm
Or there are some utilities that attempt to do this - it's a bit of a guessing game by examining the file contents and trying to determine the file type based on that. One popular example of that is:
http://mark0.net/soft-trid-e.html
~bp
I don't think there is any built in way to do that, even with some of the scripting languages you mentioned.
You could try one of the ports of "file" to Windows, like:
http://gnuwin32.sourceforge.net/packages/file.htm
Or there are some utilities that attempt to do this - it's a bit of a guessing game by examining the file contents and trying to determine the file type based on that. One popular example of that is:
http://mark0.net/soft-trid-e.html
~bp
ASKER
can you explain it to me Mike?
Bill: I have linux environment too, so it is not a constraint for only Windows.
Bill: I have linux environment too, so it is not a constraint for only Windows.
if you are using bash then you can always call the command file from bash command prompt or within bash script:
file filename
file filename
On linux can't you use the FILE command?
~bp
~bp
ASKER
I can use FILE command in linux for a single file. But what if I want to check hundred of files? I cannot type in a hundred time. ^_^
you can put it in loop:
ls | while read myfile
do
file $myfile
done
the ls command can be used in different way like
ls *.txt | while read myfile
do
file $myfile
done
ls | while read myfile
do
file $myfile
done
the ls command can be used in different way like
ls *.txt | while read myfile
do
file $myfile
done
ASKER
Thank you omar. But my coding skill is nearly 0 ^_^ If you can give me an example that works, i would really appreciate it.
Either powershell or python,bash is ok.
Either powershell or python,bash is ok.
#!/bin/bash
file *.txt
file *.txt
The example I gave can be put in a file e.g. myscript, run:
echo 'ls *.txt | while read myfile
do
file $myfile
done' > myscript
(or using a text editor). Then you make the file executable:
chmod +x myscript
Then you can call the script by its name:
./myscript
echo 'ls *.txt | while read myfile
do
file $myfile
done' > myscript
(or using a text editor). Then you make the file executable:
chmod +x myscript
Then you can call the script by its name:
./myscript
ASKER
I think the example from ozo is really good in this case. The problem is I can use wget -I to get all the http links in the text file and download all of it. After that I can check it with the "file" command. The path that i don't know is how to automate that whole process in python or bash language.
I soul really appreciate if anyone can help me with that.
I soul really appreciate if anyone can help me with that.
You started with how to determine file type for files inside folder, and now you are talking about urls and downloading files, etc.
Please elaborate on your exact requirements.
Please elaborate on your exact requirements.
ASKER
Thanks omar for asking that. At first, I thought I could do the manual job by downloading the files and put it in the folder and check the file type. But after went through the discussion with everyone, I think that If I can automate this process, that would be a great help for me in saving time.
So, can you till your requirement? you may put steps you want to do and then a script can be provided.
ASKER
step 1: download a list of text http links which included in a A.txt file
step 2: *. while downloading, if the file is text, put it in one sub-folder.
*. if not text file, put it in one sub-folder.
I think that's all.
step 2: *. while downloading, if the file is text, put it in one sub-folder.
*. if not text file, put it in one sub-folder.
I think that's all.
ASKER
hi Omar,
Do you think my idea is clear enough or we need to clarify more? Thank you ^^
Do you think my idea is clear enough or we need to clarify more? Thank you ^^
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Dear Omar,
Thank you for your help. Now I can run my script without any error. However, if I want to create another loop for your code, what should I do? and what kind of language that you're using for the code?
Thank you so much
Thank you for your help. Now I can run my script without any error. However, if I want to create another loop for your code, what should I do? and what kind of language that you're using for the code?
Thank you so much
This is a shell script (sh / ksh / bash).
Where do you want to add the loop?
Where do you want to add the loop?
ASKER
If the file is text, I want to move it to text folder. if the file is zip file, I want to move it to zip folder, if it is binary file, move it to binary folder. The rest can move to spam folder. That's what I want to achieve omar.
Thank you for your help!
Thank you for your help!
Do you have the return messages for such file types?
ASKER
If we can export it to csv file with file name, and return message, that would be great!
I don't have all the messages that file will return for each file type, but I can show you how the loop can be changed
cat A.txt | while read file
do
wget $file
filetype=`file $file`
echo $filetype | grep text
if [ $? -eq 0 ]
then
mv $file sub-folder1
elif
echo $filetype | grep gzip
then
mv $file other-folder2
fi
done
cat A.txt | while read file
do
wget $file
filetype=`file $file`
echo $filetype | grep text
if [ $? -eq 0 ]
then
mv $file sub-folder1
elif
echo $filetype | grep gzip
then
mv $file other-folder2
fi
done
Correction
cat A.txt | while read file
do
wget $file
filetype=`file $file`
echo $filetype | grep text
if [ $? -eq 0 ]
then
mv $file sub-folder1
else
echo $filetype | grep gzip
if [ $? -eq 0 ]
then
mv $file other-folder2
fi
fi
done
cat A.txt | while read file
do
wget $file
filetype=`file $file`
echo $filetype | grep text
if [ $? -eq 0 ]
then
mv $file sub-folder1
else
echo $filetype | grep gzip
if [ $? -eq 0 ]
then
mv $file other-folder2
fi
fi
done
Open in new window
:p