Solved

bash script to find file type

Posted on 2009-05-06
7
1,643 Views
Last Modified: 2013-12-26
Hi, I need assistance creating a bash script which main purpose would be to search an specific mount directory /mnt/directories in which it will list the amount of files per directory and its file types as well a total global amount of files and total amount of file types. For example if there are 2 directories marc and john under /mnt/directories it will list their file type under each directory, file size per directory and then it will give the total amount of files for instance .doc or .xls 16 files .xls and 15 files .doc. Also it should provide the total amount of files added based on date per extension.
0
Comment
Question by:dpoper1
7 Comments
 
LVL 29

Assisted Solution

by:MikeOM_DBA
MikeOM_DBA earned 20 total points
ID: 24319079

Nice homework assignment, what have you got?
 
Here are some hints:
 
man du
man ls
man awk
${file_name#*.}
 
0
 

Author Comment

by:dpoper1
ID: 24322480
ok first of all I would appreciate if you could tell me a way to use the find command to find the file type, this would really help.

Regards,

michael
0
 
LVL 68

Assisted Solution

by:woolmilkporc
woolmilkporc earned 45 total points
ID: 24323696
Hi,
what do you mean with "file type"? An extension like ".txt"?
If so, there are some things to consider -

- does every file have such an extension? If not, filter out those which don't!
- are there directories in the path containing a dot (.)? If yes, filter them out!
Use 'find' to find only files.  Use 'xargs' with 'basename' to get rid of displayed  directories (containing a dot or not). Use 'awk' to process only files with an extension (a dot in their name) and to print those extensions. Use 'sort' and 'uniq -c' to count.
Tell me how far you got with the above, and I'll help you further.
wmp
0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 
LVL 16

Assisted Solution

by:ai_ja_nai
ai_ja_nai earned 20 total points
ID: 24325248
and use the command

file

with the target file as parameter to find out the file mime type. It works better than searching for extensions, which in Linux I can assure you that it makes absolutely no sense
0
 
LVL 14

Accepted Solution

by:
Monis Monther earned 40 total points
ID: 24330931
Try the following although it can be better done , bu this is just something really quick, you can craft it to give you a nice report and do more checks, just make use of the ideas here and make sure it runs correctly as I have not tried it.


#!/bin/bash

#This loop will go through all your dirs under /mnt
for x in $(ls /mnt)
do
#First we get total number of files for each dir
echo "Total number of files in directory $x is $(ls -t1 /mnt/$x | wc -l) files"

#Second thing is to get the size
echo "Directory $x is consuming $(du -sh /mnt/$x) of my disk space"

#this line will list all file types you have in your dir where x refers to the dir name and store them in a file #called types

ls -t1 /mnt/$x | cut -d . -f 2 | sort -u > types

for i in $(cat types)
do
echo "Number of files of type $i is $(ls -t1 /mnt/$x |grep $i |wc -l)"
done

#Dont forget to close the first loop
done
exit 0
0
 
LVL 68

Assisted Solution

by:woolmilkporc
woolmilkporc earned 45 total points
ID: 24331061
OK,

just for fun - this is a one-liner to produce a ranked list of your file extensions (only files with extensions taken into account) -

find /mnt -type f | xargs -n1 basename | awk -F"." '/\./ {print $NF}' | sort | uniq -c | sort -rn

To get a top list, e.g. top 20 -  add "| head -20" at the end.

(don't wonder at the 'basename", it's to process a "find . " command correctly)

Good luck,

wmp

0
 

Author Comment

by:dpoper1
ID: 24345483
My first step to this would be to list all of the files that will have the extensions found and then to find todays date.

I would appreciate your help.



#!/bin/bash
 
find /mnt -type f | xargs -n1 basename | awk -F"." '/\./ {print $NF}' | sort | uniq -c | sort -rn|cut -c 9-40>ext
 
while true
do
  cat ext| while read file
  do
    find / -name *.$file > ext1
  done
done

Open in new window

0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
How to know the backup date of a restored DB? 4 55
pvcreate issue 5 39
exchange, squid, proxy, linux 6 43
Krita v3 Linux Mint/Ubuntu 16.04 9 24
How to remove superseded packages in windows w60 or w61 installation media (.wim) or online system to prevent unnecessary space. w60 means Windows Vista or Windows Server 2008. w61 means Windows 7 or Windows Server 2008 R2. There are various …
Active Directory replication delay is the cause to many problems.  Here is a super easy script to force Active Directory replication to all sites with by using an elevated PowerShell command prompt, and a tool to verify your changes.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question