count degrees in the file - python

I have a file with the list of jobs in each line. each job is a string and it is a line. 5th element in it is degree. I need to count different types of degree in the file for each job. I have 5 different degrees in the file.
I wrote below: but it is not counting total for each job. degrees are repeat some jobs. can you help?

with open("100 Jobs - MedDeviceManuf.txt", 'r') as f:
    for job in f:
        degree = job.rstrip().split(',')[5]    
        types_degree = {}
        if degree in types_degree:
            types_degree[degree] += 1
        else:
            types_degree[degree] = 1
       
        print str(types_degree)
Iryna253Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

peprCommented:
Can you show few lines of the text file?
Iryna253Author Commented:
IT Business Analyst,Siemens,Hutchinson KS USA,NA,Full time,bachelors degree,2,SAP,Word,Excel,PowerPoint,Outlook,excellent oral & written communication skills,leadership,team player
Senior IT Business Analyst,Siemens,Tarrytown NY USA,NA,Full time,bachelors degree,5,excellent oral & written communication skills,presentation skills,Business process mapping,Business Requirements Analysis
Business Analyst,Fresenius Medical,Austin TX USA,NA,Full time,bachelors degree,3,analytical skills,organizational skills,excellent oral & written communication skills,SQL,access,business Intelligence software,goal oriented,independent
DrDamnitCommented:
Put types_degree = {} above your "with" line.
Expert Spotlight: Joe Anderson (DatabaseMX)

We’ve posted a new Expert Spotlight!  Joe Anderson (DatabaseMX) has been on Experts Exchange since 2006. Learn more about this database architect, guitar aficionado, and Microsoft MVP.

Iryna253Author Commented:
Thank you, I did it, and it counted correct now. Now, I want to show this numbers on a pie chart.  I did it the manual way, is there a way of inserting output numbers into the pie chart, so I am not typing them into the code?

Output was {'Ph.D.': 10, 'bachelors degree': 73, 'masters degree': 11, 'associates degree': 1, 'NA': 5}

import matplotlib.pyplot as plt

types_degree = {}
with open("100 Jobs - MedDeviceManuf.txt", 'r') as f:
    for job in f:
        degree = job.rstrip().split(',')[5]
        if degree in types_degree:
            types_degree[degree] += 1
        else:
            types_degree[degree] = 1
print            
print str(types_degree)
       

labels = 'Ph.D', 'Bachelors Degree', 'Masters Degree', 'Associates Degree', 'N/A'
sizes = [10, 73, 11, 1, 5]
colors = ['yellowgreen', 'gold', 'lightskyblue', 'lightcoral', 'red']
explode = (0, 0.1, 0, 0, 0)

plt.pie(sizes, explode=explode, labels=labels, colors=colors,
        autopct='%1.1f%%', shadow=True, startangle=10)
plt.axis('equal')
DrDamnitCommented:
You should put that in another question to get more eyeballs on it. I generally do server stuff with python, never touched pie charts.

:-)
Iryna253Author Commented:
oh ok, I will do. Can you help please with one more question for this .txt file? I am trying to count categories that I identified in each jobs into a dictionary called skillsets. I would like to count how many programming and database is in list of jobs I have in .txt file.

number = {}
with open("100 Jobs - MedDeviceManuf.txt", 'r') as f:
    for job in f:
        skillsets = { 'programming' : ['scripting language', 'r', 'python', 'C'] , 'database' : ['SQL', 'relational database']}
       
        for category in skillsets:
            category = skillsets.keys()
            if category in job.rstrip().split(',')[7:]:
                number[category] += 1
            else:
                number[category] = 1
print            
print str(number)
DrDamnitCommented:
I don't understand your last question, can you please restate it or give me sample output of what you're getting now and tell me what's wrong with that?
Iryna253Author Commented:
Now I am getting an error:
  number[category] = 1

TypeError: unhashable type: 'list'

My output should be something like that:

programming : 3 out of 100 jobs
database: 5 out of 100 jobs

The code should do the following: got to the file, find job (each line), find field  [7] of the line, identify words (like 'scripting language' or 'C') in that field, and add them or identify them to keys/category (programming, data base), finally give me a number of those keys/category founded in all jobs.
DrDamnitCommented:
First, if the data you presented above is a real sample, then "field 7" (the eighth spot) may or may not even be correct.

Assuming this is a CSV, you have comma separated values in that field.

Regardless, I just wouldn't try it this way. I would likely do this in two passes:

1. First pass: extract / index all the keywords that are in the file (Word, SAP, C, etc...)
2. Second pass: loop through each one to build the counts

This is really a job for a database. But, if you insist on doing it this way, try it with the two passes I described above. The first pass is required because we don't know what all  the unique terms are (or how they are categorized, really). You'll have to get all the uniques and then manually categorieze them into "database" or "Office work" or "programming".

Also, to be lazy, I would categorize each of these terms in separate files on the disk so that when the script loads, I can just load the dictionary from that file.

Then, the second pass will simply read each line in the csv, compare that field to the pre-defined dictionaries that you have already created by analyzing the uniques, and simply incrementing an integer counter.
Iryna253Author Commented:
I actually identified all unique skills into the categories that I called "programming" and "data base" and I added them to the dictionary called "skillsets". Now, I need to match those skills to the categories, which I stuck with. Can you recommend  a link where I can read about it please?
DrDamnitCommented:
No link required.

Don't use a single dictionary with keys. Use multiple dictionaries (one for each) and then just use integer counters.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Suhas .Senior QA ManagerCommented:
No comment has been added to this question in more than 21 days, so it is now classified as abandoned.

I have recommended this question be closed as follows:

Accept: DrDamnit (https:#a41327206)

If you feel this question should be closed differently, post an objection and the moderators will review all objections and close it as they feel fit. If no one objects, this question will be closed automatically the way described above.

suhasbharadwaj
Experts-Exchange Cleanup Volunteer
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Python

From novice to tech pro — start learning today.