File manipulation python (indexing and extraction)

I just wanted to check if my algorithm is right
that' s how my attached file looks like:
0      1818      1148
0      1818      1147
0      1819      1147
0      1818      1146
etc...
i have a list: [1,2,3,4,1,1] and the index is [ 0,1,2,3,4,5] so we can see that 0,4 and 5 belongs to the same group.

and that's how i read the file to extract the groups:

for i in range(0,len(classList)):
			classe = classList[i].getGraphemClass()
			count = 0
			x0 = []
			y0 = []
			
			while count < len(classList):
				if classList[count].getGraphemClass() == classe:
					for j in range(0,len(classList[count].getListOfGraphems())):
						nb = str(classList[count].getListOfGraphems()[j])
						
						for line in file:
							line = line.strip()
							if not line:
								break
							else:
								line = line.split()
								if line[0] == nb:
									x0.append(line[1])
									y0.append(line[2])
									
				count = count + 1

Open in new window

coordonnes.txt
dadadudeAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

dadadudeAuthor Commented:
is that code ok?
for i in range(0,len(classList)):
			x0 = []
			y0 = []
			file = open('C:\\Decomposition\\Analyse\\image139\\feat\\coordonnes.txt')
			for j in range(0,len(classList[i].getListOfGraphems())):
				classe = int(classList[i].getListOfGraphems()[j])	
				
				while 1:
					lines = file.readlines(100000)
					if not lines:
						break
					for line in lines:
						line = line.split()
						if int(line[0]) == classe:
							x0.append(int(line[1]))
							y0.append(int(line[2]))
					
						
			for c in range(0,len(x0)):
				x = x0[c]
				y = y0[c]
				self.im[x,y,0] = classList[i].getR()
				self.im[x,y,1] = classList[i].getV()
				self.im[x,y,2] = classList[i].getB()

Open in new window

peprCommented:
I suggest to clean-up the code a bit.  Firstly, the Python for-loop is more general than C-for-loop.  It iterates through all the elements.  You should not use range() this way only to obtain indices to access the list elements via index.

Do not use the "file" identifier as it is the built-in function.  You mask its existence this way.  The f is enough in this case.  Your x0 and y0 separate lists should better be one list of tuples.

the line.split() produces a list of elements.  You probably should not call it line again -- it looks confusing.

I will start from the middle:

            points = []
            f = open('C:\\Decomposition\\Analyse\\image139\\feat\\coordonnes.txt')
            #...
            for line in f:    # loop through all the lines             
                lst = line.split()
                if len(lst) == 0:
                    break               # this is not that nice, but OK

                if int(lst[0]) == classe:
                    points.append( (int(lst[1]), int(lst[2])) ) 

Open in new window


It is better to avoid indentation by tabs.  You should prefer 4 spaces for one indentation level instead (your editor may expand the Tab key to spaces for you).

You can use normal slashes in paths in Python:

    f = open('C:/Decomposition/Analyse/image139/feat/coordonnes.txt')

Open in new window


or you can use the r-prefix of the literal -- raw string where backslash is not interpreted.

    f = open(r'C:\Decomposition\Analyse\image139\feat\coordonnes.txt')

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
peprCommented:
You re-read the file that many time how many classes you have.  This is very inefficient.  You should also close the open file.
JavaScript Best Practices

Save hours in development time and avoid common mistakes by learning the best practices to use for JavaScript.

dadadudeAuthor Commented:
i have 63 classes. and it's very slow.
and more than 4000 attribute each attribute has a list of coordinates (x,y) as u can see in the file.
at first i created 4000 file with x,y in them it worked fine but i was very slow.
it's the first time that i face such a problem.

I might be reading the file in a wrong way.
dadadudeAuthor Commented:
as you can see in the file that i posted at first i have 2943 element.
dadadudeAuthor Commented:
and i am still confused on which format should i use for the file.
should i use the one of the previous question or this one.
dadadudeAuthor Commented:
With the first question it worked perfectly. But it was very slow!!!!  and i was able to color the image.
Now with this algorithm it's not working although i go throught the File correctly i guess.
dadadudeAuthor Commented:
ok i cleaned the code. it's not reading all the classes!! it's just giving me one! i don't get it.
it's looks logical for me. but the results are weird. the indexing of the file is not working well.
original = 'C:/Decomposition/Analyse/image139/images/image139.png'
		im = imread(original,0)
		file = open('C:/Decomposition/Analyse/image139/feat/coordonnes.txt')
		for i in range(0,len(classList)):
			for j in range(0,len(classList[i].getListOfGraphems())):
				classe = int(classList[i].getListOfGraphems()[j])
				points = []
				for line in file:             
					lst = line.split()
					if len(lst) == 0:
						break              
					if int(lst[0]) == classe:
						  points.append( (int(lst[1]), int(lst[2])) )
								
				for c in range(0,len(points)):
					x = points[c][0]
					y = points[c][1]
					im[x,y,0] = classList[i].getR()
					im[x,y,1] = classList[i].getV()
					im[x,y,2] = classList[i].getB()

Open in new window

peprCommented:
I do not know the solved problem, so it is difficult for me to guess what you need to achieve.  No problem.  Step by step.  Both we will understand it better later.

The lines from 15 to 20 can be rewritten:

				for p in points:
					x = p[0]
					y = p[1]
					im[x,y,0] = classList[i].getR()
					im[x,y,1] = classList[i].getV()
					im[x,y,2] = classList[i].getB()

Open in new window


You usually do not want to work with lists exactly the way that is the only one possible in C.  No need for range() and indexing.

Where the imread() came from?
peprCommented:
dadadudeAuthor Commented:
ok here i am i'll post for u an image with colored segments: it will be all explained on it:
peprCommented:
For the file format... the text format is fine.  You should choose the one that is easy to generate.  The earlier format may be better because it is less space consuming.  The disk speed probably limits the processing more than a bit more complex algorithm (which is not more complex, only looks so).

But "In the face of ambiguity, refuse the temptation to guess." -- try in Python interactive mode:

>>> import this
dadadudeAuthor Commented:
sorry wrong post earlier.ok here i am i'll post for u an image with colored segments: it will be all explained on it:
I circled a segments as an example: the circled segment will look like that in the file:

x:

X1    Y1
X2    Y2
.        .
.        .
Xn   Yn

so basically these coordinates will help me color the segments that belong to the same class with the same color
Thank you, for your help.

test.png
dadadudeAuthor Commented:
I will explain it in a better way:
Step 1:create classes:
i have a list (called book in the program) with [0,0,0,1,4,5,....] this book represents the classes with the indices the segments.

supposed that i have 9 classes: self.g3 represents the number of classes:

#code book construction
		classList = []
		#self.g3 = number of classes
		#self.code = contains the classes with the indices beings the segments
		for i in range(0,self.g3):
			#create random colors
			r = round(rand.uniform(0,255))
			v = round(rand.uniform(0,255))
			b = round(rand.uniform(0,255))
			listOfGraphems = []
			#loop throught the list (self.code) 
			#if self.code[j] == i (classe number)
			#add the index to the list.
			#i have a class called Book so i have all the list of segments in that class with the class ID in that case it is i
			for j in range(0,len(self.code)):
				if self.code[j] == i:
					listOfGraphems.append(j)
			classList.append(book(i,listOfGraphems,r,v,b))

Open in new window

dadadudeAuthor Commented:
then i move on the next time with i posted earlier to color the image.
dadadudeAuthor Commented:
as for imread: i load the RGB image that i want to color it.
dadadudeAuthor Commented:
I think that the code is much clearer now: didn't use indexing switching to python! thank u for this info it's great and way easier!!!!!!!
for c in classList:
			graph =  c.getListOfGraphems()
			r = c.getR()
			v = c.getV()
			b = c.getB()
			for i in range(0,len(graph)):
				classe = int(graph[i])
				points = []
				for line in file:             
					lst = line.split()
					if len(lst) == 0:
						break              
					if int(lst[0]) == classe:
						  points.append( (int(lst[1]), int(lst[2])) )
						  
								
					for p in points:
						x = p[0]
						y = p[1]
					
						im[x,y,0] = r
						im[x,y,1] = v
						im[x,y,2] = b

Open in new window

dadadudeAuthor Commented:
still can't color the segments!! WEIRD!!
dadadudeAuthor Commented:
Dear sir,
it worked!! with the other code that u posted!! as u said it was very simple!!!!!!!!!!!!! thank you sooo much for ur help.

Please can u tell me what will be ur algorithm just in simple on how to read this file if u were me.
and do u have advise me to use objects in lists as i am doing, because i find it very easy to manipulate objects, makes my work much easier.

I will also have to change how i iterate throught the other lists.

Thank you.
Sincerely,
Hani.
dadadudeAuthor Commented:
Solution:

for c in classList:
			wanted =  c.getListOfGraphems()
			r = c.getR()
			v = c.getV()
			b = c.getB()
			coord = []
			for r in wanted:    
				points = a[r][1]     
				for p in points:
					x = p[0]
					y = p[1]
					im[x,y,0] = r
					im[x,y,1] = v
					im[x,y,2] = b

		return im,self.variance

Open in new window

dadadudeAuthor Commented:
by the way i am working on an interactive genetic algorithm with a beautiful interface. i would like u to see the executable version when i finish it. It still have to finish some problems related to pyqt4. but they will  be solved i am sure.
peprCommented:
Hi Hani.  You apparently own a kind of clever head (a bit younger than mine -- I guess from "ur newspeak").  I guess you are Italian, based on test.png and on "verde".  You may be a postgraduate student that tries the genetic algorithm approach to OCR.  I appreciate your politeness expressed formally.  Still, both we are just people on a kind of forum where informal behaviour is rather a norm ;)  To summarize: no need for "Dear sir" :)  I prefer the kind of "frienship" and not the "hierarchy" as all we have the things to accept from and to give to the others -- and one feels better among the friends than in a hierarchy.  Now back to the problem...

I strongly suggest not to use tab for indentation in your Python sources.  Even though it seems to be marginal problem, it could get worse later.  The reason is that tabs can be interpreted the way you do not expect.  They are not visible and if combined with spaces...  Python relies on indentation and with tabs combined with spaces it goes wrong very easily.  Use 4 spaces for one level.  Use no tabs in your sources.  This is a bit stronger recommendation than the "Python style guide" says but I do recommend it.

Please can u tell me what will be ur algorithm just in simple on how to read this file if u were me.

Do you mean how to read the data from the text file?  What format to choose?  Or do you mean the source file?  Let's continue in comments...


and do u have advise me to use objects in lists as i am doing, because i find it very easy to manipulate objects, makes my work much easier.

Definitely yes.  I prefer the Object Oriented approach whenever it is suitable.  And Python is extremely nice to express the approach.  Actually, every value used in Python is represented as an object.  Every variable is only a name coupled with an untyped reference to the object.  Every assignment means only copying the reference.  This way the list of integers is equally complex as the list of other objects.

For future, I do not use pyqt, so I will not help you there ;)
dadadudeAuthor Commented:
Thank you pepr for the comments. They were really helpful and solved all the problems. I also learned a new thing about lists and files so it's great. I'll keep you posted on my coding. When i finish it i'll post the executable and code.

dadadudeAuthor Commented:
Very good user
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Python

From novice to tech pro — start learning today.