Solved

Custom IP is very slow (pass data from ISE module to XPS PPC)

Posted on 2011-02-22
60
648 Views
Last Modified: 2012-05-11
Hi,
 
I'm actually running an application with a PPC440 on a virtex5.
On a first time, i sent data from a static buffer (in c code) from the PPC on a network line using LWIP and it's working pretty well.
Now instead of using a static coded buffer, i want to get data from my VHDL code (module coded in ISE). In order to do that i built a custom IP with FIFO.
 
The PPC is running at 400Mhz, and my VHDL code in ISE is running at 100Mhz.
My problem is i feed the fifo by 32bits data every 10ns BUT when i use the "CUSTOM_IP_TCP_FIFO_mReadFromFIFO(XPAR_CUSTOM_IP_TCP_FIFO_0_BASEADDR, 0)" method to read the fifo from the PPC it seems to get a lot of time so the fifo become full after few moment.
 
I don't know if this fifo stuff is the best way to pass data from sub module to PPC or if someone knows a best way ?

Or maybe i'm doing something wrong with my custom IP (on a PLB bus)?

 Thank you for your help cause i'm really lost and my dead line project is comming soon

 david
0
Comment
Question by:DBTechnique
  • 36
  • 24
60 Comments
 

Author Comment

by:DBTechnique
ID: 34958719
someone?
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34960029
It seems like you are accessing a resource (FIFO) that is stuck writing (because if you are writing the fifo every 10ns, there is no clock cycle left for accessing it).
Try writing the FIFO more slowly, say every 100ns. There should be 9 cycles left to read every 1 cycle used in writing.
0
 

Author Comment

by:DBTechnique
ID: 34961342
Hi,

Thanks for your advise but i cannot change the 10ns writing time.

It's for this reason i'm asking if there is another solution to face this problem. I can't believe that there is no solution to transfer data from vhdl module to PPC faster than 10Mhz (if every 100ns).
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34961417
I do not know what is the bandwidth of your local bus, but I'd think that you are fullfilling it (you say that the IP is running at 100Mhz).
A possible solution would be a double FIFO connected to your CPU with two separated and indipendent local busses. When one FIFO+bus is writing, you can read the other and vice versa.
0
 

Author Comment

by:DBTechnique
ID: 34961630
ok i'm not sure that my explanation is good, cause i don't understand well the connexion between PPC and VHDL sub module. So i ll explain it in a different way :

I have a PPC on a PLB bus (at 100Mhz i think) created in XPS. I have also a VHDL sub module created in ISE that working at 10ns.
And i want the PPC and the sub module can exchange data (through a fifo in my example).

What i don't understand is : the vhdl is "multi-process" so i can write on the fifo and do something else, but the PPC is simple thread so it can do only one thing at a time.
In other way if it read fifo it cannot do something else.

what i'm trying to do is :
- VHDL -> send data (ISE part clk=10ns)
- FIfo wrapper -> read the VHDL data and write them on a fifo for the PPC (it's a custom IP in XPS i don't know the clk value but i assume it is the plb one so clk=10ns)
- PPC read the value from the fifo (need maybe 10*10ns to read one data)

Your solution of double fifo is maybe good but is it possible to have 2 plb bus ? if yes how the simple thread ppc can handle 2 bus at a time ?
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34961683
the ppc has only to manage 1 FIFO at time - the reading one.
In reality, maybe you do not need 2 PLB, just one - it is important that the writing on FIFO doesn't "block" the PLB.
But again, I am not sure that this is really your problem - it depends on how the FIFO and the entire system is implemented.
0
 

Author Comment

by:DBTechnique
ID: 34961849
maybe my design is bad.  So bellow is what i want to do, can you advise me on the big step that i should do please :

- i have a trigger (pulse every 200us)
- i have a VHDL submodule that send (synch on the pulser) "32 bits data packet" every 10ns for "X" time
- i have a PPC that should send all this 32bits data on the LAN using the LWIP

Trigger______| |_________________________| |__________
32b packet ___XXXXXXXXXXXXXXX___________XXXXXXXXXX

Of course if the number of time we send "32 bit packet data" is bigger than 200us it cannot work. And if we send only 1 packet it will work cause we still  have 1990ns to handle it.
But i would to achieve the best performance.

Thank you
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34961977
Just use a DMA controller that reads the 32bit word from the device and write it on the memory directly. You do not need FIFO if the DMA controller is fast enough.
 
0
 

Author Comment

by:DBTechnique
ID: 34962039
DMA is faster than PLB bus ?

Let say that i use DMA to transfer data from the vhdl sub mudule to Memory (SDRAM or block ram ?). Then after that i have to read back the memory to get the data and put them on the LAN bus with the line "tcp_write(pcb, BufStream, length * sizeof(Xuint32), 1);" right ?

At the end does it not takes same or more time ?
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34962120
Due to the nature of LAN and the data rate, the data received from the TCP socket on the client side of the application cannot be in real time.
You should calculate what is the data rate expected on the "long" time (samples per second). then calculate which is the maximum data rate you can "write" on the TCP socket - 100Msamples/sec is impossible with TCP and a 400Mhz processor, in my opinion.
0
 

Author Comment

by:DBTechnique
ID: 34962267
off course you are right. To give you a real number, i need a data rate of 6MByte/s (48Mbits/sec) that is in my opinion possible to do.
So can you advise me on the best way to achieve this speed please
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34962417
First I'll make a test - write continuously in a tight loop on the socket and measure the maximum transfert rate.
I'd bet that you can't reach 2-3Mbyte/s. :-)

0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34962432
...of course, you MUST check the other side for integrity! it worth nothing if you write 10 Mbyte/s on the socket but only read missing data on the wire!
0
 

Author Comment

by:DBTechnique
ID: 34962600
I did tests but i have some problem to understand results.

- When i send 1 packet of  960byte (32bits*240) : tcp_write only 1 time, ack 1 time
   + All data are received by the host
   + Ack of the host take 250us
   + cannot calculate the data rate cause not enough data to have an average with the sw i used to monitor

- When i send 10000 packet of  960byte (32bits*240) : tcp_write 10000 times, ack 10000times (wait ack before to redoo a write)
   + All data are received by the host
   + Ack of the host take 250us but sometime go up to 5000us
   + Average of the data rate is 8Mbytes/s

i can see that i have a good data rate without missing data but with a long time for ack sometimes. how is it possible ??
0
 

Author Comment

by:DBTechnique
ID: 34962620
do you know the average speed of the LWIP in RAW mode ?
I found a lot of different speed on Internet so i don't know which one is real ....
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34962703
Because it's how tcp works. :-)
ACK aren't required to be received before sending other packets, because each packet can use different path (route) to reach the destination. So each packet can be received after other packet that are sent after the first one. So ack isn't required to be received before sending other packet. More, at the application layer ack aren't seen, the tcp/ip layer is completely transparent about this implementation detail.
I am surprised you reach 8Mbyte/s transfer rate, but sure, you are using a microcontroller without an operating system, so it can be a real approximation of the upper limit - but hope you do not have too much "task" to perform...
About the "real" transfer rate of LWIP in raw mode, it depends heavily on the hardware, so you must make some test.
0
 

Author Comment

by:DBTechnique
ID: 34962834
i did another test cause there is also the tcp buffer that is on the equation.
i did the same test than before but i increase the size of the packet (not the number). you have to know that the tcp buffer size is 8192 and the windows is 2048 (i don't understand them ....)

- When i send 1 packet of  2160byte (32bits*540) : tcp_write only 1 time, ack 1 time
   + All data are received by the host
   + Ack of the host take 250us
   + cannot calculate the data rate cause not enough data to have an average with the sw i used to monitor

- When i send 10000 packet of  2160byte (32bits*540) : tcp_write 10000 times, ack 10000times (wait ack before to redoo a write)
   + All data are received by the host
   + Ack of the host take 250us but sometime go up to 3000us
   + Average of the data rate is 19Mbytes/s

It's really fast, unbelievable compare to what you say. Moreover how can have that big ack time and such fast speed cause in my test i'm waiting the ack of the first packet to send the next and so on.....
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34962945
There is something wrong in your calculation:

10.000 packets x 250us (minimum) = 2.5 seconds
10.000 packets x 2160 bytes = 20.6Mbyte

20.6Mbyte / 2.5s = ca 8Mbyte/s

In the BEST condition, where all ACK takes 250us.
(sidenote question, how did you measure 250us?)

Are the data checked for correct receiving in the other side?
0
 

Author Comment

by:DBTechnique
ID: 34963159
Actually the packet i send 940 or 2160 have real data inside.I have also my own SW (connected to the socket) that will receive packet and tell me how much packet it received. in order to tell me "+1" it have to find 940 or 2160 32bits data.

So when i send 940bytes data 10.000 times, my SW will say me found 10.000 packets (of 940). So i'm sure of the packet number that i received.

I don't understand well your calculation cause :
10.000 x 250u = 2.5 => 10.000 x 2160 = 21.6 Mbytes => 21.6/2.5 = 8.64Mbytes/s
10.000 x 250u = 2.5 => 10.000 x 940= 9.4 Mbytes => 9.4/2.5 = 3.76Mbytes/s
So why the speed monitor sw that i'm using say 8 or 19Mbyts/s and not 8 or 3 ???

I'm using a small sw to monitor the network card of my computer and have the traffic speed.
For the write/ACK time in us, i have a counter in my fpga on a 10ns clk, i read it before to write then one more time when i received the ACK, and i do a subtraction (minus110ns due to the read function of the counter).

 

0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34963216
Do you agree with my calculation? If so, I should tell that your program doesn't make the correct calculation, simply.
Each time it receive a packet it should count the data read (accounting the packet size). At the end it must calculate the data rate by dividing for the total time ...

0
 

Author Comment

by:DBTechnique
ID: 34963326
Actually your calculation is good, i think i messed up when i give you the time of the write/ack.
i redoo a test and i have an average of 120us, so if we consider the same equation than before, we have something like :
- 2160 packet => 18Mbyte/s
- 940 packet => 7.8Mbytes/s  

Don't you think that the speed is incredibly high ??

Now that we are ok with the speed (are you ok with 18Mbytes/s??), the problem is that i send a packet of data, then wait for the Ack then send another one.
But when the ack takes 5000us (more than 200us, remember the trigger :)) to come back obviously my fifo is full.
So how can i manage that ?
0
 

Author Comment

by:DBTechnique
ID: 34963354
forget to tell, i didn't test it, but since  from 940 to 2160 bytes i doubled my data rate, do you think that if i send for example 4096 byte packet i will reach more high speed rate ?
So it will be more unbelievable speed don't you think ?
0
 

Author Comment

by:DBTechnique
ID: 34963455
I think my problem should not be so hard for someone that is familiar with TCP,PPC and LWIP but the main problem is that i'm not enough skilled in these domains and i don't have enough time to be. So i'm facing a lot of problem that are not real problem.

I don't know if you are able to do that (i mean if we speak about this website rules), but i wonder if you can help me more deeply in my problem. I mean if we can speak on chat sw (it's more comfortable than forum), then i can share with you my files, codes and you can help me to test on a real environment. It will be more easy and more efficient than theoretical question one by one.
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34963463
Yes. I would check the integrity of packets - fill the packet with an increasing 32bit counter, starting from 0. So the first packet would start with 0x00000000 / 0x000001 the second with 540(hex) / 541(hex), et cetera.
The testing program must check that the right counter is inside the packets. If it doesn't, then the test fails - data rate is wrong. If it is ok, then in effects all data is written to the wire and the data rate is right.
A thing you didn't mention, what kind of host (windows pc, fpga, etc) is the machine where the testing program is running on? 120us for ack is a very low response time.
But the test I suggest should remove any doubt on the feasibility.
0
 

Author Comment

by:DBTechnique
ID: 34963642
To answer about environment, i work with a Xilinx virtex5FX70 and a PPC440. My computer is a laptop dual core 2.4GHz with 4Go of RAM on a windows XP 32bits and i have standard 1000BT LAN card.
The clock system of my FPGA is 100Mhz quartz and the PPC clock is 400Mhz.

Actually i'm partially doing the test that you propose. On the sent packet (940 or 2160) i have 1 32bits data that is a counter value. so when i send 10.000 times 940bytes packet (235@ of 32bits), each received packet have a different value on the "@ = 13".

you have to know that the "@=0" and the "@=235" are start and stop flag and the "@=1" is the lengh of the packet.
So my actual sw receive a start flag , then check the length of this one, then if the stop flag corresponds to the length, it accept this packet. If only 1 @ (32bits) are missing the entire packet is drop out.
So when this sw say that it count 10.000 packets of 940bytes i'm sure that it received 10.000packets.
Moreover i'm able to save each packet and check them. and i can easily see that the "@=13" was incremented 10.000times

Does this test is enough ?


0
 

Author Comment

by:DBTechnique
ID: 34964666
I think your day is over for today, so i hope we can continue tomorrow.
good night :).
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34968019
Ok, so we should be enough confident that, when in a tight loop, you can transfer up to 19Mbyte/s .
Back to the original argument, you should use a DMA controller. This way you can transfer the data to the memory without spending CPU cycles.
0
 

Author Comment

by:DBTechnique
ID: 34968286
Morning,

ok, but don't you think 19Mbyte/s is to much compare to what you was expected (2, 3Mbyte/s). The DMA seems to be goo but i still have a problem :
- Since i write, then i wait the ACK before to write again, when an ACK takes 5000us, my fifo becomes full.

Now about the architecture.
Actually i have :
32bitsData -1-> Fifo -2-> PPC buffer -3-> TCP_send+ACK
1/ 940byte packet @ 100Mhz
2/ 940byte packet @ 10Mhz
3/ 940byte packet + tcp header @ 120us when ack is fast

With DMA :
32bitsData -1-> Fifo -2-> Memory -3-> PPC buffer -4-> TCP_send+ACK
1/ 940byte packet @ 100Mhz
2/ 940byte packet @ ?
3/ 940byte packet @ ? , + need to know when new data (no empty flag since it's a memory)
3/ 940byte packet + tcp header @ 120us when ack is fast

That is correct ? If yes, do you think that the DMA path is faster than the fifo one ?
Moreover i still have the problem of the ACK that can takes 5000us instead of 120us and so every thing is messed up.
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34968385
Yes, is faster because you do not have to read each word from the bus in a tight loop, but you can transfer directly the block of memory.
For the new data, just use an IRQ.
You could use a register to read the data lenght (if needed) from the dma controller - actually your reading device - or use a scatter-gather strategy to put different block of data around in memory.

For the ack problem, it's not a problem: just use a queue to write packets in the correct order. You do not have to wait for ack, because the tcp ip stack *should* take care about retransmissions.
0
 

Author Comment

by:DBTechnique
ID: 34968492
ok let say i'm able to put DMA, an IRQ and that every thing is working for read data from fifo (that seems really hard to do cause i don't even now what is DMA). But in this case i ll be able to get a block memory and put it directly on the TCP socket.

Now it's good that you speak about "ou do not have to wait for ack, because the tcp ip stack *should* take care about retransmissions." cause i don't understand well this part too.

So let me tell you what happen :
- on the book (or for normal people :))
I red that usually you fill up your buffer and when this one is full it will be send automatically. So usually people write on the buffer until an error (full). when error they wait (how long ? an ack ?) then resend the last packet then continue to write until the next error. Is it correct ?
You can if you want use the function "tcp_output" to send the buffer even if it is not full

- in my case
when i write a blockpacket (a block in my case cause i read many packet ) on the tcp_socket, i feel (i don't know how to test) that it is send right now. So when i use the tcp_output stuff i cannot see any changes.
I red that if you are on the received_callback it shouls be like that and send data as soon as he have something (do not care about full). In my case i am on the poll_callback, is it similar ??

Before i was sending packet on the tcp socket without care of the ack, and i had a lot of lost data, it s for this reason that i'm waiting for it now before to send a new one. If so it means that before the TCP stacK do not handle the retransmission ??  
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 
LVL 12

Expert Comment

by:HappyCactus
ID: 34968628
I do not know the internals of LWIP or how you are using it. I didn't even use LWIP - I am working with the Microchip stack and I hate it ;-)
But by what you say I should think that the stack doesn't handle the ACK automatically. Then you could use a "retransmission queue", i.e. have a buffer where you store the packets that doesn't had an ACK and make retransmission when a certain timer (retransmission timer) has expired.
This way you can send as many packet as you want and only resend the ones that doesn't receive an ack within, say, 2 or 3 seconds.
0
 

Author Comment

by:DBTechnique
ID: 34968764
As you said before, if we send for example 5 packets it doesn't mean than we will received ack1, then ack2 etc...
Maybe we will received ack2, ack1, ack3, ack5 and the ack4 is lost for example
So how do you know which ack correspond to which packet and which one should be resend ?
Another thing is if i send packet 1,2,3,4,5 and i can see oh god i have to resend the 2, it means i have to keep in memory or somewhere this packet right ? so i have to store how many packet cause in 2 or 3sec a lot of packet will be send .......

Other information :
I modified my code and now i'm able to write packet until socket buffer is full then send it. Then wait for an ack to continue to rewrite inside it until full and so on.
The speed increase up to 250Mbits/s (30Mbyte/s) can you believe that :) and all data are received.....
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34968773
ACK have a sequence number - see tcp ip specifications.
0
 

Author Comment

by:DBTechnique
ID: 34968829
........ but with this new speed some time i have an error and the socket connexion go down
0
 

Author Comment

by:DBTechnique
ID: 34968849
ok for seq number, but i send 32 packet of 940 byte and after 3 sec i found out that the packet 2 was not received so i have to resend it. it means that i have to save somewhere 32 or more packet of 940bytes ?? it's too huge .......
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34968871
You should only keep the packets that you do not received any ack - so in your example, only the packet n.2.
There is no other way - either you wait for ack, or keep a retransmission queue.
Note that when waiting for ack, if you do not receive any ACK from the other host, you wait forever, or the fifo overflows.
0
 

Author Comment

by:DBTechnique
ID: 34968958
sorry but i don't understand.
you said "You should only keep the packets that you do not received any ack ", ok but i don'y know which one will not have ack .....

- For example in one process i send data
----packet1---2---3---4---5---6--etc
- The received ack
--------------ack1-----------------2-----------------------------3-----------------------------5
- In another process i checked the ack
------------------------------------------------------------------------------------------------------------------------- after 2s i see that i don't have the ack4

-- is the time
so when you see that the ack 4 does not exist and you have have to resend the packet but we already on the packet 48 maybe so do i have to store 48 packet of 940byte ?
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34969009
No.
When you (the stack) assembly the ip header, you fit it with a sequence number.
When the packet is sent, you keep a copy somewhere (retransmission queue)
when you receive an ACK, you remove the referring packet from the queue.
Periodically you must see when each packet has been sent. if 2 seconds are passed, resend it.
So. If you receive ack 1,2,3,5,...48, you only have the ack n.4, because only packet n.4 needs to be resent.
0
 

Author Comment

by:DBTechnique
ID: 34969121
so let say i send 1024 packet of 940byte, i should receive 1024 ack.
after each send the LWIP save on the retransmission queue the packet and delete it when it receive the corresponding ack.
Periodically, i check the queue (how ?) and if some packet are inside i resend them.
So finally i don't have to really count how many number of ack came back right ?

Another stuff, just for test ans understanding i'm counting the ack. I'm counting the number of time that the sent_callback is called (corresponding of the number of received ack) but even if i received let say 65536 packet of 940byte in my host, i count only 37562 (it change) ack why ??

0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34970480
the queue must be checked periodically with a clock, I do not remember exactly but may be every 2ms (500Hz) should be ok.
For the second question, sorry, I do not know. Are you sure that "sent_callback" is related to ACK and not to PHY buffer empty?
0
 

Author Comment

by:DBTechnique
ID: 34974298
Actually i understand why i have less Ack than  frame. i red that the LWIP decide when to send buffer and the lengh is not everytime same, so some time it send 4 sometime 2 frames so different number of Ack.

i did other test and the problem is still the same (slow ack).
I write data until my tcpbuff is full send it s send on the socket until now i'm waiting for an ack then continue to write. So when the ack is really long to received my fifo is already full.
So try to not wait the ack but write on the tcpbuf as soon as he is free. and guess what it is free when ack happen.

So actually i cannot send any data until received an ack that free space on the tcpbuff. what can i do ???

back to your idea of DMA, i think but it will be the same problem, cause even if i write on memory (i ll save the fifo time), i still have to wait an ack before to read the memory and the block on the socket.
Even if it's better, if some ack take 5/10ms, the SDRAM will beome full too.

I have really no idea how every body handle this stuff......  
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34977990
Yes of course... this is how a tcp/ip stack works. You can flush the buffer with the relative function call - see doc. This works also for receiving buffer. This function usually adds a PSH flag into the tcp packet, so that the input buffer is PUSHED to the receiver layer.

The problem you write about tells me that the software layer can't handle this transfer rate, you have a bottleneck that is the time you require to handle the packets.
How many ram do you have onboard? and which is the maximum size of the data you must handle? i.e., how many 940bytes packets you must handle? are a limited number (for example 2-3seconds) or does it require to maintain this data rate indefinitely?
0
 

Author Comment

by:DBTechnique
ID: 34978728
Hi,

About RAM, do you speak about FPGA block RAM or other memory like for example i have 64MByte DDRAM (the one i ll use if i put the DMA i think).

To explain more about the entire app, it run indefinitely and received on the input analog data that are convert in digital and send to the FPGA by RocketIO @ 4Gbps.
This received data are process and send to the PPC that should then send it to a computer.

the number of packet is so infinite.
maybe i should have a big buffer that can store data even if an ack takes a long time.

other question about the setting of the LWIP:
- what is the max lenght of the sndbuff ?
- Is it important to set a big heap and stack , what is a good average size ?
- When i declare an array in c code on the power pc, it takes a bram ? if yes can i acces to it by vhdl ?

and then, since i ll try the DMA do you know a good tutorial than can help me to use it?
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34978770
tutorial... very difficult, since each DMA has his own usage. Check the IP documentation.
For an introductive reading about DMA, I can suggest wikipedia: http://en.wikipedia.org/wiki/Direct_memory_access

With this requirements, you MUST use the DDRAM.
0
 

Author Comment

by:DBTechnique
ID: 34979010
I want to use the "xps_central_dma_0" v2.01b of XPS. I red the datasheet but they explain nothing about how to implement it and how to connect it with other module.
And also they don't say about wich c command instruction should i write to perform a read or write.

Do you know where i can find information?
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34979739
the DMA controller should have some address space reserved to access his registers. Check in the configuration of the IP sources. From C you should write/read from that registers to specify the source and destination address, the length and to start / interrupt / interrogate the system.
Check carefully on the documentation, it should have all the necessary informations.
0
 

Author Comment

by:DBTechnique
ID: 34980471
first what i don't understand is that actually i have a PPC link to a DDR2_MemoryControler by LLDMA bus. That mean that the PPC have is own controler to speak with the DDR2.

Now if i want to that the PPC and also my sub module to comunicate with the DDR2 do i have to use a MPMC or the central DMA ? or both ?

before :
PPC ----- LLDMA ------ Memorycontroler ------- DDR2

after :
VHDL sub module -----PLB---- | MPMC | ---?--| DDR2 |
PPC ----------------------LDMA---- |              |         |              |

0
 

Author Comment

by:DBTechnique
ID: 34994634
Actually i found out how to connect the central DMA on a PPC440. but since it's using plb port to access the ppc, then ppc will write on the DDR i don't understand the difference with using a custom IP on plb port in term of speed....

If the central DMA is using PLB it ll be slow same as before and it seems that it also use the PPC .....
Do you have an idea ??
0
 

Author Comment

by:DBTechnique
ID: 34996403
??
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 34996436
No, the DMA permit to transfert a block of data from a port to the RAM.
this means that the CPU is free to do something else, other than reading the port and saving the data in ram.
Please take a look on the links I gave you about DMA, in particular the difference between DMA and PIO mode.
0
 

Author Comment

by:DBTechnique
ID: 34996565
ok but i don't understand cause for the central DMA, i need 2 plb bus
bus0 : the PPC is master, the DDR2 controler is slave, the central DMA is slave and my custom ip is slave
bus1: the PPC is slave and the central DMA is master

So when central DMA need to write data to the DDR2 it have to pass through the PPC, so even like this it does not use the PPC ?
0
 

Author Comment

by:DBTechnique
ID: 34999073
¿¿
0
 

Author Comment

by:DBTechnique
ID: 35004609
are you there ??
0
 

Author Comment

by:DBTechnique
ID: 35015190
no more help ?
0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 35016128
I do not know how your DMA IP works, so I suggest you to check carefully the documentation. Maybe you need to configure your DMA ip .
0
 

Author Comment

by:DBTechnique
ID: 35016448
I manage to implement and configure my central DMA, but now that i'm running a test ii doesn'work. When i read all registers to check i can read:

DMACR : 0xC0000000 => ok

SA : 0x4600F000 => ok

DA : 0x4600F100 => ok

LENGTH : 0x00000020 => ok

DMASR : 0xA5A5A5A5 => WRONG why this value ?

ISR : 0xA5A5A5A5=> WRONG why this value ?

IER : 0x00000003 => ok


The test is still runing, but i have no error either done flag that happen so it gets stuck.... Do you have an idea of what i didn't set well ?

Why DMASR and ISR seems to not be initializaed ?
0
 

Author Comment

by:DBTechnique
ID: 35016816
i red carefully the datasheet, so can you continue to help me please
0
 

Accepted Solution

by:
DBTechnique earned 0 total points
ID: 35026422
Hi,

I'm trying to use a XPS central DMA. Actually i connected it to the same PLB that my PPC440 and all other peripherals. I first try to do a simple write and read from BRAM to BRAM. bellow my code :

I have no error, everything seems to be ok, the status register say that the DMA is done but if i read the place where DMA should have been write data nothing is written.

In order to be sure that addresses on BRAM are available, i wrote by hand at the first address (0xDEAD0011) and then when i red data (data_dam) i see my "0xDEAD0011" but the next data is 0x00000000.

So why do i get no error but the DMA seems to not do his job ????

here a copy of my UART output after a try :
data : 0xFFFDDB48
data_0 : 0xFFFDD50C
DPRAMREAD : 0xDEAD0011
DMA_DA : 0xFFFDDB4A
DMA_SA : 0xFFFDD50E
DMA_DMACR : 0xC0000000
DMA_DMASR : 0x00000000
DMA_ISR : 0x00000001
regvalue_reset:0
data_dam : DEAD0011
data_dam : 0


Thank you
int x;
XDmaCentral mdma;
//XDmaCentral_SelfTest(&mdma);

x=XDmaCentral_Initialize(&mdma,0);
if (x != XST_SUCCESS)
{
    xil_printf("Error Initialize \n\r");
    return (XST_FAILURE);
}

XDmaCentral_Reset(&mdma);
XDmaCentral_SetControl(&mdma, XDMC_DMACR_SOURCE_INCR_MASK | XDMC_DMACR_DEST_INCR_MASK);   
                 
static u32  data[2]= {0x00000000,0x00000000};
static u32  data_0[2]= {0xabababab,0xecececec};
u32 len=2;
u32 RegValue;
xil_printf("data : 0x%08X\n\r", data);
xil_printf("data_0 : 0x%08X\n\r", data_0);

XGpio_mWriteReg((void*)data,0,0xDEAD0011);
xil_printf("DPRAMREAD : 0x%08X\n\r",XGpio_mReadReg((void*)data,0));

XDmaCentral_Transfer(&mdma,(void*)data_0,(void*)data,len);
   xil_printf("DMA_DA : 0x%08X\n\r", XDmaCentral_GetDestAddress(&mdma));
    xil_printf("DMA_SA : 0x%08X\n\r", XDmaCentral_GetSrcAddress(&mdma));
    xil_printf("DMA_DMACR : 0x%08X\n\r", XDmaCentral_GetControl(&mdma));
    xil_printf("DMA_DMASR : 0x%08X\n\r", XDmaCentral_GetStatus(&mdma));
    xil_printf("DMA_ISR : 0x%08X\n\r", XDmaCentral_InterruptStatusGet(&mdma));

do
{
    RegValue = XDmaCentral_GetStatus( &mdma );
    xil_printf("regvalue_reset:%x \n\r",RegValue);
}  while ((RegValue & XDMC_DMASR_BUSY_MASK) == XDMC_DMASR_BUSY_MASK);


if (RegValue &  ((XDMC_DMASR_BUS_ERROR_MASK)| XDMC_DMASR_BUS_TIMEOUT_MASK))
{
    xil_printf("entered error or timeout branch \n\r");
    XDmaCentral_Reset( &mdma );
    return XST_FAILURE;
}

int i=0;
while(i<2)
{                       
    xil_printf("data_dam : %x\n\r", data[i]);
    i++;
}

Open in new window

0
 
LVL 12

Expert Comment

by:HappyCactus
ID: 35038141
As you want.
0
 

Author Closing Comment

by:DBTechnique
ID: 35081115
i found my solution alone ....kind of.... since you don't want to help me anymore on this topics.
thanks for your previous answer but i cannot give you a A since i have to find solution alone

Anyway i open a new one cause i have trouble with interruption now
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Circuits design 2 184
SLIPPERY KEYBOARD 4 652
WD Red WD20EFRX in dell R710 raid 3 904
Mezzanine card 2 105
Use of TCL script on Cisco devices:  - create file and merge it with running configuration to apply configuration changes
In this article, I will show you HOW TO: Perform a Physical to Virtual (P2V) Conversion the easy way from a computer backup (image).
When you create an app prototype with Adobe XD, you can insert system screens -- sharing or Control Center, for example -- with just a few clicks. This video shows you how. You can take the full course on Experts Exchange at http://bit.ly/XDcourse.
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now