View Full Version : GT300: 256 GB/s memory bandwidth?
walterman
05-12-09, 07:06 AM
Finally off of the GDDR3
NV40 2004, G70 2005, G80 2006, G92 2007, GT200 2008, ...
Yup, almost 5 years using the GDDR3 technology.
Will pci-e 1.1 bottleneck the GT300's or will it just be incompatible?
I thought when the move was made to PCI-E 2.0 it was stated (somewhere) that although 2.0 has much more bandwidth the move wasn't made due to a lack of bandwidth in 1.1... I could be totally wrong but I swear I read an article somewhere stating that 1.1 had loads of bandwidth available but 2.0 was more efficient in some way.
Either way, I'm already on 2.0 so bring it on already!
Dreamweavernoob
05-12-09, 08:28 AM
NV40 2004, G70 2005, G80 2006, G92 2007, GT200 2008, ...
Yup, almost 5 years using the GDDR3 technology.
Bring on DDR7 :D
(nana2)(nana2):headexplode::headexplode:
XDanger
05-12-09, 04:08 PM
I could be totally wrong but I swear I read an article somewhere stating that 1.1 had loads of bandwidth available but 2.0 was more efficient in some way.
Still haven't had a definitive answer on this since the release of pci-e 2.0, What would saturate 1.1??
I hope my p5b can handle it.
josiahsuarez
05-12-09, 04:18 PM
here is what wikipedia says on the subject
http://en.wikipedia.org/wiki/PCI_Express#PCI_Express_2.0
PCI Express 2.0
PCI-SIG announced the availability of the PCI Express Base 2.0 specification on 15 January 2007.[16] PCIe 2.0 doubles the bus standard's bandwidth from 0.25 GByte/s to 0.5 GByte/s, meaning a x32 connector can transfer data at up to 16 GByte/s for both videocards (SLI 2x etc). PCIe 2.0 has two 32 bits channels for each GPU (2x16), while the first version only has 1x16 and is operating at 2 GHz.
PCIe 2.0 motherboard slots are backward compatible with PCIe v1.x. PCIe 2.0 cards have good backwards compatibility, new PCIe 2.0 graphics cards are compatible with PCIe 1.1 motherboards, meaning that they will run on them using the available bandwidth of PCI Express 1.1. Overall, graphic cards or motherboards designed for v2.0 will be able to work with the other being v1.1 or v1.0.
The PCI-SIG also said that PCIe 2.0 features improvements to the point-to-point data transfer protocol and its software architecture.
CaptNKILL
05-12-09, 04:55 PM
but....but...... dont you want another opportunity to upgrade :D
:headexplode:
I'll upgrade my system when I find the perfect setup for the right price.
That will probably be after GT300 is released.
walterman
05-13-09, 07:20 AM
Still haven't had a definitive answer on this since the release of pci-e 2.0, What would saturate 1.1??
I hope my p5b can handle it.
If your gfx card has got enough ram, and the games create the texture/vertex/surface/... buffers on the local video memory, the pcie bandwidth is only used for the drawing calls & primitives (the instructions of the 3D APIs).
XDanger
05-13-09, 07:44 AM
I like that answer ,Thanks
Its a keeper. :)
walterman
05-13-09, 10:20 AM
But, also, you should know, that this was for a single card.
If you run SLI/Xfire, you have extra 'traffic' between the cards. This traffic could be huge, because the cards will exchange data during the rendering process (a lot of render targets that are as big as the screen resolution f.e.).
If you want to go SLI/Xfire, get the mobo with the fastest PCIe bus (full 16x slots & rev 2.0).
Also, it's a good idea to jump to the core i7 for SLI/Xfire, because all the bandwidth of the NB is used to control the PCIe buses (mainly), while in the previous architectures, you had to share it with the main memory too (the FSB connected the CPU to the NB, and the NB to the ram/pcie). And the QPI bus has more bw than any old FSB.
X48: http://www.dinoxpc.com/Tests/articoli/articolo/schedemadri_asus_p5e3_premium_images/x48_block_diagram.jpg
X58: http://hothardware.com/articleimages/Item1244/x58-diagram.jpg
Also, i dunno the bandwidth of those SLI/XFire bridges, but i think that given the big difference in the performance between the core i7 & the previous chips, i think that it must be ridiculous.
Pistolgrip
05-16-09, 08:20 PM
If your gfx card has got enough ram, and the games create the texture/vertex/surface/... buffers on the local video memory, the pcie bandwidth is only used for the drawing calls & primitives (the instructions of the 3D APIs).
The texture uploading (CPU->GPU) is the largest bandwidth consumer these days. If the bandwidth was sufficient you would be able to stream textures to the GPU each frame and do without a lot of video memory. Doing readbacks from the framebuffer and things of this nature are also greatly affected by the bandwidth of PCI-E and its latency.
The unified memory system of the XBOX is where PCs should be going because it's a ridiculous waste having such fast memory ONLY for the video card when the CPU(s) could do with 256GB/s memory too. Of course this would require a new motherboard design and some hardware issues would need to be solved but yeah... integrated GPUs (likely on the CPU) will be pushing us toward this goal in the future I feel, because it's easier to solve the hardware issues with a soldered GPU on the mobo.
walterman
05-17-09, 09:10 AM
The texture uploading (CPU->GPU) is the largest bandwidth consumer these days. If the bandwidth was sufficient you would be able to stream textures to the GPU each frame and do without a lot of video memory. Doing readbacks from the framebuffer and things of this nature are also greatly affected by the bandwidth of PCI-E and its latency.
The unified memory system of the XBOX is where PCs should be going because it's a ridiculous waste having such fast memory ONLY for the video card when the CPU(s) could do with 256GB/s memory too. Of course this would require a new motherboard design and some hardware issues would need to be solved but yeah... integrated GPUs (likely on the CPU) will be pushing us toward this goal in the future I feel, because it's easier to solve the hardware issues with a soldered GPU on the mobo.
Yup, but, even if you fill the GTX280 with 1 GB of textures, a 16x PCIe 1.1 slot will send the data in 1/4 of second. Add a good texture manager, with some sort of LRU politic, and it will keep the texture trashing performance hit very low.
The first time that i did read about the UMA architecture was in year 96, when i saw a SGI O2 (supercomputers at that time). UMA architecture for the PC will have thousands of problems due to all the long inheritance of the x86 architecture. Obviously, like you said, you will have the problem of the integration, but, maybe if things like Larrabee have success, things may change in the next years.
Pistolgrip
05-17-09, 08:21 PM
Yup, but, even if you fill the GTX280 with 1 GB of textures, a 16x PCIe 1.1 slot will send the data in 1/4 of second. Add a good texture manager, with some sort of LRU politic, and it will keep the texture trashing performance hit very low.
If PCIe was fast enough though you could have unlimited numbers of textures (defined by system RAM) and use them whenever you want. Each frame could have completely different textures, basically because it accesses system RAM exactly like it would it's local RAM now.
The first time that i did read about the UMA architecture was in year 96, when i saw a SGI O2 (supercomputers at that time). UMA architecture for the PC will have thousands of problems due to all the long inheritance of the x86 architecture. Obviously, like you said, you will have the problem of the integration, but, maybe if things like Larrabee have success, things may change in the next years.
Wouldn't you say XBOX was UMA for PC? As are most integrated GPUs currently. It's just the motherboard issues, not having fast enough pipes or enough of them. The issue has also been the lack of fast RAM that is pluggable. But if you had enough DDR2 sticks, you could hit current GDDR speeds, ie a motherboard which supports 8 sticks of DDR2-800 could hit 50GB/s theoretical.
I haven't done much study into actual hardware engineering but I believe soldering memory is better than having pluggable memory when it comes to speeds, and having shorter copper paths also improves the speeds achievable. So the closer you can place the memory to the CPU/GPU the better off you will be. I hope with enhancements in DDR3 and I7 we will be looking at 50GB/sec soon with it's triple channel memory system. If AMD went triple/quad channel it would be an awesome integrated GPU setup, a very competitive product (similar performance to a dedicated 4830 with onboard RAM, with a lot more RAM available to it).
walterman
05-18-09, 08:08 AM
If PCIe was fast enough though you could have unlimited numbers of textures (defined by system RAM) and use them whenever you want. Each frame could have completely different textures, basically because it accesses system RAM exactly like it would it's local RAM now.
There are other ways. You could use procedural textures, that you can generate in the GPU with maths, or with some blending & multi texturing. If you fill the system RAM with textures, prolly you will have a bigger problem: virtual memory hyper paging. Not a good idea.
Wouldn't you say XBOX was UMA for PC? As are most integrated GPUs currently. It's just the motherboard issues, not having fast enough pipes or enough of them. The issue has also been the lack of fast RAM that is pluggable. But if you had enough DDR2 sticks, you could hit current GDDR speeds, ie a motherboard which supports 8 sticks of DDR2-800 could hit 50GB/s theoretical.
The XBOX wasn't designed to work with the 30 years of x86 inheritance. Just for example, the expansion cards in the 80's - 90's were designed to 'insert' their custom BIOS code above the first 640KB of RAM. The XBOX designers did not need to keep this 'compatibility' :o
A mobo with 8 x 64bit DDRx channels will be extremely complex to design & build. All the traces of the memory must have exactly the same length (GDDR5 removes this problem). But, you could use XDR (rambus) to overcome this problem (expensive). Not to comment that a speed of 50GB/s is too slow today for a GPU. :D
I haven't done much study into actual hardware engineering but I believe soldering memory is better than having pluggable memory when it comes to speeds, and having shorter copper paths also improves the speeds achievable. So the closer you can place the memory to the CPU/GPU the better off you will be. I hope with enhancements in DDR3 and I7 we will be looking at 50GB/sec soon with it's triple channel memory system. If AMD went triple/quad channel it would be an awesome integrated GPU setup, a very competitive product (similar performance to a dedicated 4830 with onboard RAM, with a lot more RAM available to it).
You must ask to an electrical engineer. Soldering prolly will help to keep the noise of the signal low, thus allowing you to reach higher frequencies in the bus. CPUs with big & smart caches suffer less than the GPUs. Basically, half of the silicon in a CPU is used for the cache.
vBulletin® v3.7.1, Copyright ©2000-2012, Jelsoft Enterprises Ltd.