![]() | |
|
Welcome to the ABXZone Computer Forums forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact contact us. |
![]() |
| | LinkBack | Thread Tools | Display Modes |
| | #31 |
| Registered User Join Date: Jan 2005
Posts: 1,232
| Since some people are looking for information and I am an advocate of core logic knowledge (this is not a pun), I think you all will find this article impressive. From the many sources of what I have read this chip is an internal masterpiece and very impressive. What catches my eye the most is the 128bit SSE units. http://www.realworldtech.com/page.cf...WT030906143144 |
| (Offline) | |
| | #33 | |
| DP45SG/Q9650 ![]() Join Date: May 2003 Location: Alabama
Posts: 6,088
| Quote:
| |
| (Offline) | |
| | #34 | |
| DP45SG/Q9650 ![]() Join Date: May 2003 Location: Alabama
Posts: 6,088
| Quote:
| |
| (Offline) | |
| | #35 | |
| DP45SG/Q9650 ![]() Join Date: May 2003 Location: Alabama
Posts: 6,088
| Quote:
| |
| (Offline) | |
| | #36 |
| Registered User Join Date: Jan 2005
Posts: 1,232
| That is what I get for skiming over rather than reading. You always have a post or a few that do not register. |
| (Offline) | |
| | #37 |
| DP45SG/Q9650 ![]() Join Date: May 2003 Location: Alabama
Posts: 6,088
| Actually I haven't gone through the whole article and you said something I hadn't read. PS. I will try to have my posts register certified from now on |
| (Offline) | |
| | #38 |
| Forget Wakeboarding Join Date: May 2004 Location: Texas
Posts: 2,460
| Ya the guy in that forum was even able to do a bulk order for T2600 ES's for members of the forum. I wish that I could be in his shoes.:eek:
__________________ |
| (Offline) | |
| | #39 | |
| ABX Folder Join Date: Mar 2006 Location: The Empire State
Posts: 477
| Quote:
Hmmmmm, that maybe a sweet idea. A benefit for membership???? ![]() DXM | |
| (Offline) | |
| | #40 |
| Registered User Join Date: Jan 2005
Posts: 1,232
| When I read into the execution pipelines, I was surprised. Even the cache architecture, which Intel always has had as very robust, is not in contention with the second core. The decoder, branch, retirement...........there is not one part of it so far that I find it to be a bad idea or consequential. This is built to be a really nice chip. “.......decodes 4+1 x86 instructions, issues 7 uops, reorders and renames 4 uops, dispatches 6 uops to execution units and retires up to 4 uops each cycle.” Wow! |
| (Offline) | |
| | #41 | |
| DP45SG/Q9650 ![]() Join Date: May 2003 Location: Alabama
Posts: 6,088
| Quote:
| |
| (Offline) | |
| | #42 | ||
| Registered User Join Date: Feb 2005 Location: Surrey, BC
Posts: 445
| Quote:
At best it can do: "In comparison, the Core CPU can perform 4 DP FLOPS/cycle and then some: a 128 bit multiply, a 128 bit add, a 128 bit load, a 128 bit store and then perhaps an ALU or fused compare and jump instruction in the last dispatch port." I have seen Intel's Presentation on PDF. It basically says same thing as link. Quote:
Shared caches were already good in Yonah. I am sure its improved in Core but its negligible compared to other enhancements. | ||
| (Offline) | |
| | #43 |
| Registered User Join Date: Feb 2005 Location: Surrey, BC
Posts: 445
| Merom benchmarks http://www.computerbase.de/news/hard...ore_duo_merom/ Merom 2.0GHz Yonah 2.167GHz 3Dmark06 M: 941 Y: 808 3Dmark01SE M: 17163 Y: 15845 PCMark04 M: 6940 Y: 6544 3Dmark03 M: 5140 Y: 5130 PCMark05 M: 4309 Y: 4213 3Dmark05 M: 2442 Y: 2279 Both systems used 1GB DDR2 and Mobility Radeon X1400. According to what I have seen over the net, Merom is 15-20% faster PER CLOCK compared to Yonah, with same chipset, same FSB. Now doubled cache sizes only make 3-4% difference in 3DMark tests. The rest are architectural advancements. Core is simply amazing. |
| (Offline) | |
| | #44 |
| Registered User Join Date: Jan 2005
Posts: 1,232
| You will usually find that units used are particular to a function therefore will not always be symmetrical. The diagram does not map this out, but the literature does clarify some. Yes, 128bit units, and now compare them to the previous 64bit architectures and the instruction flow to them. This is nice no matter how you look at it. As you have noted and so have I have, it will perform quite well. The notation you used is better expanded: "The Core microarchitecture also substantially improves on the floating point and SSE capabilities of its predecessors. Although Core’s 3 SSE units are not fully symmetric, the differences are relatively minor (shifting and multiplication resources). The SSE units are fully pipelined and each one can execute the appropriate 128 bit SSE operation in a single cycle. In comparison, the P4’s SSE resources are somewhat scanty; the two 64 bit SSE units use two cycles to execute 128 bit operations, and therefore are only partially pipelined. Similarly, Yonah only has 64 bit data paths. In comparison, the Core CPU can perform 4 DP FLOPS/cycle and then some: a 128 bit multiply, a 128 bit add, a 128 bit load, a 128 bit store and then perhaps an ALU or fused compare and jump instruction in the last dispatch port." There is always contention with cache and multiprocessors. How the contention is alleviated depends on the architecture and coherency model used. I am not as pleased with the previous methods used. This design is more for a dual core SMP implementation. Where as, even with HyperTransport, the method AMD uses seems archaic to me. Core's ability to work with its recourses within itself and outside itself is more robust and modern without a doubt. "The shared L2 cache for the Core MPU is a non-inclusive, non-exclusive design. Latency numbers were not disclosed, but it is very likely that the L1D cache latency is 2-3 cycles, most likely 2. As previously mentioned, Core can transfer directly between the L1D caches in some variants. However, it is currently unknown how often this transfer can occur, how much data is transferred (probably a cache line) and whether such a transaction would replace an L2 cache access. The Core memory subsystem also implements new prefetchers designs to work effectively with shared caches. Each L1D cache has several prefetchers, and the L2 prefetchers dynamically allocate bandwidth between the two CPUs based on the data access patterns and intensity using a modified round-robin algorithm. The front-side bus is similarly arbitrated for fairness." |
| (Offline) | |
| | #45 |
| Registered User Join Date: Feb 2005 Location: Surrey, BC
Posts: 445
| Ah, ok, still not 100% clear but better now Intel's Israel design center+mobile team rocks!! I heard new design will be out in two years!!! Better architecture than Conroe+maybe IMC=wow that's gonna be unbelievable. |
| (Offline) | |
![]() |
| Thread Tools | |
| Display Modes | |
| |