PS3 Supercomputer or a clever Hype Machine

It's coming soon but will it be what everyone is saying it is or will it be a disappointment, who knows but we can look at what we know and see some of the holes in the road to come.

Hyperbole
To understand the numbers which seem to be cropping up on sites all over the net you have to understand how the Hype Machine works and it's history also how Multiple CPU's work together and how multithreading applications work. I personally only have a basic knowledge of the concepts but I know enough to read through theoretical figures and hype.

First of all lest look at the history of the Hype Machine, the N64 is a classic example of this machine in action and the reason I won't believe anything these big companies say. I remember looking 'The EDGE' magazine, and seeing wonderfully rendered still images, now I'm graphic Designer and I know a marketing image when I see one. The N64 had a glut of these high resolution print quality images. I also know that there are no way these were game graphics, not even cut scene graphics, as most video systems run at a significantly lower resolution than needed to get good quality magazine images.
Next we come to the all the Hype Machine about the Memory Bus being developed by a super computer company. I remember seeing hours of footage generated on SGI workstations, and thinking that's impressive, but it's not an N64. Also there is the fact that SGI systems cost thousands of Dollars, and a console developed in a similar manner would cost a similar price, or the graphics industry would be using N64s not SGIs.
When it came to the crunch, and the N64 hit shelves, it was not a supercomputer; it was a little less power full than the Dreamcast and vastly inferior to the PS2. The Hype Machine However would have use believe that the N64 would be able to do movie quality graphics.

Now we look at the hype behind the PS2, the emotion engine which I'm sure you're all familiar which made exactly the same claims and show similar pre-rendered images and video footage. I did not buy this the first time around and Sony is doing exactly the same with the PS3, Hype and all this talk about it will produce movie quality graphics. So what happened with the PS2, well not only was the emotion engine hard to program for the memory bandwidth was not able to cope. Plus the proof of the pudding is in the taste and this Hype Machine would have us believe it was sweet when in actuality it was rather bland.

Now Jump to the present day, and the Hype Machine behind the PS3. So far I have seen nothing but pre-rendered stills. I have seen no evidence of actual game footage run-on the architecture of the console. Considering there is still no working console containing a Cell chip, I doubt anything that is shown now will reflect the end result. So anything that comes form Sony's Hype Machine is a aimed at building hype with the fans, and it seems to be working a treat.
You need to look at Stills from PC games, and then wonder why most of the time when the game is on your PC it never looks as good. This is because they use what is called an Art System/Server, this is geared with enough power to produce high quality stunning graphics. These systems are always use cutting edge technology and usually are out of the price range of mortal man. The console industry will do the same until they have a finalised working architecture.
So until the machine is built and working what we wills see for any console is going to be pre rendered or run on a high end graphics system dedicated to producing images that attract.

If Sony's Cell processor is so revolutionary and powerful they would make every supercomputer company and PC manufacturer bankrupt in a matter of months. I really cannot see this happening because I don't see Intel, Kray or Silicon Graphics panicking. In independent tests the 1st generation Cell chips run a bit faster than a 4 GHz Intel chip however it will suffer problems with being proprietary technology. Remember the x86 architecture may be old but Intel has had many years to perfect it, whereas Sony has not had a more than a few months of testing on the Cell chip. Years of experience and tested technology verses Marketing Hype, I would rather believe in a know factor than hype any day.
Again the Hype Machine would have you believe the Cell is going to be 10 times faster than current processors; however I hear the word theoretical banded about, and this means it works on paper and the numbers look fantastic. But how many other things work on paper and look fantastic, then when you actually implement these theories you are smacked I the face by the harsh mistress called reality. I'm sure you can name a few.

Now let's look at the Cell processor, Sony's aim it to produce a low cost multi purpose multi platform chip. That can be used in anything form microwaves to PC's. Now as a veteran of the computer industry, I know a dedicated system performs the job better and faster than a system that is designed to do a number of different tasks. In the same way a PC running one type of application will run it better an faster than one able to run several types. As many PC gamers know what makes a good fast server won't make a good fast gaming machine, each has to be tailored for their purpose. So Sony's Cell Processor will be good at a multitude of different types of application but will never be excellent at just one type. Essentially it will be a Jack of all trades but master of none.

Multi-Tasking
Now to understand the issues involved you need to know how things work inside the CPU. Older CPU's used to be able to run one program Thread, and to get a second thread to run you had to stop the first thread and reset the system. Today things are a little more advanced, and a lot more complex.
Most PCs with one CPU will be able to run multiple threads, (or multiple applications at the same time), this can be seen on an PC running Windows 2000 or XP, Just hit CTRL+ALT+DEL and click on task manager then select the processes tab. Each process is a different thread, each will still need to acquire a CPU cycle every now and again. Though they may only need to use 3 or 4 cycles, that means 3 or 4 cycles taken from another application, that's why when you have a number of applications open things will start to get slow and become unstable.
The reason for this is the CPU has a number of registers, that the processor can still only run one instruction at once essentially one program at once. It shares the CPU time between applications to give you the impression of parallel processing. Now when you have several programs all running together you will find things get slow quickly because the CPU is jumping from program to program. Also the CPU registers need to be swapped every time a new task is put to the processor. When a running process or thread requires the use of a few CPU clock cycles the whole CPU registry needs to be changed, for those few cycles. Now this action alone takes up one CPU cycle to do and in some cases it may take more, so the more threads running the more the registry needs to be swapped the more lost cycles you have. See the example bellow:

Cycle 1 Thread 1 Processed
Cycle 2 Thread 1 Processed
Cycle 3 Swap CPU Registers
Cycle 4 Thread 2 Processed
Cycle 5 Thread 2 Processed
Cycle 6 Swap CPU Registers
Cycle 7 Thread 3 Processed
Cycle 8 Swap CPU Registers
Cycle 9 Thread 1 Processed
Cycle 10 Thread 1 Processed

This is a small example of ten cycles and 3 active processes, what we see here is only 7 of those cycles are actually doing calculations and the other three are spent swapping out the registers. Essentially you lose 33.33% of the CPU cycles during this example, that would make a 3 Ghz PC only run at the equivalent of dedicated 2 Ghz system due to the actions of Multi Tasking.

In the long run the faster a CPU gets, the more applications we can run on it, this is true but your still going to loose valuable CPU time, and the more you think you can do the more you will do. This means a greater number of threads and a larger number of register swaps, and a bigger loss in raw processing power. This is why a faster processor won't always seem to run faster, because new operating systems and applications will have more threads active and a greater loss of CPU cycles.
It is said that the average PC will only be able to utilise 33% of its total power, mainly because of resident operations system processes, and drivers which are required to process and resolve compatibility issues. This is why a console with a 733 Mhz CPU can produce batter graphics than a PC with a faster CPU and Graphics card. Because that console is dedicated to just running games and does not need to have processes active to maintain an operations system or driver compatibility, their for less CPU cycles are wasted and less raw CPU power is lost.

To add to this there is a system called pre-emptive multi-tasking which in many cases increases the speed of an application which runs repetitive tasks. However with multiple applications running your tasks are going to be switching a lot.
RISC (reduced instruction set computer) CPU's are designed with more registers than most CPU's as they are designed for Multi Tasking, which actually makes them better then a CPU designed for Pre-emptive Multi Tasking. Because the number of registers mean you don't have to swap them as often. A CISC (complex instruction set computers) system with pre-emptive multitasking suffer greatly when more and more threads re run on it.

Consoles are more likely to use co-operative Multi Tasking, it is better when running a dedicated system for games, however remember the Cell processor is designed to run everything in the household not just games.
A co-operative Multi Tasking system offers more controls to the programmer, as opposed to pre-emptive systems which control the multi-tasking through the operating system. This is why co-operative is used on games consoles; however this means the programmer needs to understand the architecture better before they can produce top notch software. A well designed game engine for a console will make many of the day to day tasks of the programmers easy, however you have to get that engine running and built.

Multiple CPUs
Now essentially multiple CPUs are just single CPUs all running together, this has the advantage that a single CPU in a multi CPU system can be dedicated to one thread. This is fantastic if that Thread needs to use 100% of a single CPU's power. So for example 8 CPU can run 8 threads simultaneously without them interfering with the other threads or any loss of CPU cycles due to register swapping. If you run more than 8 threads one CPU will have to run two threads, and suffer a performance loss do to register swapping. Now the trick is deciding which thread is the one which will suffer the performance degradation. This is not always the one which is using the least CPU time, as that thread might need to react within 1 clock cycle, so pushing another thread to that CPU will reduce reaction time. So to some degree automated designation of thread based on CPU load won't work, but is an option if a general system. However to get maximum performance you need to be able to program each CPU for the task at hand, and that means the programmer has to delegate which CPU runs what. Now I know many Game developers are just getting to grips with this, so the Cell could suffer in its early development, if Sony doesn't produce the needed resources and training for their new chip. Not to mention the programmers will have to get used to a new architecture and programming language. At this point I really feel sorry for those programmers, who have to try to live up to what the Hype Machine says.

Okay so we know multiple CPUs mean multiple simultaneous threads so things won't run slower. However things won't run faster either. Unless the programmer know how to evenly distribute the work load across a number of CPU, essentially making one thread run on two CPU's. Yet more headaches for the programmers!!

So initially what I think will happen is each processor in the Cell will be dedicated to one thread (example threads: Real World Physics, Monster AI, World Geometry, Sound processing, and Number Cruncher). In doing this some CPUs may only utilise 25% or less of their power. So when you look at the theoretical numbers for the Cell, you have to understand this is with each CPU running at 100% running a single thread, in gaming this will never happen. More than likely the Cell will only run at about 33% of the speed they Hype Machine claims, but the numbers look good on paper don't they!!
Once the programmers workout how to utilise the power, properly you will still get issues with muti threading on one of the CPU's, and lost CPU cycles due to registry swaps, so in the end the PS3 will never be able to use the full power of its Cell chip, because it is designed to do a number of different tasks, rather than be dedicated to just one. However outside the PS3 the chip may make a very good dedicated CPU.

Multiprocessor Architecture
There are two main type of architecture, Symmetric Multiprocessing (SMP) and Asymmetric Multiprocessing (ASMP). In an SMP system the operating system considers all the CPU's as equal parts of the whole, this means their will be no favouritism to one CPU.
ASMP, which is not implemented in windows, are set up so each CPU is more dedicated to specific tasks. Giving a dedicated system more power to perform its tasks, usually in the ASMP systems one CPU is dedicated to running the systems kernel, and the other CPU's task are delegated from that CPU. This is the system the PS3 seem to be using, the core facts point to one CPU to delegate tasks. This is good however that CPU won't always be fully utilised, and the other more dedicated CPU's will only run their tasks, well but when used for other non-native tasks they will suffer performance loss. Which if implemented well is the best option; however this takes a lot of control out of the hands of the programmers, meaning their flexibility will be stunted.
ASMP, as the tasks are strictly divided up between the CPU's it makes implementing them harder, as each one needs its own memory space to work in and in some cases they may need their own I/O bus, and Sub-System, pushing up the complexity and cost of these systems.

The Model used in the X-box 2 seems to follow the SMP mould, this means greater flexibility in the hands of the programmers, as with the multiple CPU's you only need one of everything else. This cuts the cost and makes implementation easy.

Scalability
It has been noticed with many multiple CPU systems, that scalability is an issue, the more processors you have they harder things are to program and get working. As noted with Windows, you won't find a windows system that can handle more than 8 CPU's it used to be 4 CPU's with older NT based systems.
The reason for this is an inherent problem which means that the 9th CPU will only be running at about 60% of its capacity, and the 10th will be even less. Because a multiple CPU system is only good for dedicated tasks, and when you're using it for a number of different tasks, the management of such CPU's becomes a big issue and the loss in processing power is significant.
So putting another 4 CPUs in a system that already has 4, won't make it twice as fast, it will make it faster but cause more issues with scalability. This will be the Cells downfall. Because we have not seen any practical benchmarks of the Cell chip, the processing power loss over multiple CPUs is an unknown factor. However judging by the limit of 8 CPU cores in the cell means this loss is similar to that of a PC running 8 CPU's. The problem is still there, and it will mean they Cell chip will never run at the theoretical speed that is on paper.

The Cell Hype Rebound
I have heard numbers such as 1 TFlop (Teraflop) and 10 times faster than current processors. This is just Marketing Hype based of numbers on bits of paper. In reality the picture will be very different, the claimed 1 TFlop may actually be only 165 GFlops (Gigaflops), but through clever programming could reach 200-250 GFlops. However the case will be that the full potential will never be reached and even 165 GFlops will never be used.
As for Grid computing, this just means the Cell can link with another Cell chip and threads can be spread across multiple chips, again you have to think of how the chips will be used, the nightmare of programming them, and where the processing power loss takes things. As the PS3 is going to be a single Cell Chip with up to 8 CPU's I doubt Grid computing will actually factor in to this, so you can forget about this bit if Hype until the 4th generation of consoles.

Conclusion
Never believe hype and always look at how the technology will be implemented, and how a similar system works. On paper everything looks fantastic, but if the implementation if flawed or not able to handle the theory then things are going to go bad.
Marketing Hype is their to do one thing, and that is to sell you the item, this mean making bold claims based of theoretical numbers and producing materials on other systems that won't actually reflect the finished product.

From what I have seen Sony like to use proprietary technology because there is no way anyone can disprove their wild claims, it's never been tested outside they lab so how can anyone disprove it. Microsoft on the other hand use a different strategy, and that is of tried and tested technology, altered to be more dedicated to the task. I rarely see them putting out hype or theoretical numbers, nor have I seen any marketing images and video footage which seem to be produced on Art Machines.
All I have seen are the XNA demos which to me look feasible for the next generation programming suite, and knowing Microsoft they will have been produced with that programming suite. When has Microsoft every demonstrated a new version of windows on a non PC based system, that is designed to make it look good. Never is the answer, because if something is going to be applied to a system then what the point in telling lies and showing something hat is not what the public will have.

I have no doubt that no matter what Sony says the PS3 will never live up to it's hype, it has never done with previous consoles in the past and I seen no reason for them to do so now. What will be the battle ground for the next generation of consoles won't be number or speed but content and how much and how fast. The XNA suite will make programming for the next generation easy and fast, and it's already available to programmers. So unless Sony has a rabbit generator in a hat, their development systems will be vastly inferior, and considerably harder for programmers to utilise. This means the PS3 content will be thin on the ground for a few years, until programmers get to grips with it, where as the X-Box2 will have multiple titles coming to the market with masses of content.
I mean if you can programme a game in 18 months on one system but it takes you 36 months on another. Your going to go for the one with is fastest to work with, so you can get a fast turn around on your project, and then money flowing quickly. This is Microsoft's advantage, not the numbers or fantastical claims but the reality that developers can work quickly and easily with their hardware, and produce quality content in half the time compared to other systems. I see a lot of programmers moving loyalty to Microsoft and the cross platform XNA Development System, as they can produce a game for two platforms in the same time they can produce 1 game for one platform for any other system.
As we know from the PS2 and X-box, that the best hardware does not mean better content or higher sales, so why would that be a factor with the PS3 and X-box2.