Originally posted by vbt@Sat, 2004-12-04 @ 11:19 AM
Yes I'm really interrested 🙂 I have plugged back my PC with the commcard and it will be always plugged (my two PC are now close to the TV). I could dev all the day 🙂 .
[post=125112]Quoted post[/post]
Hey, that's cool!
What do you mean, the DSP or the dual CPU lib?
As for the dual CPU lib, I ask everyone:
What needs should it fullfill?
Anyone got suggestions how to do it?
Let me start with the SGL event system. It's a double chained list of events, that are processed in sequence, again and again. Each event got a structure to be used for local data and it got links to previous and next events, so that values could be communicated between them this way.
The only advantage using SGL events is that the event list can be changed at runtime. The lack, in my opinion, is that events cannot be processed in parallel.
To exploit the max Saturn power, using SGL's slSlaveFunc() is not sufficient.
The slave CPU is invoked by the master. In the best case, repeatedly calling slSlaveFunc() creates a work list for the slave, but when the master is busy, the slave has to wait for work in idle mode.
Another thing is that it is an advantage to allow one CPU to already start processing the next frame (instead of waiting idle), while the other CPU finishes this frame (no inter frame dependecies). Of course we must ensure that CPUs don't pass each other several times, e.g. one is working on frame n, the other one on frame n+5.
Both CPUs share the same resources, so I say first step is to move away from master -> slave.
I propose a work list similar to the SGL event system, with the main difference that we can group independent events together for parallel processing. Imagine a flowchart, a graph or an event list, that at some places splits up into multiple parallel paths.
Both CPUs should not fight for every single piece of work, instead a processing list is set in advance for both CPUs, which is only updated at certain positions in the list and which includes points where communication to the other CPU is necessary.
This reduces CPU <-> CPU communication. We would need some sort of meter, a variable or something that gives a hint on workload difference of the CPUs, such that we can determine which and how many work pieces are shifted from the overloaded CPU to the other for the next frame.
Anyone got knowledge in workload balancing?
Connections in the work list between events may get some predicate that determine how to proceed. Something like
Code:
- just proceed your list, or
- wait for other CPU, or
- wait while (other CPU has NOT passed certain point)
then take this path, or
- if (I finished (my part of an event group) first)
let the other CPU know and proceed path 1
else proceed path 2
Furthermore for some applications it may be necessary that certain events can only be processed by a certain CPU. So every event should have this property set, especially for workload balancing.
As for workload balance: it shouldn't be reduced to the library. The lib rejects whole events from busy to idle CPU. But an application has more (faster) possibilities to that.
An example: in my voxel demo, I render the screen in 2 pieces, left one on slave, right one on master. So I would have only 2 parallel events. You see there is no smart way for the lib to reject events. Of course I could divide the screen in more pieces(the DSP already can, hehe), but in cases like this, additional work is waisted just to perform this division.
Instead the app can easily balance work by varying the size of both screen pieces. The CPU which finishes earlier got it's screen piece increased, the other one decreased.
But how much I can vary, that must be told by the lib.
So we need an interface for workload balance between lib and app.
In the above case, it may be necessary to tell the lib that it should not perform the balance on his own in this region.
Noticed the similarities to petri-nets? Maybe a subset.
VBT and everyone who's interested: what do you think?