Model 3 Step 1/1.5 and Step 2 Video Board Differences

gm_matthew · Post by **gm_matthew** » Tue Apr 08, 2025 4:37 pm

MetalliC wrote: ↑Sun Apr 06, 2025 6:56 pm a bit off topic, but anyway - here is block diagram of Sega Hikaru video system. how do you think, is it looks similar to Real3D 100 / Model 3 ?
IMO it is.

Oh nice, I had been curious about that

I'm working on a block diagram to show how the Model 3 video board ASICs are linked together, but suffice to say that it is not the same as Hikaru.

MetalliC · Post by **MetalliC** » Tue Apr 08, 2025 9:28 pm

gm_matthew wrote: ↑Tue Apr 08, 2025 4:37 pm I'm working on a block diagram to show how the Model 3 video board ASICs are linked together, but suffice to say that it is not the same as Hikaru.

well, to me it looks similar to some degree

for example 2 texture bank "Australia" ASICs (2 rightmost at diagram) - each have inside 4MB of RAM, so, it looks similar to Mars ASIC but with embedded RAM. a bit interesting part - Hikaru store different texture LODs at different banks, probably to get "free" trilinear filtering.

same thing with other ASICs - frame, depth, etc RAM buffers is inside of them, except for "Antarctic" command processor which have 8MB external RAM probably for storing polygon/display list, and a bit of cache RAM (similar to Venus & Mercury), and "Europa" 2D display controller with external 8MB VRAM for 2D layers bitmaps.
btw "Europa" feature set is quite weak - it just mix 3 bitmap layers (one from 3D rendering and 2 more bitmap layers from VRAM with optional scrolling) and output that to CRTC, which smells similar to things I've heard about original Real3D Pro 100 2D subsystem, isn't it?

gm_matthew · Post by **gm_matthew** » Tue Apr 08, 2025 11:20 pm

Naturally there are some similarities, such as the use of two texture mapping units to enable one-pass trilinear filtering or multitexturing with bilinear filtering. However, on Model 3 the two Mars ASICs for each Earth ASIC are connected to each other, while the two Australia ASICs on Hikaru are not (at least according to the diagram).

Also the Model 3 is based on the Pro-1000, not Pro-100.

gm_matthew · Post by **gm_matthew** » Wed Apr 09, 2025 12:11 am

Here is a block diagram of a Model 3 Step 2.x video board:

: model 3 diagram.png (49.91 KiB) Viewed 199388 times

- Mercury is the culling processor; it analyzes the scene graph and decides which objects should be rendered.
- Venus is the geometry T&L unit.
- Earth is the rasterization unit.
- Mars is the texturing/shading unit.
- Jupiter is the image compositor; it merges the 3D layer with the 2D layers from the tilegen chip, performs some post-processing and outputs the result to 315-5648, which outputs to the video DAC.

MetalliC · Post by **MetalliC** » Thu Apr 24, 2025 7:58 pm

to me it looks quite similar at hardware level. especially if we haven't seen any other about same era GPUs which consists of quite many ASICs.

it's also look similar at software level as well - Hikaru's GPU operates with set of objects - materials, texture heads, lights, and also "view ports" (8 in total), each of them have own projection, depth range settings, and something looking like "priority" or "index" (0-7), which looks similar to Real3D isn't it?
and native coordinate system is X -Y -Z, which looks similar too.

however the display list is different, more "handy", it uses kind of command set, where each command is 1 to 8 32bit words, here is example of it in a human readable form: https://pastebin.com/raw/hTyY4Wc8

anyway, is anywhere exists some document or forum topic about how Real3D works, or at least current understanding of it after all the RE and emulation work?
yes, Supermodel code is the doc itself, but in many places it's not clear is it the thing works this way, or it's just "that's not how it works, but it's better do this way in OpenGL".
in 1st place about the order in which polygons processed with respect to view ports and priorities, is it have polygons sorter or not, how fragment rejected (I've seen in the code discard of fragments depends on resulting alpha, is it really the way Real3D works or some kind of hack?) and evertything like that.

Post by **Bart** » Thu Apr 24, 2025 8:47 pm

We know the gist of how it works and some details are hinted at in patents. By piecing together what the games are doing, what the official (very high level) documentation says, and hints from the Real3D SDK, more is known. For example, Ian and Matthew recently deduced that part of the Real3D culling RAM is used to buffer writes to other RAM regions before frames are processed.

There is a Pro-1000 Windows SDK that includes the Pro-1000 firmware (roughly equivalent to a Model 3 ROM set but smaller), C++ header files and libraries, etc. There are also some PDF manuals describing the Windows SDK. The Pro-1000 was connected to Windows NT workstations and commands were sent over to it via a SCSI bus. As far as we can tell, these commands closely mirror actual operations on the Pro-1000 (and the Model 3's version of it), so the Pro-1000 firmware (which runs on the PowerPC CPU on the Pro-1000 CPU board) is likely just executing the commands verbatim.

I think fragments being discarded based on translucency values is probably "accurate" (although obviously the hardware processes these differently than a GL fragment shader). There is no polygon sorting, to my knowledge, but there is culling of nodes and Ian handles transparency by performing a second pass for reasons I'm not clear on (I haven't had a chance to understand his renderer and I know a lot has changed in the last few years).

Hikaru is still PowerVR based isn't it? This is a very different architecture than Pro-1000. The command lists aren't really comparable at all. Pro-1000 processes a higher-level data structure, basically a scene graph. This is efficient in that it allows meshes (models) to be pre-stored and then manipulated just by updating transform matrices or a few other parameters, without having the CPU perform any transformations (the hardware does all transformation and lighting). But it's also very inflexible in that it's a fixed stage pipeline with very little configurability, especially when it comes to lighting and blending. PowerVR is much more modern and the Real3D approach was completely abandoned. It was likely the last graphics board of its kind.

MetalliC · Post by **MetalliC** » Thu Apr 24, 2025 9:54 pm

Bart wrote: ↑Thu Apr 24, 2025 8:47 pm We know the gist of how it works and some details are hinted at in patents. By piecing together what the games are doing, what the official (very high level) documentation says, and hints from the Real3D SDK, more is known. For example, Ian and Matthew recently deduced that part of the Real3D culling RAM is used to buffer writes to other RAM regions before frames are processed.

There is a Pro-1000 Windows SDK that includes the Pro-1000 firmware (roughly equivalent to a Model 3 ROM set but smaller), C++ header files and libraries, etc. There are also some PDF manuals describing the Windows SDK. The Pro-1000 was connected to Windows NT workstations and commands were sent over to it via a SCSI bus. As far as we can tell, these commands closely mirror actual operations on the Pro-1000 (and the Model 3's version of it), so the Pro-1000 firmware (which runs on the PowerPC CPU on the Pro-1000 CPU board) is likely just executing the commands verbatim.

I think fragments being discarded based on translucency values is probably "accurate" (although obviously the hardware processes these differently than a GL fragment shader). There is no polygon sorting, to my knowledge, but there is culling of nodes and Ian handles transparency by performing a second pass for reasons I'm not clear on (I haven't had a chance to understand his renderer and I know a lot has changed in the last few years).

I see, thank you.
ahh, I see, it seems opaques discarded only during that "1st pass".

Bart wrote: ↑Thu Apr 24, 2025 8:47 pmHikaru is still PowerVR based isn't it?

no, I'm 99.99% sure Hikaru's GPU have nothing with PowerVR and ImgTec.

Bart wrote: ↑Thu Apr 24, 2025 8:47 pmThe command lists aren't really comparable at all. Pro-1000 processes a higher-level data structure, basically a scene graph. This is efficient in that it allows meshes (models) to be pre-stored and then manipulated just by updating transform matrices or a few other parameters, without having the CPU perform any transformations (the hardware does all transformation and lighting).

why? I'd say it looks familiar in general concepts, with the difference Hikaru have it in the form of "command list" for GPU, which is probably transformed by "Command Processor" to the form of more low level data for next ASIC which will do actual rendering.

all the mentioned things Hikaru have as well - at top code level you'll setup and create sets of viewports, materials, lights, light groups and "texture heads", then "select" some of mat, texture, lights as "active", setup transformation matrix(es) for specific (static) model and "CALL" to it, it will be processed using these parameters and then return to top level of code.
and so on for the whole scene.

it also supports conditional CALLs and JUMPs, depending on vector math test - predefined vector multiplied by current world matrix, and based on resulting vector length may be made conditional calls or jumps. developers used that to create models with LODs, with kind of automatic LOD selection depending of current transformation matrix.

Bart wrote: ↑Thu Apr 24, 2025 8:47 pmPowerVR is much more modern and the Real3D approach was completely abandoned. It was likely the last graphics board of its kind.

well, PowerVR series 2 was also kind of "last of its kind", and had unique features like almost infinite "fill rate" of opaques and OIT for translucents, and also "modifier volume" shadows, which was abandoned in next generations of ImgTec's products.

but as of Real3D I have strong feeling it wasn't last

Post by **Bart** » Thu Apr 24, 2025 10:46 pm

MetalliC wrote: ↑Thu Apr 24, 2025 9:54 pm I see, thank you.
ahh, I see, it seems opaques discarded only during that "1st pass".

I don't know if Pro-1000 operates in passes per se but there is this concept of a contour texture. Alpha blending also works entirely differently than expected. One theory is that it actually renders only every other pixel and then biases the rendered pixels toward visible or black. See here.

no, I'm 99.99% sure Hikaru's GPU have nothing with PowerVR and ImgTec.

They must be based on some other existing IP. Seems highly unlikely that at that point in time Sega would have developed their own from scratch.

why? I'd say it looks familiar in general concepts, with the difference Hikaru have it in the form of "command list" for GPU, which is probably transformed by "Command Processor" to the form of more low level data for next ASIC which will do actual rendering.

The command list you linked looks like a set of instructions for the GPU to execute. Pro-1000 doesn't have that concept. The display lists aren't arbitrary lists of commands to follow. They're a data structure that is traversed. At the high level, there is a linked list of viewports to render. Each viewport is described by the exact same struct (which sets lighting, fog, and other parameters) that points to a list of top-level nodes to render. Each of these is a hierarchical set of culling nodes that terminate in a pointer to a model to render (stored in polygon RAM or in VROM). And each node can, IIRC, point to yet another list. So it's a tree-like structure of nodes and lists of nodes. Each time you go one node deeper, you apply a transform matrix (stored elsewhere and referred to by its index), which translates pretty directly to the OpenGL matrix stack.

It's a scene graph: each node basically specifies how to transform the children below it. And at the very end of this chain is the thing to render with those transforms applied. There is also bounding box information and LOD information (up to, I believe, 4 different models to select from depending on distance and view angle) for culling invisible models and switching to lower-fidelity ones.

But here's where things really differ from a more conventional GPU: rendering state isn't modified arbitrarily by command lists, as in Hikaru. Apart from some viewport-level things like fog and light vector, the various shading and rendering options are configured per-polygon. Every polygon in a model contains a big header consisting of 7 words (32 bits each). These specify texturing, shading, color, blending with fog, etc. Normally, most of these would be global state parameters. You'd set them using some command, submit a bunch of polygons, then change some setting, send more polygons, etc. State transitions are expensive on modern GPUs. Not exactly sure why but I suspect that they try to parallelize as much as they can and so if the state changes, you have to wait for everything to finish drawing before the next batch of triangles with different state parameters can be processed. For example, Supermodel's legacy renderer breaks up a single model into two models based on transparency state (the rest of the parameters it passes to the shaders on a per-vertex basis). I think the new renderer does something similar as well. I think it is normal for modern rendering engines to sort meshes based on their state parameters (lighting, shader used, etc.) That wasn't necessary on Pro-1000.

On Pro-1000, all that state was encoded per-polygon. So the system isn't taking a list of commands. It's taking a scene description -- sort of like a modern day .usd file or something. It then traverses all those nodes to determine what to draw. It will even decide on its own to stop drawing if it runs out of time for the current frame. The programmer has almost no control over the process -- it's just like handing off a file to some other program to draw for you.

Ian · Post by **Ian** » Fri Apr 25, 2025 9:52 pm

For example, Supermodel's legacy renderer breaks up a single model into two models based on transparency state (the rest of the parameters it passes to the shaders on a per-vertex basis). I think the new renderer does something similar as well. I think it is normal for modern rendering engines to sort meshes based on their state parameters (lighting, shader used, etc.) That wasn't necessary on Pro-1000.

The vertex parameters such as colour / normal are passed as vertex attribs with the new renderer. The rest of the polygon rendering bits are used to calculate a 64 bit value which acts as sort of a bucket ID. When parsing the models, each polygon with the same bucket ID gets put in the same bucket. So this groups polygons together which have the same texture number / rendering parameters, which means they can get rendered in the same draw call. You can just create a new draw call every time you find a polygon with a different texture ID, but the bucket approach reduces draw calls by up to 30%.

In modern game engines you'd batch render all the geometry with the same texture etc. But on many games on the model3, such as in scud the geometry is split by where it is in the world, which often means it's grouping together almost unrelated geometry because it's in close proximity in the world. Which means a single model on the model3 can reference a whole bunch of different textures.

The legacy renderer just passes the texture ID as a vertex attrib which is great, but that's not possible with the new renderer because we are basically at the limit with the number of vertex attributes that can be passed. So models need to be split by texture IDs which increases the number of draw calls.

Post by **Bart** » Sat Apr 26, 2025 12:28 am

To make them single draw calls, do you create a single big mesh of all the sub-meshes to draw? Each one will need a different transform matrix applied. How does that get handled?

I'll need to catch myself up on how you handle translucency and that co-planar/stencil mechanism.

Supermodel Forum

Model 3 Step 1/1.5 and Step 2 Video Board Differences

Re: Model 3 Step 1/1.5 and Step 2 Video Board Differences

Re: Model 3 Step 1/1.5 and Step 2 Video Board Differences

Re: Model 3 Step 1/1.5 and Step 2 Video Board Differences

Re: Model 3 Step 1/1.5 and Step 2 Video Board Differences

Re: Model 3 Step 1/1.5 and Step 2 Video Board Differences

Re: Model 3 Step 1/1.5 and Step 2 Video Board Differences

Re: Model 3 Step 1/1.5 and Step 2 Video Board Differences

Re: Model 3 Step 1/1.5 and Step 2 Video Board Differences

Re: Model 3 Step 1/1.5 and Step 2 Video Board Differences

Re: Model 3 Step 1/1.5 and Step 2 Video Board Differences