Re: FastRAM oddities.
AndyLandy said:
So, in my recent quests to acquire accelerators for my A1200 and A2000, I have discovered that 68030-based accelerators (Other than C= ones) can often address up to 128MB FastRAM, yet no 68040 ones seem to. They tend to be capped at 64MB or even 32MB -- Anyone here able to shed some light on this? My only assumption is that you can't get sufficiently fast (50ns) 72pin SIMMs that are that large.
Ahhhh the good'ol memory question -
go put the kettle on, make yourself a good hearty brew... come back get into a nice comfy chair and read on --
Motorola 680xx series CPU's
ALL (NON EC* thats Embeded Controllers) Motorola CPU's can address in 32bit, even the humble 68000, so why all these crazy limmitations?
Well firstly 32bit Fast ram (thats outside the 8MB Zorro Addressing Space) is only limmited by the controller thats been designed / implemented by the product vendor.
In the case of the Apollo 030 Mk3 thats 64MB in two 32MB sticks of ram, where as the Blizzard 030Mk4 can have upto 128MB, and with the SCSI daughter board, that can be increased to 256mb with another 128MB of ram.
Now why do they not just use a unified controller.... well this comes down to the price of logic at the end of the day.
Logic me this.... Logic me that -
Logic in the sense of Hard *glue* Logic, which is very cheap but limmited in space and power cost and it counter part - Programmable Logic, which is expensive in software development, component and testing are the limmiting factors here.
If a company has previously designed a controller that can use upto 32MB then why re-invent the wheel... we are talking about a time when the Operating system only used 880KB of ram, 32MB of the stuff is arguably over kill...
Phase 5/ DCE, is one of the few companies that re-designed their memory controller(s) from 32MB to 64MB and then onto 128MB. and they implemented this for their 030/040 and 060 range of cards (although the latter 2 are the same card just different cpu)
What kinda Matrix does the memory have?
Another limmiting factor of the day was Standards, and the serious lack of them! today we enjoy so many well documented memory standards it truly is awesome.... however, this was no so.... pre 2000 we had so many types and then fundemental differences of series in those types....
for instance.... 64pin SIMMS, apple had some.... GVP had some.... none of them were compatible with each other....
72pin SIMMS, some were parity, some were not.... and now we get to the best bit.... the Memory Matrix (sounds cool eh?)
now Imagine a memory stick is organized like a spread sheet, it has rows and columns.... now Memory Matricies can be made up of Y rows and X columns, so if we have one thats the same number of rows as columns, thats know as a SQUARE matric 1024 cells by 1024 cells.
However IF we had a Memory Matrix of [Y]2048 by [X]1024 cells, this would be a Narrow matrix and Finally if we had a memory matrix [Y]1024 by [X]2048 then this is a WIDE matrix....
this might seem quite simple to understand, but your memory controllers logic would have to deal with the above, and as so it will make it furtherly complicated requriring more logic etc etc...
Now if you compound the above by adding Parity and ECC check bits you are in for a nightmare trying to create a generic memory controller..
you see, back in the day theres was only two ways to create a memory controller...
1. Brute force... this writes a sump of data in memory locations and reads them back and then it will allocate to the ammount that returned correct. (you can already see the issues this has with WIDE and NARROW memory Matrices)
2. Probe SIMM... this feature although was designed early in the life of the 72Pin SIMM and was part of the reason to adopt the PS2 72pin simm standard, however although the 72pin allowed greater size, the PROBE lines (theres 4 of them) were adopted VERY late in its life, as such PROBING SIMM's was almost worthless as there was not data to draw back from.
Refresh Speeds
Ahhh whats better 50ns / 60ns / 70ns or 80ns.... want to know the answer.?
the simple answer is ANY SPEED that has NO WAIT STATES, to achive this its a calculation of the cpu cycle speed and the memory cycle speed, if you can match these togther you have Zero Wait State.... kinda like the Optimum fuel-air mixture in a car,
really crap anology -
lets imagine a seesaw, the cpu on one side and the memory on the other. data is collected by one of them as they are bounced up to the top with the other going low to the ground. this is a Wait State approach as one has to wait until the other has collected.
So rather than each of them going up and down with the cycles to exchange data (Wait State Approach) if there cycles are carefully balanced (the seesaw is level) both parities will be able to deal with data without any waiting caused by either party.
Now... whats all this nano-second refresh worth then?.... well the goal is to have Zero Wait, to do this processors have different cycle times, memory controllers have different latching times and RAM's have different refreshing times... so... IF you can achive a SIMM that will refresh quicker than your card cycles then you have achived a Zero Wait State.
by far the easiest to achieve Zero Wait State is using a Syncronus clock (i.e. clocking the memory at the same speed as the processor) this gives you a much clearer logic path and hence simpler to create, Syncronus clocking may still need to introduce a wait every other tick or so to keep the cycles correct (this could be a processor tick and do nothing).
The problem here is that the Memory HAS to run at the same speed as the CPU, this bottle-necks the CPU as memory just cannot be run that fast. 50ns refresh will bottom out and not keep up with any more than 38-45mhz direct clock and this doesn't include the cycles need to instagate the Column and Row refreshes
so what else can we do? well
Asyncronus (differing speeds), this will free up the processor to actually process, and the RAM to be ram... with clever logic and the right speed of RAM's you can achive a Zero Wait state with carfull design of the cycles and there number of cycles between ticks, as you can imagine its a whole lot more complicated, and hence.... pricy in both devlopment and implementation.
so.... there you have it =D