Guru ROM with 14MHz patch - The ultimate A2091 benchmark!

  • Thread starter Thread starter SpeedGeek
  • Start date Start date
  • Replies Replies 11
  • Views Views 7342

SpeedGeek

Member
AmiBayer
Joined
Jan 21, 2011
Posts
933
Country
USA
Region
Wisconsin
Hello my fellow Amibayer's,

The faster hack is the better hack! :-D

Here are some benchmarks from a project I've been quietly working on these past few months. The end result is a combination of software patching and hardware hacks producing faster benchmarks.
Indeed, we are now very close to the Zorro2 Max. transfer rate!

System specs:
- A2091 14MHz SCSI mod + 14MHz patched Guru ROM
- A2630 54.2 MHz mod + 4MB 32bit Fastmem
- Compaq 550MB SCSI-2 HD
- SYNCH transfer mode enabled!

Note: NTSC Zorro2 Bus CLK = 7.16MHz (PAL = 7.09MHz).

P.S. It's entirely up to Mr. Babel to authorize the release of any patched ROMs.

Guru2091_14MHz V1.0 released!

Guru2091_14MHz is software patch which enables 14MHz driver operation. Now you can enable SYNCH mode post boot (Please read the text file after you extract the archive). Happy benchmarking everyone!
:)

Note: The DMACSCLK part of this thread was removed since it was never a reliable hack.

Here is the link to images and downloadable files:

http://eab.abime.net/showthread.php?p=963856
 
Last edited:
Megacool!

Would you consider sharing the GAL code, so it could be compiled for other devices?
I never use the old PALs/GAL's in my projects, but I prefer the still available ATF series from
Atmel. Is it written in CUPL or VHDL/Verilog?
 
Last edited:
Megacool!

Would you consider sharing the GAL code, so it could be compiled for other devices?
I never use the old PALs/GAL's in my projects, but I prefer the still available ATF series from
Atmel. Is it written in CUPL or VHDL/Verilog?

The jedec used for a ATF16V8 is the same as a GAL16V8. The source was written in CUPL but is still subject to revision when I (time permitting) finish work on the new 2 GAL chip 14MHz DRAM controller project.

Oops, that was supposed to be a secret until it was completed! Anyway, if your interested that's a topic for a new thread (when I find some time). :D
 
Ok,

thanks I will try your jedec.

Some additional questions:
- Is your previous 14MHz hack needed to make this last one work?

- I'm aware that Gary delays its #DTACK for accesses to chip-ram if
the bus-master starts a cycle in the "wrong" C1/C3 relationship, or if
the chip-ram is unavailable due to high DMA usage.

Does Gary also do this cycle-syncing if the bus-master accesses fast-ram outside the agnus-realm
(like 0x200000 - 0xbdffff), or does it allow 3-cycle busrequests like
you described above without delaying #DTACK?
I'm now talking about a scenario where there are real fast-ram outside the A590, but the question is
whether I can use the Gary #DTACK for accesses done by other busmasters,
or I need to implement it on the cpu board that contains the RAM (it already does internal #DTACK).

- Whats the difference in operating conditions for your 2 benchmarks?
Was both done with sync mode enabled?
 
The last (DMAC_SCLK GAL) hack was designed as a work around for the Zorro2 DMA bottleneck which limits the potentially faster @14MHz SCSI chip. (Assuming that's the previous 14MHz hack your referring to?)

Gary delays /DTACK by one wait state for bus master DMA* to Chip RAM but not for Fast RAM. *Note: Custom chip DMA may add additional wait states as needed.

In the case of real Fast RAM outside of the A590, Gary should run a standard 4 clock cycle (asserting /DTACK on the 3rd clock) but this /DTACK may be delayed by XRDY or disabled by /OVR allowing the Fast RAM controller to generate it's own /DTACK.

For DMAC_SCLK timing early /DTACK is not required but the RAM controller must be able to complete it's cycle in three 7MHz clocks. The difference between the 2 benchmarks is self explanatory if you consider the names of the 2 benchmark images (I tried to make 3 separate posts but vBulletin want's them all combined in 1 post for some reason).

P.S. Recent PM request from another Amibay member:
"Can I just re-focus your attention towards zip -> simm adapters? :laugh:

All this 14mhz speed hack stuff :roll:

Just giving ya crap ! How have you been doing?"

P.P.S. Hey, I'm not really very good at PCB design,
even if I had the time to do all this stuff! :sos
 
Last edited:
Is there any chance this can be made to work with the GVP Impact/G-force combo card's SCSI?
 
Ok, thanks for your information.

I need to add the old 14MHz hack to the scsi chip first for this new hack to have any effect obviously.

I programmed an ATF16V8 device yesterday with your jedec, and hooked it up on a breadboard, together with another PLD
emulating 7M/CDAC pulses at 1Hz, feeding your PLD, (+ LED's) and it turns out it works as you have described above.

I notice that the !OWN signal is also one of the qualifiers to do the 14MHz bus termination.
This signal is left floating from the DMAC chip according to the A590 shcematic. When does
the DMAC chip assert this signal?

Will the A590 also do 14MHz bus termination when DMA'ing to chip-mem with your hack?

And another question: Suppose the CPU card asserts /OVR each time it detects
access to its local fast-ram (both during cpu cycles and during cycles by other bus-master),
is there any reason that you could not run the entire write access at 14MHz?
If the RAM capable of 14MHz no waitstate operationis, the A590 PLD could generate its own #DTACK
for fastram accesses.

Ok you are breaking Z2 specs, but Gary won't know as address decoding
could be disabled during those cycles, and if the CPU board uses SRAM, there is
no problem with refresh timing and such.



The last (DMAC_SCLK GAL) hack was designed as a work around for the Zorro2 DMA bottleneck which limits the potentially faster @14MHz SCSI chip. (Assuming that's the previous 14MHz hack your referring to?)

Gary delays /DTACK by one wait state for bus master DMA* to Chip RAM but not for Fast RAM. *Note: Custom chip DMA may add additional wait states as needed.

In the case of real Fast RAM outside of the A590, Gary should run a standard 4 clock cycle (asserting /DTACK on the 3rd clock) but this /DTACK may be delayed by XRDY or disabled by /OVR allowing the Fast RAM controller to generate it's own /DTACK.

For DMAC_SCLK timing early /DTACK is not required but the RAM controller must be able to complete it's cycle in three 7MHz clocks. The difference between the 2 benchmarks is self explanatory if you consider the names of the 2 benchmark images (I tried to make 3 separate posts but vBulletin want's them all combined in 1 post for some reason).

P.S. Recent PM request from another Amibay member:
"Can I just re-focus your attention towards zip -> simm adapters? :laugh:

All this 14mhz speed hack stuff :roll:

Just giving ya crap ! How have you been doing?"

P.P.S. Hey, I'm not really very good at PCB design,
even if I had the time to do all this stuff! :sos
 
Is there any chance this can be made to work with the GVP Impact/G-force combo card's SCSI?

It's possible DMAC_SCLK could work with the GVP DPRC chip but with no schematics available you'll have to locate and disconnect the DPRC's 7MHz clock input. The DPRC chip is SMD with very close pin spacing so you need to trace it back to a safer location and remove an SMD resistor or cut a PCB track, etc. A2091/A590 owners can simply disconnect one end of 33 Ohm thru-hole resistor to replace the DMAC clock input.

However, most GVP owners don't need to worry about any software patching or SCSI chip 14MHz hack since all late Rev. GVP SCSI controllers all already built for 14MHz operation.

Ok, thanks for your information.

I need to add the old 14MHz hack to the scsi chip first for this new hack to have any effect obviously.

I programmed an ATF16V8 device yesterday with your jedec, and hooked it up on a breadboard, together with another PLD
emulating 7M/CDAC pulses at 1Hz, feeding your PLD, (+ LED's) and it turns out it works as you have described above.

I notice that the !OWN signal is also one of the qualifiers to do the 14MHz bus termination.
This signal is left floating from the DMAC chip according to the A590 shcematic. When does
the DMAC chip assert this signal?

Will the A590 also do 14MHz bus termination when DMA'ing to chip-mem with your hack?

And another question: Suppose the CPU card asserts /OVR each time it detects
access to its local fast-ram (both during cpu cycles and during cycles by other bus-master),
is there any reason that you could not run the entire write access at 14MHz?
If the RAM capable of 14MHz no waitstate operationis, the A590 PLD could generate its own #DTACK
for fastram accesses.

Ok you are breaking Z2 specs, but Gary won't know as address decoding
could be disabled during those cycles, and if the CPU board uses SRAM, there is
no problem with refresh timing and such.

There may be some benefit with out the "Old" 14MHz hack due to increased "Free Proc. Time" (see the 2 benchmarks) but of course, the main benefit of the faster transfer rate would require it.

/OWN is essentially the same as /BGACK except that it's a Zorro2 bus specific signal and therefore N/C in the case of the A590. Since CPU slot/Accelerator cards may assert /BGACK if they master the bus /OWN is the closest I can get to determine if an A2091 DMA cycle is actually occurring on the bus.

DMAC_SCLK doesn't know the difference between a Chip or Fast RAM cycle. It only knows the difference between DMA/Slave and Read/Write cycles. For Bus master DMA cycles the CPU slot/Accelerator card asserting /OVR is optional. For CPU cycles the 68000 /AS is not asserted for local on-board cycles, therefore Gary never detects these cycles and /OVR is not needed.

The A2091/A590 RAM controller already generates it's own /DTACK but it's timing is 7MHz specific. The A500 doesn't have a Buster chip so you could connect the DMAC_SCLK RW input to GND to enable 14MHz termination for all DMA cycles but please be WARNED of the following risks (and back-up all your partitions first!):

- The folks at C= engineering could have designed DMAC's buffer enable timing according to Buster specs.
- The 68000 bus spec. allows data latching on read cycles 1/2 CLK before cycle end
- You may need to qualify DMA cycles to your custom Fast RAM controller if Chip or other standard Fast RAM can't meet the shorter timing spec.
 
Last edited:
OK,

I finally got to install the new fantastic hack.


Installation:

http://s1314.photobucket.com/user/y_u_l_q_u_e_n/media/DSC_0032_zpsbc59093a.jpg.html

I just built it on a small piece of veroboard to test, plugged into the socket where the ram controller
GAL used to be. It is not needed as I do not have any RAM installed in the A590.

I compiled a new PLD with some additional equations added.

It now uses C1, C3 and CDAC from the old GAL socket to generate 7MHz and 14MHz.
Theres a jumper to select 7MHz or 14MHz operation for both SCSI controller and DMAC.

When the jumper is open, both operates at 7MHz all the time.

When the jumper is closed, the SCSI-controller gets a constant 14MHz clock, while the
DMAC gets 14MHz clocks at S4-S7 of DMA write cycles to fast-ram, every other cycle runs
at 7MHz.

For DMA accesses to fast-ram, the board asserts #OVR and generates #DTACK at S4.
The DTACK going to the 7407 OC driver is available through the old GAL socket,
and luckily there was an unused gate in the 7407 to wire the OVR to.

I also used A23:A21 to detect fast-ram accesses (I assume the A590 will never DMA
to the custom chips).

It all looks kinda messy, but if it proves to be stable, I might do a PCB that allows me to
put the metal cover in place without any cutting, like with the guru-adapter also seen in the pic.


Test results:

I booted using Kingspec IDE SSD+Acard SCSI adapter from my main A500+/A590 system,
after first backing up everything to CD.

The jumper was closed to enable 14MHz hack during boot, and SSD boots in async mode.


Running RSCP just after boot gives this result:

http://s1314.photobucket.com/user/y_u_l_q_u_e_n/media/DSC_0042_zps5b628eb1.jpg.html

Not bad, but perhaps dangerous?


After successfully running the Guru2091_14MHz command:

http://s1314.photobucket.com/user/y_u_l_q_u_e_n/media/DSC_0043_zpsb73081a4.jpg.html

Same as using async transfer without the 14MHz mod.


And finally, after setting the SSD to sync mode:

http://s1314.photobucket.com/user/y_u_l_q_u_e_n/media/DSC_0048_zpse5e2c9bd.jpg.html

I get a result similar to the one in the opening post.


As a comparison, my main A590/A500+ system, also with guru-rom, does 3200KB/s in with the same
HD setup in sync mode without any hardware hacks.
 
Nice work Yulquen74! :thumbsup:

I was confident DMAC_SCLK could achieve a 4MB/sec transfer rate with a fast enough SCSI target device but it's nice to have an actual benchmark result. Otherwise, the "Pics or it never happened" idiots start posting their 2 cents worth.

What I didn't expect was the benchmark obtained in (33C93A internally overclocked) Asynch mode would be faster than Synch mode. My Compaq 550MB SCSI-2 HD is always faster in Synch mode.

It's possible your Acard SCSI bridge may not be optimized for Synch mode transfers. What is the value of the Synch transfer register after you enable Synch mode?
 
Running the 33C93A_Config just after boot, it reports:

Select Sync Timeout Reg = $17
Synch Transfer Reg = $30

After running the Guru2091_14MHz and putting my Acard adapter in sync mode,
it says $2D and $28.


I have also tested with a normal spinning hard disk (an 76GB 80-pin device removed
from an 8 year old server). In async mode just after boot it does
about 3500KB/second, a bit slower than the Acard/SSD combo.

After running the Guru2091_14MHz, its down to about 2200KB/s like with the Acard/SSD in async mode after running this command.

For some reason I'm unable to put this hard-drive in sync-mode (nothing to do with this hack).



Would you say it is risky to just keep the Acard/SSD in async mode without running the 14M hack?

Would patching the Guru-ROM for real allow us to use sync mode without the boot issue?
 
The Synch Transfer reg = $28 is the same as my Compaq 550MB SCSI-2 HD and probably why your Acard SCSI adapter's Synch mode transfer rate is very close to my Compaq HD.

The left digit is the transfer period and "2" is the minimum transfer period under the SCSI-1 spec. The right digit is the transfer offset and "8" is seems to be a typical value used for many SCSI target devices. The 33C93A supports a maximum offset of $C which combined with the min. transfer period should in theory give the highest Synch transfer rate. (My NEC CD-ROM drive uses "$2C" but it never manages more than a 3 MB/sec transfer rate anyway).

I have a Seagate 4.5GB SCSI-2 HD which uses the SDTR (Synch Data Transfer Request) to negotiate the minimum transfer period and then disables Synch transfer mode either because it doesn't fully support it or can't negotiate a compatible Synch offset value.

The C= 7.0 ROM driver permits this and the drive obtains an Asynch transfer rate of 2.8 MB/sec. The Guru ROM V6.14 driver forces the drive to use the default Asynch transfer period and the drive obtains a 2.2 MB/sec transfer rate.

Note: Some SCSI HD's may have a jumper or firmware enable setting for Synch transfer mode.

You should be OK with keeping your Acard SCSI adapter in Asynch mode without the Guru2091_14MHz patch. The only problem is the Select Timeout period is to short to meet the minimum SCSI spec. so if you use more than 1 SCSI target device then I would suggest you disable Reselection.

The patched ROM will enable a boot in Synch mode but it's not my code to release (as previously indicated in post #1).
 
Last edited:
Back
Top Bottom