Quantcast
Channel: Floppy Emu – Big Mess o' Wires
Viewing all articles
Browse latest Browse all 167

Thoughts on Floppy Emu Redesign

$
0
0

I’ve been pondering what a redesigned Floppy Emu might look like – what ICs might be involved. So why mess with success? The current design combines a microcontroller with a CPLD, which has proven to be a powerful and flexible combination. But the specific microcontroller and CPLD that I chose have both become a bit outdated, hard to find, and too expensive relative to the alternatives. By replacing them both with more modern parts, I could probably gain better features while simultaneously improving the manufacturing outlook, for a perfect win/win. Or with a sufficiently powerful microcontroller, I might be able to completely eliminate the CPLD.

 
CPLD

The CPLD (complex programmable logic device) is like a miniature FPGA, and is there to handle the timing-critical bit twiddling that the microcontroller can’t. It ensures the data bitstream appears at exactly the right speed, with no jitter, and also functions as a fancy parallel-to-serial shift register. For Apple II disk emulation, it helps make the work easier, and for Macintosh disk emulation it’s essential. The Mac treats floppy drives sort of like a 16×1-bit external memory, and microcontroller software isn’t fast enough to react to the changing address inputs and supply the correct data output.

The specific CPLD used by Floppy Emu is the Xilinx XC9572XL, which I mainly chose for being 5v tolerant. It works, but it can only store a very small amount of logic, which forces me to have separate and independent firmwares for Apple II and Macintosh emulation. With a larger and more capable chip, I could merge those into a single firmware.

The Xilinx XC9572XL is old enough that it’s approaching “legacy” status, and may be in line for discontinuation soon. Xilinx barely mentions it on their web site, and the chip has been sold out at all of their distributors for the past several weeks, which is worrisome. During the years I’ve been building Floppy Emus, the price of the XC9572XL has also steadily climbed to where it’s now more than double its original cost.

There aren’t many great alternatives, because it seems the whole CPLD market is slowly dying in favor of FPGAs, and most of what’s left has even smaller logic storage limits than the XC9572XL. The best option is probably a small FPGA instead, like one of the Lattice MachXO or MachXO2 devices similar to what’s in the Yellowstone disk controller.

A second idea is to replace the CPLD with a small, dedicated microcontroller, like something from the ATTINY series. With only one or two very simple tasks to do, maybe a mini-microcontroller would be able to keep up. I spent half a day pursuing this line of thinking, going as far as writing a first draft of the code for this hypothetical microcontroller, but decided to focus on other solutions first.

 
Microcontroller

Floppy Emu uses an ATMEGA1284 microcontroller, which in 2018 is almost a joke. Its sole advantage is that it’s a close relative of the ATMEGA chip used in Arduinos, so there’s tons of example code available for it. But it’s woefully underpowered compared to basically any other microcontroller out there. Any of the popular 32-bit ARM microcontrollers would have significantly faster clock speeds, more memory, and better peripheral options, and would probably cost less too.

Sadly there are still no common microcontrollers with enough internal RAM to buffer an entire floppy disk image (140K to 1440K depending on the image type). That would vastly simplify the emulation, help fix some occasional emulation hiccups, and possibly lead to better handling of copy-protected disks. But that will have to wait until the 2020’s, it seems.

After looking at a whole range of options, I’ve got my eye on the Atmel SAM4S series of microcontrollers. These boast speeds up to 120 MHz, while remaining relatively inexpensive.

 
Single-Chip Design?

Earlier I mentioned that the CPLD is essential for certain types of disk emulation, where microcontroller software isn’t fast enough to react to changing inputs in real time. That’s true for an older 8-bit microcontroller like the ATMEGA1284 running at 20 MHz, but what about for a modern 32-bit ARM running at 120 MHz? That might be a different story.

The hardest test will be 3.5 inch floppy disk emulation for the Macintosh. When the Mac wants to check some item of drive state, like whether there’s a disk inserted or whether the disk head is at track zero, it sets four IO lines with the right values to select from among the 16 possible state bits. Then it reads the result on a fifth IO line. It works just like an asynchronous SRAM, and the drive (or Floppy Emu emulating a drive) never knows if or when the Mac is reading state info. The drive needs to constantly respond, so whenever one of those four IO lines changes, the value on the fifth IO line must be updated immediately to reflect the newly selected state.

This would be difficult to emulate in a microcontroller, especially if it also needed to do other work besides merely responding to changing address inputs. The microcontroller would need to enable a pin change interrupt on each of the four address lines, and the interrupt service routine would need to read the address lines and set the data line accordingly. In practice it would be even more complicated, because there are other IO lines like /ENABLE that would also need to be considered, and would need their own interrupt handlers.

Would it be fast enough? Maybe. By itself the code in the ISR would almost surely be fast enough. But there’s overhead to consider – the interrupt service latency, and the time needed to set up the ISR context and exit it again (saving and loading all the registers from the stack). Then there’s the complicating factor of other potential interrupts, or other invocations of this same interrupt, creating additional delays before the ISR actually starts running. And the further complication of some critical code sections where interrupts need to be disabled temporarily, creating yet more delays before the ISR runs.

How fast does it need to be, anyway? How much time elapses between when the Mac sets the four IO lines and when it reads the fifth line? Unfortunately I don’t know, and the worst part is there’s really no way to find out. Because there’s no external indication of when a read is occurring, there’s nothing I can measure with an oscilloscope or logic analyzer. But I can make some rough guesses, based upon examination of a disassembly of the floppy disk driver code in the Macintosh ROM. On the Macintosh Plus, there’s a whopping 6750 ns between address and data for most state reads, but some reads have a much narrower window of about 1500 ns, assuming I’ve interpreted the code correctly.

Are there other faster read examples, buried somewhere in the Mac Plus ROM that I missed? And do Macs with a higher clock speed than the Plus have correspondingly faster read behaviors of disk state info, or do they insert delay code to keep the speed constant? I don’t know, but the Mac Plus is likely far from the worst case. I’ll make a wild guess that a 500 ns response time would probably be fast enough. Keep in mind I’m talking about the speed between presentation of the four address IO lines, and reading state info on the fifth IO line, which is entirely different from the speed or data rate of the disk data itself. It’s not something that’s specced or defined in any official source; it’s just something that arises as a side-effect of the code in the Mac’s ROM.

Despite the uncertainty here, I think there’s a decent chance this could work, so I’m going to try some experiments to test my theories.

If even a 120 MHz microcontroller isn’t fast enough to handle this 4-address-1-data mechanism, I’ve sketched out a possible fallback plan that uses a few discrete logic chips – some muxes and latches and buffers. It’s arguable whether that would be preferable to just using a CPLD (or FPGA), and it would certainly be less flexible. But a small handful of discrete logic combined with a much faster microcontroller could still provide some major advantages, and simplify firmware design by reducing the number of programmable chips on the Emu board from two to one.

So there’s a lot to think about. And anything that grows out of all this ruminating won’t see the light of day for a long time. But I can’t resist daydreaming…


Viewing all articles
Browse latest Browse all 167

Trending Articles