ZVOICE

Authors

Wilf Rigter

Publication

Publication Details

 

Date

October 1986

Pages

9-11

Like many of you, I have been fascinated by the idea of speech synthesis. I even bought a GI SP0256 chip only to have it sit on a shelf, silently gathering dust.

After all you need a robot or I/O board to make it work, right?

Meantime others were having fun and contributing to the growing library of speech programs for Karl B. ZSPEAK unit.

Then I saw Ken A. do a demo of text to speech in software and heard him describe the potential of such systems for students in the ESL program.

Well, I got hooked and decided to contribute something of my own.

After some discussion with Ken and Harry S., it became apparent that at least 3 improvements might be needed to the existing ZSPEAK system:

  • Make it cheaper, simpler, compact.
  • Get rid of the screen flick.
  • Improve the machine language drive.

ZVOICE is the solution to all 3.

The Hardware

For the first improvement I used the rule “simplicity without compromise” to achieve the desired objective.

Figure 1 shows the resulting circuit: a chip count that is hard to beat, a 50 percent reduction in size, and performance equal to the ZSPEAK system.

The I/O address remains the same as the ZSPEAK system, and hence existing ZSPEAK software can run unmodified on the new ZVOICE unit. Furthermore new software for the ZVOICE unit can be used with the older unit.

The ZVOICE design takes advantage of the “on chip” I/O port of the SP0256 IC and uses the existing CPU clock signal instead of a crystal oscillator. The handshake SBY (active low when busy) signal is gated to the DATA bus when transistor Q1 when I/O address 37h is read. If not “busy” a 6 bit DATA byte can be written to the I/O address 17h selecting the phoneme to be voiced.

The ZVOICE can be used with both the TS1000/1500 and 2068 computers.

The Software

Item 2 on our wish list calls for “flicker free” operation and that means SLOW mode for the TS1000.
But existing BASIC software has a hard time keeping up with a SP0256 hungry for phonemes.

Like a printer buffer, a phoneme buffer could be designed in hardware but this would add complexity and cost.
What about a software buffer with some machine code to speed things up?

Listing 1 shows a ML routine that does the job.

When combined with the BASIC program in listing 2, the user can assemble a phonetic word, phrase, paragraph or even a whole book of phonemes in a string variable buffer. Then using RAND 1 and RAND USR 16516, this buffer is loaded into the SP0256, one phoneme at a time, while the printed version of the spoken phrase can be viewed without screen flicker.

The ML routine returns to BASIC when it finds the last phoneme in the buffer that was set to a value greater than CHR$ 127.

You can try this software, together with listing 3, on your existing ZSPEAK units and discover integrated sight and sound.

Listing 1

16514 - 118,118,42,16,64,17,5,0,25
16523 - 237,91,50,64,25,219,39,230
16531 - 128,40,250,126,211,23,35
16537 - 230,128,0,242,201

Listing 2

1 REM
2 DIM A$(256)
10 LET Q=128
99 REM ASSEMBLE PHONEMES IN B$
100 LET B$=""
110 PRINT "INPUT PHONEMES 0 TO 63","""Q""" TERMINATES ENTRY"
120 INPUT A
130 LET B$=B$+CHR$ A
140 PRINT A;",";
150 IF A<>Q THEN GOTO 120
199 REM TEXTINPUT/B$ TO BUFFER
200 PRINT AT 19,0;"ENTER TEST WORD(S)"
210 INPUT T$
215 INPUT T$
217 LET A$(1 TO )=B$
220 RAND 1
230 RAND USR 16516
235 PRINT AT 19,0;"PRESS N/L TO CONTINUE"
240 PAUSE 10000
250 CLS
260 GOTO 100

Software Details

For those of you wishing to delve deeper into this concept, hang on to your heads.

The ML routine in listing 1 is deceptively simple but conceptually powerful

As a core routine, called from a BASIC or ML program, it expects to find the phoneme butter in the first variable of the variable area and uses an offset passed in system variable “SEED” to point to the start of the phoneme phrase to be voiced.

This means that a number of such phrases can be arranged in the buffer, with the start of the selected phrase pointed to by “SEED”.

The ML routine adds the offset to the start of the buffer. The SBY line is tested using IN A,(27). The program loops until bit 7=1. Then phonemes are loaded using LD A,(HL) and OUT (17),A. The program loops until a phoneme to be loaded has bit 7 set. This is tested using AND A,80, after which the routine returns to the calling program.

If the calling program is machine language, a slight variation may be used where the SBY polling loop can JR Z,08 to return to the calling program with the Z flag set, allowing the program to execute other stuff between phonemes.
This might provide a continuous speech output while writing data to the video screen (ie a face with lips moving while speaking).

Wow that left me kind of breathless.

A slower but effective method uses a BASIC calling program which executes the other stuff between words when a PAUSE occurs. Still longer pauses between sentences can be used to do floating point calculations which may require more time.

Careful program organization provides smooth results, with the apparent execution of 2 simultaneous tasks.
True multitasking can be achieved by rewriting the video routine at 281h, which is executed 60 times per second, and calling this routine by loading the IX register with the starting address of the new video routine. Considerable fine tuning is required to synchronize with the video timing but the results are worthwhile for this application which could include built-in commands for verbal screen copy, LSPEAK, etc.

This is the same technique used for software highres graphics.

Products

 

Downloadable Media

 
Scroll to Top