This is 07_DMA_To_DSP.rtf in view mode; [Download] [Up]
Written by J. Laroche at the Center for Music Experiment at UCSD, San Diego California. December 1990. Using Host to DSP DMA. It is now possible to use DMA to pass data from the host to the DSP. DMA transfer rate is about 2 Mbytes per second. This means that about 20 mono 44.1 KHz channels can be processed simultaneously using DMA protocol. Therefore, DMA should be used whenever a fast transfer rate is necessary, like in the case of sound files. The document 04_Through_DSP gives an example of how to play a sound through the DSP, sending the samples to the DSP in a non DMA stream, then getting them back to the DACs using DMA. This works with most sounds except when the sampling rate is 44100 and the sound is stereo, in which case the driver cannot send enough data to keep up with the DACs. This is the kind of case where you need to use DMA. The C program is given below. It can also be found in Examples/06_DMA_To_DSP. // ------------------------------- Beginning of program. #import <sound/sound.h> #import <sound/sounddriver.h> #import <mach.h> #import <stdio.h> #define Error(A,B) if((A)) {fprintf(stderr,"%s %s\n",B, SNDSoundError((A)));\ mach_error(B,(A)); } #define DMASIZE 4096 static int done; static void write_started(void *arg, int tag) { fprintf(stderr,"Starting playing... %d \n",tag); } static void write_completed(void *arg, int tag) { fprintf(stderr,"Playing done... %d\n",tag); done = 1; } static void over_run(void *arg, int tag) { fprintf(stderr,"Under or Over run... %d\n",tag); } void main (int argc, char *argv[]) { static port_t dev_port, owner_port, cmd_port; static port_t reply_port, read_port, write_port; int i, protocol; kern_return_t k_err; snddriver_handlers_t handlers = { 0, 0, write_started, write_completed,0,0,0,over_run, 0}; msg_header_t *reply_msg; SNDSoundStruct *dspStruct; SNDSoundStruct *sound; short *location; int length; int low_water = 48*1024; int high_water = 512*1024; // 64 instead of 512 makes it work like shit! short *foo; int LENGTH; int low_SR = 0; int stereo = 0; if(argc == 1) { printf("I need a 16bit linear sound file...\n"); exit(1);} k_err = SNDAcquire(SND_ACCESS_DSP|SND_ACCESS_OUT,0,0,0, NULL_NEGOTIATION_FUN,0,&dev_port,&owner_port); Error(k_err,"SND and DSP acquisition "); k_err = snddriver_get_dsp_cmd_port(dev_port,owner_port,&cmd_port); Error(k_err,"Cmd port acquisition "); k_err = SNDReadSoundfile(argv[1], &sound); Error(k_err,argv[1]); low_SR = (sound->samplingRate == SND_RATE_LOW); stereo = (sound->channelCount == 2); printf("Playing at %d Hz %s\n",((low_SR)?22050:44100),((stereo)? "stereo":"mono")); k_err = SNDGetDataPointer(sound,(char**)&location,&length,&i); Error(k_err,"Data Pointer"); protocol = SNDDRIVER_DSP_PROTO_RAW; k_err = snddriver_stream_setup(dev_port, owner_port, SNDDRIVER_DMA_STREAM_TO_DSP, DMASIZE,2, low_water, high_water, &protocol, &read_port); Error(k_err,"Stream 1 set_up"); k_err = snddriver_stream_setup(dev_port, owner_port,((low_SR) ? SNDDRIVER_STREAM_DSP_TO_SNDOUT_22: SNDDRIVER_STREAM_DSP_TO_SNDOUT_44), DMASIZE, 2, low_water, high_water, &protocol, &write_port); Error(k_err,"Stream 2 set_up"); k_err = snddriver_dsp_protocol(dev_port, owner_port, protocol); Error(k_err,"Protocol set-up "); k_err = port_allocate(task_self(),&reply_port); k_err = SNDReadDSPfile("perso_b.lod", &dspStruct, NULL); Error(k_err,"Reading .lod file "); k_err = SNDBootDSP(dev_port, owner_port, dspStruct); Error(k_err,"Booting DSP "); printf("DSP booted\n"); if(!stereo) // If mono, tell it to the DSP! k_err = snddriver_dsp_host_cmd(cmd_port,21,SNDDRIVER_LOW_PRIORITY); // To use DMA to the DSP, you need alignment with vm pages. Therefore you // need to allocate virtual memory, and copy your sound in it. Here, we // just copy an integer number of virtual memory pages. To play the whole // sound, you would need to allocate more memory, and copy the rest // in a for loop. vm_page_size is a global variable containing the size of // the virtual memory pages (in bytes.) LENGTH = (length*sizeof(short)/vm_page_size)*vm_page_size/sizeof(short); vm_allocate(task_self(),(vm_address_t *)(&foo),2*LENGTH,TRUE); Error(k_err,"VM Allocation "); vm_write(task_self(), (vm_address_t)(foo), (pointer_t)location,2*LENGTH); Error(k_err,"VM Write "); k_err = snddriver_stream_start_writing(read_port, (void *)foo, LENGTH, 1, 0,0, 1,1,0,0,0,1, reply_port); Error(k_err,"Starting writing "); reply_msg = (msg_header_t *)malloc(MSG_SIZE_MAX); done = 0; while (done != 1) { int i[2]; // 2 values: the header, and the value i[0] = 1; // 1 stands for volume printf("Value of the volume (max 8388608)? "); scanf("%d",i+1); k_err = snddriver_dsp_write(cmd_port,i,2,sizeof(int), SNDDRIVER_MED_PRIORITY); } } // ---------------------------------- End of program. SETTING-UP THE DRIVER. We start reading the sound file, and getting info about its sampling rate, and its number of channels, because we'll need these info to set-up the stream correctly. Then comes the actual streams set-ups protocol = SNDDRIVER_DSP_PROTO_RAW; k_err = snddriver_stream_setup(dev_port, owner_port, SNDDRIVER_DMA_STREAM_TO_DSP, DMASIZE,2, low_water, high_water, &protocol, &read_port); k_err = snddriver_stream_setup(dev_port, owner_port,((low_SR) ? SNDDRIVER_STREAM_DSP_TO_SNDOUT_22: SNDDRIVER_STREAM_DSP_TO_SNDOUT_44), DMASIZE, 2, low_water, high_water, &protocol, &write_port); We set up two streams: one from the DSP to the DACs with a sampling rate corresponding to that of the sound file, and on from the memory to the DSP. This one is special since instead of SNDDRIVER_STREAM_TO_DSP (like in the non-DMA examples) we use SNDDRIVER_DMA_STREAM_TO_DSP. This signals to the driver that the stream from the memory to the DSP will use DMA protocol. It also means that the DSP code should implement DMA protocol to get samples from the host. The high_water value is now set to 512*1024. This is higher than the value used in the non-DMA example: if the high water lark is not high enough, we'll get drop-outs, like in the case of non-DMA protocol. 512*1024 is a empiric value that gives satisfactory results. The DSP is then loaded with the DSP code, and booted. If the sound is mono (only one channel) a host command is issued to signal the DSP that only one channel is sent (see DSP code below.) SENDING THE SAMPLES. Another difference with non-DMA stream is that the samples we send have to be aligned with virtual memory pages (if you don't, you'll get a "bad alignment" message.) This means that the first sample must lie exactely at the start of a virtual memory page. This is not usually the case when you read a sound file using SNDReadSoundfile(), in which case the samples are just copied to a newly allocated place in the memory, not necessarily correponding to a virtual memory boundary. This is why we allocate virtual memory vm_allocate(task_self(),(vm_address_t *)(&foo),2*LENGTH,TRUE); and copy the sound into it vm_write(task_self(), (vm_address_t)(foo), (pointer_t)location,2*LENGTH); Refer to the mach functions document for info about these functions. The thing is that these mach functions will allocate (resp. copy) only an integer number of vm pages. The size of a vm page is given in the global variable vm_page_size, and is currently 8192 bytes. Therefore, if you want to copy the whole sound into an area in the virtual memory, you need to allocate more than really needed, and then copy an integer number of pages using vm_write() and finish the rest with a for loop. Here, we just copy as many pages as possible, and just dump the remaining samples: LENGTH = (length*sizeof(short)/vm_page_size)*vm_page_size/sizeof(short); The samples are then sent using the classical call to the function k_err = snddriver_stream_start_writing(read_port, (void *)foo, LENGTH, 1, 0,0, 1,1,0,0,0,1, reply_port); Note that we ask for write_started, write_completed and overflow messages. Overflow messages are sent each time the DACs run out of samples. They are usually associated with blanks or drop-outs. CONTROLLING the PLAY-BACK. Now that the driver is sending and receiving data to and from the DSP, we need to control the play-back and therefore, we need to send parameters to the DSP (play-back volume, amount of pitch shift...) We cannot use the scheme we used when we only had DMA from the DSP to the host because there is a risk of collision between the samples and the parameter values. The driver won't reliably write a data and send a host-command in one DMA-out buffer. However, it's possible to send many data in between DMA buffers. We pass data the following way: Each parameter update is composed of a header and a value. The header indicates what parameter is to be updated, and the value is obviously the update value. The DSP stores these values on a queue inside its memory and dispatches the values to the parameters according to the headers, after each DMA-out buffer. For each parameter update, we have two values, the first of which contains the header (1 in our case.) int i[2]; i[0] = 1; These two values are sent using the classical snddriver_dsp_write() function. snddriver_dsp_write(cmd_port,i,2,sizeof(int),SNDDRIVER_MED_PRIORITY); Note that it's the driver's responsability to avoid collisions betwee DMA and parameters. It does that pretty well. DSP CODE. The DSP code is given below. It can also be found in Examples/06_DMA_To_DSP. ;; -------------------------- Beginning of program include "ioequ.asm" IW_Buff equ 8192 ;Start address of input buffer Buff_size equ 8191 Control_Queue equ 0 ; Start address of the control value queue Control_Size equ 99 ; Size of that queue. DMA_SIZE equ 4096 DM_R_REQ equ $050001 ;message to host to request dma-OUT DM_W_REQ equ $040002 ;message to host to request dma-IN VEC_R_DONE equ $0024 ;host command: dma-OUT complete VEC_W_DONE equ $0028 ;host command: dma-IN complete Vol_Header equ $01 ; Signals that next value is a volume. Stuff_Header equ $02 ; Signals that next value is something. ;;;------------------------- Variable locations ;;; x_sFlags equ $00fd ;dspstream flags DMA_DONE equ 0 ; indicates that dma is complete DMA_ACCEPTED equ 1 Stop_Flag equ $00 ; Stop DMA flag bull equ $02 volume equ $03 writeHost macro source _one jclr #m_htde,x:m_hsr,_one movep source,x:m_htx endm readHost macro dest _two jclr #m_hrdf,x:m_hsr,_two movep x:m_hrx,dest endm org p:$0 jmp reset org p:$20 movep x:m_hrx,y:(R2)+ nop org p:$2A move #>2,N1 ; When the sound is mono. nop org p:VEC_R_DONE ; DMA-OUT completed. bset #DMA_DONE,x:x_sFlags org p:$2C ; DMA-IN accepted: start reading. jsr startDMA_In org p:100 reset movec #6,omr ;data rom enabled, mode 2 bset #0,x:m_pbc ;host port bset #3,x:m_pcddr ; pc3 is an output with value bclr #3,x:m_pcd ; zero to enable the external ram movep #>$000000,x:m_bcr ;no wait states on the external sram movep #>$00BC00,x:m_ipr ;intr levels: SSI=2, SCI=1, HOST=2 clr a move a,x:x_sFlags ;clear flags bset #m_hcie,x:m_hcr ;host command interrupts move #0,sr ;enable interrupts move #>1,N1 ; Stereo sound by default. jmp main main move #>IW_Buff,R0 move #>Buff_size,M0 move #>0,R1 move #>DMA_SIZE-1,M1 move #>Control_Queue,R2 move #>Control_Queue,R3 move #>Control_Size,M2 move #>Control_Size,M3 move #>.9,a move a,x:volume clr a move a,x:x_sFlags _main_loop jsr Read_DMA_Buffer ; Get a buffer from the host jsr Write_DMA_Buffer ; Send it back! jsr update_para ; dispatch the received control values. jmp _main_loop ; Until the next earthquake... ;; Subroutine that reads one complete DMA from the host, and puts it in the ;; input buffer. If the sound is mono, then two samples are copied instead ;; of just one. Read_DMA_Buffer jset #m_hf1,x:m_hsr,Read_DMA_Buffer move #>IW_Buff,R0 bclr #m_hrie,x:m_hcr ; Disable the host receive interrupt. ; since the following values are samples... writeHost #DM_W_REQ move #>IW_Buff,R0 jclr #m_hf0,x:m_hsr,_ready _ready btst #DMA_ACCEPTED,x:x_sFlags jcc _ready move #DMA_SIZE,b do b,_end_DMA_loop _clear jclr #m_hrdf,x:m_hsr,_clear movep x:m_hrx,a move a,x:bull jclr #15,x:bull,_no_correct ; This is a modification which corrects move #>$FF,a2 ; the driver's bug. It sign-extends the move #>$FF0000,X1 ; received short value, if necessary or X1,a _no_correct rep N1 ; If mono, copies samples twice. move a,y:(R0)+ ; for left and right channels. _end_DMA_loop jclr #m_hrdf,x:m_hsr,_then move x:m_hrx,X0 ; Continue reading incoming data... _then jset #m_hf1,x:m_hsr,_end_DMA_loop ; until HF1 is reset. rts ;; Subroutine that sends a DMA buffer to the host. ;; This is a classical DMA out routine... Write_DMA_Buffer bset #m_hrie,x:m_hcr ; enable reception of control values. move #>IW_Buff,R0 do N1,_ackEnd ; If mono, we need to send two buffer _DMA_out ; for each received one... jclr #m_htde,x:m_hsr,_DMA_out movep #DM_R_REQ,x:m_htx _ackBegin jclr #m_hf1,x:m_hsr,_ackBegin ; wait for HF1 to go high move #>DMA_SIZE,b do b,_prodDMA _ddd move y:(R0)+,X1 move x:volume,X0 mpyr X0,X1,a writeHost a _prodDMA btst #DMA_DONE,x:x_sFlags jcs _endDMA jclr #m_htde,x:m_hsr,_prodDMA movep #0,x:m_htx ; send zeros until noticed jmp _prodDMA _endDMA bclr #DMA_DONE,x:x_sFlags ; Clear the flag for next buffer! _ackEnd rts ;; Subroutine called when the host is ready to send the samples. It reads an ;; integer. startDMA_In readHost X0 ; The host sends a integer. bset #DMA_ACCEPTED,x:x_sFlags ; But we don't really need it. rti ;; Subroutine that checks the control values queue, and dispatches the received ;; values to the corresponding parameters (here, only the volume...) update_para move R2,a move R3,b cmp a,b #>Vol_Header,b ; Is the queue empty? jeq _end move y:(R3)+,a ; If not, what's the header? cmp a,b #>Stuff_Header,b ; Is it a volume header? jeq _update_vol ; YES: update the volume cmp a,b ; Is it a stuff header? jeq _update_stuff ; YES: update the stuff, etc... jmp update_para ; do it again Sam _end rts _update_vol ; Updates the value of the volume move y:(R3)+,a ; The next value is the volume. move a,x:volume jmp update_para _update_stuff ; would update the value of another jmp update_para ; parameter (amount of reverb etc...) ;; ----------------------- End of program. To compile this DSP program, you would type: asm56000 -a -b -os,so -l myProgram.asm Host to DSP DMA transfer is DSP initiated and is done the following way: · When the DSP is ready to read a DMA buffer, it sends the driver a DM_W_REQ request. · The driver performs all sorts of initializations and sends a DMA-accepted host command (address $2C on the DSP) when it's ready to start, along with an integer containing additional info. · Upon reception of the host command, the DSP can start reading data from the host, and continues reading after one DMA buffer has been read until HF1 is reset. · When the host has finished sending a buffer, it resets HF1 and sends a host command to the DSP (address $24 on the DSP program memory.) · When it receives this host command (or as soon as it tests that FH1 is low), the DSP can get another DMA buffer, or send a DMA out buffer to the host. The program implemented here is very simple: it reads one DMA buffer from the host, then sends it back to the host using also DMA protocol and loops forever. The DMA out protocol is the same as in other examples (05_Dig_Ears etc...) The DMA in is done as follows; A DM_W_REQ DMA request is sent to the host and the DSP waits until it receives a DMA-accepted host command. This host command calls the startDMA_In subroutine which reads an integer and sets a bit in x_sFlags: startDMA_In readHost X0 ; The host sends a integer. bset #DMA_ACCEPTED,x:x_sFlags ; But we don't really need it. rti The DSP checks that precise bit in x_sFlags to find out whether the driver is ready to send the data. _ready btst #DMA_ACCEPTED,x:x_sFlags jcc _ready Then it starts reading the samples from the host, sign extending them if necessary (see Pitfalls for more info about the necessity of sign extension.) After reading one DMA buffer, the DSP continues reading data sent by the host until the driver resets HF1, indicating that the DMA transfer is done. _end_DMA_loop jclr #m_hrdf,x:m_hsr,_then move x:m_hrx,X0 ; Continue reading incoming data... _then jset #m_hf1,x:m_hsr,_end_DMA_loop ; until HF1 is reset. The DSP can then perform other tasks and eventually sends the buffer back using standard DSP->host DMA protocol. RECEIVING PARAMETERS UPDATES. As we saw in the C program, the host sends parameters using couples of values {header,value}. The DSP receives them during the time it sends one DMA-out buffer (to avoid collision) using host data receive interrupts, and put them in a control queue indexed by R2: org p:$20 movep x:m_hrx,y:(R2)+ nop These host data receive interrupts are disabled during DMA-in because the driver is then sending only samples, and the DSP reads them using normal hardware-handshake. At the end of the DMA-out buffer, the DSP calls a subroutine, update_para, to update its parameters according to what's present in its control queue. The queue is examined using R3, until no control data is present: update_para move R2,a move R3,b cmp a,b jeq _end ;; _end simply returns (rts) the headers in the queue are compared to the predefined headers, and the DSP jumps to the corresponding function when it recognizes a header (here, the Stuff_Header could be anything, a filter coefficient etc...) and does that until the queue is completely checked. move #>Vol_Header,b move y:(R3)+,a cmp a,b jeq _update_vol move #>Stuff_Header,b cmp a,b jeq _update_stuff ...... jmp update_para The update function simply store the following parameter, and returns to the beginning of the queue check: _update_vol move y:(R3)+,a move a,x:volume jmp update_para This scheme for passing control data to the DSP is very general (very similar to that used in the Orchestra). It makes it possible to send many data in one call to the snddriver_dsp_write() function, and always updates the parameters at the end of a DMA buffer (which is a nice way to be sure all the parameters are updated at the same time.) This scheme could have been implemented in all other examples when host commands where used instead. IMPORTANT REMARKS. · After sending a DMA-OUT buffer to the host, the DSP expects another DMA-IN buffer and therefore sends the host a DM_W_REQ DMA request. If the host doesn't have any more data to send, the driver will stay stuck in that position where the DSP is expecting a DMA buffer. This can be annoying, therefore the C program should signal the DSP that all the samples have been sent after the last DMA buffer, then send an additionnal DMA buffer containing junk data to deblock the driver. This can be done with a special host command sent in the write_completed function for example. The DSP should recognize the host command, receive the following DMA buffer (throwing away the data), and NOT send another DMA-IN request after the end of the reception. · Be sure you send the correct amount of data to the DSP when you use the snddriver_stream_start_writing() function. Sending non allocated or non copied data can wedge the NeXT machine (with a long reboot...) The arguments to vm_allocate() and vm_write() are expressed in bytes, not in samples!
These are the contents of the former NiCE NeXT User Group NeXTSTEP/OpenStep software archive, currently hosted by Netfuture.ch.