*****************************************************************************
*                        FIDDLING WITH I/O PORTS                            *
*                                                                           *
*                                                                           *
* Copyright (c)  2001  Daniel P. Bovet, Marco Cesati, and Cosimo Comella    *
*    Permission is granted to copy, distribute and/or modify this document  *
*    under the terms of the GNU Free Documentation License, Version 1.1,    *
*    published by the Free Software Foundation; with no Invariant Sections, *
*    with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the *
*    license is included in the file named LICENSE.                         *   
*                                                                           *
* (version 1.1)                                                             *
*****************************************************************************  
The objective is to illustrate how write User Mode programs that access directly
I/O ports. By the way several well known application programs do exactly that: 
X Window for istance, gets full control of the I/O ports both of the graphic
interface and of the mouse. The same thing happens for many recreation
applications.

Our talk consists of several parts:

a) introduce the general I/O port model 

b) describe the programing interface of the serial port
  
c) illustrate a few programs that fiddle with the serial port

d) make some timing measurements: how long does it take for the I/O port to send
a character along the line. 



*****************************************************************************
a)                     THE GENERAL I/O PORT MODEL
*****************************************************************************

On Intel 80x86 architecture, I/O is performed by means of the two I/O
instructions in and out that allow to fetch and send data from and to an I/O
port.

Each device interfaces the I/O bus with 4 kinds of ports:

a) Input register   -  data coming from the device and going into RAM are stored
                       here
		    
b) Output register  -  data coming from RAM and going to the device are stored
                       here
		     
c) Command register -  commands sent to the device must be sent to that register
                       by an out instruction
		       
d) Control register -  the current device status can be tested by reding this
                       register via an in instruction
		       
In practice, things are not so neat and the same register is used for different
functions (see serial port example next)




*****************************************************************************
b)                 THE SERIAL PORT PROGRAMING INTERFACE  
*****************************************************************************

Each I/O device has its own set of I/O ports. You may get their addresses by
typing:

            cat /proc/ioports
	    
The result is something like:

0000-001f : dma1
0020-003f : pic1
0040-005f : timer
0060-006f : keyboard
0080-008f : dma page reg
00a0-00bf : pic2
00c0-00df : dma2
00f0-00ff : fpu
0170-0177 : ide1
01f0-01f7 : ide0
02f8-02ff : serial(auto)
0376-0376 : ide1
0378-037f : parport0
03c0-03df : vga+
03f6-03f6 : ide0
03f8-03ff : serial(auto)
0778-077a : parport0
d800-d807 : ide0
d808-d80f : ide1
	    
The first serial port, ttyS0 (or COM1 in the DOS world) is assigned the 8
consecutive one-byte I/O ports ranging from 0x2f8 to 0xff

Now the difficult part is to get documentation about what each port is supposed
to do. 

In the case of the serial port that has been around for many years this
is easy (we got the information we need from Messmer's book "The indispensable
PC hardware book" but you can find it in several places).

In other cases, getting the documentation may be a very hard task. A well known
manufacturer of network cards, for instance, refuses to give to his customers
any information about the programming interface of his products.

Coming back to the serial port and assuming a base addresss equal to 0x2f8
(ttyS0), we have the following layout:

Offset

00          Receiver register or Transmitter register

01          Interrupt enable register

02          Interrupt identification register

03          Line control register

04          Modem control register

05          Line status register

06          Modem status register

07          Scratch pad register

We'll only play with the first three registers.

********************************************************************************
                RECEIVER/TRANSMITTER REGISTER

If the data length is smaller than 8, the most significant bits are not valid
and should be masked. The UART's transmitter control inserts start, stop and
parity bits automatically.

********************************************************************************
                   INTERRUPT ENABLE REGISTER
		    
Used to control interrupt requests coming from UART

bits 7-4      - always set to 0

bit 3 (SINP)  - if set, the UART activates the INTRP line if the state of one
	        of the RS-232C input signals changes (ringing signa, etc.)
	       
bit 2 (ERBRK) - if set, the UART activates the INTRP line if the receiver
                control detects a parity, overrun, or framing error, or a 
		break of the connection during the course of an incoming byte
		
bit 1 (TBE)   - if set, the UART activates the INTRP line as soon as the data
                character to be transmitted has been transferred and the 
		transmitter register can accomodate the character to be 
		transmitted next
		
bit 0 (RxRD)  - if set, the UART activates the INTRP line as soon as a complete
                character is available in the receiver register

********************************************************************************
                 INTERRUPT IDENTIFICATION REGISTER 

Used to determine whether an interrupt is currently pending or not. Reading
such register clears automatically the pending interrupt.

bits 7-3           - always set to 0

bits 2-1 (ID1,ID0) - identity bits:
                      00 = change of an RS-232 signal
		      01 = transmitter buffer empty
		      10 = data received
		      11 = serialization error or BREAK
		      
bit 0 (not PND)     - 1 if no interrupt pending, 0 otherwise
    



*****************************************************************************
c)     PROGRAMS THAT FIDDLE WITH I/O PORTS OF SERIAL PORT
*****************************************************************************

Suppose we have a null modem cable and we want to use such cable to connect the
serial ports of two computers so that they are able to exchange data.

The first thing to do is to check whether the communication parameters of the
two ports are the same. This can be done in several ways: for instance by 
using an ioctl() system call, or by writing into the line control register of
the I/O port.

The simplest way to do it, however, is to run minicom on both computers and
select the <CTRL>-A P option that corresponds to "Comm Parameters"

                             [Comm Parameters]              
                    x                                        x              
                    x Current: 57600 8N1                     x              
                    x                                        x              
                    x   Speed          Parity          Data  x              
                    x                                        x              
                    x A: 300           K: None         R: 5  x              
                    x B: 1200          L: Even         S: 6  x              
                    x C: 2400          M: Odd          T: 7  x              
                    x D: 4800          N: Mark         U: 8  x              
                    x E: 9600                                x              
                    x F: 19200         O: Space              x              
                    x G: 38400                               x              
                    x H: 57600                               x              
                    x I: 115200        P: 8-N-1              x              
                    x                  Q: 7-E-1              x              
                    x                                        x              
                    x Choice, or <Enter> to exit?  

Remember to run minicom every time you switch on the computers because the
values you selected are not saved as default values.
		    
Once this is done, we can start thinking about programs that send and receive
data.

We have prepared three types of programs:

HIGH LEVEL: TwriteH_ttyS.c and TreadH_ttyS.c exchange data by using the
            /dev/ttyS0 device file
	    
LOW LEVEL:  TwriteL0_ttyS.c writes a single char using the I/O ports of the
            serial port
	    
	    TwriteL1_ttyS.c writes a sequence of chars and computes, for each
	    char transmitted, the number of iterations of the Ready flag
	    
	    TwriteL2_ttyS.c  is similar to TwriteL1_ttyS.c, execept that it
	    computes the execution time of the "out" instruction using the
	    64-bit Time Stamp Counter. This program works only if there is a
	    new system call that reads into User Mode address space the value
	    of the cpu_khz kernel variable, which denotes the frequency of the
	    CPU. The corresponding patch is denoted as patch-2.4.16kh3bis and is
	    available in the patch list  

When performing the tests, the receiving machine runs TreadH_ttyS while the
trasmitting machine runs TwriteH_ttyS, TwriteL0_ttyS, TwriteL1_ttyS, or
TwriteL2_ttyS

/*                            TwriteH_ttyS.c                                */
/*                                                                          */
/*     writes a few chars in serial port using the high level functions     */  
/*     of the /dev/ttyS0 device file                                        */
/*                                                                          */
/*     test OK but it is necessary to introduce a delay between consecutive */
/*     write                                                                */
 
#include "test_ioports.h"                           
             
#define N       80

char data[N];

int main(void)
{
    int fd, i, errcode;
    size_t count;
    
    for (i=0; i<20; i = i+2) {
    	data[i] = 'a';
	data[i+1] = 'b';
    }
    
    printf("start test\n");
    
    fd = open("/dev/ttyS0", O_WRONLY);
    if (fd < 0)
        {
        fprintf(stderr, "open /dev/ttyS0 failed: %s\n", strerror(errno));
	exit(-1);
	}
    for (i=0; i<10; i = i++) {	
    count = write(fd, &data[i], 1);
	if (count != 1) {
	    printf("bad write count: %d\n", count);
	    exit(-1);
	    } 
    sleep(1); 
    }
    close(fd);  
    exit(0);   
}

*****************************************************************************

/*                            TreadH_ttyS.c                              */
/*                                                                       */
/*        reads endless chars from /dev/ttyS0 or /dev/ttyS1              */
                             
#define ttyS1      /* define here the serial port used: ttyS0 or ttyS1 */

#ifdef ttyS0
#define ttyS "/dev/ttyS0"
#endif
#ifdef ttyS1
#define ttyS "/dev/ttyS1"
#endif

#include "test_ioports.h" 

int main(void)
{
    int fd, errcode, i;
    size_t count;
    char c;
    
    fd = open(ttyS, O_RDONLY);
    if (fd < 0)
        {
        fprintf(stderr, "open /dev/ttyS failed: %s\n", strerror(errno));
	exit(-1);
	}	
    for ( ; ; ) {
    	count = read(fd, &c, 1);
	if (count != 1) {
	    printf("bad read count: %d\n", count);
	    exit(-1);
	    } 
	printf("%c\n", c);   
	}
    close(fd);  
    exit(0);   
}

*****************************************************************************

/*                             TwriteL0_ttyS.c                           */
/*                                                                       */
/*     writes a single char in serial port using the low level in and    */  
/*     out I/O instruction                                               */
/*                                                                       */

#include "io.h"
#include "test_ioports.h"
  
#define	DataRegister0                     0x2f8
#define	InterruptEnableRegister0          0x2f9
#define	InterruptIdentificationRegister0  0x2fa
        
#define	DataRegister1                     0x3f8
#define	InterruptEnableRegister1          0x3f9
#define	InterruptIdentificationRegister1  0x3fa

int main(void)
{
    unsigned char status, data = 'A';
    int cc;
    
    printf("start test\n");
    
    cc = iopl(3); /* required to access I/O ports */

    outb(0x02, InterruptEnableRegister1); 
  
    /* clear pending interrupts */
    status = inb(InterruptIdentificationRegister1); 
   
   /* transmit char */
    outb(data, DataRegister1); 
    
    /* read status */
    status = inb(InterruptIdentificationRegister1);
    
    printf("InterruptIdentificationRegister= %x\n", (int)status);

    exit(0);   
}

*****************************************************************************

/*                             TwriteL1_ttyS.c                           */
/*                                                                       */
/*   writes 20 chars in /dev/ttyS0 or /dev/ttyS1 and computes the number */
/*   of busy wait iterations                                             */
                             

#define ttyS1      /* define here the serial port used: ttyS0 or ttyS1 */
  
#ifdef ttyS0             
#define	DataRegister                     0x2f8
#define	InterruptEnableRegister          0x2f9
#define	InterruptIdentificationRegister  0x2fa
#endif 

#ifdef ttyS1
#define	DataRegister                     0x3f8
#define	InterruptEnableRegister          0x3f9
#define	InterruptIdentificationRegister  0x3fa
#endif 

#include "io.h"
#include "test_ioports.h"

int main(void)
{
    int i, j, cc;
    size_t count;
    unsigned char status, data = 'A';
    char *buf = &data;
    
    printf("start test\n");
    
    cc = iopl(3); /* required to access I/O ports */
   
    /* enable TBE (data transmitted) interrupt */
    outb(0x02, InterruptEnableRegister);  
    /* read status and clear pending interrupts */                                           
    status = inb(InterruptIdentificationRegister);
                                          
    printf("InterruptIdentificationRegister= %x\n", (unsigned int)status);
    
    /* transmission cycle  */
    for (i=0; i<20 ;i++) {
    	j = 0;
    	outb(data, DataRegister); /* transmit char */
	for (; ;) { /* polling on PND flag of 
	              interrupt identification register */
		status = inb(InterruptIdentificationRegister); 
		if (!(status & 0x1))
			break;
		j++;
	}
    printf("number of busy wait iterations= %d\n", j);
    
    sleep(1);
    }
    exit(0);   
}

*****************************************************************************

/*                             TwriteL2_ttyS.c                           */
/*                                                                       */
/*   writes 20 chars in /dev/ttyS0 or /dev/ttyS1 and computes the time   */
/*   spent in busy wait iterations by using the Time Stamp Counter       */
/*                                                                       */
/*   CAUTION: you'll have to get the CPU frequency of your computer      */
/*   and replace in the test_ioports.h header file the value 451032 in   */
/*   the line:                                                            */
/*                                                                       */ 
/*                    cpu_khz = 451032;                                  */
/*                                                                       */
/*   with the proper value that you can get by issuing:                  */ 
/*                                                                       */                    
/*              cat /proc/cpuinfo                                        */
/*                                                                       */                  

#define ttyS1      /* define here the serial port used: ttyS0 or ttyS1 */
  
#ifdef ttyS0             
#define	DataRegister                     0x2f8
#define	InterruptEnableRegister          0x2f9
#define	InterruptIdentificationRegister  0x2fa
#endif 

#ifdef ttyS1
#define	DataRegister                     0x3f8
#define	InterruptEnableRegister          0x3f9
#define	InterruptIdentificationRegister  0x3fa
#endif 

#include "io.h"
#include "test_ioports.h"

int main(void)
{
    int i, cc;
    size_t count;
    unsigned char status, data = 'A';
    char *buf = &data;
    
    printf("start test\n");
    
    cc = iopl(3); /* required to access I/O ports */
   
    /* enable TBE (data transmitted) interrupt */
    outb(0x02, InterruptEnableRegister);  
    /* read status and clear pending interrupts */                                           
    status = inb(InterruptIdentificationRegister);
                                          
    printf("InterruptIdentificationRegister= %x\n", (unsigned int)status);
    
    /* transmission cycle  */
    for (i=0; i<20 ;i++) {

	/* get starting time before issuing I/O instruction */
	asm("pushl %eax\n\t"
	"pushl %edx\n\t"
        "rdtsc\n\t"
	"movl %eax,tlow1\n\t"
	"movl %edx,thigh1\n\t"
	"popl %edx\n\t"
	"popl %eax");
	
	/* transmit char */
    	outb(data, DataRegister); 
	
	for (; ;) { /* polling on PND flag of 
	              interrupt identification register */
		status = inb(InterruptIdentificationRegister); 
		if (!(status & 0x1))
			break;
	}
    
    /* get ending time right after PND flag has been cleared */
    asm("pushl %eax\n\t"
	"pushl %edx\n\t"
        "rdtsc\n\t"
	"movl %eax,tlow2\n\t"
	"movl %edx,thigh2\n\t"
	"popl %edx\n\t"
	"popl %eaxt");
    
    /* print the elapsed time interval */
    printf("thigh1= %u, tlow1= %u\n", (unsigned int)thigh1, (unsigned int)tlow1);
    printf("thigh2= %u, tlow2= %u\n", (unsigned int)thigh2, (unsigned int)tlow2);
    
    timediff(thigh1, tlow1, thigh2, tlow2);    
    
    sleep(1);
    }
    exit(0);   
}

*******************************************************************************

/*            header file  test_ioports.h (copied from linux)              */
/*                                                                         */
/*    includes all the APIs based on linux-2.4.?  kernel and other         */
/* utilities,  namely:                                                     */
/*                                                                         */
/* void timediff(unsignedlong thigh1, unsigned long tlow1,                 */
/*          unsigned long thigh2, unsigned long tlow2)                     */
/* void barrier()                                                          */   

#include <asm/unistd.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <linux/mman.h>
#include <sys/mman.h>     
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/fcntl.h>
#include <sys/time.h>

unsigned long thigh1=0, tlow1=0, thigh2=0, tlow2=0, freq=0;
unsigned int reg_eax, reg_ebx, reg_ecx, reg_edx;

void timediff(unsigned long thigh1, unsigned long tlow1, 
              unsigned long thigh2, unsigned long tlow2)
{
    unsigned long cpu_khz;
    int retcode;
    float tdiffh, tdiffl, elapsed_time, tui, tuisecs, tcpu_khz;
    
    cpu_khz = 200;
    tcpu_khz = cpu_khz;
    printf("tcpu_khz= %12.1f\n", tcpu_khz); 
    tui = 4*1024*1024;
    tui = tui*1024;
    tuisecs = tui / (cpu_khz*1000);
    /* printf("high unit in secs= %12.3f\n", tuisecs); */
    if (thigh2 == thigh1)
        {
	if (tlow1 > tlow2)
		{
		printf("bad RTC values\n");
		return;
		}
	/* printf("h2=h1 and l2>=l1\n"); */
        tdiffh = 0.0;
	tdiffl = tlow2 - tlow1;
	}
    else if (thigh2 > thigh1)
	if (tlow2 >= tlow1)
	    {
	    /* printf("h2>h1 and l2>=l1\n");  */
	    tdiffh = thigh2 - thigh1;
	    tdiffl = tlow2 - tlow1; 
	    }
	else
	    {
	    /* printf("h2>h1 and l2<l1\n");  */
	    tdiffh = thigh2 - thigh1 - 1.0;
	    /*printf("tui + tlow2= %12.0f\n", tui + tlow2);  */
	    tdiffl = (tui + tlow2) - tlow1; 
	    }
    else
        {
        printf("bad RTC values\n");
	return;
	}
    /* printf("tdiffh= %12.0f, tdiffl= %12.0f\n", tdiffh, tdiffl);  
    printf("tdiffl/tui= %12.3f\n", tdiffl / tui);  */
    elapsed_time = (tdiffh + (tdiffl / tui)) * tuisecs;
    printf("elapsed_time= %12.6f\n", elapsed_time);
    return;
}

static inline void read_tsc1(unsigned long thigh1, unsigned long tlow1)
{
        asm("pushl %eax\n\t"
	"pushl %edx\n\t"
        "rdtsc\n\t"
	"movl %eax,tlow1\n\t"
	"movl %edx,thigh1\n\t"
	"popl %edx\n\t"
	"popl %eax");
}

static inline void read_tsc2(unsigned long thigh2, unsigned long tlow2)
{
        asm("pushl %eax\n\t"
	"pushl %edx\n\t"
        "rdtsc\n\t"
	"movl %eax,tlow2\n\t"
	"movl %edx,thigh2\n\t"
	"popl %edx\n\t"
	"popl %eax");
}



inline void cpuid(int op, int *eax, int *ebx, int *ecx, int *edx)
{
asm("cpuid": "=a" (*eax), "=b" (*ebx), "=c" (*ecx), "=d" (*edx): "a" (op): "cc");
}

inline void barrier(void)
{
    asm("pushl %eax\n\t"
    	"pushl %ebx\n\t" 
	"pushl %ecx\n\t"
	"pushl %edx");
    cpuid(0, &reg_eax, &reg_ebx, &reg_ecx, &reg_edx);
    asm("popl %edx\n\t"
    	"popl %ecx\n\t" 
	"popl %ebx\n\t");
}

********************************************************************************

#ifndef _ASM_IO_H
#define _ASM_IO_H

#include <linux/config.h>

/*
 * This file contains the definitions for the x86 IO instructions
 * inb/inw/inl/outb/outw/outl and the "string versions" of the same
 * (insb/insw/insl/outsb/outsw/outsl). You can also use "pausing"
 * versions of the single-IO instructions (inb_p/inw_p/..).
 *
 * This file is not meant to be obfuscating: it's just complicated
 * to (a) handle it all in a way that makes gcc able to optimize it
 * as well as possible and (b) trying to avoid writing the same thing
 * over and over again with slight variations and possibly making a
 * mistake somewhere.
 */

/*
 * Thanks to James van Artsdalen for a better timing-fix than
 * the two short jumps: using outb's to a nonexistent port seems
 * to guarantee better timings even on fast machines.
 *
 * On the other hand, I'd like to be sure of a non-existent port:
 * I feel a bit unsafe about using 0x80 (should be safe, though)
 *
 *		Linus
 */

 /*
  *  Bit simplified and optimized by Jan Hubicka
  *  Support of BIGMEM added by Gerhard Wichert, Siemens AG, July 1999.
  *
  *  isa_memset_io, isa_memcpy_fromio, isa_memcpy_toio added,
  *  isa_read[wl] and isa_write[wl] fixed
  *  - Arnaldo Carvalho de Melo <acme@conectiva.com.br>
  */

#define IO_SPACE_LIMIT 0xffff

#define XQUAD_PORTIO_BASE 0xfe400000
#define XQUAD_PORTIO_LEN  0x40000   /* 256k per quad. Only remapping 1st */

#ifdef __KERNEL__

#include <linux/vmalloc.h>

/*
 * Temporary debugging check to catch old code using
 * unmapped ISA addresses. Will be removed in 2.4.
 */
#if CONFIG_DEBUG_IOVIRT
  extern void *__io_virt_debug(unsigned long x, const char *file, int line);
  extern unsigned long __io_phys_debug(unsigned long x, const char *file, int line);
  #define __io_virt(x) __io_virt_debug((unsigned long)(x), __FILE__, __LINE__)
//#define __io_phys(x) __io_phys_debug((unsigned long)(x), __FILE__, __LINE__)
#else
  #define __io_virt(x) ((void *)(x))
//#define __io_phys(x) __pa(x)
#endif

/*
 * Change virtual addresses to physical addresses and vv.
 * These are pretty trivial
 */
static inline unsigned long virt_to_phys(volatile void * address)
{
	return __pa(address);
}

static inline void * phys_to_virt(unsigned long address)
{
	return __va(address);
}

/*
 * Change "struct page" to physical address.
 */
#define page_to_phys(page)	((page - mem_map) << PAGE_SHIFT)

extern void * __ioremap(unsigned long offset, unsigned long size, unsigned long flags);

static inline void * ioremap (unsigned long offset, unsigned long size)
{
	return __ioremap(offset, size, 0);
}

/*
 * This one maps high address device memory and turns off caching for that area.
 * it's useful if some control registers are in such an area and write combining
 * or read caching is not desirable:
 */
static inline void * ioremap_nocache (unsigned long offset, unsigned long size)
{
        return __ioremap(offset, size, _PAGE_PCD);
}

extern void iounmap(void *addr);

/*
 * IO bus memory addresses are also 1:1 with the physical address
 */
#define virt_to_bus virt_to_phys
#define bus_to_virt phys_to_virt
#define page_to_bus page_to_phys

/*
 * readX/writeX() are used to access memory mapped devices. On some
 * architectures the memory mapped IO stuff needs to be accessed
 * differently. On the x86 architecture, we just read/write the
 * memory location directly.
 */

#define readb(addr) (*(volatile unsigned char *) __io_virt(addr))
#define readw(addr) (*(volatile unsigned short *) __io_virt(addr))
#define readl(addr) (*(volatile unsigned int *) __io_virt(addr))
#define __raw_readb readb
#define __raw_readw readw
#define __raw_readl readl

#define writeb(b,addr) (*(volatile unsigned char *) __io_virt(addr) = (b))
#define writew(b,addr) (*(volatile unsigned short *) __io_virt(addr) = (b))
#define writel(b,addr) (*(volatile unsigned int *) __io_virt(addr) = (b))
#define __raw_writeb writeb
#define __raw_writew writew
#define __raw_writel writel

#define memset_io(a,b,c)	memset(__io_virt(a),(b),(c))
#define memcpy_fromio(a,b,c)	memcpy((a),__io_virt(b),(c))
#define memcpy_toio(a,b,c)	memcpy(__io_virt(a),(b),(c))

/*
 * ISA space is 'always mapped' on a typical x86 system, no need to
 * explicitly ioremap() it. The fact that the ISA IO space is mapped
 * to PAGE_OFFSET is pure coincidence - it does not mean ISA values
 * are physical addresses. The following constant pointer can be
 * used as the IO-area pointer (it can be iounmapped as well, so the
 * analogy with PCI is quite large):
 */
#define __ISA_IO_base ((char *)(PAGE_OFFSET))

#define isa_readb(a) readb(__ISA_IO_base + (a))
#define isa_readw(a) readw(__ISA_IO_base + (a))
#define isa_readl(a) readl(__ISA_IO_base + (a))
#define isa_writeb(b,a) writeb(b,__ISA_IO_base + (a))
#define isa_writew(w,a) writew(w,__ISA_IO_base + (a))
#define isa_writel(l,a) writel(l,__ISA_IO_base + (a))
#define isa_memset_io(a,b,c)		memset_io(__ISA_IO_base + (a),(b),(c))
#define isa_memcpy_fromio(a,b,c)	memcpy_fromio((a),__ISA_IO_base + (b),(c))
#define isa_memcpy_toio(a,b,c)		memcpy_toio(__ISA_IO_base + (a),(b),(c))


/*
 * Again, i386 does not require mem IO specific function.
 */

#define eth_io_copy_and_sum(a,b,c,d)		eth_copy_and_sum((a),__io_virt(b),(c),(d))
#define isa_eth_io_copy_and_sum(a,b,c,d)	eth_copy_and_sum((a),__io_virt(__ISA_IO_base + (b)),(c),(d))

static inline int check_signature(unsigned long io_addr,
	const unsigned char *signature, int length)
{
	int retval = 0;
	do {
		if (readb(io_addr) != *signature)
			goto out;
		io_addr++;
		signature++;
		length--;
	} while (length);
	retval = 1;
out:
	return retval;
}

static inline int isa_check_signature(unsigned long io_addr,
	const unsigned char *signature, int length)
{
	int retval = 0;
	do {
		if (isa_readb(io_addr) != *signature)
			goto out;
		io_addr++;
		signature++;
		length--;
	} while (length);
	retval = 1;
out:
	return retval;
}

/*
 *	Cache management
 *
 *	This needed for two cases
 *	1. Out of order aware processors
 *	2. Accidentally out of order processors (PPro errata #51)
 */
 
#if defined(CONFIG_X86_OOSTORE) || defined(CONFIG_X86_PPRO_FENCE)

static inline void flush_write_buffers(void)
{
	__asm__ __volatile__ ("lock; addl $0,0(%%esp)": : :"memory");
}

#define dma_cache_inv(_start,_size)		flush_write_buffers()
#define dma_cache_wback(_start,_size)		flush_write_buffers()
#define dma_cache_wback_inv(_start,_size)	flush_write_buffers()

#else

/* Nothing to do */

#define dma_cache_inv(_start,_size)		do { } while (0)
#define dma_cache_wback(_start,_size)		do { } while (0)
#define dma_cache_wback_inv(_start,_size)	do { } while (0)
#define flush_write_buffers()

#endif

#endif /* __KERNEL__ */

#ifdef SLOW_IO_BY_JUMPING
#define __SLOW_DOWN_IO "\njmp 1f\n1:\tjmp 1f\n1:"
#else
#define __SLOW_DOWN_IO "\noutb %%al,$0x80"
#endif

#ifdef REALLY_SLOW_IO
#define __FULL_SLOW_DOWN_IO __SLOW_DOWN_IO __SLOW_DOWN_IO __SLOW_DOWN_IO __SLOW_DOWN_IO
#else
#define __FULL_SLOW_DOWN_IO __SLOW_DOWN_IO
#endif

#ifdef CONFIG_MULTIQUAD
extern void *xquad_portio;    /* Where the IO area was mapped */
#endif /* CONFIG_MULTIQUAD */

/*
 * Talk about misusing macros..
 */
#define __OUT1(s,x) \
static inline void out##s(unsigned x value, unsigned short port) {

#define __OUT2(s,s1,s2) \
__asm__ __volatile__ ("out" #s " %" s1 "0,%" s2 "1"

#ifdef CONFIG_MULTIQUAD
/* Make the default portio routines operate on quad 0 for now */
#define __OUT(s,s1,x) \
__OUT1(s##_local,x) __OUT2(s,s1,"w") : : "a" (value), "Nd" (port)); } \
__OUT1(s##_p_local,x) __OUT2(s,s1,"w") __FULL_SLOW_DOWN_IO : : "a" (value), "Nd" (port));} \
__OUTQ0(s,s,x) \
__OUTQ0(s,s##_p,x) 
#else
#define __OUT(s,s1,x) \
__OUT1(s,x) __OUT2(s,s1,"w") : : "a" (value), "Nd" (port)); } \
__OUT1(s##_p,x) __OUT2(s,s1,"w") __FULL_SLOW_DOWN_IO : : "a" (value), "Nd" (port));} 
#endif /* CONFIG_MULTIQUAD */

#ifdef CONFIG_MULTIQUAD
#define __OUTQ0(s,ss,x)    /* Do the equivalent of the portio op on quad 0 */ \
static inline void out##ss(unsigned x value, unsigned short port) { \
	if (xquad_portio) \
		write##s(value, (unsigned long) xquad_portio + port); \
	else               /* We're still in early boot, running on quad 0 */ \
		out##ss##_local(value, port); \
} 

#define __INQ0(s,ss)       /* Do the equivalent of the portio op on quad 0 */ \
static inline RETURN_TYPE in##ss(unsigned short port) { \
	if (xquad_portio) \
		return read##s((unsigned long) xquad_portio + port); \
	else               /* We're still in early boot, running on quad 0 */ \
		return in##ss##_local(port); \
}
#endif /* CONFIG_MULTIQUAD */

#define __IN1(s) \
static inline RETURN_TYPE in##s(unsigned short port) { RETURN_TYPE _v;

#define __IN2(s,s1,s2) \
__asm__ __volatile__ ("in" #s " %" s2 "1,%" s1 "0"

#ifdef CONFIG_MULTIQUAD
#define __IN(s,s1,i...) \
__IN1(s##_local) __IN2(s,s1,"w") : "=a" (_v) : "Nd" (port) ,##i ); return _v; } \
__IN1(s##_p_local) __IN2(s,s1,"w") __FULL_SLOW_DOWN_IO : "=a" (_v) : "Nd" (port) ,##i ); return _v; } \
__INQ0(s,s) \
__INQ0(s,s##_p) 
#else
#define __IN(s,s1,i...) \
__IN1(s) __IN2(s,s1,"w") : "=a" (_v) : "Nd" (port) ,##i ); return _v; } \
__IN1(s##_p) __IN2(s,s1,"w") __FULL_SLOW_DOWN_IO : "=a" (_v) : "Nd" (port) ,##i ); return _v; } 
#endif /* CONFIG_MULTIQUAD */

#define __INS(s) \
static inline void ins##s(unsigned short port, void * addr, unsigned long count) \
{ __asm__ __volatile__ ("rep ; ins" #s \
: "=D" (addr), "=c" (count) : "d" (port),"0" (addr),"1" (count)); }

#define __OUTS(s) \
static inline void outs##s(unsigned short port, const void * addr, unsigned long count) \
{ __asm__ __volatile__ ("rep ; outs" #s \
: "=S" (addr), "=c" (count) : "d" (port),"0" (addr),"1" (count)); }

#define RETURN_TYPE unsigned char
__IN(b,"")
#undef RETURN_TYPE
#define RETURN_TYPE unsigned short
__IN(w,"")
#undef RETURN_TYPE
#define RETURN_TYPE unsigned int
__IN(l,"")
#undef RETURN_TYPE

__OUT(b,"b",char)
__OUT(w,"w",short)
__OUT(l,,int)

__INS(b)
__INS(w)
__INS(l)

__OUTS(b)
__OUTS(w)
__OUTS(l)

#endif