summaryrefslogtreecommitdiff
path: root/Documentation/s390/cds.rst
blob: 7006d8209d2ede981e8f7e23fd9eb7489dd1b0f5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
===========================
Linux for S/390 and zSeries
===========================

Common Device Support (CDS)
Device Driver I/O Support Routines

Authors:
	- Ingo Adlung
	- Cornelia Huck

Copyright, IBM Corp. 1999-2002

Introduction
============

This document describes the common device support routines for Linux/390.
Different than other hardware architectures, ESA/390 has defined a unified
I/O access method. This gives relief to the device drivers as they don't
have to deal with different bus types, polling versus interrupt
processing, shared versus non-shared interrupt processing, DMA versus port
I/O (PIO), and other hardware features more. However, this implies that
either every single device driver needs to implement the hardware I/O
attachment functionality itself, or the operating system provides for a
unified method to access the hardware, providing all the functionality that
every single device driver would have to provide itself.

The document does not intend to explain the ESA/390 hardware architecture in
every detail.This information can be obtained from the ESA/390 Principles of
Operation manual (IBM Form. No. SA22-7201).

In order to build common device support for ESA/390 I/O interfaces, a
functional layer was introduced that provides generic I/O access methods to
the hardware.

The common device support layer comprises the I/O support routines defined
below. Some of them implement common Linux device driver interfaces, while
some of them are ESA/390 platform specific.

Note:
  In order to write a driver for S/390, you also need to look into the interface
  described in Documentation/s390/driver-model.rst.

Note for porting drivers from 2.4:

The major changes are:

* The functions use a ccw_device instead of an irq (subchannel).
* All drivers must define a ccw_driver (see driver-model.txt) and the associated
  functions.
* request_irq() and free_irq() are no longer done by the driver.
* The oper_handler is (kindof) replaced by the probe() and set_online() functions
  of the ccw_driver.
* The not_oper_handler is (kindof) replaced by the remove() and set_offline()
  functions of the ccw_driver.
* The channel device layer is gone.
* The interrupt handlers must be adapted to use a ccw_device as argument.
  Moreover, they don't return a devstat, but an irb.
* Before initiating an io, the options must be set via ccw_device_set_options().
* Instead of calling read_dev_chars()/read_conf_data(), the driver issues
  the channel program and handles the interrupt itself.

ccw_device_get_ciw()
   get commands from extended sense data.

ccw_device_start(), ccw_device_start_timeout(), ccw_device_start_key(), ccw_device_start_key_timeout()
   initiate an I/O request.

ccw_device_resume()
   resume channel program execution.

ccw_device_halt()
   terminate the current I/O request processed on the device.

do_IRQ()
   generic interrupt routine. This function is called by the interrupt entry
   routine whenever an I/O interrupt is presented to the system. The do_IRQ()
   routine determines the interrupt status and calls the device specific
   interrupt handler according to the rules (flags) defined during I/O request
   initiation with do_IO().

The next chapters describe the functions other than do_IRQ() in more details.
The do_IRQ() interface is not described, as it is called from the Linux/390
first level interrupt handler only and does not comprise a device driver
callable interface. Instead, the functional description of do_IO() also
describes the input to the device specific interrupt handler.

Note:
	All explanations apply also to the 64 bit architecture s390x.


Common Device Support (CDS) for Linux/390 Device Drivers
========================================================

General Information
-------------------

The following chapters describe the I/O related interface routines the
Linux/390 common device support (CDS) provides to allow for device specific
driver implementations on the IBM ESA/390 hardware platform. Those interfaces
intend to provide the functionality required by every device driver
implementation to allow to drive a specific hardware device on the ESA/390
platform. Some of the interface routines are specific to Linux/390 and some
of them can be found on other Linux platforms implementations too.
Miscellaneous function prototypes, data declarations, and macro definitions
can be found in the architecture specific C header file
linux/arch/s390/include/asm/irq.h.

Overview of CDS interface concepts
----------------------------------

Different to other hardware platforms, the ESA/390 architecture doesn't define
interrupt lines managed by a specific interrupt controller and bus systems
that may or may not allow for shared interrupts, DMA processing, etc.. Instead,
the ESA/390 architecture has implemented a so called channel subsystem, that
provides a unified view of the devices physically attached to the systems.
Though the ESA/390 hardware platform knows about a huge variety of different
peripheral attachments like disk devices (aka. DASDs), tapes, communication
controllers, etc. they can all be accessed by a well defined access method and
they are presenting I/O completion a unified way : I/O interruptions. Every
single device is uniquely identified to the system by a so called subchannel,
where the ESA/390 architecture allows for 64k devices be attached.

Linux, however, was first built on the Intel PC architecture, with its two
cascaded 8259 programmable interrupt controllers (PICs), that allow for a
maximum of 15 different interrupt lines. All devices attached to such a system
share those 15 interrupt levels. Devices attached to the ISA bus system must
not share interrupt levels (aka. IRQs), as the ISA bus bases on edge triggered
interrupts. MCA, EISA, PCI and other bus systems base on level triggered
interrupts, and therewith allow for shared IRQs. However, if multiple devices
present their hardware status by the same (shared) IRQ, the operating system
has to call every single device driver registered on this IRQ in order to
determine the device driver owning the device that raised the interrupt.

Up to kernel 2.4, Linux/390 used to provide interfaces via the IRQ (subchannel).
For internal use of the common I/O layer, these are still there. However,
device drivers should use the new calling interface via the ccw_device only.

During its startup the Linux/390 system checks for peripheral devices. Each
of those devices is uniquely defined by a so called subchannel by the ESA/390
channel subsystem. While the subchannel numbers are system generated, each
subchannel also takes a user defined attribute, the so called device number.
Both subchannel number and device number cannot exceed 65535. During sysfs
initialisation, the information about control unit type and device types that
imply specific I/O commands (channel command words - CCWs) in order to operate
the device are gathered. Device drivers can retrieve this set of hardware
information during their initialization step to recognize the devices they
support using the information saved in the struct ccw_device given to them.
This methods implies that Linux/390 doesn't require to probe for free (not
armed) interrupt request lines (IRQs) to drive its devices with. Where
applicable, the device drivers can use issue the READ DEVICE CHARACTERISTICS
ccw to retrieve device characteristics in its online routine.

In order to allow for easy I/O initiation the CDS layer provides a
ccw_device_start() interface that takes a device specific channel program (one
or more CCWs) as input sets up the required architecture specific control blocks
and initiates an I/O request on behalf of the device driver. The
ccw_device_start() routine allows to specify whether it expects the CDS layer
to notify the device driver for every interrupt it observes, or with final status
only. See ccw_device_start() for more details. A device driver must never issue
ESA/390 I/O commands itself, but must use the Linux/390 CDS interfaces instead.

For long running I/O request to be canceled, the CDS layer provides the
ccw_device_halt() function. Some devices require to initially issue a HALT
SUBCHANNEL (HSCH) command without having pending I/O requests. This function is
also covered by ccw_device_halt().


get_ciw() - get command information word

This call enables a device driver to get information about supported commands
from the extended SenseID data.

::

  struct ciw *
  ccw_device_get_ciw(struct ccw_device *cdev, __u32 cmd);

====  ========================================================
cdev  The ccw_device for which the command is to be retrieved.
cmd   The command type to be retrieved.
====  ========================================================

ccw_device_get_ciw() returns:

=====  ================================================================
 NULL  No extended data available, invalid device or command not found.
!NULL  The command requested.
=====  ================================================================

::

  ccw_device_start() - Initiate I/O Request

The ccw_device_start() routines is the I/O request front-end processor. All
device driver I/O requests must be issued using this routine. A device driver
must not issue ESA/390 I/O commands itself. Instead the ccw_device_start()
routine provides all interfaces required to drive arbitrary devices.

This description also covers the status information passed to the device
driver's interrupt handler as this is related to the rules (flags) defined
with the associated I/O request when calling ccw_device_start().

::

  int ccw_device_start(struct ccw_device *cdev,
		       struct ccw1 *cpa,
		       unsigned long intparm,
		       __u8 lpm,
		       unsigned long flags);
  int ccw_device_start_timeout(struct ccw_device *cdev,
			       struct ccw1 *cpa,
			       unsigned long intparm,
			       __u8 lpm,
			       unsigned long flags,
			       int expires);
  int ccw_device_start_key(struct ccw_device *cdev,
			   struct ccw1 *cpa,
			   unsigned long intparm,
			   __u8 lpm,
			   __u8 key,
			   unsigned long flags);
  int ccw_device_start_key_timeout(struct ccw_device *cdev,
				   struct ccw1 *cpa,
				   unsigned long intparm,
				   __u8 lpm,
				   __u8 key,
				   unsigned long flags,
				   int expires);

============= =============================================================
cdev          ccw_device the I/O is destined for
cpa           logical start address of channel program
user_intparm  user specific interrupt information; will be presented
	      back to the device driver's interrupt handler. Allows a
	      device driver to associate the interrupt with a
	      particular I/O request.
lpm           defines the channel path to be used for a specific I/O
	      request. A value of 0 will make cio use the opm.
key           the storage key to use for the I/O (useful for operating on a
	      storage with a storage key != default key)
flag          defines the action to be performed for I/O processing
expires       timeout value in jiffies. The common I/O layer will terminate
	      the running program after this and call the interrupt handler
	      with ERR_PTR(-ETIMEDOUT) as irb.
============= =============================================================

Possible flag values are:

========================= =============================================
DOIO_ALLOW_SUSPEND        channel program may become suspended
DOIO_DENY_PREFETCH        don't allow for CCW prefetch; usually
			  this implies the channel program might
			  become modified
DOIO_SUPPRESS_INTER       don't call the handler on intermediate status
========================= =============================================

The cpa parameter points to the first format 1 CCW of a channel program::

  struct ccw1 {
	__u8  cmd_code;/* command code */
	__u8  flags;   /* flags, like IDA addressing, etc. */
	__u16 count;   /* byte count */
	__u32 cda;     /* data address */
  } __attribute__ ((packed,aligned(8)));

with the following CCW flags values defined:

=================== =========================
CCW_FLAG_DC         data chaining
CCW_FLAG_CC         command chaining
CCW_FLAG_SLI        suppress incorrect length
CCW_FLAG_SKIP       skip
CCW_FLAG_PCI        PCI
CCW_FLAG_IDA        indirect addressing
CCW_FLAG_SUSPEND    suspend
=================== =========================


Via ccw_device_set_options(), the device driver may specify the following
options for the device:

========================= ======================================
DOIO_EARLY_NOTIFICATION   allow for early interrupt notification
DOIO_REPORT_ALL           report all interrupt conditions
========================= ======================================


The ccw_device_start() function returns:

======== ======================================================================
      0  successful completion or request successfully initiated
 -EBUSY  The device is currently processing a previous I/O request, or there is
	 a status pending at the device.
-ENODEV  cdev is invalid, the device is not operational or the ccw_device is
	 not online.
======== ======================================================================

When the I/O request completes, the CDS first level interrupt handler will
accumulate the status in a struct irb and then call the device interrupt handler.
The intparm field will contain the value the device driver has associated with a
particular I/O request. If a pending device status was recognized,
intparm will be set to 0 (zero). This may happen during I/O initiation or delayed
by an alert status notification. In any case this status is not related to the
current (last) I/O request. In case of a delayed status notification no special
interrupt will be presented to indicate I/O completion as the I/O request was
never started, even though ccw_device_start() returned with successful completion.

The irb may contain an error value, and the device driver should check for this
first:

========== =================================================================
-ETIMEDOUT the common I/O layer terminated the request after the specified
	   timeout value
-EIO       the common I/O layer terminated the request due to an error state
========== =================================================================

If the concurrent sense flag in the extended status word (esw) in the irb is
set, the field erw.scnt in the esw describes the number of device specific
sense bytes available in the extended control word irb->scsw.ecw[]. No device
sensing by the device driver itself is required.

The device interrupt handler can use the following definitions to investigate
the primary unit check source coded in sense byte 0 :

======================= ====
SNS0_CMD_REJECT         0x80
SNS0_INTERVENTION_REQ   0x40
SNS0_BUS_OUT_CHECK      0x20
SNS0_EQUIPMENT_CHECK    0x10
SNS0_DATA_CHECK         0x08
SNS0_OVERRUN            0x04
SNS0_INCOMPL_DOMAIN     0x01
======================= ====

Depending on the device status, multiple of those values may be set together.
Please refer to the device specific documentation for details.

The irb->scsw.cstat field provides the (accumulated) subchannel status :

========================= ============================
SCHN_STAT_PCI             program controlled interrupt
SCHN_STAT_INCORR_LEN      incorrect length
SCHN_STAT_PROG_CHECK      program check
SCHN_STAT_PROT_CHECK      protection check
SCHN_STAT_CHN_DATA_CHK    channel data check
SCHN_STAT_CHN_CTRL_CHK    channel control check
SCHN_STAT_INTF_CTRL_CHK   interface control check
SCHN_STAT_CHAIN_CHECK     chaining check
========================= ============================

The irb->scsw.dstat field provides the (accumulated) device status :

===================== =================
DEV_STAT_ATTENTION    attention
DEV_STAT_STAT_MOD     status modifier
DEV_STAT_CU_END       control unit end
DEV_STAT_BUSY         busy
DEV_STAT_CHN_END      channel end
DEV_STAT_DEV_END      device end
DEV_STAT_UNIT_CHECK   unit check
DEV_STAT_UNIT_EXCEP   unit exception
===================== =================

Please see the ESA/390 Principles of Operation manual for details on the
individual flag meanings.

Usage Notes:

ccw_device_start() must be called disabled and with the ccw device lock held.

The device driver is allowed to issue the next ccw_device_start() call from
within its interrupt handler already. It is not required to schedule a
bottom-half, unless a non deterministically long running error recovery procedure
or similar needs to be scheduled. During I/O processing the Linux/390 generic
I/O device driver support has already obtained the IRQ lock, i.e. the handler
must not try to obtain it again when calling ccw_device_start() or we end in a
deadlock situation!

If a device driver relies on an I/O request to be completed prior to start the
next it can reduce I/O processing overhead by chaining a NoOp I/O command
CCW_CMD_NOOP to the end of the submitted CCW chain. This will force Channel-End
and Device-End status to be presented together, with a single interrupt.
However, this should be used with care as it implies the channel will remain
busy, not being able to process I/O requests for other devices on the same
channel. Therefore e.g. read commands should never use this technique, as the
result will be presented by a single interrupt anyway.

In order to minimize I/O overhead, a device driver should use the
DOIO_REPORT_ALL  only if the device can report intermediate interrupt
information prior to device-end the device driver urgently relies on. In this
case all I/O interruptions are presented to the device driver until final
status is recognized.

If a device is able to recover from asynchronously presented I/O errors, it can
perform overlapping I/O using the DOIO_EARLY_NOTIFICATION flag. While some
devices always report channel-end and device-end together, with a single
interrupt, others present primary status (channel-end) when the channel is
ready for the next I/O request and secondary status (device-end) when the data
transmission has been completed at the device.

Above flag allows to exploit this feature, e.g. for communication devices that
can handle lost data on the network to allow for enhanced I/O processing.

Unless the channel subsystem at any time presents a secondary status interrupt,
exploiting this feature will cause only primary status interrupts to be
presented to the device driver while overlapping I/O is performed. When a
secondary status without error (alert status) is presented, this indicates
successful completion for all overlapping ccw_device_start() requests that have
been issued since the last secondary (final) status.

Channel programs that intend to set the suspend flag on a channel command word
(CCW)  must start the I/O operation with the DOIO_ALLOW_SUSPEND option or the
suspend flag will cause a channel program check. At the time the channel program
becomes suspended an intermediate interrupt will be generated by the channel
subsystem.

ccw_device_resume() - Resume Channel Program Execution

If a device driver chooses to suspend the current channel program execution by
setting the CCW suspend flag on a particular CCW, the channel program execution
is suspended. In order to resume channel program execution the CIO layer
provides the ccw_device_resume() routine.

::

  int ccw_device_resume(struct ccw_device *cdev);

====  ================================================
cdev  ccw_device the resume operation is requested for
====  ================================================

The ccw_device_resume() function returns:

=========   ==============================================
	0   suspended channel program is resumed
   -EBUSY   status pending
  -ENODEV   cdev invalid or not-operational subchannel
  -EINVAL   resume function not applicable
-ENOTCONN   there is no I/O request pending for completion
=========   ==============================================

Usage Notes:

Please have a look at the ccw_device_start() usage notes for more details on
suspended channel programs.

ccw_device_halt() - Halt I/O Request Processing

Sometimes a device driver might need a possibility to stop the processing of
a long-running channel program or the device might require to initially issue
a halt subchannel (HSCH) I/O command. For those purposes the ccw_device_halt()
command is provided.

ccw_device_halt() must be called disabled and with the ccw device lock held.

::

  int ccw_device_halt(struct ccw_device *cdev,
		      unsigned long intparm);

=======  =====================================================
cdev     ccw_device the halt operation is requested for
intparm  interruption parameter; value is only used if no I/O
	 is outstanding, otherwise the intparm associated with
	 the I/O request is returned
=======  =====================================================

The ccw_device_halt() function returns:

=======  ==============================================================
      0  request successfully initiated
-EBUSY   the device is currently busy, or status pending.
-ENODEV  cdev invalid.
-EINVAL  The device is not operational or the ccw device is not online.
=======  ==============================================================

Usage Notes:

A device driver may write a never-ending channel program by writing a channel
program that at its end loops back to its beginning by means of a transfer in
channel (TIC)   command (CCW_CMD_TIC). Usually this is performed by network
device drivers by setting the PCI CCW flag (CCW_FLAG_PCI). Once this CCW is
executed a program controlled interrupt (PCI) is generated. The device driver
can then perform an appropriate action. Prior to interrupt of an outstanding
read to a network device (with or without PCI flag) a ccw_device_halt()
is required to end the pending operation.

::

  ccw_device_clear() - Terminage I/O Request Processing

In order to terminate all I/O processing at the subchannel, the clear subchannel
(CSCH) command is used. It can be issued via ccw_device_clear().

ccw_device_clear() must be called disabled and with the ccw device lock held.

::

  int ccw_device_clear(struct ccw_device *cdev, unsigned long intparm);

======= ===============================================
cdev    ccw_device the clear operation is requested for
intparm interruption parameter (see ccw_device_halt())
======= ===============================================

The ccw_device_clear() function returns:

=======  ==============================================================
      0  request successfully initiated
-ENODEV  cdev invalid
-EINVAL  The device is not operational or the ccw device is not online.
=======  ==============================================================

Miscellaneous Support Routines
------------------------------

This chapter describes various routines to be used in a Linux/390 device
driver programming environment.

get_ccwdev_lock()

Get the address of the device specific lock. This is then used in
spin_lock() / spin_unlock() calls.

::

  __u8 ccw_device_get_path_mask(struct ccw_device *cdev);

Get the mask of the path currently available for cdev.