summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox/mlx4/cmd.c
Commit message (Collapse)AuthorAgeFilesLines
* IB/mlx4: Miscellaneous adjustments for SR-IOV IB supportJack Morgenstein2012-09-301-0/+1
| | | | | | | | | | | | | | | | | | | | 1. Allow only master to change node description. 2. Prevent AH leakage in send mads. 3. Take device part number from PCI structure, so that guests see the VF part number (and not the PF part number). 4. Place the device revision ID into caps structure at startup. 5. SET_PORT in update_gids_task needs to go through wrapper on master. 6. In mlx4_ib_event(), PORT_MGMT_EVENT needs be handled in a work queue on the master, since it propagates events to slaves using GEN_EQE. 7. Do not support FMR on slaves. 8. Add spinlock to slave_event(), since it is called both in interrupt context and in process context (due to 6 above, and also if smp_snoop is used). This fix was found and implemented by Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
* mlx4: Add alias_guid mechanismJack Morgenstein2012-09-301-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | For IB ports, we paravirtualize the GUID at index 0 on slaves. The GUID at index 0 seen by a slave is the actual GUID occupying the GUID table at the slave-id index. The driver, by default, requests at startup time that subnet manager populate its entire guid table with GUIDs. These guids are then mapped (paravirtualized) to the slaves, and appear for each slave as its GUID at index 0. Until each slave has such a guid, its port status is DOWN. The guid table is cached to support special QP paravirtualization, and event propagation to slaves on guid change (we test to see if the guid really changed before propagating an event to the slave). To support this caching, add capability to __mlx4_ib_query_gid() to obtain the network view (i.e., physical view) gid at index X, not just the host (paravirtualized) view. Based on a patch from Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
* mlx4: MAD_IFC paravirtualizationJack Morgenstein2012-09-301-0/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MAD_IFC firmware command fulfills two functions. First, it is used in the QP0/QP1 MAD-handling flow to obtain information from the FW (for answering queries), and for setting variables in the HCA (MAD SET packets). For this, MAD_IFC should provide the FW (physical) view of the data. This is the view that OpenSM needs. We call this the "network view". In the second case, MAD_IFC is used by various verbs to obtain data regarding the local HCA (e.g., ib_query_device()). We call this the "host view". This data needs to be paravirtualized. MAD_IFC therefore needs a wrapper function, and also needs another flag indicating whether it should provide the network view (when it is called by ib_process_mad in special-qp packet handling), or the host view (when it is called while implementing a verb). There are currently 2 flag parameters in mlx4_MAD_IFC already: ignore_bkey and ignore_mkey. These two parameters are replaced by a single "mad_ifc_flags" parameter, with different bits set for each flag. A third flag is added: "network-view/host-view". Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
* mlx4: Implement QP paravirtualization and maintain phys_pkey_cache for smp_snoopJack Morgenstein2012-09-301-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This requires: 1. Replacing the paravirtualized P_Key index (inserted by the guest) with the real P_Key index. 2. For UD QPs, placing the guest's true source GID index in the address path structure mgid field, and setting the ud_force_mgid bit so that the mgid is taken from the QP context and not from the WQE when posting sends. 3. For UC and RC QPs, placing the guest's true source GID index in the address path structure mgid field. 4. For tunnel and proxy QPs, setting the Q_Key value reserved for that proxy/tunnel pair. Since not all the above adjustments occur in all the QP transitions, the QP transitions require separate wrapper functions. Secondly, initialize the P_Key virtualization table to its default values: Master virtualized table is 1-1 with the real P_Key table, guest virtualized table has P_Key index 0 mapped to the real P_Key index 0, and all the other P_Key indices mapped to the reserved (invalid) P_Key at index 127. Finally, add logic in smp_snoop for maintaining the phys_P_Key_cache. and generating events on the master only if a P_Key actually changed. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
* IB/mlx4: Initialize SR-IOV IB support for slaves in master contextJack Morgenstein2012-09-301-0/+3
| | | | | | | | | | | | | | | | | | | | | | | Allocate SR-IOV paravirtualization resources and MAD demuxing contexts on the master. This has two parts. The first part is to initialize the structures to contain the contexts. This is done at master startup time in mlx4_ib_init_sriov(). The second part is to actually create the tunneling resources required on the master to support a slave. This is performed the master detects that a slave has started up (MLX4_DEV_EVENT_SLAVE_INIT event generated when a slave initializes its comm channel). For the master, there is no such startup event, so it creates its own tunneling resources when it starts up. In addition, the master also creates the real special QPs. The ib_core layer on the master causes creation of proxy special QPs, since the master is also paravirtualized at the ib_core layer. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
* mlx4: Add support for EEH error recoveryKleber Sacilotto de Souza2012-07-251-2/+47
| | | | | | | | | | | | | | | | | | | | | Currently the mlx4 drivers don't have the necessary callbacks to implement EEH errors detection and recovery, so the PCI layer uses the probe and remove callbacks to try to recover the device after an error on the bus. However, these callbacks have race conditions with the internal catastrophic error recovery functions, which will also detect the error and this can cause the system to crash if both EEH and catas functions try to reset the device. This patch adds the necessary error recovery callbacks and makes sure that the internal catastrophic error functions will not try to reset the device in such scenarios. It also adds some calls to pci_channel_offline() to suppress reads/writes on the bus when the slot cannot accept I/O operations so we prevent unnecessary accesses to the bus and speed up the device removal. Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Acked-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx4_core: Add firmware commands to support device managed flow steeringHadar Hen Zion2012-07-071-0/+19
| | | | | | | | | | | Add support for firmware commands to attach/detach a new device managed steering mode. Such network steering rules allow the user to provide an L2/L3/L4 flow specification to the firmware and have the device to steer traffic that matches that specification to the provided QP. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx4_core: Fixes for VF / Guest startup flowJack Morgenstein2012-05-311-2/+2
| | | | | | | | | | | | | | | | | - pass the following parameters: - firmware version (added QUERY_FW paravirtualization for that) - disable Blueflame on slaves. KVM disables write combining on guests, and we get better performance without BF in this case. (This requires QUERY_DEV_CAP paravirtualization, also in this commit) - max qp rdma as destination - get rid of a chunk of "if (0)" dead code Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx4_core: Fix potential kernel Oops in res tracker during Dom0 driver ↵Jack Morgenstein2012-05-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | unload Currently the slave and master resources are deleted after master freed all bitmaps. If any resources were not properly cleaned up during the shutdown process, an Oops would result. Fix so that delete slave (only) resources during cleanup. Master resources are cleaned up during unload process, and need not separately be cleaned. Note that during cleanup, we need to split the resource-tracker freeing functionality. Before removing all the bitmaps, we free any leftover slave resources. However, we can only remove the resource tracker linked list after all bitmap frees, since some of the freeing functions (e.g., mlx4_cleanup_eq_table) use paravirtualized FW commands which expect the resource tracker linked list to be present. Found-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx4: Address build warnings on set but not used variablesOr Gerlitz2012-05-161-4/+1
| | | | | | | | | Handle the compiler warnings on variables which are set but not used by removing the relevant variable or casting a return value which is ignored on purpose to void. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlx4_core: fix race on comm channelEugenia Emantayev2012-03-191-0/+9
| | | | | | | | | | | | Prevent race condition between commands on comm channel. Happened while unloading the driver when switching from event to polling mode. VF got completion on the last command before switching to polling mode, but toggle was not changed. After the fix - VF will not write the next command before toggle is updated. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2012-02-101-5/+5
|\ | | | | | | | | | | | | | | | | Conflicts: drivers/infiniband/hw/nes/nes_cm.c Simple whitespace conflict. Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlx4_core: fix memory leak at multi_func_cleanupEugenia Emantayev2012-02-061-5/+5
| | | | | | | | | | | | | | | | Perform cleanup also in non-master flow. The VFs use communication channel as well. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | mlx4: Fix typo in cmd.cMasanari Iida2012-02-041-1/+1
|/ | | | | | | | Correct spelling "reseting" to "resetting" in drivers/net/ethernet/mellanox/mlx4/cmd.c Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlx4_core: map async events to arbitrary slave eqsMarcel Apfelbaum2012-01-221-1/+8
| | | | | | | | | Slave async events were mapped to single eq. This patch fixes this issue, so the slaves can map the async events to any eq. Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlx4: Fixing wrong error codes in communication channelYevgeny Petrilin2011-12-191-32/+56
| | | | | | | | | | The communication channel is HW interface from PF point of view So the command return status should be stored as HW error code and only then translated to errno values. Reporetd-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlx4_core: Modify driver initialization flow to accommodate SRIOV for EthernetJack Morgenstein2011-12-131-1/+169
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Added module parameters sr_iov and probe_vf for controlling enablement of SRIOV mode. 2. Increased default max num-qps, num-mpts and log_num_macs to accomodate SRIOV mode 3. Added port_type_array as a module parameter to allow driver startup with ports configured as desired. In SRIOV mode, only ETH is supported, and this array is ignored; otherwise, for the case where the FW supports both port types (ETH and IB), the port_type_array parameter is used. By default, the port_type_array is set to configure both ports as IB. 4. When running in sriov mode, the master needs to initialize the ICM eq table to hold the eq's for itself and also for all the slaves. 5. mlx4_set_port_mask() now invoked from mlx4_init_hca, instead of in mlx4_dev_cap. 6. Introduced sriov VF (slave) device startup/teardown logic (mainly procedures mlx4_init_slave, mlx4_slave_exit, mlx4_slave_cap, mlx4_slave_exit and flow modifications in __mlx4_init_one, mlx4_init_hca, and mlx4_setup_hca). VFs obtain their startup information from the PF (master) device via the comm channel. 7. In SRIOV mode (both PF and VF), MSI_X must be enabled, or the driver aborts loading the device. 8. Do not allow setting port type via sysfs when running in SRIOV mode. 9. mlx4_get_ownership: Currently, only one PF is supported by the driver. If the HCA is burned with FW which enables more than one PF, only one of the PFs is allowed to run. The first one up grabs a FW ownership semaphone -- all other PFs will find that semaphore taken, and the driver will not allow them to run. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Liran Liss <liranl@mellanox.co.il> Signed-off-by: Marcel Apfelbaum <marcela@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlx4: Ethernet port management modificationsEugenia Emantayev2011-12-131-0/+37
| | | | | | | | | | | | | | | | | The physical port is now common to the PF and VFs. The port resources and configuration is managed by the PF, VFs can only influence the MTU of the port, it is set as max among all functions, Each function allocates RX buffers of required size to meet it's MTU enforcement. Port management code was moved to mlx4_core, as the mlx4_en module is virtualization unaware Move handling qp functionality to mlx4_get_eth_qp/mlx4_put_eth_qp including reserve/release range and add/release unicast steering. Let mlx4_register/unregister_mac deal only with MAC (un)registration. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlx4: Traffic steering management support for SRIOVEugenia Emantayev2011-12-131-0/+9
| | | | | | | | | | | | Let multicast/unicast attaching flow go through resource tracker. The PF is the one responsible for managing all the steering entries. Define and use module parameter that determines the number of qps per multicast group. Minor changes in function calls according to changed prototype. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlx4_core: resource tracking for HCA resources used by guestsEli Cohen2011-12-131-6/+399
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The resource tracker is used to track usage of HCA resources by the different guests. Virtual functions (VFs) are attached to guest operating systems but resources are allocated from the same pool and are assigned to VFs. It is essential that hostile/buggy guests not be able to affect the operation of other VFs, possibly attached to other guest OSs since ConnectX firmware is not tolerant to misuse of resources. The resource tracker module associates each resource with a VF and maintains state information for the allocated object. It also defines allowed state transitions and enforces them. Relationships between resources are also referred to. For example, CQs are pointed to by QPs, so it is forbidden to destroy a CQ if a QP refers to it. ICM memory is always accessible through the primary function and hence it is allocated by the owner of the primary function. When a guest dies, an FLR is generated for all the VFs it owns and all the resources it used are freed. The tracked resource types are: QPs, CQs, SRQs, MPTs, MTTs, MACs, RES_EQs, and XRCDNs. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/mlx4_core: Implement the master-slave communication channelYevgeny Petrilin2011-12-131-27/+672
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When SRIOV is enabled, pf and vfs communicate via shared comm channel. The vf gets its side of the comm channel via a VF BAR. Each VF (slave) creates its vHCR (virtual HCA Command Register), Its DMA address is passed to the PF (master) using Communication Channel Register. The same Register is used to notify the master of commands posted by the slaves and for the master to pass events to the slaves, such as command completions and asynchronous events. The vHCR format is identical to the HCR format, except for the 'go' and 't' bits, which are reserved in the vHCR. Posting commands to the vHCR is identical to the way it is done with the HCR, albeit that the function/PF token fields are used instead of the HCR go bit. Specifically: - When the function prepares a new command in the vHCR, it issues the Post_vHCR_cmd communication channel command and toggles the value of the function token; when PF token has an equal value, the command has been accepted and a new command may be posted. - When the PF detects a Post_vHCR_cmd command, it concludes that a new command is available in the vHCR; after processing the command, the PF toggles the PF token to match the function token. When the 'e' bit is not set, the completion of a Post_vHCR_cmd command also indicates the completion the vHCR command. If, however, the 'e' bit is set, the completion of a Post_vHCR_cmd command only indicates that the vHCR command has been accepted for execution by the PF. Function commands are processed by the PF as follows: -DMA (using the ACCESS_MEM command) the vHCR image into a shadow buffer. -Validate that the opcode is non-privileged, and that the opcode- and input-modifiers are legal. -DMA the in-box (if required) into a shadow buffer. -Validate the command: o Resource ranges (e.g., QP ranges). o Partition key. o Ranges of referenced resources (e.g., CQs within QP contexts). -If the 'e' bit is set o complete the Post_vHCR_cmd command -Execute the command on the HCR. -DMA the results to the vHCR out-box (if required). -If the 'e' bit is set o Indicate command completion by generating a completion event using the GEN_EQE command -Otherwise o DMA the command status to the vHCR o Complete the Post_vHCR_cmd command Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Yevgeny Petrillin <yevgenyp@mellanox.com> Signed-off-by: Liran Liss <liranl@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mlx4_core: Add "native" argument to mlx4_cmd and its callers (where needed)Jack Morgenstein2011-12-131-1/+1
| | | | | | | | | | | | | | | For SRIOV, some Hypervisor commands can be executed directly (native = 1). Others should go through the command wrapper flow (for tracking resource usage, for example, or for changing some HCA configurations that slaves need to be notified of). This patch sets the groundwork for this capability -- adding the correct value of "native" in each case. Note that if SRIOV is not activated, this parameter has no effect. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* drivers/net: Add export.h to files using EXPORT_SYMBOL/THIS_MODULEPaul Gortmaker2011-10-311-0/+1
| | | | | | | | | | | | | | | These were getting the macros from an implicit module.h include via device.h, but we are planning to clean that up. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> drivers/net: Add export.h to wireless/brcm80211/brcmfmac/bcmsdh.c This relatively recently added file uses EXPORT_SYMBOL and hence needs export.h included so that it is compatible with the module.h split up work. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* mlx4: Move the Mellanox driverJeff Kirsher2011-08-111-0/+443
Moves the Mellanox driver into drivers/net/ethernet/mellanox/ and make the necessary Kconfig and Makefile changes. CC: Roland Dreier <roland@kernel.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>