summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* x86: Use this_cpu_inc_return for nmi counterTejun Heo2010-12-301-2/+1
| | | | | | | | | | | this_cpu_inc_return() saves us a memory access there. Reviewed-by: Pekka Enberg <penberg@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* x86: Replace uses of current_cpu_data with this_cpu opsTejun Heo2010-12-3011-29/+28
| | | | | | | | | | | | | | | | | | | Replace all uses of current_cpu_data with this_cpu operations on the per cpu structure cpu_info. The scala accesses are replaced with the matching this_cpu ops which results in smaller and more efficient code. In the long run, it might be a good idea to remove cpu_data() macro too and use per_cpu macro directly. tj: updated description Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* x86: Use this_cpu_ops to optimize codeTejun Heo2010-12-3016-62/+57
| | | | | | | | | | | | | Go through x86 code and replace __get_cpu_var and get_cpu_var instances that refer to a scalar and are not used for address determinations. Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* vmstat: User per cpu atomics to avoid interrupt disable / enableChristoph Lameter2010-12-181-14/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently the operations to increment vm counters must disable interrupts in order to not mess up their housekeeping of counters. So use this_cpu_cmpxchg() to avoid the overhead. Since we can no longer count on preremption being disabled we still have some minor issues. The fetching of the counter thresholds is racy. A threshold from another cpu may be applied if we happen to be rescheduled on another cpu. However, the following vmstat operation will then bring the counter again under the threshold limit. The operations for __xxx_zone_state are not changed since the caller has taken care of the synchronization needs (and therefore the cycle count is even less than the optimized version for the irq disable case provided here). The optimization using this_cpu_cmpxchg will only be used if the arch supports efficient this_cpu_ops (must have CONFIG_CMPXCHG_LOCAL set!) The use of this_cpu_cmpxchg reduces the cycle count for the counter operations by %80 (inc_zone_page_state goes from 170 cycles to 32). Signed-off-by: Christoph Lameter <cl@linux.com>
* irq_work: Use per cpu atomics instead of regular atomicsChristoph Lameter2010-12-181-9/+9
| | | | | | | | | | | | | | | | | | | | | | The irq work queue is a per cpu object and it is sufficient for synchronization if per cpu atomics are used. Doing so simplifies the code and reduces the overhead of the code. Before: christoph@linux-2.6$ size kernel/irq_work.o text data bss dec hex filename 451 8 1 460 1cc kernel/irq_work.o After: christoph@linux-2.6$ size kernel/irq_work.o text data bss dec hex filename 438 8 1 447 1bf kernel/irq_work.o Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Christoph Lameter <cl@linux.com>
* Merge branch 'this_cpu_ops' into for-2.6.38Tejun Heo2010-12-183-68/+316
|\
| * cpuops: Use cmpxchg for xchg to avoid lock semanticsChristoph Lameter2010-12-181-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use cmpxchg instead of xchg to realize this_cpu_xchg. xchg will cause LOCK overhead since LOCK is always implied but cmpxchg will not. Baselines: xchg() = 18 cycles (no segment prefix, LOCK semantics) __this_cpu_xchg = 1 cycle (simulated using this_cpu_read/write, two prefixes. Looks like the cpu can use loop optimization to get rid of most of the overhead) Cycles before: this_cpu_xchg = 37 cycles (segment prefix and LOCK (implied by xchg)) After: this_cpu_xchg = 11 cycle (using cmpxchg without lock semantics) Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * x86: this_cpu_cmpxchg and this_cpu_xchg operationsChristoph Lameter2010-12-182-1/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide support as far as the hardware capabilities of the x86 cpus allow. Define CONFIG_CMPXCHG_LOCAL in Kconfig.cpu to allow core code to test for fast cpuops implementations. V1->V2: - Take out the definition for this_cpu_cmpxchg_8 and move it into a separate patch. tj: - Reordered ops to better follow this_cpu_* organization. - Renamed macro temp variables similar to their existing neighbours. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg supportChristoph Lameter2010-12-181-1/+133
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generic code to provide new per cpu atomic features this_cpu_cmpxchg this_cpu_xchg Fallback occurs to functions using interrupts disable/enable to ensure correct per cpu atomicity. Fallback to regular cmpxchg and xchg is not possible since per cpu atomic semantics include the guarantee that the current cpus per cpu data is accessed atomically. Use of regular cmpxchg and xchg requires the determination of the address of the per cpu data before regular cmpxchg or xchg which therefore cannot be atomically included in an xchg or cmpxchg without segment override. tj: - Relocated new ops to conform better to the general organization. - This patch contains a trivial comment fix. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * percpu,x86: relocate this_cpu_add_return() and friendsTejun Heo2010-12-172-66/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - include/linux/percpu.h: this_cpu_add_return() and friends were located next to __this_cpu_add_return(). However, the overall organization is to first group by preemption safeness. Relocate this_cpu_add_return() and friends to preemption-safe area. - arch/x86/include/asm/percpu.h: Relocate percpu_add_return_op() after other more basic operations. Relocate [__]this_cpu_add_return_8() so that they're first grouped by preemption safeness. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com>
* | connector: Use this_cpu operationsChristoph Lameter2010-12-171-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch was originally in the use cpuops patchset but it needs an inc_return and is therefore dependent on an extension of the cpu ops. Fixed up and verified that it compiles. get_seq can benefit from this_cpu_operations. Address calculation is avoided and the increment is done using an xadd. Cc: Scott James Remnant <scott@ubuntu.com> Cc: Mike Frysinger <vapier@gentoo.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* | xen: Use this_cpu_inc_returnChristoph Lameter2010-12-171-1/+1
| | | | | | | | | | | | | | | | __this_cpu_inc_return reduces code and simplifies code. Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com>
* | taskstats: Use this_cpu_opsChristoph Lameter2010-12-171-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use this_cpu_inc_return in one place and avoid ugly __raw_get_cpu in another. V3->V4: - Fix off by one. V4-V4f: - Use &listener_array Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* | random: Use this_cpu_inc_returnChristoph Lameter2010-12-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | __this_cpu_inc can create a single instruction to do the same as __get_cpu_var()++. Cc: Richard Kennedy <richard@rsk.demon.co.uk> Cc: Matt Mackall <mpm@selenic.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* | fs: Use this_cpu_inc_return in buffer.cChristoph Lameter2010-12-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | __this_cpu_inc can create a single instruction with the same effect as the _get_cpu_var(..)++ construct in buffer.c. Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Christoph Hellwig <hch@lst.de> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* | highmem: Use this_cpu_xx_return() operationsChristoph Lameter2010-12-171-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use this_cpu operations to optimize access primitives for highmem. The main effect is the avoidance of address calculations through the use of a segment prefix. V3->V4 - kmap_atomic_idx: Do not return a value. - Use __this_cpu_dec without HIGHMEM_DEBUG Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Catalin Marinas <catalin.marinas@arm.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* | vmstat: Use this_cpu_inc_return for vm statisticsChristoph Lameter2010-12-171-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | this_cpu_inc_return() saves us a memory access there. Code size does not change. V1->V2: - Fixed the location of the __per_cpu pointer attributes - Sparse checked V2->V3: - Move fixes to __percpu attribute usage to earlier patch Reviewed-by: Pekka Enberg <penberg@kernel.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* | Merge branch 'this_cpu_ops' into for-2.6.38Tejun Heo2010-12-17369-2072/+3775
|\ \ | |/
| * x86: Support for this_cpu_add, sub, dec, inc_returnChristoph Lameter2010-12-171-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Supply an implementation for x86 in order to generate more efficient code. V2->V3: - Cleanup - Remove strange type checking from percpu_add_return_op. tj: - Dropped unused typedef from percpu_add_return_op(). - Renamed ret__ to paro_ret__ in percpu_add_return_op(). - Minor indentation adjustments. Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * percpu: Generic support for this_cpu_add, sub, dec, inc_returnChristoph Lameter2010-12-171-0/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce generic support for this_cpu_add_return etc. The fallback is to realize these operations with simpler __this_cpu_ops. tj: - Reformatted __cpu_size_call_return2() to make it more consistent with its neighbors. - Dropped unnecessary temp variable ret__ from __this_cpu_generic_add_return(). Reviewed-by: Tejun Heo <tj@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notifyLinus Torvalds2010-12-167-38/+68
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.infradead.org/users/eparis/notify: fanotify: fill in the metadata_len field on struct fanotify_event_metadata fanotify: split version into version and metadata_len fanotify: Dont try to open a file descriptor for the overflow event fanotify: Introduce FAN_NOFD fanotify: do not leak user reference on allocation failure inotify: stop kernel memory leak on file creation failure fanotify: on group destroy allow all waiters to bypass permission check fanotify: Dont allow a mask of 0 if setting or removing a mark fanotify: correct broken ref counting in case adding a mark failed fanotify: if set by user unset FMODE_NONOTIFY before fsnotify_perm() is called fanotify: remove packed from access response message fanotify: deny permissions when no event was sent
| | * fanotify: fill in the metadata_len field on struct fanotify_event_metadataEric Paris2010-12-151-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | The fanotify_event_metadata now has a field which is supposed to indicate the length of the metadata portion of the event. Fill in that field as well. Based-in-part-on-patch-by: Alexey Zaytsev <alexey.zaytsev@gmail.com> Signed-off-by: Eric Paris <eparis@redhat.com>
| | * fanotify: split version into version and metadata_lenAlexey Zaytsev2010-12-151-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To implement per event type optional headers we are interested in knowing how long the metadata structure is. This patch slits the __u32 version field into a __u8 version and a __u16 metadata_len field (with __u8 left over). This should allow for backwards compat ABI. Signed-off-by: Alexey Zaytsev <alexey.zaytsev@gmail.com> [rewrote descrtion and changed object sizes and ordering - eparis] Signed-off-by: Eric Paris <eparis@redhat.com>
| | * fanotify: Dont try to open a file descriptor for the overflow eventLino Sanfilippo2010-12-071-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | We should not try to open a file descriptor for the overflow event since this will always fail. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Eric Paris <eparis@redhat.com>
| | * fanotify: Introduce FAN_NOFDLino Sanfilippo2010-12-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | FAN_NOFD is used in fanotify events that do not provide an open file descriptor (like the overflow_event). Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Eric Paris <eparis@redhat.com>
| | * fanotify: do not leak user reference on allocation failureEric Paris2010-12-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | If fanotify_init is unable to allocate a new fsnotify group it will return but will not drop its reference on the associated user struct. Drop that reference on error. Reported-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Eric Paris <eparis@redhat.com>
| | * inotify: stop kernel memory leak on file creation failureEric Paris2010-12-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If inotify_init is unable to allocate a new file for the new inotify group we leak the new group. This patch drops the reference on the group on file allocation failure. Reported-by: Vegard Nossum <vegard.nossum@gmail.com> cc: stable@kernel.org Signed-off-by: Eric Paris <eparis@redhat.com>
| | * fanotify: on group destroy allow all waiters to bypass permission checkLino Sanfilippo2010-12-073-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When fanotify_release() is called, there may still be processes waiting for access permission. Currently only processes for which an event has already been queued into the groups access list will be woken up. Processes for which no event has been queued will continue to sleep and thus cause a deadlock when fsnotify_put_group() is called. Furthermore there is a race allowing further processes to be waiting on the access wait queue after wake_up (if they arrive before clear_marks_by_group() is called). This patch corrects this by setting a flag to inform processes that the group is about to be destroyed and thus not to wait for access permission. [additional changelog from eparis] Lets think about the 4 relevant code paths from the PoV of the 'operator' 'listener' 'responder' and 'closer'. Where operator is the process doing an action (like open/read) which could require permission. Listener is the task (or in this case thread) slated with reading from the fanotify file descriptor. The 'responder' is the thread responsible for responding to access requests. 'Closer' is the thread attempting to close the fanotify file descriptor. The 'operator' is going to end up in: fanotify_handle_event() get_response_from_access() (THIS BLOCKS WAITING ON USERSPACE) The 'listener' interesting code path fanotify_read() copy_event_to_user() prepare_for_access_response() (THIS CREATES AN fanotify_response_event) The 'responder' code path: fanotify_write() process_access_response() (REMOVE A fanotify_response_event, SET RESPONSE, WAKE UP 'operator') The 'closer': fanotify_release() (SUPPOSED TO CLEAN UP THE REST OF THIS MESS) What we have today is that in the closer we remove all of the fanotify_response_events and set a bit so no more response events are ever created in prepare_for_access_response(). The bug is that we never wake all of the operators up and tell them to move along. You fix that in fanotify_get_response_from_access(). You also fix other operators which haven't gotten there yet. So I agree that's a good fix. [/additional changelog from eparis] [remove additional changes to minimize patch size] [move initialization so it was inside CONFIG_FANOTIFY_PERMISSION] Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Eric Paris <eparis@redhat.com>
| | * fanotify: Dont allow a mask of 0 if setting or removing a markLino Sanfilippo2010-12-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In mark_remove_from_mask() we destroy marks that have their event mask cleared. Thus we should not allow the creation of those marks in the first place. With this patch we check if the mask given from user is 0 in case of FAN_MARK_ADD. If so we return an error. Same for FAN_MARK_REMOVE since this does not have any effect. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Eric Paris <eparis@redhat.com>
| | * fanotify: correct broken ref counting in case adding a mark failedLino Sanfilippo2010-12-071-17/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If adding a mount or inode mark failed fanotify_free_mark() is called explicitly. But at this time the mark has already been put into the destroy list of the fsnotify_mark kernel thread. If the thread is too slow it will try to decrease the reference of a mark, that has already been freed by fanotify_free_mark(). (If its fast enough it will only decrease the marks ref counter from 2 to 1 - note that the counter has been increased to 2 in add_mark() - which has practically no effect.) This patch fixes the ref counting by not calling free_mark() explicitly, but decreasing the ref counter and rely on the fsnotify_mark thread to cleanup in case adding the mark has failed. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Eric Paris <eparis@redhat.com>
| | * fanotify: if set by user unset FMODE_NONOTIFY before fsnotify_perm() is calledLino Sanfilippo2010-12-072-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unsetting FMODE_NONOTIFY in fsnotify_open() is too late, since fsnotify_perm() is called before. If FMODE_NONOTIFY is set fsnotify_perm() will skip permission checks, so a user can still disable permission checks by setting this flag in an open() call. This patch corrects this by unsetting the flag before fsnotify_perm is called. Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Eric Paris <eparis@redhat.com>
| | * fanotify: remove packed from access response messageEric Paris2010-12-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since fanotify has decided to be careful about alignment and packing rather than rely on __attribute__((packed)) for multiarch support. Since this attribute isn't doing anything on fanotify_response we just drop it. This does not break API/ABI. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@sophos.com> Signed-off-by: Eric Paris <eparis@redhat.com>
| | * fanotify: deny permissions when no event was sentEric Paris2010-12-071-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | If no event was sent to userspace we cannot expect userspace to respond to permissions requests. Today such requests just hang forever. This patch will deny any permissions event which was unable to be sent to userspace. Reported-by: Tvrtko Ursulin <tvrtko.ursulin@sophos.com> Signed-off-by: Eric Paris <eparis@redhat.com>
| * | Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linusLinus Torvalds2010-12-1629-152/+350
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (28 commits) MIPS: Add a CONFIG_FORCE_MAX_ZONEORDER Kconfig option. MIPS: LD/SD o32 macro GAS fix update MIPS: Alchemy: fix build with SERIAL_8250=n MIPS: Rename mips_dma_cache_sync back to dma_cache_sync MIPS: MT: Fix typo in comment. SSB: Fix nvram_get on BCM47xx platform MIPS: BCM47xx: Swap serial console if ttyS1 was specified. MIPS: BCM47xx: Use sscanf for parsing mac address MIPS: BCM47xx: Fill values for b43 into SSB sprom MIPS: BCM47xx: Do not read config from CFE MIPS: FDT size is a be32 MIPS: Fix CP0 COUNTER clockevent race MIPS: Fix regression on BCM4710 processor detection MIPS: JZ4740: Fix pcm device name MIPS: Separate two consecutive loads in memset.S MIPS: Send proper signal and siginfo on FP emulator faults. MIPS: AR7: Fix loops per jiffies on TNETD7200 devices MIPS: AR7: Fix double ar7_gpio_init declaration MIPS: Rework GENERIC_HARDIRQS Kconfig. MIPS: Alchemy: Add return value check for strict_strtoul() ...
| | * | MIPS: Add a CONFIG_FORCE_MAX_ZONEORDER Kconfig option.David Daney2010-12-161-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For huge page support with base page size of 16K or 32K, we have to increase the MAX_ORDER so that huge pages can be allocated. [Ralf: I don't think a user should have to configure obscure constants like this but for the time being this will have to suffice.] Signed-off-by: David Daney <ddaney@caviumnetworks.com> To: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/1685/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: LD/SD o32 macro GAS fix updateMaciej W. Rozycki2010-12-162-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I am about to commit: http://sourceware.org/ml/binutils/2010-10/msg00033.html that fixes a problem with the LD/SD macro currently implemented by GAS for the o32 ABI in an inconsistent way. This is best illustrated with a simple program, which I'm copying here from the message above for easier reference: $ cat ld.s ld $5,32767($4) ld $5,32768($4) This gets assebled into the following output: $ mips-linux-as -32 -mips3 -o ld.o ld.s $ mips-linux-objdump -d ld.o ld.o: file format elf32-tradbigmips Disassembly of section .text: 00000000 <.text>: 0: dc857fff ld a1,32767(a0) 4: 3c010001 lui at,0x1 8: 00810821 addu at,a0,at c: 8c258000 lw a1,-32768(at) 10: 8c268004 lw a2,-32764(at) ... Oops! The GAS fix makes the macro behave in a consistent way and pairs of LW/SW instructions to be output as appropriate regardless of the size of the offset associated with the address used. The machine instruction is still available, but to reach it macros have to be disabled first. This has a side effect of requiring the use of a machine-addressable memory operand. As some platforms require 64-bit operations for accesses to some I/O registers LD/SD instructions are used in a couple of places in Linux regardless of the ABI selected. Here's a fix for some pieces of code affected I've been able to track down. The fix should be backwards compatible with all supported binutils releases in existence and can be used as a reference for any other places or off-tree code. The use of the "R" constraint guarantees a machine-addressable operand. Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/1680/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: Alchemy: fix build with SERIAL_8250=nManuel Lauss2010-12-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 7d172bfe ("Alchemy: Add UART PM methods") I introduced platform PM methods which call a function of the 8250 driver; this patch works around link failures when the kernel is built without 8250 support. Signed-off-by: Manuel Lauss <manuel.lauss@googlemail.com> To: Linux-MIPS <linux-mips@linux-mips.org> Patchwork: https://patchwork.linux-mips.org/patch/1737/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: Rename mips_dma_cache_sync back to dma_cache_syncRalf Baechle2010-12-161-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | This fixes IP22 and IP28 build errors. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: MT: Fix typo in comment.Ralf Baechle2010-12-161-1/+1
| | | | | | | | | | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | SSB: Fix nvram_get on BCM47xx platformHauke Mehrtens2010-12-161-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nvram_get function was never in the mainline kernel, it only existed in an external OpenWrt patch. Use nvram_getenv function, which is in mainline and use an include instead of an extra function declaration. et0macaddr contains the mac address in text from like 00:11:22:33:44:55. We have to parse it before adding it into macaddr. nvram_parse_macaddr will be merged into asm/mach-bcm47xx/nvram.h through the MIPS git tree and will be available soon. It will not build now without nvram_parse_macaddr, but it hasn't before either. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> To: linux-mips@linux-mips.org Cc: mb@bu3sch.de Cc: netdev@vger.kernel.org Cc: Hauke Mehrtens <hauke@hauke-m.de> Acked-by: Michael Buesch <mb@bu3sch.de> Patchwork: https://patchwork.linux-mips.org/patch/1849/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: BCM47xx: Swap serial console if ttyS1 was specified.Hauke Mehrtens2010-12-161-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some devices like the Netgear WGT634U are using ttyS1 for default console output. We should switch to that console if it was given in the kernel_args parameters. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> To: linux-mips@linux-mips.org Cc: Hauke Mehrtens <hauke@hauke-m.de> Patchwork: https://patchwork.linux-mips.org/patch/1848/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: BCM47xx: Use sscanf for parsing mac addressHauke Mehrtens2010-12-162-20/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of writing own function for parsing the mac address we now use sscanf. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> To: linux-mips@linux-mips.org Cc: Hauke Mehrtens <hauke@hauke-m.de> Patchwork: https://patchwork.linux-mips.org/patch/1847/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: BCM47xx: Fill values for b43 into SSB spromHauke Mehrtens2010-12-161-22/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fill the sprom with all available values from the nvram. Most of these new values are needed for the b43 or b43legacy driver. Parts of this patch have been in OpenWRT for a long time and were written by Michael Buesch. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> To: linux-mips@linux-mips.org Cc: Hauke Mehrtens <hauke@hauke-m.de> Patchwork: https://patchwork.linux-mips.org/patch/1846/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: BCM47xx: Do not read config from CFEHauke Mehrtens2010-12-161-19/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The config options read out here are not stored in CFE but only in NVRAM on the devices. Remove reading from CFE and only access the NVRAM. Reading out CFE does not harm but is useless here. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> To: linux-mips@linux-mips.org Cc: Hauke Mehrtens <hauke@hauke-m.de> Patchwork: https://patchwork.linux-mips.org/patch/1845/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: FDT size is a be32Thomas Chou2010-12-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The totalsize field was be32. And the reserve bootmem would cause failure. Signed-off-by: Thomas Chou <thomas@wytron.com.tw> To: devicetree-discuss@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: grant.likely@secretlab.ca Cc: David Daney <ddaney@caviumnetworks.com> Cc: Dezhong Diao <dediao@cisco.com> Patchwork: https://patchwork.linux-mips.org/patch/1838/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: Fix CP0 COUNTER clockevent raceKevin Cernekee2010-12-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the following test case: write_c0_compare(read_c0_count()); Even if the counter doesn't increment during execution, this might not generate an interrupt until the counter wraps around. The CPU may perform the comparison each time CP0 COUNT increments, not when CP0 COMPARE is written. If mips_next_event() is called with a very small delta, and CP0 COUNT increments during the calculation of "cnt += delta", it is possible that CP0 COMPARE will be written with the current value of CP0 COUNT. If this is detected, the function should return -ETIME, to indicate that the interrupt might not have actually gotten scheduled. Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/1836/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: Fix regression on BCM4710 processor detectionKevin Cernekee2010-12-162-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BCM4710 uses the BMIPS32 core (like BCM6345), not the MIPS 4Kc core as was previously believed. Signed-off-by: Kevin Cernekee <cernekee@gmail.com> Tested-by: Alexandros C. Couloumbis <alex@ozo.com> Patchwork: https://patchwork.linux-mips.org/patch/1837/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: JZ4740: Fix pcm device nameLars-Peter Clausen2010-12-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part the ASoC multi-component patch (commit f0fba2ad) the jz4740 pcm driver was renamed to 'jz4740-pcm-audio'. Adjust the device name accordingly. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/1770/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: Separate two consecutive loads in memset.STony Wu2010-12-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | partial_fixup is used in noreorder block. Separating two consecutive loads can save one cycle on processors with GPR intrelock and can fix load-use on processors that need a load delay slot. Also do so for fwd_fixup. [Ralf: Only R2000/R3000 class processors are lacking the the load-user interlock and even some of those got it retrofitted. With R2000/R3000 being fairly uncommon these days the impact of this bug should be minor.] Signed-off-by: Tony Wu <tung7970@gmail.com> To: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/1768/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: Send proper signal and siginfo on FP emulator faults.David Daney2010-12-162-30/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We were unconditionally sending SIGBUS with an empty siginfo on FP emulator faults. This differs from what happens when real floating point hardware would get a fault. For most faults we need to send SIGSEGV with the faulting address filled in in the struct siginfo. Reported-by: Camm Maguire <camm@maguirefamily.org> Signed-off-by: David Daney <ddaney@caviumnetworks.com> To: linux-mips@linux-mips.org Cc: Camm Maguire <camm@maguirefamily.org> Patchwork: https://patchwork.linux-mips.org/patch/1727/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>