Commit Graph

164 Commits

Author SHA1 Message Date
Vivek Goyal 484b90c4b9 [PATCH] kdump: Save parameter segment in protected mode (x86)
o With introduction of kexec as boot-loader, the assumption that parameter
  segment will always be loaded at lower address than kernel and will be
  addressable by early bootup page tables is no longer valid. In kexec on
  panic case parameter segment might well be loaded beyond kernel image and
  might not be addressable by early boot page tables.
o This case might hit in the scenario where user has reserved a chunk of
  memory for second kernel, for example 16MB to 64MB, and has also built
  second kernel for physical memory location 16MB. In this case kexec has no
  choice but to load the parameter segment at a higher address than new kernel
  image at safe location where new kernel does not stomp it.
o Though problem should automatically go away once relocatable kernel for i386
  is in place and kexec can determine the location of new kernel at run time
  and load parameter segment at lower address than kernel image. But till then
  this patch can go in (assuming it does not break something else).
o This patch moves up the boot parameter saving code. Now boot parameters
  are copied out in protected mode before page tables are initialized. This
  will ensure that parameter segment is always addressable irrespective of
  its physical location.

Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:09 -07:00
Petr Tesarik 5fd75ebb1a [PATCH] vm86: Honor TF bit when emulating an instruction
If the virtual 86 machine reaches an instruction which raises a General
Protection Fault (such as CLI or STI), the instruction is emulated (in
handle_vm86_fault).  However, the emulation ignored the TF bit, so the
hardware debug interrupt was not invoked after such an emulated instruction
(and the DOS debugger missed it).

This patch fixes the problem by emulating the hardware debug interrupt as
the last action before control is returned to the VM86 program.

Signed-off-by: Petr Tesarik <kernel@tesarici.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:09 -07:00
Matt Tolentino 7ae65fd334 [PATCH] x86: fix EFI memory map parsing
The memory descriptors that comprise the EFI memory map are not fixed in
stone such that the size could change in the future.  This uses the memory
descriptor size obtained from EFI to iterate over the memory map entries
during boot.  This enables the removal of an x86 specific pad (and ifdef)
in the EFI header.  I also couldn't stomach the broken up nature of the
function to put EFI runtime calls into virtual mode any longer so I fixed
that up a bit as well.

For reference, this patch only impacts x86.

Signed-off-by: Matt Tolentino <matthew.e.tolentino@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:09 -07:00
Venkatesh Pallipadi 4116c527ea [PATCH] hpet: use read_timer_tsc only when CPU has TSC
Only use read_timer_tsc only when CPU has TSC.  Thanks to Andrea for
pointing this out.  Should not be issue on any platforms as all recent
systems that has HPET also has CPUs that supports TSC.  The patch is still
required for correctness.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-05 00:06:09 -07:00
Steven Rostedt 69be8f1896 [PATCH] convert signal handling of NODEFER to act like other Unix boxes.
It has been reported that the way Linux handles NODEFER for signals is
not consistent with the way other Unix boxes handle it.  I've written a
program to test the behavior of how this flag affects signals and had
several reports from people who ran this on various Unix boxes,
confirming that Linux seems to be unique on the way this is handled.

The way NODEFER affects signals on other Unix boxes is as follows:

1) If NODEFER is set, other signals in sa_mask are still blocked.

2) If NODEFER is set and the signal is in sa_mask, then the signal is
still blocked. (Note: this is the behavior of all tested but Linux _and_
NetBSD 2.0 *).

The way NODEFER affects signals on Linux:

1) If NODEFER is set, other signals are _not_ blocked regardless of
sa_mask (Even NetBSD doesn't do this).

2) If NODEFER is set and the signal is in sa_mask, then the signal being
handled is not blocked.

The patch converts signal handling in all current Linux architectures to
the way most Unix boxes work.

Unix boxes that were tested:  DU4, AIX 5.2, Irix 6.5, NetBSD 2.0, SFU
3.5 on WinXP, AIX 5.3, Mac OSX, and of course Linux 2.6.13-rcX.

* NetBSD was the only other Unix to behave like Linux on point #2. The
main concern was brought up by point #1 which even NetBSD isn't like
Linux.  So with this patch, we leave NetBSD as the lonely one that
behaves differently here with #2.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-29 10:03:11 -07:00
Chuck Ebbert b1daec3089 [PATCH] i386: fix incorrect FP signal code
i386 floating-point exception handling has a bug that can cause error
code 0 to be sent instead of the proper code during signal delivery.

This is caused by unconditionally checking the IS and c1 bits from the
FPU status word when they are not always relevant.  The IS bit tells
whether an exception is a stack fault and is only relevant when the
exception is IE (invalid operation.) The C1 bit determines whether a
stack fault is overflow or underflow and is only relevant when IS and IE
are set.

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-23 19:52:37 -07:00
Steven Rostedt cd3716ab40 [PATCH] Mobil Pentium 4 HT and the NMI
I'm trying to get the nmi working with my laptop (IBM ThinkPad G41) and after
debugging it a while, I found that the nmi code doesn't want to set it up for
this particular CPU.

Here I have:

$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Mobile Intel(R) Pentium(R) 4 CPU 3.33GHz
stepping        : 1
cpu MHz         : 3320.084
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 3
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni
monitor ds_cpl est tm2 cid xtpr
bogomips        : 6642.39

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Mobile Intel(R) Pentium(R) 4 CPU 3.33GHz
stepping        : 1
cpu MHz         : 3320.084
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 1
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 3
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni
monitor ds_cpl est tm2 cid xtpr
bogomips        : 6637.46

And the following code shows:

$ cat linux-2.6.13-rc6/arch/i386/kernel/nmi.c

[...]

void setup_apic_nmi_watchdog (void)
{
        switch (boot_cpu_data.x86_vendor) {
        case X86_VENDOR_AMD:
                if (boot_cpu_data.x86 != 6 && boot_cpu_data.x86 != 15)
                        return;
                setup_k7_watchdog();
                break;
        case X86_VENDOR_INTEL:
                 switch (boot_cpu_data.x86) {
                case 6:
                        if (boot_cpu_data.x86_model > 0xd)
                                return;

                        setup_p6_watchdog();
                        break;
                case 15:
                        if (boot_cpu_data.x86_model > 0x3)
                                return;

Here I get boot_cpu_data.x86_model == 0x4.  So I decided to change it and
reboot.  I now seem to have a working NMI.  So, unless there's something know
to be bad about this processor and the NMI.  I'm submitting the following
patch.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
Acked-by: Mikael Pettersson <mikpe@csd.uu.se>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-19 18:44:56 -07:00
Andi Kleen 6be382ea0c [PATCH] x86: Remove obsolete get_cpu_vendor call
Since early CPU identify is in this information is already available

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-18 12:53:59 -07:00
Andrew Morton 39bbb07d7c [PATCH] transmeta: CONFIG_PROC_FS=n build fix
Fix bug found by Grant Coady <lkml@dodo.com.au>'s autobuild setup.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-01 21:38:00 -07:00
Eric Lammerts cdf32eaa4e [PATCH] disable addres space randomization default on transmeta CPUs
We know that the randomisation slows down some workloads on Transmeta CPUs
by quite large amounts.  We think it's because the CPU needs to recode the
same x86 instructions when they pop up at a different virtual address after
a fork+exec.

So disable randomization by default on those CPUs.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-01 19:13:59 -07:00
Ingo Molnar 6cb54819d7 [PATCH] remove sys_set_zone_reclaim()
This removes sys_set_zone_reclaim() for now.  While i'm sure Martin is
trying to solve a real problem, we must not hard-code an incomplete and
insufficient approach into a syscall, because syscalls are pretty much
for eternity.  I am quite strongly convinced that this syscall must not
hit v2.6.13 in its current form.

Firstly, the syscall lacks basic syscall design: e.g. it allows the
global setting of VM policy for unprivileged users. (!) [ Imagine an
Oracle installation and a SAP installation on the same NUMA box fighting
over the 'optimal' setting for this flag. What will they do? Will they
try to set the flag to their own preferred value every second or so? ]

Secondly, it was added based on a single datapoint from Martin:

 http://marc.theaimsgroup.com/?l=linux-mm&m=111763597218177&w=2

where Martin characterizes the numbers the following way:

 ' Run-to-run variability for "make -j" is huge, so these numbers aren't
   terribly useful except to see that with reclaim the benchmark still
   finishes in a reasonable amount of time. '

in other words: the fundamental problem has likely not been solved, only
a tendential move into the right direction has been observed, and a
handful of numbers were picked out of a set of hugely variable results,
without showing the variability data. How much variance is there
run-to-run?

I'd really suggest to first walk the walk and see what's needed to get
stable & predictable kernel compilation numbers on that NUMA box, before
adding random syscalls to tune a particular aspect of the VM ... which
approach might not even matter once the whole picture has been analyzed
and understood!

The third, most important point is that the syscall exposes VM tuning
internals in a completely unstructured way. What sense does it make to
have a _GLOBAL_ per-node setting for 'should we go to another node for
reclaim'? If then it might make sense to do this per-app, via numalib or
so.

The change is minimalistic in that it doesnt remove the syscall and the
underlying infrastructure changes, only the user-visible changes.  We
could perhaps add a CAP_SYS_ADMIN-only sysctl for this hack, a'ka
/proc/sys/vm/swappiness, but even that looks quite counterproductive
when the generic approach is that we are trying to reduce the number of
external factors in the VM balance picture.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-01 10:03:56 -07:00
Len Brown adbedd3424 merge 2.6.13-rc4 with ACPI's to-linus tree 2005-07-30 01:55:32 -04:00
Len Brown d6ac1a7910 /home/lenb/src/to-linus branch 'acpi-2.6.12' 2005-07-29 23:31:17 -04:00
Dominik Brodowski 4b31e77455 [ACPI] Always set P-state on initialization
Otherwise a platform that supports ACPI based cpufreq
and boots up at lowest possible speed could stay there
forever.  This because the governor may request max speed,
but the code doesn't update if there is no change in
speed, and it assumed the initial state of max speed.

http://bugzilla.kernel.org/show_bug.cgi?id=4634

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-29 18:29:47 -04:00
Natalie.Protasevich@unisys.com e1afc3f522 [PATCH] x86: avoid wasting IRQs patch update
The patch addresses a problem with ACPI SCI interrupt entry, which gets
re-used, and the IRQ is assigned to another unrelated device.  The patch
corrects the code such that SCI IRQ is skipped and duplicate entry is
avoided.  Second issue came up with VIA chipset, the problem was caused by
original patch assigning IRQs starting 16 and up.  The VIA chipset uses
4-bit IRQ register for internal interrupt routing, and therefore cannot
handle IRQ numbers assigned to its devices.  The patch corrects this
problem by allowing PCI IRQs below 16.

Signed-off by: Natalie Protasevich <Natalie.Protasevich@unisys.com>

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-29 15:01:13 -07:00
Linus Torvalds 590f47a1d9 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq 2005-07-29 14:40:08 -07:00
Dave Jones 094ce7fde4 arch/i386/kernel/cpu/cpufreq/powernow-k8.c: In function `powernow_k8_cpu_init_acpi':
arch/i386/kernel/cpu/cpufreq/powernow-k8.c:740: warning: unused variable `vid'
arch/i386/kernel/cpu/cpufreq/powernow-k8.c:739: warning: unused variable `fid'
arch/i386/kernel/cpu/cpufreq/powernow-k8.c:743: warning: unused variable `vid'
arch/i386/kernel/cpu/cpufreq/powernow-k8.c:742: warning: unused variable `fid'
arch/i386/kernel/cpu/cpufreq/powernow-k8.c:746: `fid' undeclared (first use in this function)
arch/i386/kernel/cpu/cpufreq/powernow-k8.c:746: `vid' undeclared (first use in this function)

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-07-29 12:55:40 -07:00
Eric W. Biederman e7b47ccaf6 [PATCH] i386 machine_kexec: Cleanup inline assembly
For some reason I was telling my inline assembly that the
input argument was an output argument.

Playing in the trampoline code I have seen a couple of
instances where lgdt get the wrong size (because the
trampolines run in 16bit mode) so use lgdtl and lidtl to
be explicit.

Additionally gcc-3.3 and gcc-3.4 want's an lvalue for a
memory argument and it doesn't think an array of characters
is an lvalue so use a packed structure instead.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-29 12:17:26 -07:00
Dave Jones 2bcad935a3 Fix up powernow-k8 compile. (Missing definitions).
From: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-07-29 09:56:41 -07:00
Linus Torvalds 33ac02aa4c Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq 2005-07-29 10:16:25 -07:00
Dave Hansen e1474e2d9d [PATCH] re-disable TSC on NUMAQ
Somewhere recently, the TSC got re-enabled for timekeeping on NUMAQ
machines.  However, the hardware makes these get unsynchronized quite
badly.  So badly, in fact, that the code to fix up the skew can just hang
on boot.

This patch re-disables them.  It's nicely confined to the numaq.c file.  It
would be great if this could make it into 2.6.13, I think it counts as a
bugfix.

Tested on a 16-proc 4-node NUMAQ.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28 21:46:05 -07:00
Andi Kleen e2cac78935 [PATCH] x86_64: When running cpuid4 need to run on the correct CPU
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-28 21:46:01 -07:00
Dave Jones 7153d9612f powernow-k8.c: In function `query_current_values_with_pending_wait':
powernow-k8.c:110: warning: `hi' may be used uninitialized in this function

Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2005-07-28 09:45:10 -07:00
Dave Jones 841e40b380 Opteron revision F will support higher frequencies than
can be encoded in the current driver's 4 bit frequency
field.  This patch updates the driver to support Rev F
including 6 bit FIDs and processor ID updates.

This should apply cleanly whether or not the dual-core
bugfix I sent out last week is applied.  I'd prefer
that both get applied, of course.

Signed-off-by: David Keck <david.keck@amd.com>
Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-07-28 09:40:04 -07:00
Dave Jones 03938c3f10 powernow-k8 requires that a data structure for
each core be created in the _cpu_init function
call.  The cpufreq infrastructure doesn't call
_cpu_init for the second core in each processor.
Some systems crashed when _get was called with
an odd-numbered core because it tried to
dereference a NULL pointer since the data
structure had not been created.

The attached patch solves the problem by
initializing data structures for all shared
cores in the _cpu_init function.  It should
apply to 2.6.12-rc6 and has been tested by
AMD and Sun.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2005-07-28 09:38:21 -07:00
Blaisorblade 71ae18ec69 [PATCH] sys_get_thread_area does not clear the returned argument
sys_get_thread_area does not memset to 0 its struct user_desc info before
copying it to user space...  since sizeof(struct user_desc) is 16 while the
actual datas which are filled are only 12 bytes + 9 bits (across the
bitfields), there is a (small) information leak.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:08 -07:00
Brian Gerst 7657e20e46 [PATCH] Fix warning in powernow-k8.c
powernow-k8.c: In function `query_current_values_with_pending_wait':
powernow-k8.c:110: warning: `hi' may be used uninitialized in this function

Signed-off-by: Brian Gerst <bgerst@didntduck.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:25:54 -07:00
Eric W. Biederman 910de55c66 [PATCH] APM: Remove redundant call to set_cpus_allowed
machine_power_off now always switches to the boot cpu so there
is no reason for APM to also do that.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:45 -07:00
Eric W. Biederman 4fa2564a6f [PATCH] i386 machine_power_off cleanup
Call machine_shutdown() to move to the boot cpu
and disable apics.  Both acpi_power_off and
apm_power_off want to move to the boot cpu.
and we are already disabling the local apics
so calling machine_shutdown simply reuses
code.

ia64 doesn't have a special path in power_off
for efi so there is no reason i386 should.  If
we really need to call the efi power off path
the efi driver can set pm_power_off like everyone
else.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:44 -07:00
Eric W. Biederman d8e392e7c8 [PATCH] machine_shutdown: Typo fix to actually allow specifying which cpu to reboot on
This appears to be a typo I introduced when cleaning
this code up earlier. Ooops.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:44 -07:00
Eric W. Biederman 4a1421f81b [PATCH] i386: Implement machine_emergency_reboot
set_cpus_allowed is not safe in interrupt context
and disabling apics is complicated code so don't
call machine_shutdown on i386 from emergency_restart().

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:42 -07:00
Eric W. Biederman 59586e5a26 [PATCH] Don't export machine_restart, machine_halt, or machine_power_off.
machine_restart, machine_halt and machine_power_off are machine
specific hooks deep into the reboot logic, that modules
have no business messing with.  Usually code should be calling
kernel_restart, kernel_halt, kernel_power_off, or
emergency_restart. So don't export machine_restart,
machine_halt, and machine_power_off so we can catch buggy users.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:42 -07:00
Linus Torvalds 72538d8565 Remove "noreplacement" kernel command line option.
It is no longer valid to not replace instructions, since we depend on
different behaviour depending on CPU capabilities.

If you need to limit the capabilities of the replacements (because the
boot CPU has features that non-boot CPU's do not have, for example), you
need to explicitly disable those capabilities that are not shared across
all CPU's.

For example, if your boot CPU has FXSR, but other CPU's in your system
do not, you need to use the "nofxsr" kernel command line, not disable
instruction replacement per se.
2005-07-22 18:29:40 -04:00
Linus Torvalds 8ed1383fb7 x86: make restore_fpu() use alternative assembler instructions
It's really just a single instruction, conditional on whether the CPU
supports FXSR or not, so implement it as such instead of making it a
function that queries FXSR dynamically.

This means that the instruction just gets automatically rewritten to the
correct one at boot-time.
2005-07-22 16:06:16 -04:00
Linus Torvalds b339a18b81 Fix up incorrect "unlikely()" on %gs reload in x86 __switch_to
These days %gs is normally the TLS segment, so it's no longer zero.  As
a result, we shouldn't just assume that %fs/%gs tend to be zero
together, but test them independently instead.

Also, fix setting of debug registers to use the "next" pointer instead
of "current".  It so happens that the scheduler will have set the new
current pointer before calling __switch_to(), but that's just an
implementation detail.
2005-07-22 15:23:47 -04:00
Robert Love 0eeca28300 [PATCH] inotify
inotify is intended to correct the deficiencies of dnotify, particularly
its inability to scale and its terrible user interface:

        * dnotify requires the opening of one fd per each directory
          that you intend to watch. This quickly results in too many
          open files and pins removable media, preventing unmount.
        * dnotify is directory-based. You only learn about changes to
          directories. Sure, a change to a file in a directory affects
          the directory, but you are then forced to keep a cache of
          stat structures.
        * dnotify's interface to user-space is awful.  Signals?

inotify provides a more usable, simple, powerful solution to file change
notification:

        * inotify's interface is a system call that returns a fd, not SIGIO.
	  You get a single fd, which is select()-able.
        * inotify has an event that says "the filesystem that the item
          you were watching is on was unmounted."
        * inotify can watch directories or files.

Inotify is currently used by Beagle (a desktop search infrastructure),
Gamin (a FAM replacement), and other projects.

See Documentation/filesystems/inotify.txt.

Signed-off-by: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 20:38:38 -07:00
Len Brown 5028770a42 [ACPI] merge acpi-2.6.12 branch into latest Linux 2.6.13-rc...
Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-12 17:21:56 -04:00
Venkatesh Pallipadi 02df8b9385 [ACPI] enable C2 and C3 idle power states on SMP
http://bugzilla.kernel.org/show_bug.cgi?id=4401

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-12 00:14:36 -04:00
Nickolai Zeldovich 9d9437759e [ACPI] S3 resume -- use lgdtl, not lgdt
From: Nickolai Zeldovich <kolya@MIT.EDU>
Signed-off-by: Len Brown <len.brown@intel.com>
2005-07-12 00:04:31 -04:00
Christoph Lameter 6c036527a6 [PATCH] mostly_read data section
Add a new section called ".data.read_mostly" for data items that are read
frequently and rarely written to like cpumaps etc.

If these maps are placed in the .data section then these frequenly read
items may end up in cachelines with data is is frequently updated.  In that
case all processors in an SMP system must needlessly reload the cachelines
again and again containing elements of those frequently used variables.

The ability to share these cachelines will allow each cpu in an SMP system
to keep local copies of those shared cachelines thereby optimizing
performance.

Signed-off-by: Alok N Kataria <alokk@calsoftinc.com>
Signed-off-by: Shobhit Dayal <shobhit@calsoftinc.com>
Signed-off-by: Christoph Lameter <christoph@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:46 -07:00
Shaohua Li 3b520b238e [PATCH] MTRR suspend/resume cleanup
There has been some discuss about solving the SMP MTRR suspend/resume
breakage, but I didn't find a patch for it.  This is an intent for it.  The
basic idea is moving mtrr initializing into cpu_identify for all APs (so it
works for cpu hotplug).  For BP, restore_processor_state is responsible for
restoring MTRR.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:23:42 -07:00
Rusty Lynch 6772926bef [PATCH] kprobes: fix namespace problem and sparc64 build
The following renames arch_init, a kprobes function for performing any
architecture specific initialization, to arch_init_kprobes in order to
cleanup the namespace.

Also, this patch adds arch_init_kprobes to sparc64 to fix the sparc64 kprobes
build from the last return probe patch.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-05 19:19:00 -07:00
Greg Kroah-Hartman 7586585897 [PATCH] PCI: clean up dynamic pci id logic
The dynamic pci id logic has been bothering me for a while, and now that
I started to look into how to move some of this to the driver core, I
thought it was time to clean it all up.

It ends up making the code smaller, and easier to follow, and fixes a
few bugs at the same time (dynamic ids were not being matched
everywhere, and so could be missed on some call paths for new devices,
semaphore not needed to be grabbed when adding a new id and calling the
driver core, etc.)

I also renamed the function pci_match_device() to pci_match_id() as
that's what it really does.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-07-01 13:35:50 -07:00
Ingo Molnar 306e440daf [PATCH] x86: i8253/i8259A lock cleanup
Introduce proper declarations for i8253_lock and i8259A_lock.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-30 08:45:10 -07:00
Greg KH 8644d2a42b Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2005-06-27 22:07:56 -07:00
Greg Kroah-Hartman 545493917d [PATCH] PCI: add proper MCFG table parsing to ACPI core.
This patch is the first step in properly handling the MCFG PCI table.
It defines the structures properly, and saves off the table so that the
pci mmconfig code can access it.  It moves the parsing of the table a
little later in the boot process, but still before the information is
needed.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-27 21:52:47 -07:00
Kenji Kaneshige b1bb248a5d [PATCH] ACPI based I/O APIC hot-plug: add interfaces
This patch adds the following new interfaces for I/O xAPIC
hotplug. The implementation of these interfaces depends on each
architecture.

    o int acpi_register_ioapic(acpi_handle handle, u64 phys_addr,
			       u32 gsi_base);

        This new interface is to add a new I/O xAPIC specified by
        phys_addr and gsi_base pair. phys_addr is the physical address
        to which the I/O xAPIC is mapped and gsi_base is global system
        interrupt base of the I/O xAPIC. acpi_register_ioapic returns
        0 on success, or negative value on error.

    o int acpi_unregister_ioapic(acpi_handle handle, u32 gsi_base);

        This new interface is to remove a I/O xAPIC specified by
        gsi_base. acpi_unregister_ioapic returns 0 on success, or
        negative value on error.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-27 21:52:44 -07:00
Rusty Lynch 4bdbd37f6d [PATCH] Return probe redesign: i386 specific changes
The following patch contains the i386 specific changes for the new
return probe design.  Changes include:

 * Removing the architecture specific functions for querying a return probe
   instance off a stack address
 * Complete rework onf arch_prepare_kretprobe() and trampoline_probe_handler()
 * Removing trampoline_post_handler()
 * Adding arch_init() so that now we handle registering the return probe
   trampoline instead of kernel/kprobes.c doing it

Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:23:53 -07:00
Andrea Arcangeli ffaa8bd6c9 [PATCH] seccomp: tsc disable
I believe at least for seccomp it's worth to turn off the tsc, not just for
HT but for the L2 cache too.  So it's up to you, either you turn it off
completely (which isn't very nice IMHO) or I recommend to apply this below
patch.

This has been tested successfully on x86-64 against current cogito
repository (i686 compiles so I didn't bother testing ;).  People selling
the cpu through cpushare may appreciate this bit for a peace of mind.

There's no way to get any timing info anymore with this applied
(gettimeofday is forbidden of course).  The seccomp environment is
completely deterministic so it can't be allowed to get timing info, it has
to be deterministic so in the future I can enable a computing mode that
does a parallel computing for each task with server side transparent
checkpointing and verification that the output is the same from all the 2/3
seller computers for each task, without the buyer even noticing (for now
the verification is left to the buyer client side and there's no
checkpointing, since that would require more kernel changes to track the
dirty bits but it'll be easy to extend once the basic mode is finished).

Eliminating a cold-cache read of the cr4 global variable will save one
cacheline during the tlb flush while making the code per-cpu-safe at the
same time.  Thanks to Mikael Pettersson for noticing the tlb flush wasn't
per-cpu-safe.

The global tlb flush can run from irq (IPI calling do_flush_tlb_all) but
it'll be transparent to the switch_to code since the IPI won't make any
change to the cr4 contents from the point of view of the interrupted code
and since it's now all per-cpu stuff, it will not race.  So no need to
disable irqs in switch_to slow path.

Signed-off-by: Andrea Arcangeli <andrea@cpushare.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 15:11:44 -07:00
Jens Axboe 22e2c507c3 [PATCH] Update cfq io scheduler to time sliced design
This updates the CFQ io scheduler to the new time sliced design (cfq
v3).  It provides full process fairness, while giving excellent
aggregate system throughput even for many competing processes.  It
supports io priorities, either inherited from the cpu nice value or set
directly with the ioprio_get/set syscalls.  The latter closely mimic
set/getpriority.

This import is based on my latest from -mm.

Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-27 14:33:29 -07:00