zephyr/scripts/process_gperf.py

163 lines
4.5 KiB
Python
Raw Normal View History

kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
#!/usr/bin/env python3
#
# Copyright (c) 2017 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
import sys
import argparse
import os
import re
from distutils.version import LooseVersion
# --- debug stuff ---
"""
gperf C file post-processor
We use gperf to build up a perfect hashtable of pointer values. The way gperf
does this is to create a table 'wordlist' indexed by a string repreesentation
of a pointer address, and then doing memcmp() on a string passed in for
comparison
We are exclusively working with 4-byte pointer values. This script adjusts
the generated code so that we work with pointers directly and not strings.
This saves a considerable amount of space.
"""
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
def debug(text):
if not args.verbose:
return
sys.stdout.write(os.path.basename(sys.argv[0]) + ": " + text + "\n")
def error(text):
sys.stderr.write(os.path.basename(sys.argv[0]) + " ERROR: " + text + "\n")
sys.exit(1)
def warn(text):
sys.stdout.write(
os.path.basename(
sys.argv[0]) +
" WARNING: " +
text +
"\n")
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
def reformat_str(match_obj):
addr_str = match_obj.group(0)
# Nip quotes
addr_str = addr_str[1:-1]
addr_vals = [0, 0, 0, 0]
ctr = 3
i = 0
while (True):
if i >= len(addr_str):
break
if addr_str[i] == "\\":
if addr_str[i + 1].isdigit():
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
# Octal escape sequence
val_str = addr_str[i + 1:i + 4]
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
addr_vals[ctr] = int(val_str, 8)
i += 4
else:
# Char value that had to be escaped by C string rules
addr_vals[ctr] = ord(addr_str[i + 1])
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
i += 2
else:
addr_vals[ctr] = ord(addr_str[i])
i += 1
ctr -= 1
return "(char *)0x%02x%02x%02x%02x" % tuple(addr_vals)
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
def process_line(line, fp):
if line.startswith("#"):
fp.write(line)
return
# Set the lookup function to static inline so it gets rolled into
# _k_object_find(), nothing else will use it
if re.search(args.pattern + " [*]$", line):
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
fp.write("static inline " + line)
return
m = re.search("gperf version (.*) [*][/]$", line)
if m:
v = LooseVersion(m.groups()[0])
v_lo = LooseVersion("3.0")
v_hi = LooseVersion("3.1")
if (v < v_lo or v > v_hi):
warn("gperf %s is not tested, versions %s through %s supported" %
(v, v_lo, v_hi))
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
# Replace length lookups with constant len of 4 since we're always
# looking at pointers
line = re.sub(r'lengthtable[[]key[]]', r'4', line)
# Empty wordlist entries to have NULLs instead of ""
line = re.sub(r'[{]["]["][}]', r'{}', line)
# Suppress a compiler warning since this table is no longer necessary
line = re.sub(r'static unsigned char lengthtable',
r'static unsigned char __unused lengthtable', line)
# drop all use of register keyword, let compiler figure that out,
# we have to do this since we change stuff to take the address of some
# parameters
line = re.sub(r'register', r'', line)
# Hashing the address of the string
line = re.sub(r"hash [(]str, len[)]",
r"hash((const char *)&str, len)", line)
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
# Just compare pointers directly instead of using memcmp
if re.search("if [(][*]str", line):
fp.write(" if (str == s)\n")
return
# Take the strings with the binary information for the pointer values,
# and just turn them into pointers
line = re.sub(r'["].*["]', reformat_str, line)
fp.write(line)
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
def parse_args():
global args
parser = argparse.ArgumentParser(
description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
parser.add_argument("-i", "--input", required=True,
help="Input C file from gperf")
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
parser.add_argument("-o", "--output", required=True,
help="Output C file with processing done")
parser.add_argument("-p", "--pattern", required=True,
help="Search pattern for objects")
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
parser.add_argument("-v", "--verbose", action="store_true",
help="Print extra debugging information")
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
args = parser.parse_args()
if "VERBOSE" in os.environ:
args.verbose = 1
kernel: introduce object validation mechanism All system calls made from userspace which involve pointers to kernel objects (including device drivers) will need to have those pointers validated; userspace should never be able to crash the kernel by passing it garbage. The actual validation with _k_object_validate() will be in the system call receiver code, which doesn't exist yet. - CONFIG_USERSPACE introduced. We are somewhat far away from having an end-to-end implementation, but at least need a Kconfig symbol to guard the incoming code with. Formal documentation doesn't exist yet either, but will appear later down the road once the implementation is mostly finalized. - In the memory region for RAM, the data section has been moved last, past bss and noinit. This ensures that inserting generated tables with addresses of kernel objects does not change the addresses of those objects (which would make the table invalid) - The DWARF debug information in the generated ELF binary is parsed to fetch the locations of all kernel objects and pass this to gperf to create a perfect hash table of their memory addresses. - The generated gperf code doesn't know that we are exclusively working with memory addresses and uses memory inefficently. A post-processing script process_gperf.py adjusts the generated code before it is compiled to work with pointer values directly and not strings containing them. - _k_object_init() calls inserted into the init functions for the set of kernel object types we are going to support so far Issue: ZEP-2187 Signed-off-by: Andrew Boie <andrew.p.boie@intel.com>
2017-08-23 04:15:23 +08:00
def main():
parse_args()
with open(args.input, "r") as in_fp, open(args.output, "w") as out_fp:
for line in in_fp.readlines():
process_line(line, out_fp)
if __name__ == "__main__":
main()