Skip to content
  • Ian Rogers's avatar
    53df2b93
    libsymbols kallsyms: Parse using io api · 53df2b93
    Ian Rogers authored
    
    
    'perf record' will call kallsyms__parse 4 times during startup and
    process megabytes of data. This changes kallsyms__parse to use the io
    library rather than fgets to improve performance of the user code by
    over 8%.
    
    Before:
    
      Running 'internals/kallsyms-parse' benchmark:
      Average kallsyms__parse took: 103.988 ms (+- 0.203 ms)
    
    After:
    
      Running 'internals/kallsyms-parse' benchmark:
      Average kallsyms__parse took: 95.571 ms (+- 0.006 ms)
    
    For a workload like:
    
      $ perf record /bin/true
      Run under 'perf record -e cycles:u -g' the time goes from:
      Before
      30.10%     1.67%  perf     perf                [.] kallsyms__parse
      After
      25.55%    20.04%  perf     perf                [.] kallsyms__parse
    
    So a little under 5% of the start-up time is removed. A lot of what
    remains is on the kernel side, but caching kallsyms within perf would at
    least impact memory footprint.
    
    Committer notes:
    
    The internal/kallsyms-parse bench is run using:
    
      [root@five ~]# perf bench internals kallsyms-parse
      # Running 'internals/kallsyms-parse' benchmark:
        Average kallsyms__parse took: 80.381 ms (+- 0.115 ms)
      [root@five ~]#
    
    And this pre-existing test uses these routines to parse kallsyms and
    then compare with the info obtained from the matching ELF symtab:
    
      [root@five ~]# perf test vmlinux
       1: vmlinux symtab matches kallsyms                       : Ok
      [root@five ~]#
    
    Also we can't remove hex2u64() in this patch as this breaks the build:
    
      /usr/bin/ld: /tmp/build/perf/perf-in.o: in function `modules__parse':
      /home/acme/git/perf/tools/perf/util/symbol.c:607: undefined reference to `hex2u64'
      /usr/bin/ld: /home/acme/git/perf/tools/perf/util/symbol.c:607: undefined reference to `hex2u64'
      /usr/bin/ld: /tmp/build/perf/perf-in.o: in function `dso__load_perf_map':
      /home/acme/git/perf/tools/perf/util/symbol.c:1477: undefined reference to `hex2u64'
      /usr/bin/ld: /home/acme/git/perf/tools/perf/util/symbol.c:1483: undefined reference to `hex2u64'
      collect2: error: ld returned 1 exit status
    
    Leave it there, move it in the next patch.
    
    Signed-off-by: default avatarIan Rogers <irogers@google.com>
    Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Stephane Eranian <eranian@google.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Link: http://lore.kernel.org/lkml/20200501221315.54715-3-irogers@google.com
    
    
    Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    53df2b93
    libsymbols kallsyms: Parse using io api
    Ian Rogers authored
    
    
    'perf record' will call kallsyms__parse 4 times during startup and
    process megabytes of data. This changes kallsyms__parse to use the io
    library rather than fgets to improve performance of the user code by
    over 8%.
    
    Before:
    
      Running 'internals/kallsyms-parse' benchmark:
      Average kallsyms__parse took: 103.988 ms (+- 0.203 ms)
    
    After:
    
      Running 'internals/kallsyms-parse' benchmark:
      Average kallsyms__parse took: 95.571 ms (+- 0.006 ms)
    
    For a workload like:
    
      $ perf record /bin/true
      Run under 'perf record -e cycles:u -g' the time goes from:
      Before
      30.10%     1.67%  perf     perf                [.] kallsyms__parse
      After
      25.55%    20.04%  perf     perf                [.] kallsyms__parse
    
    So a little under 5% of the start-up time is removed. A lot of what
    remains is on the kernel side, but caching kallsyms within perf would at
    least impact memory footprint.
    
    Committer notes:
    
    The internal/kallsyms-parse bench is run using:
    
      [root@five ~]# perf bench internals kallsyms-parse
      # Running 'internals/kallsyms-parse' benchmark:
        Average kallsyms__parse took: 80.381 ms (+- 0.115 ms)
      [root@five ~]#
    
    And this pre-existing test uses these routines to parse kallsyms and
    then compare with the info obtained from the matching ELF symtab:
    
      [root@five ~]# perf test vmlinux
       1: vmlinux symtab matches kallsyms                       : Ok
      [root@five ~]#
    
    Also we can't remove hex2u64() in this patch as this breaks the build:
    
      /usr/bin/ld: /tmp/build/perf/perf-in.o: in function `modules__parse':
      /home/acme/git/perf/tools/perf/util/symbol.c:607: undefined reference to `hex2u64'
      /usr/bin/ld: /home/acme/git/perf/tools/perf/util/symbol.c:607: undefined reference to `hex2u64'
      /usr/bin/ld: /tmp/build/perf/perf-in.o: in function `dso__load_perf_map':
      /home/acme/git/perf/tools/perf/util/symbol.c:1477: undefined reference to `hex2u64'
      /usr/bin/ld: /home/acme/git/perf/tools/perf/util/symbol.c:1483: undefined reference to `hex2u64'
      collect2: error: ld returned 1 exit status
    
    Leave it there, move it in the next patch.
    
    Signed-off-by: default avatarIan Rogers <irogers@google.com>
    Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Stephane Eranian <eranian@google.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Link: http://lore.kernel.org/lkml/20200501221315.54715-3-irogers@google.com
    
    
    Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
Loading