We greatly benefited from the help of the perf mailing list . Here is a condensed summary. > > I'm writing a tool to parse the perf.data output of perf and was wondering about the source of the > timestamps found in perf_sample events? Specifically I mean the u64 time field in the perf_sample struct > in util/event.h. I believe it is in nanoseconds from a particular event (system start?), but I can't find > any documentation about where the time comes from. > > > > I'm also interested in reading from this time source from a userspace program, is that possible? > Code wise, you'll need to follow perf_clock(). I believe for x86 with a > stable TSC you end up in native_sched_clock() in arch/x86/kernel/tsc.c. > As for the corresponding userspace call I believe it is > clock_gettime(CLOCK_MONOTONIC, ...). But later it does not prove to be the case: > > Now our main problem is relating the perf timestamps > > to a public clock API (like posix_gettime, etc.) in order > > to chronologically merge the perf log with our external event logs. > > We've so far determined experimentally that even the > > clock_gettime(CLOCK_MONOTONIC, ...) values are not identical > > to the perf timestamps at the same places, though they are quite close. > > We're working around right now by syncing with a timestamp > > of a known event, but that leaves us vulnerable to a time drift. > > If there's no way currently to generate externally meaningful > > timestamps, could we file a feature request for the sake of the future? > You mean like this: https://lkml.org/lkml/2011/6/7/638? > I'm still waiting for certain changes to core perf before I try again to get it committed. I haven't forgotten; it's just not ready yet. > And using the time-of-day patch against a newer kernel (tried 3.2, 3.4 and 3.6) I do see the offset you mentioned. Another issue: > > Are we missing anything? Is the "--timestamp" option of perf-record > > related in any way? Any other tips? > That option adds the timestamps to the samples if it were not otherwise added. Yet another: > > Could someone clarify the behaviour of the -p PID flag of perf-record? > > In our experiments (3.2.0-27 #43-Ubuntu SMP x86_64), > > it ignores events on threads spawned after perf-record is started. > > Is this the intended behaviour, and is there any work-around? > > Generally, what is the best (vs CPU/IO, distortion of profiling results) > > way to record only events of a given OS process, with all its threads? > > The version of perf-record that we use already contains the separate > > -p PID and -t TID options, so the initial PID patch is already in. > > It's just that the -p PID option ignores all threads spawned > > after the process is started, AFAICT. > AFAIK perf events aren't inherited when -p, -t or -u option is used due > to a performance reason. But if you start the process on the command > line with perf record, it'll be inherited and counted properly. > I guess your problem might be solved by using a cgroup: > mkdir ${cgroup_root}/mygroup > echo NNNN > ${cgroup_root}/mygroup/tasks > perf record -a -e cycles -G mygroup Another answer: > I see the problem now. The perf tool gets FORK events as threads are created, > but it does not respond to them and create new counters. Some refactoring is needed to make that happen.