I'm trying to create performance counters for each cpu in my computer, and count the number of the last-level-cache misses.
I'm doing it in order to change the schduler so it can schedule tasks according to their memory bandwidth consumption.
I have intel processor, and ubuntu with kernel 3.1.6.
In order to create the counters I attached a perf_event field and a perf_event_attr field to each run-queue and wrote the following code (sched.c):
Code: Select all
static void BW_overflow_handler(struct perf_event *event,
struct perf_sample_data *data,
struct pt_regs *regs)
{
printk("BW_event_handler CPU%d event %p\n", smp_processor_id(), event);
}
struct perf_event* my_perf_event_open(uint64_t config, int cpu, struct perf_event_attr* event_attr){
struct perf_event* event;
memset(event_attr,0,sizeof(struct perf_event_attr));
event_attr->type = PERF_TYPE_HARDWARE;
event_attr->size = sizeof(struct perf_event_attr);
event_attr->config=config;
event_attr->disabled = 0;//enabled
event_attr->exclude_kernel = 1;
event_attr->pinned=1;
event_attr->exclude_idle=1;
//count events on all the pids (but actually only one will be counted), only on the specific cpu.
event=perf_event_create_kernel_counter(event_attr,cpu,NULL,BW_overflow_handler,NULL);
if (IS_ERR(event)){
printk("IS_ERR returned true\n");
return NULL;
}
if(event->state != PERF_EVENT_STATE_ACTIVE){
perf_event_release_kernel(event);
printk("failed to enable event on cpu %d",smp_processor_id());
return NULL;
}
printk("an event was open successfully.\n");
return event;
}
/*** initialize counters for each run-queue ***/
for_each_possible_cpu(i) {
struct rq *my_rq;
my_rq = cpu_rq(i);
raw_spin_lock(&my_rq->lock);
my_rq->cache_misses_event = my_perf_event_open(PERF_COUNT_HW_CACHE_MISSES,i,&(my_rq->cache_misses_event_attr));// count cache-misses for cpu number i
if (my_rq->cache_misses_event){
printk("cache_misses_event created successfully for cpu %d\n",i);
}else{
printk("cache_misses_event was not created for cpu %d\n",i);
}
my_rq->cycles_event = my_perf_event_open(PERF_COUNT_HW_CPU_CYCLES,i,&(my_rq->cycles_event_attr));
if (my_rq->cycles_event){
printk("cycles_event created successfully for cpu %d\n",i);
}else{
printk("cycles_event was not created for cpu %d\n",i);
}
raw_spin_unlock(&my_rq->lock);
}
I tried to run the initialization code at system startup, and via a system call.
The system call always fails. a.k.a. IS_ERR(event) always returns true.
If I run the code during the system startup... it sometime works and sometimes dont! with probability of about 1:3. (most of the times it fails)... but when it works... it works perferctly, and Im able to read the counters and reset them with no problem...
when it does not work I get a black screen(attached) with the message:
BUG:unable to handle kernel NULL pointer dereference at 00000010
What am I doing wrong?
And why is this message does not happens consistently?
Thanks a lot for anyone who can help!