Strong APA Low level Design in RTEMS

We talked about the high level design in the last post. Here we’ll be talking about the low level design, or the exact implementation of the algorithm from the paper

Get Lowest Scheduled

We’ll be looking at the code corresponding to each part of the alogrithm one by one.

Part 1

Alpha(new) denotes the affinity set of the newly arrived task, which would correspond to the filter_base(or the Strong APA node of the filter base, since the later stores the affinity in its structure)

  for ( cpu_index = 0 ; cpu_index < cpu_max ; ++cpu_index ) { 
    visited[ cpu_index ] = false;
    
    //Checks if the thread_CPU is in the affinity set of the node
    if ( _Processor_mask_Is_set( &filter_node->affinity, cpu_index)) { 
      Per_CPU_Control *cpu = _Per_CPU_Get_by_index( cpu_index );
         
      if ( _Per_CPU_Is_processor_online( cpu ) ) {
        rear = rear + 1;
        queue[ rear ] = cpu;
        visited[ cpu_index ] = true;
        caller[ cpu_index ] = filter_base;
      }
    }
  }

Here queue is an array of Per_CPU_Control with CONFIGURE_MAXIMUM_PROCESSORS as its size, so we know it would never overflow.

Note that the initial values of front and rear are 0 and -1 respectively.

caller is an array of Scheduler_Node with CONFIGURE_MAXIMUM_PROCESSORS as its size, with each entry corresponding to the Scheduler_Node that appointed a CPU at that index. Appoint here means that the CPU was added to the queue because of the Scheduler_Node stored at the index of CPU index in caller. A Scheduler_Node adds a CPU to the queue that is in its affinity set and is not visited. caller is used for backtracking near the end of the alogrithm.

Part 2

Queue stores the univisited CPUs. Chi(T)(cur) denotes the task running on a CPU. We use Per_CPU_Control-> executing to get the executing thread on a CPU and later use _Thread_Scheduler_get_home_node to get the Scheduler_Node corresponding to that CPU.

We check if the task is not idle by checking the is_idle attribute of its thread. Adding an element to the queue and marking it visited and saving its caller is same as we saw above in part 1.

 while( front <= rear ) {
    curr_CPU = queue[ front ]; 
    front = front + 1;
    
    curr_thread = curr_CPU->executing;
    curr_node = _Thread_Scheduler_get_home_node( curr_thread );
  
    curr_priority = _Scheduler_Node_get_priority( curr_node );
    curr_priority = SCHEDULER_PRIORITY_PURIFY( curr_priority ); 
    
    curr_strong_node = _Scheduler_strong_APA_Node_downcast( curr_node );  
      
    if ( curr_priority > max_priority_num) {
      ret = curr_node;
      max_priority_num = curr_priority;
      
      if( curr_priority > filter_priority)
      {
      	cpu_to_preempt = curr_CPU;
      }
    }
    
    if ( !curr_thread->is_idle ) {
      for ( cpu_index = 0 ; cpu_index < cpu_max ; ++cpu_index ) {
        if ( _Processor_mask_Is_set( &Scurr_node->affinity, cpu_index ) ) { 
          //Checks if the thread_CPU is in the affinity set of the node
          Per_CPU_Control *cpu = _Per_CPU_Get_by_index( cpu_index );
          if ( _Per_CPU_Is_processor_online( cpu ) && visited[ cpu_index ] == false ) {
            rear = rear + 1;
            queue[ rear ] = cpu;
            visited[ cpu_index ] = true;
            caller[ cpu_index ] = curr_node;
          }
        }  
      }
    }
  }

Here ret is the final node with the smallest priority (maximum priority number) that we would return. Note that since the max_priority_num keeps getting smaller each time the conditional expression executes, we would have the right ret (the smallest priority node) and the right cpu_to_preempt.

The conditional expresion above the assignment on cpu_to_preempt:

 if( curr_priority > filter_priority) {
   cpu_to_preempt = curr_CPU;
 }

is important because there exists scenario where all the nodes that we go through have a higher priority (less priority number) than our filter_node(the node to schedule) in which case, we would be returning ret: the minimum priority node encountered but there would be no preemptions and filter_node would remain unassigned.

Part 3

This is the most important part of the algorithm as this is where the real “shifting” occurs.

  if( ret->Priority.value > filter_priority ) {
    //Backtrack on the path from
    //_Thread_Get_CPU(ret->user) to ret, shifting along every task
    
    curr_node = caller[ _Per_CPU_Get_index(cpu_to_preempt) ];
    curr_cpu = _Thread_Get_CPU( curr_node->user );
    
    curr_node = caller [ _Per_CPU_Get_index( curr_cpu ) ];
    curr_cpu = _Thread_Get_CPU( curr_node->user );
    
    do{     
      next_cpu = _Thread_Get_CPU( curr_node->user );
         
      _Scheduler_SMP_Preempt(
        context,
        curr_node,
        _Thread_Scheduler_get_home_node( curr_cpu->executing ),
        _Scheduler_strong_APA_Allocate_processor
      );
        
      curr_cpu = _Per_CPU_Get_index( next_cpu );
      curr_node = caller[ curr_cpu ];
      
      }while( curr_node != filter_base );
      
    _Scheduler_SMP_Preempt(
      context,
      curr_node,
      _Thread_Scheduler_get_home_node( curr_cpu->executing ),
      _Scheduler_strong_APA_Allocate_processor
    );
      
    filter_base = caller[ _Per_CPU_Get_index(cpu_to_preempt) ];
  }
  return ret;

The first conditional expression makes sure that we are shifting task (eventually preempting one of the task from its cpu) only if the lowest priority task (which would get preempted) has a lower priority (higher priority number) than the filter_node : the node that we are trying to schedule.

Note that ret is executing on the cpu_to_preempt here.

We start backtracking not from the cpu_to_preempt, but from the cpu before that, that is the cpu of the caller node of cpu_to_preempt. We do this because the code which calls get_lowest_scheduled from the smpimpl.h (here and here) preempt the ret node with the filter_base after returning the call from the get_lowest_scheduled. So, to create an equal effect, we replace the filter_base with the caller to cpu_to_preempt and keep the ret node same and meanwhile shift all the task before this.

So, in a hindsight, we shift all the task except the last task, and we return the last_task as filter_base to let schedulersmpimpl.h do the job of preempting the last task. We do this by starting with the Scheduler_Node caller of the CPU on which the Scheduler_Node caller of the cpu_to_preempt is executing on. Doing this shifts the starting of backtracking by one cpu behind the cpu_to_preempt.

We save the next_cpu since when the curr_cpu gets preempted to schedule the curr_node on it, the cpu of curr_node would change, so saving the next_cpu would help.

Also, note that at any time cpu_node is the Scheduler_Node that would preempt the curr_cpu.

After the end of while loop, we preempt the last task and cpu, which means the filter_node preempts the cpu that is needs to go to, so our original Scheduler_Node filter_base that we started to schedule with gets scheduled as well.

Please find the PR corresponding to these changes at : https://github.com/richidubey/rtems/pull/6, referring to the entire code from schedulerstrongapa.c, schedulerstrongapa.h and scheduler.h might help put everything in proper context.

This is a lil confusing, and I hope my explanation made things clear.

Thanks!

To learn more about the Get_highest_ready function definition of the Strong APA scheduler, check out this post.

Richi's Journey with rtems

Strong APA Low level Design in RTEMS

Leave a comment Cancel reply

Share this:

Leave a comment Cancel reply