Use Backtracking to print all Subsets

  • integers • code • generator • algorithm • personal • it • set • english • print • subsets • backtracking
  • 743 words

I'm "studying" this algorithmic technique: Backtracking. This is an algorithmic approach to problem solution that I trust every good Computer Scientist already uses; but taking a good theo-practical look at it is something better.

A Backtracking Tree A Backtracking Tree

I believe you can find enough informations about it online (the 2 links I provided are more than enough), so I'll go straight to the problem.

Problem definition

Given an integer n, and the set S = { 1 ... n }, calculate (print) all the possible subsets of S. For example, for n = 1, all the possible subsets are { 1 } and { } (empty set). For n = 2, all the possible subsets are: { 1 2 } { 1 } { 2 } { }. In general, for the set made of the first n Integers, the number of possible subsets is 2n.

Approaching the problem

A way to describe a possible subset is an array of n elements, one for every integers; every element in the array will have value TRUE if the correspondent integers is part of the subset, FALSE otherwise.

Why the Backtracking then? Because the backtracking technique is designed to generate every possible "candidate solution" once. If we design the algorithm smartly, we can get the backtracking logic to work for us and generate all the possible subsets.

Are you are asking yourself: "isn't this a bit of a stretching of the backtracking approach?". I believe it is: the code could be made way smaller, even though it would have the same complexity. We are going to execute the backtracking, designing the algorithm so it will never stop until it tried every possible solution. No "intermediate stopping condition" then.

Solution

Give a good look at the void backtrack(int *currsol, int currsol_size, int input): it works as a "skeleton" for generic backtracking algorithms. I'm going to use it in the future as well: I'm going to have fun to implement a Sudoku solver ASAP (probably this weekend). Why? Because I suck at Sudoku: never got into it very much to learn how to play :-P.

Anyway, here is the code

#include <stdio.h>
#include <stdlib.h>

// Fake boolean values
#define TRUE    (1)
#define FALSE   (0)

// Candidates used at every recursive call
#define MAX_CANDIDATES 2

// Define the "bool" type using "short int"
typedef short int bool;
static bool finished = FALSE;

bool reject(int* curr_sol, int curr_sol_size, int input) {
    return FALSE; // EMPTY for now
}

bool accept(int* curr_sol, int curr_sol_size, int input) {
    return (curr_sol_size == input);
}

void output(int* curr_sol, int curr_sol_size, int input) {
    static int i;
    printf("[ ");
    for ( i = 1; i <= curr_sol_size; ++i ) {
        if ( curr_sol[i] == TRUE ) printf("%d ", i);
    }
    printf("]\n");
}

void extend_solution(int* curr_sol, int curr_sol_size, int input, int* candidates, int* num_candidates) {
    // Only 2 possibilities: the element is take or not taken
    candidates[0] = TRUE;
    candidates[1] = FALSE;
    *num_candidates = 2;
}

void try(int *curr_sol, int curr_sol_size, int input) {
    // EMPTY for now
}

void revert(int *curr_sol, int curr_sol_size, int input) {
    // EMPTY for now
}

void backtrack(int *curr_sol, int curr_sol_size, int input) {
    int candidates[MAX_CANDIDATES];
    int num_candidates;
    int i;

    if ( reject(curr_sol, curr_sol_size, input) ) {
        return; // Not worth completing
    }

    if ( accept(curr_sol, curr_sol_size, input) ) {
        output(curr_sol, curr_sol_size, input); // Found! Print it
    } else {
        ++curr_sol_size; // Increase solution size

        // Generate solution extension
        extend_solution(curr_sol, curr_sol_size, input, candidates, &num_candidates);
        // Try every candidates just generated
        for ( i = 0; i < num_candidates; ++i ) {
            curr_sol[curr_sol_size] = candidates[i];

            try(curr_sol, curr_sol_size, input);

            backtrack(curr_sol, curr_sol_size, input);

            revert(curr_sol, curr_sol_size, input);

            if ( finished ) return; // Early termination
        }
    }
}

void print_all_subset_from_1_to(int n) {
    int *sol;
    sol = malloc(n * sizeof(int));
    backtrack(sol, 0, n);
    free (sol);
}

int main(int argc, char** argv) {
    int input;
    if ( argc == 2 ) input = atoi(argv[1]); else return (EXIT_FAILURE);

    print_all_subset_from_1_to(input);
}

Complexity

Time Complexity? Actually, I believe we are talking about a very complex algorithm here: there are 2n different subsets here, but to generate everyone of them, the algorithm has to go deep down in every branch of the Backtracking tree. And, given that the depth of a branch is exactly n (because is when the printing actually happens), I believe here we have a O(n * 2^n) complexity; pretty big. But I could be wrong, so if any Big-O notation expert is out there, please advice.

comments powered by Disqus