Leetcode 433 Minimum Genetic Mutation

Minimum Genetic Mutation – LeetCode

A gene string can be represented by an 8-character long string, with choices from ‘A’, ‘C’, ‘G’, and ‘T’. Given the two gene strings startGene and endGene and the gene bank bank, return the minimum number of mutations needed to mutate from startGene to endGene. If there is no such a mutation, return -1.

To address this problem, we can use either Breadth-First Search (BFS) since it’s more suitable for finding the shortest path in an unweighted graph, similar to determining the minimum number of mutations required.

The core idea is to represent each gene string as a node in a graph, with each valid mutation forming an edge between two nodes. By employing BFS, we can explore the graph level by level, starting from the initial gene and progressing towards the target gene. BFS is particularly effective in this context as it identifies the shortest path (i.e., the fewest mutations) from the start node to the end node.

Here are the core steps:

Initialize a set from the bank to quickly verify if a mutation is valid.
Start with a queue containing the initial gene and a mutation count of 0.
Define mutation possibilities for each character using a dictionary for the ease of writing code.
- e.g {A: ['C', 'T', 'G']; C: ['A', 'T', 'G'] ....}
Dequeue an element and iterate over its characters, changing one character at a time based on the mutation possibilities.
If a new mutation is valid (present in the bank), enqueue it with the mutation count increased by one.
If the end gene is reached, return the mutation count.
If the queue is exhausted without reaching the end gene, return -1.

Python:

class Solution(object):
    def minMutation(self, startGene, endGene, bank):
        # For quick lookup and easy delete
        gene_set = set(bank)
        queue = deque([{'gene': startGene, 'step': 0}])
        
        # For easy iteration to generate mutation from the current letter
        mutation_map = {
            'A': ['C', 'G', 'T'],
            'C': ['A', 'G', 'T'],
            'G': ['C', 'A', 'T'],
            'T': ['C', 'G', 'A']
        }
        
        while queue:
            current = queue.popleft()
            gene = current['gene']
            step = current['step']
            
            if gene == endGene:
                return step
            
            for i in range(len(gene)):
                for c in mutation_map[gene[i]]:
                    new_gene_array = list(gene)
                    new_gene_array[i] = c
                    new_gene = ''.join(new_gene_array)
                    
                    if new_gene in gene_set:
                        queue.append({'gene': new_gene, 'step': step + 1})
                        gene_set.remove(new_gene)
        
        return -1

Java:

class Solution {

    private static class GeneStep {
        String gene;
        int step;
        
        GeneStep(String gene, int step) {
            this.gene = gene;
            this.step = step;
        }
    }

    public int minMutation(String startGene, String endGene, String[] bank) {
        // For quick lookup and easy delete
        Set<String> geneSet = new HashSet<>(Arrays.asList(bank));
        Queue<GeneStep> queue = new LinkedList<>();
        queue.offer(new GeneStep(startGene, 0));
        
        // For easy iteration to generate mutation from the current letter
        Map<Character, char[]> mutationMap = new HashMap<>();
        mutationMap.put('A', new char[]{'C', 'G', 'T'});
        mutationMap.put('C', new char[]{'A', 'G', 'T'});
        mutationMap.put('G', new char[]{'C', 'A', 'T'});
        mutationMap.put('T', new char[]{'C', 'G', 'A'});
        
        while (!queue.isEmpty()) {
            GeneStep current = queue.poll();
            String gene = current.gene;
            int step = current.step;
            
            if (gene.equals(endGene)) {
                return step;
            }
            
            for (int i = 0; i < gene.length(); i++) {
                char[] newGeneArray = gene.toCharArray();
                for (char c : mutationMap.get(gene.charAt(i))) {
                    newGeneArray[i] = c;
                    String newGene = new String(newGeneArray);
                    
                    if (geneSet.contains(newGene)) {
                        queue.offer(new GeneStep(newGene, step + 1));
                        geneSet.remove(newGene);
                    }
                }
            }
        }
        
        return -1;
    }
}

Javascript:

/**
 * @param {string} startGene
 * @param {string} endGene
 * @param {string[]} bank
 * @return {number}
 */
var minMutation = function(startGene, endGene, bank) {
    // for quick lookup and easy delete
    let set = new Set(bank);
    let queue = [{gene: startGene, step: 0}];

    // for easy iteration to generate mutation from the current letter
    let map = {
        A: ['C', 'G', 'T'],
        C: ['A', 'G', 'T'],
        G: ['C', 'A', 'T'],
        T: ['C', 'G', 'A']
    }
    
    while(queue.length) {
        let {gene, step} = queue.shift();
        if(gene === endGene) {
            return step
        }

        for(let i = 0; i < gene.length; i++) {
            map[gene[i]].forEach(c => {
                let newGeneArray = gene.split("")
                newGeneArray[i] = c
                let newGene = newGeneArray.join("")
                if(set.has(newGene)) {
                    queue.push({gene: newGene, step: step + 1});
                    set.delete(newGene)
                }
            })
        }
    }

    return -1
};

Time Complexity:
To analyze time complexity, let’s look at the components involved:

The BFS traversal: The algorithm uses a queue to perform a breadth-first search.
Checking each character and trying all possible mutations: For each gene sequence, the algorithm checks each of its N characters (where N is the length of the gene sequence, typically 8 in the problem constraint) and attempts to mutate it to all other possible characters (3 possibilities per character, since there are 4 possible nucleotides).
Checking and removing gene sequences from the set: Upon each successful mutation, it checks if the new sequence is in the set and removes it if present.

Considering that there are M gene sequences in the bank:

In the worst case, the algorithm might have to visit all M sequences in the bank.
For each sequence, we perform N character checks and 3 possible mutations.
Each mutation check involves an O(1) set containment check and O(1) removal operation (since it’s a set).

Thus, the total time complexity can be estimated as O(M * N * 3), which simplifies to O(M * N) as constants can be ignored in Big O notation.

Space Complexity:
The space complexity is determined by the queue and the set, both of which may hold up to M gene sequences. Hence, the space complexity is O(M). Where M is the length of the bank.

2 thoughts on “Leetcode 433 Minimum Genetic Mutation”

John E. Snyder June 23, 2024 at 2:02 am


Hello there! This blog post couldn’t be written much better!
Looking at this post reminds me of my previous roommate!

He constantly kept preaching about this. I most certainly will forward this information to him.
Pretty sure he’s going to have a great read. Thanks
for sharing!

Feel free to surf to my homepage; John E. Snyder
1. leetcode-solution.com June 23, 2024 at 2:25 am
  
  
  Thanks for the kind words and really glad it helped!

Leetcode 433 Minimum Genetic Mutation

Leetcode 433 Minimum Genetic Mutation

2 thoughts on “Leetcode 433 Minimum Genetic Mutation”

Leave a Reply Cancel reply