Leetcode 433 Minimum Genetic Mutation
To address this problem, we can use either Breadth-First Search (BFS) since it’s more suitable for finding the shortest path in an unweighted graph, similar to determining the minimum number of mutations required.
The core idea is to represent each gene string as a node in a graph, with each valid mutation forming an edge between two nodes. By employing BFS, we can explore the graph level by level, starting from the initial gene and progressing towards the target gene. BFS is particularly effective in this context as it identifies the shortest path (i.e., the fewest mutations) from the start node to the end node.
Here are the core steps:
- Initialize a set from the bank to quickly verify if a mutation is valid.
- Start with a queue containing the initial gene and a mutation count of 0.
- Define mutation possibilities for each character using a dictionary for the ease of writing code.
- e.g
{A: ['C', 'T', 'G']; C: ['A', 'T', 'G'] ....}
- e.g
- Dequeue an element and iterate over its characters, changing one character at a time based on the mutation possibilities.
- If a new mutation is valid (present in the bank), enqueue it with the mutation count increased by one.
- If the end gene is reached, return the mutation count.
- If the queue is exhausted without reaching the end gene, return -1.
Python:
class Solution(object):
def minMutation(self, startGene, endGene, bank):
# For quick lookup and easy delete
gene_set = set(bank)
queue = deque([{'gene': startGene, 'step': 0}])
# For easy iteration to generate mutation from the current letter
mutation_map = {
'A': ['C', 'G', 'T'],
'C': ['A', 'G', 'T'],
'G': ['C', 'A', 'T'],
'T': ['C', 'G', 'A']
}
while queue:
current = queue.popleft()
gene = current['gene']
step = current['step']
if gene == endGene:
return step
for i in range(len(gene)):
for c in mutation_map[gene[i]]:
new_gene_array = list(gene)
new_gene_array[i] = c
new_gene = ''.join(new_gene_array)
if new_gene in gene_set:
queue.append({'gene': new_gene, 'step': step + 1})
gene_set.remove(new_gene)
return -1
Java:
class Solution {
private static class GeneStep {
String gene;
int step;
GeneStep(String gene, int step) {
this.gene = gene;
this.step = step;
}
}
public int minMutation(String startGene, String endGene, String[] bank) {
// For quick lookup and easy delete
Set<String> geneSet = new HashSet<>(Arrays.asList(bank));
Queue<GeneStep> queue = new LinkedList<>();
queue.offer(new GeneStep(startGene, 0));
// For easy iteration to generate mutation from the current letter
Map<Character, char[]> mutationMap = new HashMap<>();
mutationMap.put('A', new char[]{'C', 'G', 'T'});
mutationMap.put('C', new char[]{'A', 'G', 'T'});
mutationMap.put('G', new char[]{'C', 'A', 'T'});
mutationMap.put('T', new char[]{'C', 'G', 'A'});
while (!queue.isEmpty()) {
GeneStep current = queue.poll();
String gene = current.gene;
int step = current.step;
if (gene.equals(endGene)) {
return step;
}
for (int i = 0; i < gene.length(); i++) {
char[] newGeneArray = gene.toCharArray();
for (char c : mutationMap.get(gene.charAt(i))) {
newGeneArray[i] = c;
String newGene = new String(newGeneArray);
if (geneSet.contains(newGene)) {
queue.offer(new GeneStep(newGene, step + 1));
geneSet.remove(newGene);
}
}
}
}
return -1;
}
}
Javascript:
/**
* @param {string} startGene
* @param {string} endGene
* @param {string[]} bank
* @return {number}
*/
var minMutation = function(startGene, endGene, bank) {
// for quick lookup and easy delete
let set = new Set(bank);
let queue = [{gene: startGene, step: 0}];
// for easy iteration to generate mutation from the current letter
let map = {
A: ['C', 'G', 'T'],
C: ['A', 'G', 'T'],
G: ['C', 'A', 'T'],
T: ['C', 'G', 'A']
}
while(queue.length) {
let {gene, step} = queue.shift();
if(gene === endGene) {
return step
}
for(let i = 0; i < gene.length; i++) {
map[gene[i]].forEach(c => {
let newGeneArray = gene.split("")
newGeneArray[i] = c
let newGene = newGeneArray.join("")
if(set.has(newGene)) {
queue.push({gene: newGene, step: step + 1});
set.delete(newGene)
}
})
}
}
return -1
};
Time Complexity:
To analyze time complexity, let’s look at the components involved:
- The BFS traversal: The algorithm uses a queue to perform a breadth-first search.
- Checking each character and trying all possible mutations: For each gene sequence, the algorithm checks each of its
N
characters (whereN
is the length of the gene sequence, typically 8 in the problem constraint) and attempts to mutate it to all other possible characters (3 possibilities per character, since there are 4 possible nucleotides). - Checking and removing gene sequences from the set: Upon each successful mutation, it checks if the new sequence is in the set and removes it if present.
Considering that there are M
gene sequences in the bank:
- In the worst case, the algorithm might have to visit all
M
sequences in the bank. - For each sequence, we perform
N
character checks and3
possible mutations. - Each mutation check involves an
O(1)
set containment check andO(1)
removal operation (since it’s a set).
Thus, the total time complexity can be estimated as O(M * N * 3)
, which simplifies to O(M * N)
as constants can be ignored in Big O notation.
Space Complexity:
The space complexity is determined by the queue and the set, both of which may hold up to M
gene sequences. Hence, the space complexity is O(M)
. Where M
is the length of the bank.
Hello there! This blog post couldn’t be written much better!
Looking at this post reminds me of my previous roommate!
He constantly kept preaching about this. I most certainly will forward this information to him.
Pretty sure he’s going to have a great read. Thanks
for sharing!
Feel free to surf to my homepage; John E. Snyder
Thanks for the kind words and really glad it helped!