Photo by Clément Hélardot / Unsplash

HackerRank: Merge the Tools! Notes

Ran
Ran

Table of Contents

Background

This article was written down when I was doing a Python challenge on HackerRank: "Merge the Tools!".

Merge the Tools! | HackerRank
Split a string into subsegments of length $k$, then print each subsegment with any duplicate characters stripped out.

Problem

To help you understand this problem, I will rewrite the description here by adding an extra variable m  (m is an integer) for better explanation.

Given a string s (length is n), we need to split it into m substrings. We call every substring \( t_{i} \), and every \( t_{i} \) has k characters.

We call every character \(c_{i}\), so S has n characters.

Then we need to create \(u_{i}\) using \(t_{i}\) with 2 conditions being satisfied.

\(u_{i}\) needs to satisfy 2 conditions:

  1. The characters in \(u_{i}\) are a subsequence of the characters in \(t_{i}\)
  2. Remove \(t_{i}\)'s duplicate characters while keeping the order

The problem will give you S and k, and ask you to output every \(u_{i}\), i.e., \(u_{0}\) to \(u_{m}\). The problem also says that k is a factor of n, so it guarantees that our m will be an integer.

The reason why the description is hard to understand is that it is all based on S and k. But in fact, it is talking about splitting a string S into m substrings, and each substring has k characters, so \( n = k \times m \). Then we remove each substring's duplicate characters while keeping its order.

For example, assume S = "AAABCADDE", and k = 3.

Analysis

After we understand the description, the key becomes this: giving a string \(t_{i}\), how can we remove the duplicate characters while preserving the order? In the followings, I will give 4 different solutions.

(Attention: The first and the second solution will need the Python version to be above 3.7)

Solution

1st: Use Python Dictionary to Do Hashing

The basic idea is: we want to create the effect of hashing by using the Python dictionary.

def merge_the_tools(string,k):
    n = len(string)
    m = int(n/k)
    for i in range(0,n,k):
        d = {key:0 for key in string[i:i+k]}
        print("".join(d))

The value is set to 0 because we only need the keys when printing the dictionary.

2nd: Use Python's dict.fromkeys() Method

The second method is basically the same as the first one, but we use dict.fromkeys() method to create the dictionary.

How to use dict.fromkeys()

dict.fromkeys() can be used to create a new dictionary. It will accept 2 parameters, and the first one is compulsory. We need to pass an iterable to be the dictionary's keys. The second one is optional. We can pass a specific value to be set as the value for all keys. If we don't pass the second parameter, it will be set to None.

For example:

a = "ABC"
b = 0

print(dict.fromkeys(a,b))

# Output:{'A': 0, 'B': 0, 'C': 0}

Or

a = "ABC"
b = [1,2,3]
print(dict.fromkeys(a,b))

# Output:{'A': [1, 2, 3], 'B': [1, 2, 3], 'C': [1, 2, 3]}

We need to be cautious for the second example here. Even if we pass the iterable, it will be set as the fixed value for all keys instead of matching the key correspondingly.

Solution

Therefore, we got our second solution:

def merge_the_tools(string, k):
    n = len(string)
    m = int(n/k)
    result = []
    for i in range(0,n,k):
        result.append("".join(dict.fromkeys(string[i:i+k])))
    print("\n".join(result))

3rd: Use Python's OrderedDict

The first and the second solution will only be suitable for Python 3.7 or above. Because after Python 3.7.0, dictionary is changed to insertion-order preserved.

Before Python 3.7.0, if we want to preserve the insertion order while creating a dictionary, we need to use OrderedDict.

from collections import OrderedDict

def merge_the_tools(string,k):
    n = len(string)
    m = int(n/k)
    result = []
    for i in range(0,n,k):
        result.append("".join(OrderedDict.fromkeys(string[i:i+k])))
    print("\n".join(result))

4th: Use List Comprehension

The 4th solution came out when I was thinking, "Is it possible to solve it in one line by list comprehension?" and it turned out to be possible. I think it is fun, haha, although it is a bit tricky.

def merge_the_tools(string,k):
    n = len(string)
    m = int(n/k)
    for i in range(0,n,k):
        result = []
        [result.append(x) for x in string[i:i+k] if x not in result]
        print("".join(result))

The key is the second last line here. When I am appending x for result, I am also using result for checking the condition. And if you try to print out the second last line, you will see the result as follows:

# Assume S = AAABCADDE,k = 3

def merge_the_tools3(string,k):
    n = len(string)
    m = int(n/k)
    for i in range(0,n,k):
        result = []
        print([result.append(x) for x in string[i:i+k] if x not in result])
        # print("".join(result))

"""" 
Output:
[None]
[None, None, None]
[None, None]
"""

Some Algorithm Books That I Am Using

(This part will contain affiliate links. Purchases made from these links will give me some money. It doesn't cost you extra. Thank you.)

  1. Grokking Algorithms (by Aditya Bhargava)
Grokking Algorithms: An Illustrated Guide for Programmers and Other Curious People
Get it now:

2. Introduction to Algorithms, fourth edition (by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein)

Introduction to Algorithms, fourth edition
Get it now:

3. Competitive Programming 4 - Book 1: The Lower Bound of Programming Contests in the 2020s (Steven Halim, Felix Halim, Suhendry Effendy) (I have the third version, but I believe the fourth will be better.)

Competitive Programming 4 - Book 1
Get it now:

If you only want to buy one, I believe the first one will be a good book for introduction to algorithms, while the second is the best reference book if you want to have some solid theoretical background. The third one should only be bought if you really want to join in competitive programming.

Reference

  1. Stack Overflow: Order of keys in dictionaries in old versions of Python
PythonHackerRankBasic CPfromkeysCS

Ran

Comments