Python: raw_input(), input(), strip() and split() Explanations

Ran
Ran

Table of Contents

Background

This article was written down when I was doing the HackerRank Python challenge: Find a string. I think it would help explain the difference between raw_input() and input() for others.

You will see raw_input() a lot in the HackerRank Editorial section since the solution was written with Python 2 at the time.

Difference between raw_input() and input()

raw_input() is a function that only exists in Python2, but it is the same as input() in Python3.

There was an input() function in Python2, it is now the same as eval(input()) in Python3.

The relationship can be described by the following table:

Python2 Python3
input() = eval(raw_input()) eval(input())
raw_input() input()

Python2 to Python3 changes:

  • The original input() was deleted.
  • raw_input() has been renamed to input().
  • The original input() is the same as eval(input()) in Python3.

eval() is dangerous

The eval(input()) in Python3 is quite dangerous because the eval() will parse the expression argument and evaluate it as a python expression.

result = 10

guess = eval(input())

if guess == result:
    print("You are right!")
else:
    print("Sorry, it is wrong.")

For example, I set the result as 10 and let the user guess the number. If we input 8 or 10, it should result in different output:

8
Sorry, it is wrong.

10
You are right!

But the interesting thing is, if we enter the word "result", it will have an unexpected output:

result
You are right!

It counted it as correct! Because the eval will take the input as python expression, so the code becomes if result == result:, it will evaluate the statement as True.

Also, similar situations happen when you have access to the list.

def append_2(l):
    l.append(2)

result = [1,3,5]

guess = eval(input())

print(result)
Input:
append_2(result) 

Output:
[1, 3, 5, 2]

You can see that we add an element 2 to the result list through the eval(input()). Therefore, we should be very careful when there is a need to use the eval() function.

strip() and split()

strip()

Syntax:

str.strip([chars])

Usage:

strip() is a string method. It is used to remove the leading and trailing characters for a string. The default character is whitespace.

s = "    hey    "
d = "   hey   d    "

print(s.strip())
print(d.strip())

Output:

hey
hey   d

You can see that it starts the test from two sides and stops until it reaches the first non-whitespace character.

e = "banana ghost nanaba"

print(e.strip("ban "))

Output:

ghost

Here we passed four characters: b, a, n, and [space]. The strip() will start from both sides until it reaches the first letter that does not belong to these four characters.

split()

Syntax

str.split(sep=None, maxsplit=-1)

Usage

split() is used to split a string into a list. But a lot of beginners don't know that split() actually has two parameters.

First, the basic usage is to split the string by whitespace:

l = "Amy Sheldon Penny"

x = l.split()

print(x)

Output:

['Amy', 'Sheldon', 'Penny']

You can also specify the separator.

l = "Amy-Sheldon-Penny-Leonard"

x = l.split("-")

print(x)

Output:

['Amy', 'Sheldon', 'Penny', 'Leonard']

We can also pass the maxsplit argument to decide how many splits we want to have.

l = "Amy-Sheldon-Penny-Leonard"

x = l.split("-",2)

print(x)

Output:

['Amy', 'Sheldon', 'Penny-Leonard']

Here we use the first two "-"s to process the string, so it ends up with three elements in the list.

The important thing about the split()!

If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.
Source: Python split()

The bold part is the one we need to be careful of.

For example:

s = "tom   and jerry"
l = s.split()

print(l)

Output:

['tom', 'and', 'jerry']

Here we do not pass anything to the split()'s argument sep. You can see that there are 2 whitespaces between the "tom" and "and" strings. However, the split() will treat consecutive whitespace as one single separator when sep is not specified or is None.

The result will be different if we set the sep as ' ' (one whitespace):

s = "tom   and jerry"
l = s.split(' ')

print(l)

Output:

['tom', '', '', 'and', 'jerry']

This is how split() works by using one whitespace as the separator.

The good thing about this is that after we process the elements in the list, we can use "".join(s)to connect them without changing the whitespace between words.

For instance:

s = "tom   and jerry"
l = s.split(' ')

print(l)
print(" ".join(l))

Output:

['tom', '', '', 'and', 'jerry']
tom   and jerry

However, if the sep is None or not specified:

s = "tom   and jerry"
l = s.split()

print(l)
print(" ".join(l))

Output:

['tom', 'and', 'jerry']
tom and jerry

We can see only one whitespace will be used to join the elements here.

HackerRank Find a string Solution

def count_substring(string, sub_string):
    result = 0
    sub_len = len(sub_string)
    for i in range(len(string) - sub_len + 1):
        if sub_string == string[i:i+sub_len]:
            result += 1
    return result

I hope this article helps!

A Great Python Tricks Book That I Am Using

Python Tricks: A Buffet of Awesome Python Features
Get it now:

Reference

  1. Python strip()
  2. Python split()
  3. raw_input(), input(), strip() 和 split() 讲解(My Chinese Blog)
  4. HackerRank: Capitalize! 笔记(My Chinese Blog)
PythonHackerRanksplit()CS

Ran

Comments