Subscribe Contact

Home  »  Chapter 2 : Programming Basics
Text (a.k.a. Strings)

Overview

As we discovered on the earlier page Data & Data Types, strings are alphanumeric (textual) data. Now that we have used strings a bit and explored some additional features, like the basic operators, let's take a closer look at strings. First a bit of review and then we'll build on our knowledge of strings.

Strings are sequences of alphanumeric characters including the alphabetic letters A thru Z, both upper case and lowercase, as well as the digits 0 thru 9. Remember that when a numeric digit is within an alphanumeric string, it is not considered a number but rather a textual representation of that digit. Alphanumeric also includes other characters found on keyboards, such as @ $ # & * ( ) { } [ ] , = - _ . + ; ' /. Also, a blank space (created on the keyboard by the spacebar) is considered alphanumeric.

In some of the code we've seen and worked with so far, we have used string literals, which are strings surrounded by quotes and usually assigned to a variable, like this:

message = "This is a message."

The words This is a message. is called a string literal. Literal in this context means that sequence of characters makes up the value of the string stored in the variable, message in this case.

Important Note: Strings are immutable in Python.

Concept: Mutability

In programming, mutability refers to the ability of an object (like a variable) to be changed after it has been created. Conversely, if an object cannot be changed and always retains its initial value, it is called immutable (like a string). If you remember from our discussion of mutability earlier, this means it cannot be changed. Any operation or function that appears to change a string is making a copy of the original string, modifying that copy, and returning it as a new string. This process happens automatically by the Python interpreter.

Concatenation

Concatenation occurs when we combine two or more strings together to make one string. We do this in Python using the plus-sign (+) operator. Notice that's the same operator we use with numeric values for addition, so we can think of concatenation as adding strings together.

Here's an example using string literals (read the code comments for details):

# We can concatenate string literals, like this ...
print("Bob" + "Smith")
# ... notice though that when you run this it prints
# BobSmith, with no space. We need to include any
# punctuation or spaces in the string literal ...
print("Bob " + "Smith")
# ... Note that now we included a space in the first
# string literal, so now the output will be
# Bob Smith, which is more command and readable.
Here's another example using string variables:

# First we'll prompt the user for their first
# and last names and store the values the
# enter in variables...
first_name = input("Enter your first name: ")
last_name = input("Enter your last name: ")
# ... then we'll print the first and last
# names concatenating them together using the
# + operator...
print(first_name + last_name)
# ... we have the same problem in this example
# as we did in the previous example, it prints
# the first and last name with no space between
# them. We need to include punctuation and
# spaces with variable-based output as well,
# like this ...
print(first_name + " " + last_name)
# ... note that we are concatenating the first
# name, a string literal containing only a single
# space and the last name.

Character Indexing

There are times when we need to access individual characters in a string, we do this based on an index (position) of characters in the string. If we think of a string as a sequence of individual alphanumeric characters we can assign a number to each character (called an index). In Python, indexes for strings start at zero.

Let's look at an example:


In this example, we have declared a variable called full_name and assigned it the string "Bob Smith". The string contains 8 alphabetic characters and 1 space, so the length of the string is 9 alphanumeric characters. The first row of numbers below the string value in the image below is the index of each character in the string--starting at zero.

To access an individual character in the string we use the variable name and then the index number we want to access in square brackets immediately after the variable name, like this:

print(full_name[2])
The output of this print statement would be:

b
Note that the output is b, which is at index [2]. Remember that we start counting from zero so that lowercase b is the third character in the string.

We can also start at the end of the string and count backward (using negative integers), like this:

print(full_name[-4])
The output of this print statement would be:

m
Notice too that when counting backward from the end of the string, we start at -1, not at zero.

String Slicing

Another way we can access portions of a string is through slicing. Slicing allows us to select a range of characters within a string. The syntax is similar to using an index as shown above, but now we add the colon operator : and provide another index value after the colon, like this:

print(full_name[2:7])
The output of this print statement would be:


b Smi


In this example, I specified 2:7, which means slice the string starting at index [2] (remember this is the 3rd character in the string), and end at index [7] (which is the 8th character). Also notice that the space is included in the count as it is considered a character of the string.


Continuing with our "Bob Smith" string example (shown above), let's consider several variations of slicing strings. I recommend you try these and other combinations of your choosing in your IDE.

Code Example Output Description
1
print(full_name[2:5])
b S
This example is similar to the first example above, we're slicing the string from index [2] to index [5].
2
print(full_name[1:5])
ob S
This example slices just character different than the one above, from index [1] to [5].
3
print(full_name[3:5])
  S
This example slices [3] to [5] and notice that the character at [3] is a space, which is included in the output.
4
print(full_name[1:])
ob Smith
This example demonstrates that we can leave out an index value, which implies, in this case, the end of the string. So, index [1:] means, to slice characters from index [1] to the end of the string.
5
print(full_name[:5])
Bob S
This example demonstrates leaving out the index value from the first entry, [:5], which means slice starting from the beginning of the string to, in this example, index [5].
6
print(full_name[:3])
Bob
Again, slice from the beginning of the string to index [3].
7
print(full_name[:-6])
Bob
We can also use the negative indexing to count from, the end of the string. So in this example, we're slicing from the beginning of the string to the [-6] index.
8
print(full_name[4:])
Smith
This example shows another example of starting from an index value [4] in this case, to the end of the string.
9
print(full_name[-5:])
Smith
We can also use the negative indexing as the first value, in this case [-5] to the end of the string.
10
print(full_name[:])
Bob Smith
In this last example, [:] means from the beginning of the string to the end of the string, so the entire string.

Concept: Methods

Methods in programming are procedures associated with objects which provide functionality related to that object.


String Methods

The Python string data type has a set of useful built-in methods that we can use to work with strings. We will take a look at a few of them for now, and then explore additional string methods as we progress. There is a full list of the string methods here (opens in a separate tab).

String Method Description & Example(s)
str_var.isalpha() Remember that strings are alphanumeric, which means they can contain digits and special characters, as well as alphabetic characters. There are times when we want to know if a string contains only alphabetic characters, the isalpha() method can tell us. It returns a boolean True if all characters in the string (str_var) are alphabetic (A thru Z or a thru z), that is, there are no digits, special characters, or spaces.

Code Example:

str_var_1 = "Bob"
str_var_2 = "Bob123"
str_var_3 = "Bob Smith"
print(str_var_1.isalpha())
print(str_var_2.isalpha())
print(str_var_3.isalpha())

Output:


True
False
False


The first output line is True because str_var_1 contains only alphabetic characters. The second and third output lines are False because both str_var_2 and str_var_3 contain non-alphabetic characters.

str_var.lower() The lower() method returns a copy of the string with all characters converted to lower case. Note that it returns a copy, remember that strings in Python are immutable (they cannot be changed once set).

Code Example:

str_var_1 = "Bob"
str_var_2 = "Bob123"
str_var_3 = "Bob Smith"
print(str_var_1.lower())
print(str_var_2.lower())
print(str_var_3.lower())

Output:


bob
bob123
bob smith


Notice that all three lines of output have been converted to lower case. Also notice that lower() does not effect on the non-alphabetic characters in the strings, the 123 is unchanged and the space in the third string remains the same.

str_var.upper() The upper() method returns a copy of the string with all characters converted to upper case. Note that it returns a copy, remember that strings in Python are immutable (they cannot be changed once set).

Code Example:

str_var_1 = "Bob"
str_var_2 = "Bob123"
str_var_3 = "Bob Smith"
print(str_var_1.upper())
print(str_var_2.upper())
print(str_var_3.upper())

Output:


BOB
BOB123
BOB SMITH


Notice that all three lines of output have been converted to upper case. Also, notice that upper() does not effect on the non-alphabetic characters in the strings, the 123 is unchanged and the space in the third string remains the same.

str_var.capitalize() The capitalize() method converts the first character of a string to upper case.

Code Example:

str_var_1 = "this is a sentence."
print(str_var_1.capitalize())

Output:


This is a sentence.


Notice that the original string was all lower case letters. The capitalize() method converted the first character to upper case.

str_var.title() The title() method converts the first character of every word in a string to upper case.

Code Example:

str_var_1 = "this is a sentence."
print(str_var_1.title())

Output:


This Is A Sentence.


Notice that the original string was all lower case letters. The title() method converted the first character of every word to upper case.

str_var.replace(old, new [, count]) The replace() method returns a copy of the string with all occurrences of the specified characters (old) in the string with the specified characters (new). There is also an optional count parameter that indicates the replace should only be applied to the first count occurrences of old. Note that it returns a copy, remember that strings in Python are immutable (they cannot be changed once set).

Code Example:

str_var_1 = "this is a sentence."
print(str_var_1.replace("s", "Z"))
print(str_var_1.replace("s", "ZZZ"))
print(str_var_1.replace("s", "ZZZ", 1))

Output:


thiZ iZ a Zentence.
thiZZZ iZZZ a ZZZentence.
thiZZZ is a sentence.


Notice that in all three output examples, we're replacing "s" with "Z". The first output line replaces the three "s" characters with a single Z, as specified in the replace() method. The second output demonstrates that the replacement can be any number of characters, so each of the three "s" characters in the original string are replaced with "ZZZ". And the third example demonstrates the use of the optional count parameter, so in this case, only the first "s" is replaced with "ZZZ" because we specified 1 in the replace() method. Also, note that we can use the same str_var_1 variable because it is immutable, each of the three replace() method statements return copies of the string with the specified replacement completed.

Built-In Python Functions with Strings

As indicated previously, Python contains several built-in functions that are more general-purpose (not only useful with strings), like print(). We can use many of these functions with strings as well. There is a full list of the string methods here (opens in a separate tab). Let's look at one example for now and we'll learn more later.

Function Description & Example(s)
len(str_var) The len() function returns the length of a string.

Code Example:

str_var_1 = "Bob"
print(len(str_var_1))
str_var_1 = "This is a sentence."
print(len(str_var_1))
str_var_1 = "This is a longer sentence that is about nothing of any particular importance."
print(len(str_var_1))
str_var_1 = "Address: 123 Nowhere Street, SLC, UT 84999"
print(len(str_var_1))

Output:


3
19
77
42


The first output line indicates that the string "Bob" is 3 characters long. The other three examples demonstrate various lengths and also that the len() function counts every alphanumeric character in the string, including spaces, digits, and special characters.

String, ASCII & Unicode Text Art

When we are working on programs that print to the screen we want to present the output in a user-friendly and readable form. The Python console is non-graphical, meaning it does not display colors, graphics, images, etc. But we often will generate output that is not just the result of calculations or processing, but we add text formatting to make it look formatted and even artistic. Here are a few examples:

----------------------------
Welcome to My Program
----------------------------
1. Menu Option 1
2. Menu Option 2
3. Menu Option 3
4. Exit
----------------------------
User Prompt:
----------------------------

jjjjjjjjjj GGGGGGGGGG
jjjjjjjjjj GGGGGGGGGG
     jj    GG
jj   jj    GG    GGGG
jj   jj    GG      GG
jjjjjjj    GGGGGGGGGG
jjjjjjj    GGGGGGGGGG

       __|__
--o--o--(_)--o--o--

        _______
       //  ||\ \
 _____//___||_\ \___
 ) _ _ \
 |_/ \________/ \___|
___\_/________\_/______

Python has many string handling tools that help us create text-based shapes and art. Notice that all of the individual characters used in the above examples are directly from the alphanumeric character set (visible on most keyboards). There are a large number of additional characters and symbols that we can print in the console as well. Those extra characters come from several sources, such as the extended ASCII character set (see chart below), or the Unicode Character Database which you can find here.

What is ASCII?
Concept: American Standard Code for Information Interchange (ASCII)

A set of characters used in computers for electronic communication. The table presented below is the complete ASCII character set. Also, you can find a more thorough listing of ASCII values here. There are other encoding systems as well, such as EBCDIC.

Every character in the set (see the chart below) has a code number associated with it that computers can interpret and produce the associated character on a computer screen, printer, etc. Some of the ASCII characters are called control characters, which do not have a visual printable representation, but they have a function related to electronic communications. For example, character 09 is a tab, the same as if you pressed the Tab key on your keyboard. So, if I put that character in the middle of a string in a Python print statement when it is printed on the console a tab would appear at the position in the string. Other characters are printable, meaning they are visible on a computer screen, printer, etc. The standard range of characters you see on most keyboards is between 32 and 127. Beyond 127 is referred to as the extended ASCII set, however, today more systems use the Unicode standard for extended character representation (see the Unicode discussion below).


What is Unicode?
Concept: Unicode

An information technology standard for representing text-based information. It is a worldwide standard used for electronic communication across human languages. Contrast this with the ASCII standard above which only represents English characters.

There are over 140,000 individual characters, across multiple human languages, available through Unicode, which is why it is often the preferred standard to use. You can find a list of the Unicode characters here. There's also a large list of Unicode emoji characters available here.

Unicode Example

Here are a few examples of using Unicode characters in Python. If I go to the Unicode characters page and locate a few interesting characters I can print them using Python's u'\u____' syntax, where the _____ is the Unicode code value.

print(u'\u00A9')
print(u'\u00BD')
print(u'\u00F1')
print(u'\u0376')
print(u'\u2600')
print(u'\u259A')
print(u'\u259A' * 30, end="")
print()
print(u'\u2550' * 30)
print("Main Menu")
print(u'\u2550' * 30)
print("1. Menu Option 1")
print("2. Menu Option 2")
print("3. Menu Option 3")
print("4. Exit")
print(u'\u2550' * 30)
print("User Prompt: ")
print(u'\u2550' * 30)
Code Details:
  • Lines 1 thru 6 use the u'\u____' syntax to print characters from Unicode that are not found on the standard keyboard.
  • The code in the print statement, like 00BD is found in the list of Unicode characters here
  • Line 7 uses the * operator to repeat the character a specified number of times.
  • Line 8 prints a blank line.
  • Lines 9 thru 18 use Unicode characters for the lines, which look better than our example menu shown earlier (above) that uses dash characters to draw lines.

Practice Problems

Problem 1

String Indexing 1: Write a Python program that assigns the string shown below to a variable and then prints the 3rd, 9th and 13th characters in the string:


Salt Lake City, UT



Problem 2

String Indexing 2: Write a Python program that produces the same output as the String Indexing 1 problem above, only this time use indexes that start from the end of the string. Also, output only 1 line that prints the 3 characters together like this:.


let



Problem 3

String Slicing 1: Write a Python program that uses the string shown below and then use string slicing to print only the comma in the string. Yes, you could use a single index for this, but for practice, use full slice syntax.


Salt Lake City, UT



Problem 4

String Slicing 2: Write a Python program that uses string slicing to convert the string "Salt Lake City, UT" to "SLC, UT".


SLC, UT




Problem 5

String Methods & Functions: Write a Python program that uses string methods that use the string provided below and performs the following tasks, each on their own line of output.


Python is a popular programming language.




Problem 6

Text Art: Write a Python program that produces the following text art airplane. In your solution, use variables for the characters that make up the image, particularly any sequences that repeat (there are at least a couple of them). Use concatenation extensively. Also, do not use any literal strings in your print statements and no more than one space character in any literal string segment.

       __|__
--o--o--(_)--o--o--





 


«  Previous : Printing
Next : Comments  »




© 2023 John Gordon
Cascade Street Publishing, LLC