Python - String (str type)

Card Puncher Data Processing

Python - String (str type)

About

Text - String in Python

A string literal is a sequence data type.

Strings in Python are:

Each character in a string has a subscript or offset (id). The number starts at 0 for the leftmost character and increases by one as you move character-by-character to the right.

The string python has 6 characters, numbered 0 to 5:

+---+---+---+---+---+---+
| P | Y | T | H | O | N |
+---+---+---+---+---+---+
  0   1   2   3   4   5

To get the letter P,

letter_P = "PYTHON"[0]

Initialization

Code

It's created by writing it down between quotation marks (' ' or “ ”). The escape character is the backslash character (\)

my_string = "I'm a string!"
my_other_string = 'and me too'
my_string_with_comma = 'It\'s great!'

string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'

File

with open("file.txt", "r") as fh:
    my_description = fh.read()

str

str is the class that creates strings objects.

Character Set

Unicode

data prefixed with the letter “u” are unicode strings. For example:

s = u"This is an unicode string"
print type(s)
<type 'unicode'>

python i/o is byte based.

s = file.readline() # bytes
print type(s)
s = file.readline().decode('utf-8') # unicode
print type(s)

s = u'Hello world!'
print type(s), repr(s)
s = 'Hello world!'
print type(s), repr(s)

If you encounter an error involving printing unicode, you can use the encode method to properly print the international characters, like this:

unicode_string = u"aaaàçççñññ"
encoded_string = unicode_string.encode('utf-8')
print encoded_string

“decode” converts from bytes to unicode. “encode” converts from unicode to bytes.

Loop

For

string = "Nico!"

for character in string:
    print character
N
i
c
o
!

Operator

in

if 'a' in 'Nicolas':
    print('a is a letter of Nicolas')
if 'z' not in 'Nicolas':
    print('a is not a letter of Nicolas')
if 'Nico' in 'Nicolas':
    print('Nico is in Nicolas')

+ (Concat)

The + operator between two strings concatenates them

print "gerard"+" "+"nico"

Function

Split

3/library/stdtypes.html

The string function will split a sentence in a list of words.

text = "How do you do?"

for word in text.split():
    print word
How
do
you
do?

Split returns a list data type

>>> type(text.split())
<class 'list'>

split() can be called directly on a unicode or str object. For example,

>>> u'split,me'.split(',')
[u'split', u'me']

length

  • len(). Length of a string of numbers of characters
my_string = "Nico Gerard"
print len(my_string)
# Dot notation works only for string specific methods (ie that don't work on anything else).
# my_string.len() is then not good because len() can work on different objects.

Lower

s.lower()

Upper

s.upper()

(Cast|Str)

str(). Makes strings out of non-strings. Explicit string conversion.

str(2)

Isalphanumeric

Isalphanumeric: “J123”.isalpha() == False

Slicing

Slicing of substring: string[i:j] gives the characters from position i to j.

>>> 'foo'[0:2]
'fo'
>>> 'foo'[2:]
'o'

Replace

Syntax:

str.replace(old, new[, max])

Example:

  • Replace boum by hop
'Youplaboum'.replace('boum', 'hop')
'Youplahop'

  • Replace the first boum by hop
'Youplaboumboum'.replace('boum', 'hop',1)
'Youplahopboum'

Strip

remove trailing and leading spaces.

s.strip()

Encoding

Characters are represented using a variable-length encoding scheme called UTF-8.

Each character is represented by some number of bytes.

Ord

You can find the value of a character c using ord©.

Example of numeric values of the characters 'a', 'A' and space:

>>> ord('a')
97
>>> ord('A')
65
>>> ord(' ')
32

Chr

You can obtain the character from a numerical value using chr(i).

To see the string of characters numbered 0 through 10, you can use the following:

s = ' - '.join([chr(i) for i in range(10)])
'\x00 - \x01 - \x02 - \x03 - \x04 - \x05 - \x06 - \x07 - \x08 - \t'

to

toInt

Python - Integer

int('24')

toByte

Python - Byte

myString = "hello";
myStringInByte = myString.encode()

Properties

immutable

Strings in Pyhton are immutable.

From Why are Python strings immutable?. There are several advantages.

  • One is performance: knowing that a string is immutable means we can allocate space for it at creation time, and the storage requirements are fixed and unchanging. This is also one of the reasons for the distinction between tuples and lists.
  • Another advantage is that strings in Python are considered as “elemental” as numbers. No amount of activity will change the value 8 to anything else, and in Python, no amount of activity will change the string “eight” to anything else.

Documentation / Reference





Discover More
Card Puncher Data Processing
Python - Byte

in Python Empty byte Python 2 Python 3
Card Puncher Data Processing
Python - Control flow (Comparator, Boolean Operator and Conditional Statement)

in Python Equal to (==) Not equal to (!=) Less than (<) Less than or equal to (<=) Greater than (>) Greater than or equal to (>=) In (for a string or a list) Comparisons generate...
Card Puncher Data Processing
Python - Data Type

data (type|structure) in Python of a name generally a variable objecttypeintsstringsfunctionsclassesclass - an integer is a positive or negative whole number. float booleans (True...
Card Puncher Data Processing
Python - Integer

in Python int is the class that creates integer objects
Card Puncher Data Processing
Python - Sequence (type)

The following data structure are sequence: Sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two...
Card Puncher Data Processing
Python - Unicode

unicode is an object type unicode. See also: split() can be called directly on a unicode or str object. For example, ...



Share this page:
Follow us:
Task Runner