Lesson-6.5SetsIntroductionIterating through SetsGrowing SetsSet OperationsOther Set MethodsMutability
A set is an unordered collection with no duplicate elements [refer]. Unlike lists and tuples, there is no notion of order in a set. This is why it is called an unordered collection as opposed to a sequence. A set can be defined as follows:
xxxxxxxxxx
31even_nums = {2, 4, 6, 8, 10}
2print(type(even_nums))
3print(isinstance(even_nums, set))
Notice the similarity in syntax between sets and dictionaries. Both are enclosed within curly braces. While a dictionary has key-value pairs in it, a set just has a collection of values. A set in Python is a remarkably accurate representation of a mathematical set. Therefore, most of the properties that you are used to seeing in mathematical sets nicely carry over to Python sets. This connection is so strong that you can often forget that you are dealing with Python sets.
xxxxxxxxxx
51nums_1 = {2, 4, 6, 8, 10}
2nums_2 = {2, 2, 4, 4, 6, 6, 8, 8, 10, 10}
3print(nums_1, nums_2)
4print(nums_1 == nums_2)
5print(nums_1 is not nums_2)
As stated before, sets do not support duplicate elements. We see that nums_1
and nums_2
are equal sets. However, they don't point to the same object. Sets support membership just like lists, tuples and dictionaries.
xxxxxxxxxx
31nums = {1, 2, 3, 4, 5}
2print(1 in nums)
3print(6 not in nums)
The number of elements in a set, which is the same as its cardinality, is given by the len
function:
xxxxxxxxxx
21nums = {1, 2, 3, 4, 5}
2print(f'Cardinality of nums is {len(nums)}')
Sets cannot be indexed. This is quite reasonable as they are not ordered collections. The following code will throw an error:
xxxxxxxxxx
41##### Alarm! Wrong code snippet! #####
2some_set = {'this', 'is', 'a', 'set'}
3print(some_set[0])
4##### Alarm! Wrong code snippet! #####
Any hashable object can be added to sets. This means most of the immutable types such as int
, float
, str
and tuple
can be added to sets. A small caveat as far as tuples are concerned: a tuple of lists is unhashable and therefore cannot be added to sets.
xxxxxxxxxx
21a_set = {1.0, 'one', 1, True, (1, )} # valid set
2not_a_set = {([1, 2], [3, 4])} # not a valid set
not_a_set
returns a TypeError
as expected.
Though a set is not a sequence, iterating through the elements of a set is supported.
xxxxxxxxxx
31nums = {1, 2, 3, 4, 5}
2for num in nums:
3 print(num)
How do we define an empty set?
xxxxxxxxxx
51##### Alarm! Be careful about the variable name! #####
2empty_set = { }
3print(isinstance(empty_set, set))
4print(isinstance(empty_set, dict))
5##### Alarm! Be careful about the variable name! #####
We see that empty_set
is in fact an empty dictionary. Computers are precise machines, which makes them very faithful. Few lessons back we used { }
to initialize an empty dictionary. It hasn't changed. { }
is still an empty dictionary. So, how do we define an empty set then?
xxxxxxxxxx
21empty_set = set()
2print(isinstance(empty_set, set))
Simple enough! With the empty set and set-iteration defined, we can now grow sets from scratch.
Consider the first 100 powers of 7:
Note down the last digit of each of these powers. How many of them are unique? What are these numbers?
This problem has a simple mathematical solution. But humor me and assume that you don't know how to solve this problem. Let us go for a computational solution.
xxxxxxxxxx
71num = 1
2digits = set()
3for i in range(100):
4 num *= 7
5 last = num % 10
6 digits.add(last)
7print(digits)
add
is a method used to add elements to a set. The solution to this problem is a typical use case of sets. When you expect duplicate elements to come up often and if you are not concerned with duplicates, then sets are ideal objects for storage. The same problem can be solved using lists:
xxxxxxxxxx
81num = 1
2digits = [ ]
3for i in range(100):
4 num *= 7
5 last = num % 10
6 if last not in digits:
7 digits.append(last)
8print(digits)
Mathematical sets are friendly objects. They routinely interact with each other through one of the following operations:
Python sets strive to be as friendly as their mathematical counterparts. We will see how each of these operations are represented:
xxxxxxxxxx
41A = {1, 3, 5}
2B = {1, 2, 3, 4, 5}
3print(A.issubset(B)) # method-1
4print(A <= B) # method-2
Both lines return the value True
. A set is a proper subset of if every element in is present in and . It is denoted by . That is, there is at least one element in which is not in :
xxxxxxxxxx
41A = {1, 2, 3}
2B = {1, 2, 3}
3print(A <= B) # method-1
4print(A < B) # method-2
The A < B
operator checks if A
is a proper subset of B
. In this case A
is not a proper subset of B
, so the second print statement returns False
.
xxxxxxxxxx
41A = {1, 3, 5}
2B = {1, 2, 3, 4, 5}
3B.issuperset(A) # method-1
4print(B >= A) # method-2
xxxxxxxxxx
61A = {1, 3, 5}
2B = {2, 4, 6}
3C1 = A.union(B) # method-1
4C2 = A | B # method-2
5print(C1, C2)
6print(C1 == C2)
When there are multiple sets, we could do the following:
xxxxxxxxxx
51A1, A2, A3, A4 = {1}, {2, 3}, {4, 5, 6}, {7, 8, 9, 10}
2B1 = A1.union(A2, A3, A4) # method-1
3B2 = A1 | A2 | A3 | A4 # method-2
4print(B1, B2)
5print(B1 == B2)
xxxxxxxxxx
61A = {2, 4, 6}
2B = {2, 4}
3C1 = A.intersection(B) # method-1
4C2 = A & B # method-2
5print(C1, C2)
6print(C1 == C2)
What happens if there are no elements in common? We should get the empty set:
xxxxxxxxxx
31even, odd = {2, 4, 6}, {1, 3, 5}
2common = even & odd
3assert common == set()
We have used an assert statement just to introduce some variation. As it doesn't raise an AssertionError
, we are right on target.
Difference: The difference between two sets and is the set of elements present in one set but not in the other. It is denoted by or , and the two are not the same!
xxxxxxxxxx
101A = {1, 2, 3, 4}
2B = {2, 4, 5}
3C1 = A.difference(B) # method-1
4C2 = A - B # method-2
5print(C1, C2)
6print(C1 == C2)
7D1 = B.difference(A) # method-1
8D2 = B - A # method-2
9print(D1, D2)
10print(D1 == D2)
The methods that we saw in the previous section had a mathematical flavor. Now, we shall look at those methods that have a computational flavor!
To remove an element from the set, we can use the remove
method:
xxxxxxxxxx
41A = {'this', 'is', 'a', 'set'}
2print('Before', A)
3A.remove('this')
4print('After', A)
If we try to remove an element that is not present in the set, the interpreter will throw a KeyError
:
xxxxxxxxxx
21A = {'this', 'is', 'a', 'set'}
2A.remove('cool') # error!
Consider the following problem:
Given a list
L
, extract all unique elements from it and store the result in another list,L_uniq
. The order of elements does not matter.
Let us first look at a solution that doesn't use sets:
xxxxxxxxxx
61L = [1, 2, 3, 3, 4, 5, 6, 1, 2, 2]
2L_uniq = [ ]
3for elem in L:
4 if elem not in L_uniq:
5 L_uniq.append(elem)
6print(L_uniq)
Now, for some set magic:
xxxxxxxxxx
41L = [1, 2, 3, 3, 4, 5, 6, 1, 2, 2]
2S = set(L)
3L_uniq = list(S)
4print(L_uniq)
Passing a list to the set
function removes all duplicates and returns the unique elements.
Sets are mutable entities.
xxxxxxxxxx
51A = {1, 2, 3}
2B = A
3B.add(4)
4print(A, B)
5print(A is B)
A
and B
are the same objects. As before, there are two ways to do a shallow copy:
xxxxxxxxxx
81A = {1, 2, 3}
2B1 = A.copy()
3B2 = set(A)
4B1.add(4)
5B2.add(0)
6print(A, B1, B2)
7print(A is not B1)
8print(A is not B2)