Have you ever tried to organize information in your code, but felt like there was no good way to do it? Maybe you were trying to store a collection of data, but didn’t want to write out hundreds of individual variables.
Or maybe you needed to look up data quickly, but didn’t want to loop through a long list to find what you needed. Fortunately, there are data structures that make these tasks much easier.
In this article, we’ll explore several important data structures in Python.
1) Dictionaries, Maps, and Hash Tables:
A dictionary is a handy way of storing data in Python.
At its most basic level, it’s just like a standard English dictionary – you look up a word (known as a “key”) to find its definition (known as a “value”). This structure is incredibly useful for organizing data that would be difficult to store in a list or tuple.
In Python, dictionaries are defined with curly braces, like this:
my_dictionary = {"key1": "value1", "key2": "value2", "key3": "value3"}
You can access a value in a dictionary by using its key:
print(my_dictionary["key2"]) # Output: "value2"
If you try to access a key that doesn’t exist in the dictionary, you’ll get a KeyError. To avoid this, you can use the “get” method instead:
print(my_dictionary.get("nonexistent_key", "default_value")) # Output: "default_value"
If you want to add a new key/value pair to a dictionary, just assign it like this:
my_dictionary["key4"] = "value4"
In addition to regular dictionaries, Python offers several other specialized types:
collections.OrderedDict
: This type maintains the order in which items were inserted into the dictionary.- This can be useful for maintaining a certain order when looping through the keys or values.
collections.defaultdict
: This type returns a default value when you try to access a key that isn’t in the dictionary. - This can simplify your code if you don’t want to check whether a key exists before accessing it.
collections.ChainMap
: This type combines multiple dictionaries into a single one, with a fallback to each successive dictionary if the key isn’t found in the previous one. types.MappingProxyType
: This type creates a read-only proxy of a dictionary, which can be useful if you don’t want other code to accidentally modify your data.
2) Arrays Data Structures:
Arrays are another important data structure in Python.
While they are similar to lists, they have a few key differences. Arrays can only store a single data type (e.g. only integers, only floats, only characters), whereas lists can contain any combination of data types.
This makes arrays much more efficient for storing large amounts of data of a single data type. Here are the most common types of arrays in Python:
list
: As mentioned earlier, this is a common and versatile data structure that can contain any combination of data.- Because lists are dynamic (i.e. their length can change), they are useful for storing data that will be frequently modified or added to.
tuple
: This is similar to a list, but is immutable (i.e. it can’t be changed once it’s created). - This can make tuples more efficient for storing data that won’t change, since they can be optimized more easily by the computer.
array.array
: This is a basic typed array, which means it can only hold data of a single type (e.g. integers, floats, characters). - Because the array is typed, it can be more memory-efficient than a list.
str
: This is an immutable array of Unicode characters. - Because it’s immutable, it can be accessed more quickly and efficiently than a standard list.
bytes
: This is an immutable array of single bytes (i.e. values from 0 to 255). - It’s useful for storing binary data, like images or audio files.
bytearray
: This is a mutable array of single bytes. - It’s useful for storing binary data that needs to be modified.
In conclusion, there are many different data structures available in Python that can make organizing and accessing data much easier.
By understanding the strengths and weaknesses of each structure, you can choose the one that’s best suited for your needs. Whether you’re storing simple dictionaries or complex arrays, these data structures can help you write more efficient and organized code.
3) Records, Structs, and Data Transfer Objects:
When working with data, you will often want to group multiple pieces of information together into a single object. This can be done using records, structs, or data transfer objects (DTOs).
These data structures can help you organize your data and make it easier to work with.
1. dict
:
Dictionaries are one of the simplest and most common ways to create data objects in Python. They can be used to store key-value pairs, making them ideal for storing data that can be accessed using a unique identifier.
Dictionaries are also incredibly flexible, since they can store values of any data type. Here’s an example of how you can create a dictionary in Python:
person = {"name": "John Smith", "age": 30, "email": "[email protected]"}
To access a value in the dictionary, you can use its key:
print(person["age"]) # Output: 30
2. tuple
:
Tuples are another way to group data together in Python. They are similar to lists, but are immutable, meaning they cannot be changed once they are created.
This can make tuples more efficient for storing data that doesn’t need to be modified. Here’s an example of how you can create a tuple in Python:
person = ("John Smith", 30, "[email protected]")
To access a value in the tuple, you can use indexing:
print(person[1]) # Output: 30
3. Write a Custom Class:
If the built-in data structures like dictionaries and tuples don’t fit your needs, you can create your own custom class. This gives you more control over how your data is structured and accessed.
You can define attributes and methods that suit your specific use case. Here’s an example of how you can create a custom class in Python:
class Person:
def __init__(self, name, age, email):
self.name = name
self.age = age
self.email = email
person = Person("John Smith", 30, "[email protected]")
To access a property of the Person object, you can use dot notation:
print(person.age) # Output: 30
4. dataclasses.dataclass
:
Python 3.7 introduced a new feature called data classes, which make it easier to create classes that are primarily used to store data. Data classes automatically generate methods like __init__
and __repr__
, which can save you time when writing code.
Here’s an example of how you can create a data class in Python:
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int
email: str
person = Person("John Smith", 30, "[email protected]")
To access a property of the data class, you can use dot notation:
print(person.age) # Output: 30
5. collections.namedtuple
:
Named tuples are a subclass of tuples that have named fields.
This can make your code more readable and easier to understand. Named tuples are also immutable, making them more efficient for storing data that doesn’t need to be modified.
Here’s an example of how you can create a named tuple in Python:
from collections import namedtuple
Person = namedtuple("Person", ["name", "age", "email"])
person = Person("John Smith", 30, "[email protected]")
To access a property of the named tuple, you can use dot notation or indexing:
print(person.age) # Output: 30
print(person[1]) # Output: 30
6. typing.NamedTuple
:
In Python 3.6 and higher, you can use the NamedTuple class from the typing module to create named tuples that have type hints.
This can make your code more readable and easier to understand, especially for larger projects. Here’s an example of how you can create a named tuple with type hints:
from typing import NamedTuple
class Person(NamedTuple):
name: str
age: int
email: str
person = Person("John Smith", 30, "[email protected]")
To access a property of the named tuple, you can use dot notation or indexing:
print(person.age) # Output: 30
print(person[1]) # Output: 30
7. struct.Struct
:
The struct module allows you to create C-style struct objects that can be serialized to and from binary data.
This can be useful if you need to interact with other programs or systems that use structured binary data. Here’s an example of how you can create a struct in Python:
import struct
person = struct.Struct("10s i 20s")
data = person.pack(b"John Smith", 30, b"[email protected]")
To access a property of the struct, you need to unpack it:
name, age, email = person.unpack(data)
print(age) # Output: 30
8. types.SimpleNamespace
:
Simple namespaces are a way to group data together in Python that provides attribute access to its members.
It’s similar to a dictionary, but with easier syntax for accessing its elements. Here’s an example of how you can create a simple namespace in Python:
from types import SimpleNamespace
person = SimpleNamespace(name="John Smith", age=30, email="[email protected]")
To access a property of the simple namespace, you can use dot notation:
print(person.age) # Output: 30
4) Sets and Multisets:
Sets and multisets are data structures that are commonly used in mathematics and computer science. They are useful for storing collections of unique items, and for counting the occurrence of items.
1. set
:
Sets are the most basic form of collections in Python.
They are unordered collections that contain unique elements. Sets can be used to perform a variety of operations such as finding a union, intersection, or difference between two sets.
Here’s an example of how you can create a set in Python:
my_set = {1, 2, 3, 4, 5}
my_set.add(6)
To access elements in a set, you can loop through it or use the “in” keyword:
for item in my_set:
print(item)
print(3 in my_set) # Output: True
2. frozenset
:
A frozenset is an immutable version of the set object.
This means that once the frozenset is created, it cannot be changed. Frozensets are useful when you want to ensure that the set of elements cannot be modified.
Here’s an example of how you can create a frozenset in Python:
my_frozenset = frozenset([1, 2, 3, 4, 5])
To access elements in a frozenset, you can loop through it or use the “in” keyword:
for item in my_frozenset:
print(item)
print(3 in my_frozenset) # Output: True
3. collections.Counter
:
A counter is a special type of dictionary that is used to count the occurrences of items in a collection.
It is part of the collections module, and can be used to simplify code that requires counting the frequency of items. Here’s an example of how you can create a counter in Python:
from collections import Counter
my_list = [1, 2, 3, 1, 2, 4, 5, 1]
my_counter = Counter(my_list)
print(my_counter[1]) # Output: 3
To access the count of an item in the counter, you can use indexing:
print(my_counter[1]) # Output: 3
In conclusion, Python provides a variety of data structures to support different programming needs. Understanding their strengths and weaknesses can help you choose the right data structure for your project.
Whether you are storing simple data items or complex objects, Python has the tools needed for your programming tasks.
5) Stacks (LIFOs):
Stacks are a form of data structure that follow the Last-In, First-Out (LIFO) principle.
In other words, the last element added to the stack will be the first one removed. Stacks are especially useful for keeping track of the order in which certain tasks or actions were performed.
1. list
:
Python lists can be implemented as simple, built-in stacks.
You can add an element to a list using the “append” method, and remove the last element using the “pop” method. Here’s an example of how you can use a list as a stack in Python:
my_stack = []
my_stack.append(1)
my_stack.append(2)
my_stack.append(3)
print(my_stack.pop()) # Output: 3
print(my_stack.pop()) # Output: 2
2. collections.deque
:
collections.deque
is a high-performance, thread-safe implementation of a double-ended queue (deque) that can be used as a LIFO stack. Deques support fast appends and pops from either side, and are optimized for high throughputs and contention.
This makes them a great choice for applications that need fast and robust stacks. Here’s an example of how you can use a deque as a stack in Python:
from collections import deque
my_stack = deque()
my_stack.append(1)
my_stack.append(2)
my_stack.append(3)
print(my_stack.pop()) # Output: 3
print(my_stack.pop()) # Output: 2
3. queue.LifoQueue
:
LifoQueue
is part of the Python standard library and provides locking semantics for parallel computing, which can be useful in multi-threaded applications.
The LifoQueue class is a stack implementation of a queue, which stores items in a LIFO order. Here’s an example of how you can use a LifoQueue as a stack in Python:
import queue
my_stack = queue.LifoQueue()
my_stack.put(1)
my_stack.put(2)
my_stack.put(3)
print(my_stack.get()) # Output: 3
print(my_stack.get()) # Output: 2
6) Queues (FIFOs):
Queues are data structures that follow the First-In-First-Out (FIFO) principle. In other words, the first element added to the queue will be the first one removed.
Queues are especially useful for managing collections of tasks or events that need to be processed in the order in which they were received.
1. list
:
You can implement a queue using a simple list, but it can be very slow for large collections of data. To add an element to the queue, you can use the “append” method, and to remove